The exposome concept aims to consider all environmental stressors simultaneously. The dimension of the data and the correlation that may exist between exposures lead to various statistical challenges. Some methodological studies have provided insight regarding the efficiency of specific modeling approaches in the context of exposome data assessed once for each subject. However, few studies have considered the situation in which environmental exposures are assessed repeatedly. Here, we conduct a simulation ...
The exposome concept aims to consider all environmental stressors simultaneously. The dimension of the data and the correlation that may exist between exposures lead to various statistical challenges. Some methodological studies have provided insight regarding the efficiency of specific modeling approaches in the context of exposome data assessed once for each subject. However, few studies have considered the situation in which environmental exposures are assessed repeatedly. Here, we conduct a simulation study to compare the performance of statistical approaches to assess exposome-health associations in the context of multiple exposure variables. Different scenarios were tested, assuming different types and numbers of exposure-outcome causal relationships. An application study using real data collected within the INMA mother-child cohort (Spain) is also presented. In the simulation experiment, assessed methods showed varying performance across scenarios, making it challenging to recommend a one-size-fits-all strategy. Generally, methods such as sparse partial least-squares and the deletion-substitution-addition algorithm tended to outperform the other tested methods (ExWAS, Elastic-Net, DLNM, or sNPLS). Notably, as the number of true predictors increased, the performance of all methods declined. The absence of a clearly superior approach underscores the additional challenges posed by repeated exposome data, such as the presence of more complex correlation structures and interdependencies between variables, and highlights that careful consideration is essential when selecting the appropriate statistical method. In this regard, we provide recommendations based on the expected scenario. Given the heightened risk of reporting false positive or negative associations when applying these techniques to repeated exposome data, we advise interpreting the results with caution, particularly in compromised contexts such as those with a limited sample size.
+