Guimbaud, Jean-BaptisteSabidó Aguadé, Eduard, 1981-Borràs, EvaJúlvez Calvo, JordiUrquiza, José M.Casas Sanahuja, MaribelBustamante Pineda, MarionaNieuwenhuijsen, Mark J.Vrijheid, MartineLópez Vicente, Mònica, 1988-de Castro, MontserratBasagaña Flores, XavierMaitre, Léa2024-11-222024-11-222024Guimbaud JB, Siskos AP, Sakhi AK, Heude B, Sabidó E, Borràs E, et al. Machine learning-based health environmental-clinical risk scores in European children. Commun Med (Lond). 2024 May 23;4(1):98. DOI: 10.1038/s43856-024-00513-y2730-664Xhttp://hdl.handle.net/10230/68783Background: Early life environmental stressors play an important role in the development of multiple chronic disorders. Previous studies that used environmental risk scores (ERS) to assess the cumulative impact of environmental exposures on health are limited by the diversity of exposures included, especially for early life determinants. We used machine learning methods to build early life exposome risk scores for three health outcomes using environmental, molecular, and clinical data. Methods: In this study, we analyzed data from 1622 mother-child pairs from the HELIX European birth cohorts, using over 300 environmental, 100 child peripheral, and 18 mother-child clinical markers to compute environmental-clinical risk scores (ECRS) for child behavioral difficulties, metabolic syndrome, and lung function. ECRS were computed using LASSO, Random Forest and XGBoost. XGBoost ECRS were selected to extract local feature contributions using Shapley values and derive feature importance and interactions. Results: ECRS captured 13%, 50% and 4% of the variance in mental, cardiometabolic, and respiratory health, respectively. We observed no significant differences in predictive performances between the above-mentioned methods.The most important predictive features were maternal stress, noise, and lifestyle exposures for mental health; proteome (mainly IL1B) and metabolome features for cardiometabolic health; child BMI and urine metabolites for respiratory health. Conclusions: Besides their usefulness for epidemiological research, our risk scores show great potential to capture holistic individual level non-hereditary risk associations that can inform practitioners about actionable factors of high-risk children. As in the post-genetic era personalized prevention medicine will focus more and more on modifiable factors, we believe that such integrative approaches will be instrumental in shaping future healthcare paradigms.application/pdfeng© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.Machine learning-based health environmental-clinical risk scores in European childreninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.1038/s43856-024-00513-yEpidemiologyPaediatric researchinfo:eu-repo/semantics/openAccess