Body height has traditionally been looked upon as a mirror of the condition of society, short height being an indicator of poor nutritional status, poor education, and low social status and income. This view has recently been questioned. We aimed to quantify the effects of nutrition, education, sibship size, and household income, factors that are conventionally considered to be related to child growth, on body height of children and adolescents raised under urban Indian conditions. We re-analyzed several anthropometric measurements and questionnaires with questions on sibship size, fathers' and mother's education, and monthly family expenditure, from two cross-sectional growth studies performed in Kolkata, India. The first Kolkata Growth Study (KG1) took place in 1982-1983, with data on 825 Bengali boys aged 7 to 16 years; and the second Kolkata Growth Study (KG2) between 1999 and 2011 with data of 1999 boys aged 7 to 21 years from Bengali Hindu families, and data of 2195 girls obtained between 2005 and 2011. Indian children showed positive insignificant secular trends in height and a significant secular trend in weight and BMI between between 1982 and 2011. Yet, multiple regression analysis failed to detect an association between nutritional status (expressed in terms of skinfold thickness), monthly family expenditure and sibship size with body height of these children. The analysis only revealed an influence of parental education on female, but not on male height. We failed to detect influences of nutrition, sibship size, and monthly family expenditure on body height in a large sample of children and adolescents raised in Kolkata, India, between 1982 and 2011. We found a mild positive association between parental education and girls' height. The data question current concepts regarding the impact of nutrition, and household and economic factors on growth, but instead underscore the effect of parental education.
Regulatory regions, like promoters and enhancers, cover an estimated 5–15% of the human genome. Changes to these sequences are thought to underlie much of human phenotypic variation and a substantial proportion of genetic causes of disease. However, our understanding of their functional encoding in DNA is still very limited. Applying machine or deep learning methods can shed light on this encoding and gapped k-mer support vector machines (gkm-SVMs) or convolutional neural networks (CNNs) are commonly trained on putative regulatory sequences. Here, we investigate the impact of negative sequence selection on model performance. By training gkm-SVM and CNN models on open chromatin data and corresponding negative training dataset, both learners and two approaches for negative training data are compared. Negative sets use either genomic background sequences or sequence shuffles of the positive sequences. Model performance was evaluated on three different tasks: predicting elements active in a cell-type, predicting cell-type specific elements, and predicting elements' relative activity as measured from independent experimental data. Our results indicate strong effects of the negative training data, with genomic backgrounds showing overall best results. Specifically, models trained on highly shuffled sequences perform worse on the complex tasks of tissue-specific activity and quantitative activity prediction, and seem to learn features of artificial sequences rather than regulatory activity. Further, we observe that insufficient matching of genomic background sequences results in model biases. While CNNs achieved and exceeded the performance of gkm-SVMs for larger training datasets, gkm-SVMs gave robust and best results for typical training dataset sizes without the need of hyperparameter optimization.
Regulatory regions, like promoters and enhancers, cover an estimated 5-15% of the human genome. Changes to these sequences are thought to underlie much of human phenotypic variation and a substantial proportion of genetic causes of disease. However, our understanding of their functional encoding in DNA is still very limited. Applying machine or deep learning methods can shed light on this encoding and gapped k-mer support vector machines (gkm-SVMs) or convolutional neural networks (CNNs) are commonly trained on putative regulatory sequences.Here, we investigate the impact of negative sequence selection on model performance. By training gkm-SVM and CNN models on open chromatin data and corresponding negative training dataset, both learners and two approaches for negative training data are compared. Negative sets use either genomic background sequences or sequence shuffles of the positive sequences. Model performance was evaluated on three different tasks: predicting elements active in a cell-type, predicting cell-type specific elements, and predicting elements’ relative activity as measured from independent experimental data.Our results indicate strong effects of the negative training data, with genomic backgrounds showing overall best results. Specifically, models trained on highly shuffled sequences perform worse on the complex tasks of tissue-specific activity and quantitative activity prediction, and seem to learn features of artificial sequences rather than regulatory activity. Further, we observe that insufficient matching of genomic background sequences results in model biases. While CNNs achieved and exceeded the performance of gkm-SVMs for larger training datasets, gkm-SVMs gave robust and best results for typical training dataset sizes without the need of hyperparameter optimization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.