Changes in human microbiome are associated with many human diseases. Next generation sequencing technologies make it possible to quantify the microbial composition without the need for laboratory cultivation. One important problem of microbiome data analysis is to identify the environmental/biological covariates that are associated with different bacterial taxa. Taxa count data in microbiome studies are often over-dispersed and include many zeros. To account for such an over-dispersion, we propose to use an additive logistic normal multinomial regression model to associate the covariates to bacterial composition. The model can naturally account for sampling variabilities and zero observations and also allow for a flexible covariance structure among the bacterial taxa. In order to select the relevant covariates and to estimate the corresponding regression coefficients, we propose a group ℓ1 penalized likelihood estimation method for variable selection and estimation. We develop a Monte Carlo expectation-maximization algorithm to implement the penalized likelihood estimation. Our simulation results show that the proposed method outperforms the group ℓ1 penalized multinomial logistic regression and the Dirichlet multinomial regression models in variable selection. We demonstrate the methods using a data set that links human gut microbiome to micro-nutrients in order to identify the nutrients that are associated with the human gut microbiome enterotype.
We document the prevalence and the pattern of dermatologic diseases among primary and secondary school students visiting a Student Health Service Center in Hong Kong. In this study, the differences in prevalence of skin diseases between these two groups are pointed out. A total of 1006 students from both primary (n = 559) and secondary schools (n = 447) were seen in a regional, population-based screening center during the period from October 1996 to September 1997. Each student was asked to answer a simple questionnaire to identify any skin problems and to explore health-seeking behavior. Students were then examined for evidence of skin disease. A total of 314 students (31.3%) had one or more skin disorders, the most common of which were acne vulgaris (9.9%), eczema (6.8%), café au lait spots (4.4%), congenital melanocytic nevus (3.6%), superficial fungal infections (2.2%), keratosis pilaris (1.3%), and pityriasis alba (1.0%), which represented 93% of the skin disorders encountered. Acne vulgaris and tinea cruris were distinctly more common in secondary school students, while atopic eczema and congenital melanocytic nevi were more commonly found in primary school students. Among the 314 students with skin disease, 129 (41%) had symptoms while 185 (59%) did not. Ninety of the 129 students (70%) with symptomatic skin problems did not seek medical attention. The two predominant skin diseases, acne vulgaris and endogenous eczema, both chronic skin problems, incur not only morbidity in affected individuals and families, but also use considerable resources in the community. The lack of medical intervention reported by symptomatic students in this study was unexpectedly high. Therefore it is useful to monitor the epidemiology of skin problems in children so that relevant skin health education programs and preventive measures can be planned and implemented effectively.
We consider S-estimators of multivariate location and common dispersion matrix in multiple populations. Instead of averaging the robust estimates of the individual covariance matrices, as used by Todorov, Neykov and Neytchev (1990), the observations are pooled for estimating the common covariance more efficiently. Two such proposals are evaluated by a breakdown point analysis and Monte Carlo simulations. Their applications to the discriminant analysis are also considered.
Academic PressAMS 1991 subject classifications: 62H30, 62F35.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.