To identify operational taxonomy units (OTUs) signaling disease onset in an observational study, a powerful strategy was selecting participants by matched sets and profiling temporal metagenomes, followed by trajectory analysis. Existing trajectory analyses modeled individual OTU or microbial community without adjusting for the within-community correlation and matched-set-specific latent factors. We proposed a joint model with matching and regularization (JMR) to detect OTU-specific compositional trajectory predictive of host disease status, using nested random effects and covariate taxa pre-selected by Bray-Curtis distance and elastic net regression. We demonstrated that JMR effectively controlled the false discovery and pseudo biomarkers in a simulation study that generated temporal high-dimensional metagenomic counts with random intercept or slope. Application of the competing methods in the simulated data and the TEDDY cohort showed that JMR outperformed the other methods and identified important taxa in infants’ fecal samples with dynamics preceding host disease status.IMPORTANCEWe proposed a framework to link taxon-specific compositional trajectory and the risk of host developing disease without transformation of the relative abundance. The heterogeneity between matched sets was considered in our model, with improved detection power and well-controlled false positive rate. The inherent negative correlation in microbiota composition due to sum-to-one constraint was adjusted by incorporating the top-correlated taxa as covariate. We designed a simulation pipeline to generate true biomarkers (i.e., causal OTUs and/or those interacting with causal OTUs) for disease onset and the pseudo biomarkers caused by sum-to-one or latent noises. Our method successfully reduced pseudo biomarker rate in simulation results, as well as identified more microbial trajectories signaling autoimmune status of young children enrolled in the TEDDY cohort.