Comprehensive characterizations of genetic diversity and demographic models of ethnolinguistically diverse Chinese populations are essential for elucidating their forensic characteristics and evolutionary past. We developed a 114-plex NGS InDel panel to genotype 114 genome-wide markers and investigated the genetic structures of Zhuang, Hui, Miao, Li, Tibetan, Yi, and Mongolian populations, encompassing five language families. This panel demonstrated robust performance, with exceptional potential for forensic individual identification and paternity testing, evidenced by the combined power of discrimination for 77 autosomal InDels (ranged from 1-3.6400 × 10
–30
to 1-3.5713 × 10
–32
) and the combined power of exclusion (ranged from 1-2.1863 × 10
–6
to 1-2.1261 × 10
–7
). The cumulative mean exclusion chance for 32 X-chromosomal InDels varied between 0.99996 and 0.99999 for trios and 0.99760 to 0.99898 for duos. We also analyzed genetic similarities and differences between these populations and 27 global populations, revealing distinct clusters among African, South Asian, East Asian, and European groups, with a close genetic affinity to East Asians. Notably, we identified geography-related genetic substructures: Inner Mongolia Mongolians and Gansu Huis formed a northern branch, Tibetans and Yis from Sichuan constituted a highland branch, and Guangxi Zhuangs exhibited close ties with Hainan Lis and Guangxi Miaos in the southern branch. Additionally, many InDels proved to be ancestry-informative markers for biogeographic ancestry inference. Collectively, these findings underscore the utility of the 114-plex NGS InDel panel as a complementary tool for forensic investigations and as a source of insights into the genetic architecture of ethnolinguistically distinct Chinese populations.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12864-024-10894-y.