BackgroundThe performance in evaluating thyroid nodules on ultrasound varies across different risk stratification systems, leading to inconsistency and uncertainty regarding diagnostic sensitivity, specificity, and accuracy.ObjectiveComparing diagnostic performance of detecting thyroid cancer among distinct ultrasound risk stratification systems proposed in the last five years.Evidence acquisitionSystematic search was conducted on PubMed, EMBASE, and Web of Science databases to find relevant research up to December 8, 2022, whose study contents contained elucidation of diagnostic performance of any one of the above ultrasound risk stratification systems (European Thyroid Imaging Reporting and Data System[Eu-TIRADS]; American College of Radiology TIRADS [ACR TIRADS]; Chinese version of TIRADS [C-TIRADS]; Computer-aided diagnosis system based on deep learning [S-Detect]). Based on golden diagnostic standard in histopathology and cytology, single meta-analysis was performed to obtain the optimal cut-off value for each system, and then network meta-analysis was conducted on the best risk stratification category in each system.Evidence synthesisThis network meta-analysis included 88 studies with a total of 59,304 nodules. The most accurate risk category thresholds were TR5 for Eu-TIRADS, TR5 for ACR TIRADS, TR4b and above for C-TIRADS, and possible malignancy for S-Detect. At the best thresholds, sensitivity of these systems ranged from 68% to 82% and specificity ranged from 71% to 81%. It identified the highest sensitivity for C-TIRADS TR4b and the highest specificity for ACR TIRADS TR5. However, sensitivity for ACR TIRADS TR5 was the lowest. The diagnostic odds ratio (DOR) and area under curve (AUC) were ranked first in C-TIRADS.ConclusionAmong four ultrasound risk stratification options, this systemic review preliminarily proved that C-TIRADS possessed favorable diagnostic performance for thyroid nodules.Systematic review registrationhttps://www.crd.york.ac.uk/prospero, CRD42022382818.