Background: Ultrasound (US) examination is helpful in the differential diagnosis of thyroid nodules (malignant vs. benign), but its accuracy relies heavily on examiner experience. Therefore, the aim of this study was to develop a less subjective diagnostic model aided by machine learning. Methods: A total of 2064 thyroid nodules (2032 patients, 695 male; M age = 45.25-13.49 years) met all of the following inclusion criteria: (i) hemi-or total thyroidectomy, (ii) maximum nodule diameter 2.5 cm, (iii) examination by conventional US and real-time elastography within one month before surgery, and (iv) no previous thyroid surgery or percutaneous thermotherapy. Models were developed using 60% of randomly selected samples based on nine commonly used algorithms, and validated using the remaining 40% of cases. All models function with a validation data set that has a pretest probability of malignancy of 10%. The models were refined with machine learning that consisted of 1000 repetitions of derivatization and validation, and compared to diagnosis by an experienced radiologist. Sensitivity, specificity, accuracy, and area under the curve (AUC) were calculated. Results: A random forest algorithm led to the best diagnostic model, which performed better than radiologist diagnosis based on conventional US only (AUC = 0.924 [confidence interval (CI) 0.895-0.953] vs. 0.834 [CI 0.815-0.853]) and based on both conventional US and real-time elastography (AUC = 0.938 [CI 0.914-0.961] vs. 0.843 [CI 0.829-0.857]). Conclusions: Machine-learning algorithms based on US examinations, particularly the random forest classifier, may diagnose malignant thyroid nodules better than radiologists.