BackgroundClinicians are confronted with an increasing number of patients with thyroid nodules. Reliable preoperative diagnosis of thyroid nodules remains a challenge because of inconclusive cytological examination of fine-needle aspiration biopsies. Although molecular analysis of thyroid tissue has shown promise as a diagnostic tool in recent years, it has not been successfully applied in routine clinical use, particularly in Chinese patients.MethodsWhole-transcriptome sequencing of 19 primary papillary thyroid cancer (PTC) samples and matched adjacent normal thyroid tissue (NT) samples were performed. Bioinformatics analysis was carried out to identify candidate diagnostic genes. Then, RT-qPCR was performed to evaluate these candidate genes, and four genes were finally selected. Based on these four genes, diagnostic algorithm was developed (training set: 100 thyroid cancer (TC) and 65 benign thyroid lesions (BTL)) and validated (independent set: 123 TC and 81 BTL) using the support vector machine (SVM) approach.ResultsWe discovered four genes, namely fibronectin 1 (FN1), gamma-aminobutyric acid type A receptor beta 2 subunit (GABRB2), neuronal guanine nucleotide exchange factor (NGEF) and high-mobility group AT-hook 2 (HMGA2). A SVM model with these four genes performed with 97.0 % sensitivity, 93.8 % specificity, 96.0 % positive predictive value (PPV), and 95.3 % negative predictive value (NPV) in training set. For additional independent validation, it also showed good performance (92.7 % sensitivity, 90.1 % specificity, 93.4 % PPV, and 89.0 % NPV).ConclusionsOur diagnostic panel can accurately distinguish benign from malignant thyroid nodules using a simple and affordable method, which may have daily clinical application in the near future.Electronic supplementary materialThe online version of this article (doi:10.1186/s13046-016-0447-3) contains supplementary material, which is available to authorized users.