The diagnosis of tuberculosis depends on detecting Mycobacterium tuberculosis (Mtb). Unfortunately, recognizing patients with extrapulmonary tuberculosis (EPTB) remains challenging due to the insidious clinical presentation and poor performance of diagnostic tests. To identify biomarkers for EPTB, the GSE83456 dataset was screened for differentially expressed genes (DEGs), followed by a gene enrichment analysis. One hundred and ten DEGs were obtained, mainly enriched in inflammation and immune -related pathways. Weighted gene co-expression network analysis (WGCNA) was used to identify 10 co-expression modules. The turquoise module, correlating the most highly with EPTB, contained 96 DEGs. Further screening with the least absolute shrinkage and selection operator (LASSO) and support vector machine recursive feature elimination (SVM-RFE) narrowed down the 96 DEGs to five central genes. All five key genes were validated in the GSE144127 dataset. CARD17 and GBP5 had high diagnostic capacity, with AUC values were 0.763 (95% CI: 0.717–0.805) and 0.833 (95% CI: 0.793–0.869) respectively. Using single sample gene enrichment analysis (ssGSEA), we evaluated the infiltration of 28 immune cells in EPTB and explored their relationships with key genes. The results showed 17 immune cell subtypes with significant infiltrations in EPTB. CARD17, GBP5, HOOK1, LOC730167, and HIST1H4C were significantly associated with 16, 14, 12, 6, and 4 immune cell subtypes, respectively. The RT-qPCR results confirmed that the expression levels of GBP5 and CARD17 were higher in EPTB compared to control. In conclusion, CARD17 and GBP5 have high diagnostic efficiency for EPTB and are closely related to immune cell infiltration.