BackgroundThere is a growing interest in the use of 18F-FDG PET-CT to monitor tuberculosis (TB) treatment response. However, TB causes complex and widespread pathology, which is challenging to segment and quantify in a reproducible manner.To address this, we developed a technique to standardise uptake (Z-score), segment and quantify tuberculous lung lesions on PET and CT concurrently, in order to track changes over time. We used open source tools and created a MATLAB script. The technique was optimised on a training set of five pulmonary tuberculosis (PTB) cases after standard TB therapy and 15 control patients with lesion-free lungs.ResultsWe compared the proposed method to a fixed threshold (SUV > 1) and manual segmentation by two readers and piloted the technique successfully on scans of five control patients and five PTB cases (four cured and one failed treatment case), at diagnosis and after 1 and 6 months of treatment. There was a better correlation between the Z-score-based segmentation and manual segmentation than SUV > 1 and manual segmentation in terms of overall spatial overlap (measured in Dice similarity coefficient) and specificity (1 minus false positive volume fraction). However, SUV > 1 segmentation appeared more sensitive. Both the Z-score and SUV > 1 showed very low variability when measuring change over time. In addition, total glycolytic activity, calculated using segmentation by Z-score and lesion-to-background ratio, correlated well with traditional total glycolytic activity calculations. The technique quantified various PET and CT parameters, including the total glycolytic activity index, metabolic lesion volume, lesion volumes at different CT densities and combined PET and CT parameters. The quantified metrics showed a marked decrease in the cured cases, with changes already apparent at month one, but remained largely unchanged in the failed treatment case.ConclusionsOur technique is promising to segment and quantify the lung scans of pulmonary tuberculosis patients in a semi-automatic manner, appropriate for measuring treatment response. Further validation is required in larger cohorts.