Genetic analysis of wood chemical composition is often limited by the cost and throughput of direct analytical methods. The speed and low cost of Fourier transform near infrared (FT-NIR) overcomes many of these limitations, but it is an indirect method relying on calibration models that are typically developed and validated with small sample sets. In this study, we used >1500 young greenhouse grown trees from a clonally propagated single Populus family, grown at low and high nitrogen, and compared FT-NIR calibration sample sizes of 150, 250, 500 and 750 on calibration and prediction model statistics, and heritability estimates developed with pyrolysis molecular beam mass spectrometry (pyMBMS) wood chemical composition. As calibration sample size increased from 150 to 750, predictive model statistics improved slightly. Overall, stronger calibration and prediction statistics were obtained with lignin, S-lignin, S/G ratio, and m/z 144 (an ion from cellulose), than with C5 and C6 carbohydrates, and m/z 114 (an ion from xylan). Although small differences in model statistics were observed between the 250 and 500 sample calibration sets, when predicted values were used for calculating genetic control, the 500 sample set gave substantially more similar results to those obtained with the pyMBMS data. With the 500 sample calibration models, genetic correlations obtained with FT-NIR and
OPEN ACCESSForests 2014, 5 467 pyMBMS methods were similar. Quantitative trait loci (QTL) analysis with pyMBMS and FT-NIR predictions identified only three common loci for lignin traits. FT-NIR identified four QTLs that were not found with pyMBMS data, and these QTLs were for the less well predicted carbohydrate traits.