Accurate Determination of Genotypic Variance of Cell Wall Characteristics of a Populus trichocarpa Pedigree Using High-Throughput Pyrolysis-Molecular Beam Mass Spectrometry

Harman‐Ware, Anne E.; Macaya-Sans, David; Abeyratne, Chanaka Roshan; Doeppke, Crissa; Haiby, Kathleen; Tuskan, Gerald A.; Stanton, Brian J.; DiFazio, Stephen P.; Davis, Mark F.

doi:10.21203/rs.3.rs-23478/v1

Cited by 5 publications

(11 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As shown in the PCA scores plot in Figure 1 and similarly to that reported previously [ 1 , 2 , 7 , 9 , 28 ], samples cluster according to primary biomass type (family, being hardwood, softwood, or grasses) based on their spectra prior to RUV correction (instrument drift correction) from m / z 30–450 and to a lesser degree, secondary biomass type (species) and Sample ID. The first principal component (x-axis), as outlined before, generally separates the biomass types according to relative lignin and sugar abundance.…”

Section: Resultssupporting

confidence: 81%

“…The underlying assumptions of normalizing large datasets across different instruments for RNAseq are applicable to py-MBMS analysis: (1) data are complex and contain replicates; (2) there are differences in sample loadings; (3) occurrence of operator technician variation. Normalization strategies for py-MBMS have traditionally included reference standards from NIST and internal controls, but they typically fail to account for instrument drift across time or space and thus can only be comparable within runs, although we have previously reported one different type of correction or normalization strategy [ 28 ]. Similar limitations occur across RNA sequencing runs, which vary in library depth, preparation, and reagent quality.…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

Machine Learning-Based Classification of Lignocellulosic Biomass from Pyrolysis-Molecular Beam Mass Spectrometry Data

Nag

Gerritsen

Doeppke

et al. 2021

IJMS

Self Cite

View full text Add to dashboard Cite

High-throughput analysis of biomass is necessary to ensure consistent and uniform feedstocks for agricultural and bioenergy applications and is needed to inform genomics and systems biology models. Pyrolysis followed by mass spectrometry such as molecular beam mass spectrometry (py-MBMS) analyses are becoming increasingly popular for the rapid analysis of biomass cell wall composition and typically require the use of different data analysis tools depending on the need and application. Here, the authors report the py-MBMS analysis of several types of lignocellulosic biomass to gain an understanding of spectral patterns and variation with associated biomass composition and use machine learning approaches to classify, differentiate, and predict biomass types on the basis of py-MBMS spectra. Py-MBMS spectra were also corrected for instrumental variance using generalized linear modeling (GLM) based on the use of select ions relative abundances as spike-in controls. Machine learning classification algorithms e.g., random forest, k-nearest neighbor, decision tree, Gaussian Naïve Bayes, gradient boosting, and multilayer perceptron classifiers were used. The k-nearest neighbors (k-NN) classifier generally performed the best for classifications using raw spectral data, and the decision tree classifier performed the worst. After normalization of spectra to account for instrumental variance, all the classifiers had comparable and generally acceptable performance for predicting the biomass types, although the k-NN and decision tree classifiers were not as accurate for prediction of specific sample types. Gaussian Naïve Bayes (GNB) and extreme gradient boosting (XGB) classifiers performed better than the k-NN and the decision tree classifiers for the prediction of biomass mixtures. The data analysis workflow reported here could be applied and extended for comparison of biomass samples of varying types, species, phenotypes, and/or genotypes or subjected to different treatments, environments, etc. to further elucidate the sources of spectral variance, patterns, and to infer compositional information based on spectral analysis, particularly for analysis of data without a priori knowledge of the feedstock composition or identity.

show abstract

Section: Resultssupporting

confidence: 81%

Section: Resultsmentioning

confidence: 99%

Machine Learning-Based Classification of Lignocellulosic Biomass from Pyrolysis-Molecular Beam Mass Spectrometry Data

Nag

Gerritsen

Doeppke

et al. 2021

IJMS

Self Cite

View full text Add to dashboard Cite

show abstract

“…Three half-sib families of male parents from a half-diallel designed cross (7 × 7) were used to generate three genetic maps [ 32 ]. A similar protocol as described above was used to call variants.…”

Section: Methodsmentioning

confidence: 99%

Sequencing and Analysis of the Sex Determination Region of Populus trichocarpa

Zhou

Macaya‐Sanz

Schmutz

et al. 2020

Genes

Self Cite

View full text Add to dashboard Cite

The ages and sizes of a sex-determination region (SDR) are difficult to determine in non-model species. Due to the lack of recombination and enrichment of repetitive elements in SDRs, the quality of assembly with short sequencing reads is universally low. Unique features present in the SDRs help provide clues about how SDRs are established and how they evolve in the absence of recombination. Several Populus species have been reported with a male heterogametic configuration of sex (XX/XY system) mapped on chromosome 19, but the exact location of the SDR has been inconsistent among species, and thus far, none of these SDRs has been fully assembled in a genomic context. Here we identify the Y-SDR from a Y-linked contig directly from a long-read PacBio assembly of a Populus trichocarpa male individual. We also identified homologous gene sequences in the SDR of P. trichocarpa and the SDR of the W chromosome in Salix purpurea. We show that inverted repeats (IRs) found in the Y-SDR and the W-SDR are lineage-specific. We hypothesize that, although the two IRs are derived from the same orthologous gene within each species, they likely have independent evolutionary histories. Furthermore, the truncated inverted repeats in P. trichocarpa may code for small RNAs that target the homologous gene for RNA-directed DNA methylation. These findings support the hypothesis that diverse sex-determining systems may be achieved through similar evolutionary pathways, thereby providing a possible mechanism to explain the lability of sex-determination systems in plants in general.

show abstract

“…Pyrolysis or thermal degradative methods coupled with chromatographic and/or mass spectrometry techniques can be performed on minimally processed biomass, yield highly reproducible results, and can be used in high-throughput platforms to analyze lignin content and composition in biomass [3,[29][30][31][32][33].…”

Section: Introductionmentioning

confidence: 99%

Comparison of Methodologies used to Determine Aromatic Lignin Unit Ratios in Lignocellulosic Biomass

Happs

Addison

Doeppke

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Background Multiple analytical methods have been developed to determine the ratios of aromatic lignin units, particularly the syringyl/guaiacyl (S/G) ratio, of lignin biopolymers in plant cell walls. Chemical degradation methods such as thioacidolysis produce aromatic lignin units that are released from certain linkages and may induce chemical changes rendering it difficult to distinguish and determine the source of specific aromatic lignin units released, as is the case with nitrobenzene oxidation methodology. NMR methods provide powerful tools used to analyze cell walls for lignin composition and linkage information. Pyrolysis-mass spectrometry methods are also widely used, particularly as high-throughput methodologies. However, the different techniques used to analyze aromatic lignin unit ratios frequently yield different results within and across particular studies, making it difficult to interpret and compare results. This also makes it difficult to obtain meaningful insights relating these measurements to other characteristics of plant cell walls that may impact biomass sustainability and conversion metrics for the production of bio-derived fuels and chemicals. Results The authors compared the S/G lignin unit ratios obtained from thioacidolysis, pyrolysis-molecular beam mass spectrometry (py-MBMS), HSQC liquid-state NMR and solid-state (ss) NMR methodologies of pine, several genotypes of poplar, and corn stover biomass. An underutilized approach to deconvolute ssNMR spectra was implemented to derive S/G ratios. The S/G ratios obtained for the samples did not agree across the different methods, but trends were similar with the most agreement among the py-MBMS, HSQC NMR and deconvoluted ssNMR methods. The relationship between S/G, thioacidolysis yields, and linkage analysis determined by HSQC is also addressed. Conclusions This work demonstrates that different methods using chemical, thermal, and nondestructive NMR techniques to determine native lignin S/G ratios in plant cell walls may yield different results depending on species and linkage abundances. Spectral deconvolution can be applied to many hardwoods with lignin dominated by S and G units, but the results may not be reliable for some woody and grassy species of more diverse lignin composition. HSQC may be a better method for analyzing lignin in those species given the wealth of information provided on additional aromatic moieties and bond linkages. Additionally, trends or correlations in lignin characteristics such as S/G ratios and lignin linkages within the same species such as poplar may not necessarily exhibit the same trends or correlations made across different biomass types. Careful consideration is required when choosing a method to measure S/G ratios and the benefits and shortcomings of each method discussed here are summarized.

show abstract

Accurate Determination of Genotypic Variance of Cell Wall Characteristics of a Populus trichocarpa Pedigree Using High-Throughput Pyrolysis-Molecular Beam Mass Spectrometry

Cited by 5 publications

References 44 publications

Machine Learning-Based Classification of Lignocellulosic Biomass from Pyrolysis-Molecular Beam Mass Spectrometry Data

Machine Learning-Based Classification of Lignocellulosic Biomass from Pyrolysis-Molecular Beam Mass Spectrometry Data

Sequencing and Analysis of the Sex Determination Region of Populus trichocarpa

Comparison of Methodologies used to Determine Aromatic Lignin Unit Ratios in Lignocellulosic Biomass

Contact Info

Product

Resources

About