The widespread legalization of Cannabis has opened the industry to using contemporary analytical techniques for chemotype analysis. Chemotypic data has been collected on a large variety of oil profiles inherent to the cultivars that are commercially available. The unknown gene regulation and pharmacokinetics of dozens of cannabinoids offer opportunities of high interest in pharmacology research. Retailers in many medical and recreational jurisdictions are typically required to report chemical concentrations of at least some cannabinoids. Commercial cannabis laboratories have collected large chemotype datasets of diverse Cannabis cultivars. In this work a data set of 17,600 cultivars tested by Steep Hill Inc., is examined using machine learning techniques to interpolate missing chemotype observations and cluster cultivars into groups based on chemotype similarity. The results indicate cultivars cluster based on their chemotypes, and that some imputation methods work better than others at grouping these cultivars based on chemotypic identity. Due to the missing data and to the low signal to noise ratio for some less common cannabinoids, their behavior could not be accurately predicted. These findings have implications for characterizing complex interactions in cannabinoid biosynthesis and improving phenotypical classification of Cannabis cultivars.
33The accelerating legalization of Cannabis has opened the industry to using contemporary 34 analytical techniques. The gene regulation and pharmacokinetics of dozens of cannabinoids 35 remain poorly understood. Because retailers in many medical and recreational 36 jurisdictions are required to report chemical concentrations of cannabinoids, commercial 37 laboratories have growing chemotype datasets of diverse Cannabis cultivars. Using a data 38 set of 17,600 cultivars tested by Steep Hill Inc., we apply machine learning techniques to 39 interpolate missing chemotype observations and cluster cultivars together based on 40 similarity. Our results show that cultivars cluster based on their chemotype, and that some 41 imputation methods work better than others at grouping these cultivars based on 42 chemotypic identity. However, due to the missing data for some of the cannabinoids their 43 behavior could not be accurately predicted. These findings have implications for 44 characterizing complex interactions in cannabinoid biosynthesis and improving 45 phenotypical classification of Cannabis cultivars.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.