Raman spectroscopy is commonly used in microplastics identification, but equipment variations yield inconsistent data structures that disrupt the development of communal analytical tools. We report a strategy to overcome the issue using a database of high-resolution, full-window Raman spectra. This approach enables customizable analytical tools to be easily createda feature we demonstrate by creating machine-learning classification models using open-source random-forest, K-nearest neighbors, and multi-layer perceptron algorithms. These models yield >95% classification accuracy when trained on spectroscopic data with spectroscopic data downgraded to 1, 2, 4, or 8 cm–1 spacings in Raman shift. The accuracy can be maintained even in non-ideal conditions, such as with spectroscopic sampling rates of 1 kHz and when microplastic particles are outside the focal plane of the laser. This approach enables the creation of classification models that are robust and adaptable to varied spectrometer setups and experimental needs.
The rapid growth in microplastic pollution research is influencing funding priorities, environmental policy, and public perceptions of risks to water quality and environmental and human health. Ensuring that environmental microplastics research data are findable, accessible, interoperable, and reusable (FAIR) is essential to inform policy and mitigation strategies. We present a bibliographic analysis of data sharing practices in the environmental microplastics research community, highlighting the state of openness of microplastics data. A stratified (by year) random subset of 785 of 6,608 microplastics articles indexed in Web of Science indicates that, since 2006, less than a third (28.5%) contained a data sharing statement. These statements further show that most often, the data were provided in the articles’ supplementary material (38.8%) and only 13.8% via a data repository. Of the 279 microplastics datasets found in online data repositories, 20.4% presented only metadata with access to the data requiring additional approval. Although increasing, the rate of microplastic data sharing still lags behind that of publication of peer-reviewed articles on environmental microplastics. About a quarter of the repository data originated from North America (12.8%) and Europe (13.4%). Marine and estuarine environments are the most frequently sampled systems (26.2%); sediments (18.8%) and water (15.3%) are the predominant media. Of the available datasets accessible, 15.4% and 18.2% do not have adequate metadata to determine the sampling location and media type, respectively. We discuss five recommendations to strengthen data sharing practices in the environmental microplastic research community.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.