Abstract. Integrating mineralogy with data science is critical to
modernizing Earth materials research and its applications to geosciences.
Data were compiled on 95 650 garnet sample analyses from a variety of
sources, ranging from large repositories (EarthChem, RRUFF, MetPetDB) to
individual peer-reviewed literature. An important feature is the inclusion
of mineralogical “dark data” from papers published prior to 1990. Garnets
are commonly used as indicators of formation environments, which directly
correlate with their geochemical properties; thus, they are an ideal subject
for the creation of an extensive data resource that incorporates
composition, locality information, paragenetic mode, age, temperature,
pressure, and geochemistry. For the data extracted from existing databases
and literature, we increased the resolution of several key aspects,
including petrogenetic and paragenetic attributes, which we extended from
generic material type (e.g., igneous, metamorphic) to more specific rock-type names (e.g., diorite, eclogite, skarn) and locality information,
increasing specificity by examining the continent, country, area, geological
context, longitude, and latitude. Likewise, we utilized end-member and
quality index calculations to help assess the garnet sample analysis
quality. This comprehensive dataset of garnet information is an open-access
resource available in the Evolutionary System of Mineralogy Database (ESMD)
for future mineralogical studies, paving the way for characterizing
correlations between chemical composition and paragenesis through natural
kind clustering (Chiama et al., 2022; https://doi.org/10.48484/camh-xy98). We encourage scientists to contribute their own
unpublished and unarchived analyses to the growing data repositories of
mineralogical information that are increasingly valuable for advancing
scientific discovery.