Data access, use, and reuse are crucial for empirical science and evidence-based policymaking, and rely on metadata to facilitate data discovery and utilization by users and producers alike. Metadata quality is pivotal for the federal government to understand data available for evidence building, enabling agencies to identify data production gaps and redundancies, and enhancing evidence quality through reproducible research. The alignment of federal data agency incentives with the private sector, alongside technological advancements, now supports feedback-driven data classification, leveraging machine learning for improved data discoverability and categorization. This article outlines the multiple classification needs of one statistical agency, the National Center for Science and Engineering Statistics, and proposes a machine learning approach for classifying data sets based on usage in research, aligning with legislative and policy frameworks to enhance data governance, interoperability, and utility for evidence-based decision-making.