As artificial intelligence (AI) and data science education gain importance in K-12 curricula, there is a growing need for well-designed sustainable educational datasets tailored to different school levels. Sustainable datasets should be reusable, adaptable, and accessible to support long-term AI and data science education goals. However, research on the systematic categorization of difficulty levels in educational datasets is limited. This study aims to address this gap by developing a framework for sustainable educational dataset standards based on learners’ developmental stages and data preprocessing requirements. The proposed framework consists of five levels: Level 1 (grades 1–4), where data preprocessing is unnecessary; Level 2 (grades 5–6), involving basic data cleaning; Level 3 (grades 7–9), requiring attribute manipulation; Level 4 (grades 10–12), involving feature merging and advanced preprocessing; and Level 5 (teachers/adults), requiring the entire data science process. An expert validity survey was conducted with 22 elementary and secondary school teachers holding advanced degrees in AI education. The results showed high validity for Levels 1–4 but relatively lower validity for Level 5, suggesting the need for separate training and resources for teachers. Based on the CVR results and expert feedback, the standards for Educational Datasets were revised, particularly for Stage 5, which targets teachers and adult learners. The findings highlight the importance of expert validation, step-by-step experiences, and an interdisciplinary approach in developing educational datasets. This study contributes to the theoretical understanding of educational datasets and provides practical implications for teachers, students, educational institutions, and policymakers in implementing effective and sustainable AI and data science education in K-12 settings, ultimately fostering a more sustainable future.