Diverse Data Repository for Data Science Education

Project TitleDiverse Data Repository for Data Science Education
Principal InvestigatorKatie Burak
Co-ApplicantsElham Khoda, Assistant Professor of Teaching, Computer Science, Faculty of Science,
FacultyScience
Funding Year2025
Project SummaryThis project aims to create a user-friendly data repository featuring datasets centered on a wide range of diverse topics, including EDI-related data. By curating freely available and open-source data, the repository will provide instructors with an accessible resource to introduce diverse and meaningful topics into their classrooms. The datasets will be easily accessible through an open-access website and as R and Python packages, enabling seamless integration into the workflows of students, educators and practitioners who use these programming languages. The proposed resources will support teaching in UBC’s Masters of Data Science program, specifically in courses such as DSCI 552 and DSCI 571.

Along with the datasets, there will be example problems that showcase various types of data science questions,
offering instructors motivation and practical inspiration for incorporating these topics into their teaching. The history of each dataset, along with a narrative and key points for educators to discuss with their students will also be documented and highlighted. Highlighting the rich context and significance of the data not only addresses common gap in most freely available data repositories but also helps students engage more deeply with the material by connecting it to real-world issues and meaningful discussions.
Grant type OER Affordability Grant
Funded Amount $24,976