Automated Classification of Pedagogical Materials with NLP

Ensuring computer science curricula align with international standards is a complex challenge, given the extensive guidelines published by organizations like ACM and IEEE. Manually assessing the coverage of each course requires significant time and resources.

This study proposes a Natural Language Processing (NLP)-based approach to automate the classification of pedagogical materials. Two types of techniques are explored: traditional methods of parsing, tagging, and embeddings, and the use of Large Language Models (LLM).

Preliminary results indicate that these techniques can effectively classify documents automatically, reducing the workload for curriculum administrators.