Titel: Using NLP to create corpus-based vocabulary exercises in Latin classes Autoren: Beyer, Andrea; Schulz, Konstantin Tagungsband: INTED2020 Proceedings ISBN: 978-84-09-17939-8 DOI: 10.21125/inted.2020.0562 Jahr: 2020 Seiten: 1750-1757 Bemerkungen: -- Abstract Learning a historical language is in itself different from learning a modern language in view of emphasizing the work on texts instead of everyday communication. Therefore, not only the expectations and motivation differ, but also the teaching methodology. Whereas learners of modern languages focus on language production, learners of Latin read or translate their texts. Because of the overall low frequency of occurrence of a Latin word or a phrase in this kind of learning environment, most students are often unfamiliar with a given word and therefore finally unable to translate the texts. To approach this underlying problem of Latin classes (in German high schools) we try to figure out whether corpus-based methods are more supportive in Latin vocabulary acquisition than other methods used in teaching languages and how corpus-based tasks might be (analogically and digitally) implemented in class. Consequently, we adapt the methodology of data-driven (language) learning as an educational innovation for Latin classes. In this context, we reuse various tools from the Classics and the natural language processing community for the development of our corpus-based software (Machina Callida). Simultaneously, we carried out different intervention studies using a design-based research approach. In these studies, we gained some interesting insights, e.g. that the majority of students fail to lemmatize words correctly. Likewise, we tested the user experience of our software receiving feedback from experts (teachers, students). Then, we used both kinds of results to improve the software constantly, e.g. readjusting the type of exercises or the order of tasks in our so-called vocabulary unit. Finally, we started a new study using the so-called vocabulary unit that presents both an example of how to implement – to some extent – data-driven learning into Latin classes and how to carry out an intervention study within the software. The results so far available indicate that students benefit a little more in their temporary learning outcome from using a context-based vocabulary task (cloze). Although the results are promising, we need to collect more data, esp. taking into account the individual learning progress (e.g. by implementing a user management).