Don’t score too early! Evaluating argument mining models on incomplete essays
Aufsatz in Konferenzband › Forschung › begutachtet
Publikationsdaten
| Von | Nils-Jonathan Schaller, Yuning Ding, Thorben Jansen, Andrea Horbach |
| Originalsprache | Englisch |
| Erschienen in | Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025) |
| Seiten | 345–355 |
| Herausgeber (Verlag) | Association for Computational Linguistics |
| ISBN | 979-8-89176-270-1 |
| DOI/Link | https://aclanthology.org/2025.bea-1.27/ |
| Publikationsstatus | Veröffentlicht – 07.2025 |
Students' argumentative writing benefits from receiving automated feedback, particularly throughout the writing process. Argument Mining (AM) technology shows promise for delivering automated feedback on argumentative structures; however, existing systems are frequently trained on completed essays. Although they provide rich context information, concerns have been raised about their usefulness for offering writing support on incomplete texts during the writing process. This study evaluates the robustness of AM algorithms on artificially fragmented learner texts from two large-scale corpora of secondary school essays: the German DARIUS corpus and the English PERSUADE corpus. Our analysis reveales that token-level sequence-tagging methods, while highly effective on complete essays, suffer significantly when the context is limited or misleading. Conversely, sentence-level classifiers maintain relative stability under such conditions. We show that deliberately training AM models on fragmented input substantially mitigates these context-related weaknesses, enabling AM systems to better support dynamic educational writing scenarios.