ASLAN at BEA 2026 Shared Task 2: Voting Across Scoring Paradigms
Aufsatz in Konferenzband › Forschung › begutachtet
Publikationsdaten
| Von | Marie Bexte, Yuning Ding, Josef Ruppenhofer, Nils-Jonathan Schaller, Daniel Ignacio Mora Melanchthon, Torsten Zesch, Andrea Horbach |
| Originalsprache | Englisch |
| Erschienen in | Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026). Association for Computational Linguistics. |
| Herausgeber (Verlag) | Association for Computational Linguistics |
| ISBN | 979-8-89176-409-5 |
| Publikationsstatus | Veröffentlicht – 07.2026 |
This paper describes the ASLAN system contribution to the BEA 2026 Shared Task on rubric-based short answer scoring for German (Gombert et al., 2026). We investigate three complementary modeling paradigms: similarity-based scoring, instance-based classification, and rubric-prompted large language models (LLMs). For the unseen answers track, where test answers belong to prompts observed during training, we compare question-specific and generic scoring models as well as ensemble variants. For the unseen questions track, where
models must generalize to previously unseen prompts, we primarily rely on zero-shot LLMbased scoring using the scoring rubrics. Our experiments show that similarity-based models outperform instance-based models and LLMbased models in the unseen answers setting. In addition, we find that ensemble methods improve robustness over individual models