Increasing the generalizability of similarity-based essay scoring through cross-prompt training

Conference contribution (Article) › Research › Peer reviewed

Publication data

By	Marie Bexte, Yuning Ding, Andrea Horbach
Original language	English
Published in	Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
Pages	225–236
Editor (Publisher)	Association for Computational Linguistics
ISBN	979-8-89176-270-1
DOI/Link	https://aclanthology.org/2025.bea-1.17
Publication status	Published – 07.2025

In this paper, we address generic essay scoring, i.e., the use of training data from one writing task to score data from a different task. We approach this by generalizing a similarity-based essay scoring method (Xie et al., 2022) to learning from texts that are written in response to a mixture of different prompts. In our experiments, we compare within-prompt and cross-prompt performance on two large datasets (ASAP and PERSUADE). We combine different amounts of prompts in the training data and show that our generalized method substantially improves cross-prompt performance, especially when an increasing number of prompts is used to form the training data. In the most extreme case, this leads to more than double the performance, increasing QWK from .26 to .55.

Announcements

About Us

The IPN's Departments

Research Lines

Projects

Publications

Collaboration and Networks

Open Science and Good Research Practice

Topics

Extracurricular Offers

Podcasts - Listening to Research

Teaching and Training Materials

Increasing the generalizability of similarity-based essay scoring through cross-prompt training

Conference contribution (Article) › Research › Peer reviewed

Publication data

IPN - Leibniz Institute for Science and Mathematics Education