Strengths and weaknesses of automated scoring of free-text student answers
Journal article › Research › Peer reviewed
Publication data
By | Marie Bexte, Andrea Horbach, Torsten Zesch |
Original language | English |
Published in | Informatik Spektrum, 47(3-4) |
Pages | 78-86 |
Editor (Publisher) | Springer |
ISSN | 0170-6012, 1432-122X |
DOI/Link | https://doi.org/10.1007/s00287-024-01573-z |
Publication status | Published – 09.2024 |
Free-text tasks, where students need to write a short answer to a specific question, serve as a well-established method for assessing learner knowledge. To address the high cost of manually scoring these tasks, automated scoring models can be used. Such models come in various types, each with its own strengths and weaknesses. Comparing these models helps in selecting the most suitable one for a given problem. Depending on the assessment context, this decision can be driven by ethical or legal considerations. When implemented successfully, a scoring model has the potential to substantially reduce costs and enhance the reliability of the scoring process. This article compares the different categories of scoring models across a set of crucial criteria that have immediate relevance to model employment in practice.