Comparing generative AI and expert feedback to students’ writing: Insights from student teachers

Journal articleResearchPeer reviewed

Publication data


ByThorben Jansen, Lars Höft, Luca Bahr, Johanna Fleckenstein, Jens Möller, Olaf Köller, Jennifer Meyer
Original languageEnglish
Published inPsychologie in Erziehung und Unterricht, 71(2)
Pages80-92
Editor (Publisher)Ernst Reinhardt Verlag
ISSN0342-183X
DOI/Linkhttps://doi.org/10.2378/peu2024.art08d (Open Access)
Publication statusPublished – 04.2024

Feedback is crucial for learning complex tasks like writing; yet its creation is time-consuming, often leading to students receiving insufficient feedback. Generative artificial intelligence, particularly Large Language Models (LLMs) like ChatGPT 3.5-Turbo, has been discussed as a solution for providing more feedback. However, there needs to be more evidence that AI-feedback already meets the quality criteria for classroom use, and studies have yet to investigate whether LLM-generated feedback already seems useful to its potential users. In our study, 89 student teachers evaluated the usefulness of feedback for students’ argumentative writing, comparing LLM against expert-generated feedback without receiving information about the feedback source. Participants rated LLM-generated feedback as useful for revision in 59 % of texts (compared to 88 % for expert feedback). 23 % of the time, participants preferred to give LLM-generated feedback to students. Our discussion focuses on the conditions in which AI-generated feedback might be effectively and appropriately used in educational settings.