Self-assessment accuracy in the age of artificial intelligence: Differential effects of LLM-generated feedback

Artikel in Fachzeitschrift › Forschung › begutachtet

Publikationsdaten

Von	Lucas Wilhelm Liebenow, Fabian T.C. Schmidt, Jennifer Meyer, Johanna Fleckenstein
Originalsprache	Englisch
Erschienen in	Computers & Education, 237, Artikel 105385
Herausgeber (Verlag)	Elsevier
ISSN	0360-1315, 1873-782X
DOI/Link	https://doi.org/10.1016/j.compedu.2025.105385
Publikationsstatus	Veröffentlicht – 11.2025

Feedback is a promising intervention to foster students’ self-assessment accuracy (SAA), but the effect can vary depending on students' initial skill levels or prior performance. In particular, lower-performing students who are less accurate might benefit more from feedback in terms of SAA. To deepen our understanding, the present study investigated the mechanism and dependencies of feedback effects on SAA in the realm of large language models (LLMs). Within a randomized control experiment, we examined the effect of LLM-generated feedback on SAA by considering students’ initial performance and initial SAA as potential moderators. A sample of N = 459 upper secondary students wrote an argumentative essay in English as a foreign language and revised their text. After finishing their first draft (pretest) and revision (posttest) of the draft, students self-assessed their writing performance. Students in the experimental group received GPT-3.5-turbo-generated feedback on their first draft during their revision. In the control group, students could revise their text without feedback. Our results indicated no significant main effect of LLM-generated feedback on students’ SAA. Furthermore, we found a significant interaction effect between feedback and students' pretest SAA on SAA changes, indicating that lower-calibrated students improved their SAA with feedback more than students with similar pretest SAA and without feedback. Exploratory analyses revealed that students with higher pretest SAA did not improve their SAA with feedback and decreased their SAA. We discuss this nuanced evidence and draw implications for research and practice using LLM-generated feedback in education.

Aktuelles

Über uns

Abteilungen

Forschungslinien

Projekte

Alle Publikationen des IPN

Open Science & Gute Wissenschaftliche Praxis

Kooperationen & Vernetzung

Themen

Unterrichtsergänzende Angebote

Unterrichts- und Fortbildungsmaterialien

Podcasts - Forschung zum Hören

IPN Journal

Self-assessment accuracy in the age of artificial intelligence: Differential effects of LLM-generated feedback

Artikel in Fachzeitschrift › Forschung › begutachtet

Publikationsdaten

DOI

IPN - Leibniz-Institut für die Pädagogik der Naturwissenschaften und Mathematik