NEPS technical report for mathematics: Scaling results for the additional study Thuringia

Project reportResearch

Publication data


ByAnna-Lena Kock, Lara Aylin Petersen, Kristin Litteck
Original languageEnglish
Published in(NEPS Survey Paper; No. 84)
Pages31
Editor (Publisher)Leibniz Institut für Bildungsverläufe, Nationales Bildungspanel
DOI/Linkhttps://doi.org/10.5157/NEPS:SP84:1.0 (Open Access)
Publication statusPublished – 03.2021

The National Educational Panel Study (NEPS) investigates the development of competencies across the whole life span and develops tests for assessing these competence domains in different age groups. To evaluate the quality of the competence tests, a wide range of analyses based on item response theory (IRT) are performed. This paper describes the data and scaling procedure for the mathematics competence test administered in the additional study Thuringia. The test was designed to test the graduating classes of 2010 (the last year which was not affected by the reform of the “Leistungskurs-Grundkurs-System”) and 2011 (the first year after the reform). In sum, 2,266 students participated in these two waves. The mathematics test consisted of 41 items (distributed among eight booklets), representing different content areas as well as different cognitive components. A Rasch model was used to scale the data. Item fit statistics, differential item functioning, Rasch-homogeneity, and the test ́s dimensionality were evaluated to ensure the quality of the test. These analyses showed that the test exhibited a good reliability and that the items showed a satisfactory model fit. Furthermore, test fairness could be confirmed for different subgroups. Limitations of the test were some recognizable gaps at the upper end of the scale’s item difficulties. Overall, the mathematics test had good psychometric properties that allowed for an estimation of a reliable mathematics competence score. Besides the scaling results, this paper also describes the data available in the Scientific Use File and provides the ConQuest-Syntax for scaling the data.