The effect of nonignorable missing data in computerized adaptive test on item fit statistics for polytomous item response models
By: Shudong Wang, Hong Jiao
For both linear and adaptive tests, it is crucial to evaluate model-data fit because the goodness-of-fit of item response theory (IRT) models are relevant to any purpose of a test. To date, all item fit statistics are derived based on linear tests and almost all studies have been done in the context of linear testing. These studies are conducted based on assumptions under regular conditions for fixed test forms, such as no missing responses and normal distribution of unidimensional ability for a population.See More
There has been increasing concern about the presence of disengaged test taking in international assessment programs and its implications for the validity of inferences made regarding a country’s level of educational attainment. In this paper, the author discusses six important insights yielded by 20 years of research on this and implications for assessment programs.
By: Steven Wise
This technical report documents the processes and procedures employed by NWEA to build and support the MAP Reading Fluency assessment.
Products: MAP Reading Fluency
This study investigated test-taking engagement on a large-scale state summative assessment. Overall, results of this study indicate that disengagement has a material impact on individual state summative test scores, though its impact on score aggregations may be relatively minor.
This paper describes a method for identifying partial engagement and provides validation evidence to support its use and interpretation. When test events indicate the presence of partial engagement, effort-moderated scores should be interpreted cautiously.
To avoid the subjectivity of having a single person evaluate a construct of interest, multiple raters are often used. While a range of models to address measurement issues that arise when using multiple raters have been presented, few are available to estimate growth in the presence of multiple raters. This study provides a model that removes all but the shared perceptions of raters at a given timepoint then adds on a latent growth curve model across timepoints. Results indicate that the model shows promise for use by researchers who want to estimate growth based on longitudinal multi-rater data.
Topics: Item response theory
To avoid the subjectivity of having a single person evaluate a construct of interest (e.g., a student’s self-efficacy in school), multiple raters are often used. This study provides a model for estimating growth in the presence of multiple raters.