Simulation study for evaluating MAP Growth item pools with grade-level constraints
This simulation study examines the measurement precision, item exposure rates, and the depth of the MAP Growth item pools under various grade-level restrictions.
There has been increasing concern about the presence of disengaged test taking in international assessment programs and its implications for the validity of inferences made regarding a country’s level of educational attainment. In this paper, the author discusses six important insights yielded by 20 years of research on this and implications for assessment programs.
By: Steven Wise
This technical report documents the processes and procedures employed by NWEA to build and support the MAP Reading Fluency assessment.
Products: MAP Reading Fluency
This study investigated test-taking engagement on a large-scale state summative assessment. Overall, results of this study indicate that disengagement has a material impact on individual state summative test scores, though its impact on score aggregations may be relatively minor.
This paper describes a method for identifying partial engagement and provides validation evidence to support its use and interpretation. When test events indicate the presence of partial engagement, effort-moderated scores should be interpreted cautiously.
To avoid the subjectivity of having a single person evaluate a construct of interest, multiple raters are often used. While a range of models to address measurement issues that arise when using multiple raters have been presented, few are available to estimate growth in the presence of multiple raters. This study provides a model that removes all but the shared perceptions of raters at a given timepoint then adds on a latent growth curve model across timepoints. Results indicate that the model shows promise for use by researchers who want to estimate growth based on longitudinal multi-rater data.
This research study is the first time of applying the thinking of semi-supervised learning into CDM. Also, we used the validating test to choose the appropriate parameters for the ANNs instead of using typical statistical criteria, such as AIC, BIC.
By: Kang Xue, Laine Bradshaw
Topics: Measurement & scaling
To avoid the subjectivity of having a single person evaluate a construct of interest (e.g., a student’s self-efficacy in school), multiple raters are often used. This study provides a model for estimating growth in the presence of multiple raters.