10 (More) Questions to Ask When Comparing and Evaluating Interim Assessments

10 (More) Questions to Ask When Comparing and Evaluating Interim AssessmentsNot all assessments are created equal. In fact, they vary widely in their designs, purposes, and validity. Some deliver data powerful enough for educators to make informed decisions at the student, class, school, and district level—and some don’t. Along with measuring student growth and achievement, quality interim assessment data can provide stability during times of transition in standards and curriculum (think Common Core).

We’ve blogged previously on questions to ask when evaluating student growth assessments, but there are other questions that can be asked that are equally as important. Here we share 10 (more) questions, why they’re important to ask and what to look for:

1. Was the assessment designed to provide achievement status and growth data?

Why it’s important: Growth data help engage students in a growth mindset; they learn their ability isn’t fixed, but can increase with effort. Data that fail to reflect student growth deny students and their communities the opportunity to be proud of what they accomplished. An assessment is only as good as the scale that forms its foundation.

What to look for: Look for a stable vertical scale that is regularly monitored for scale drift.

 2. Is the assessment adaptive?

Why it’s important: Adaptive assessments choose items based on a student’s response pattern, so the student’s true ability level can be measured with precision in the fewest number of test items.

What to look for: Look for whether the assessment adapts to each student with each item. Many so-called adaptive assessments adapt only after a bank of questions has been answered, resulting in less precision in the results.

3. Does it provide test items at the appropriate difficulty level for each student?

Why it’s important: Instructional readiness does not always relate directly to grade level. Students may be at, above, or below grade level in terms of what they are ready to learn.

What to look for: Look for an assessment that adapts both within grade and out-of-grade, so that it’s able to measure the student’s true starting point. Also review the span of content the assessment covers. An assessment that truly informs educators about each student’s instructional readiness draws on content that spans across grades.

4. Does the assessment link to relevant resources to support instruction?

Why it’s important: Data linked to views, tools, and instructional resources can help educators answer the critical question, “How do we make these data actionable?”

What to look for: Look for links to instructional resources that support students in learning what they are ready to learn. These resources may include open-education resources (OER), full-blown curricula, or resources linked to blended learning models based on a student’s assessment results.

5. Do the data inform decision-making at the classroom level?

Why it’s important: At the classroom level, ability grouping is one of the key uses for data; this includes differentiating instruction as well as identifying students for programs and resources that will best support their individual needs.

What to look for: Look at whether the report views and tools empower classroom-level instructional decisions and simplify differentiation and grouping.

6. Can the data inform decision-making at the building and district levels?

Why it’s important: Report views that aggregate for a school or multiple sites serve building- and district-level administrators’ data needs.

What to look for: Look at the features and capabilities of the reports and tools to maximize the value of the assessment data. The more assessment data are leveraged, the less time needs to be spent gathering them – lending efficiency to your assessment process.

7. Is the item pool sufficient to support the test design and purpose?

Why it’s important: An item pool’s sufficiency can be determined by a number of factors, including its development process, maturity, size, depth, and breadth. High-quality items are a critical component to any assessment.

What to look for: Look at the item development, alignment, and review processes. This information should be available in the assessment provider’s Technical Manual.

8. Do the assessment data have validity?

Why it’s important: Every assessment is designed with a purpose—or purposes—that its data can support. Validity, meaning whether an assessment measures what it intends to measure, ensures that the inferences made from the data are sound.

What to look for: Look for the test design and its intended purpose(s). Does it have a strong research underpinning or a history of reliability and accuracy? Understand what content is covered, how the test should be administered, scored, and its standard error of measurement.

9. Does the assessment provider develop norms from their data? How often are the norms updated?

Why it’s important: Norms can provide a relevant data point that contextualizes a student’s assessment results and helps students and teachers with goal setting. It is also important to provide context around student growth and answer questions such as “How much growth is sufficient?” “Is the student gaining or losing ground relative to their peers?”

What to look for: Look for the assessment provider’s practices around norming, including how often normative studies are conducted, whether the population is nationally representative, and whether status and growth norms are developed.

10. Can the assessment make predictions of student performance on high-stakes summative year-end tests and college benchmarks?

Why it’s important: Knowing if students are on track to achieve proficiency on state assessments helps teachers make adjustments in instructional pacing, plan interventions, and provide additional resources.

What to look for: Look for predictive studies that link student scores to proficiency levels for their state assessments or college entrance examinations.

There is no such thing as a perfect assessment system, but one founded on a theory of action that incorporates multiple measures, provides opportunities for relevant feedback, and has credibility and defensibility (Brian Gong, 2011 Reidy Interactive Lecture Series’ Multiple Measures: a personal response. Presented at the Reidy Interactive Lecture Series; Boston, MA) will go a long way toward maximizing results so that students are supported as they progress on their unique learning paths.