Five characteristics of quality educational assessments: Part one

Assessment literacy involves understanding how assessments are made, what type of assessments answer what questions, and how the data from assessments can be used to help teachers, students, parents, and other stakeholders make decisions about teaching and learning. Assessment designers strive to create assessments that show a high degree of fidelity to the following five traits:

1.  Content validity
2.  Reliability
3.  Fairness
4.  Student engagement and motivation
5.  Consequential relevance

In this blog post, we’ll cover the first characteristic of quality educational assessments: content validity.

Understanding content validity 

One of the most important characteristics of any quality assessment is content validity. Simply put, content validity means that the assessment measures what it is intended to measure for its intended purpose, and nothing more. For example, if an assessment is designed to measure Algebra I performance, then reading comprehension issues should not interfere with a student’s ability to demonstrate what they know, understand, and can do in Algebra I.

Content validity is evidenced at three levels: assessment design, assessment experience, and assessment questions, or items. The assessment design is guided by a content blueprint, a document that clearly articulates the content that will be included in the assessment and the cognitive rigor of that content.  The content standards the test is designed to assess determine what content makes it into the test’s item pool.

The next level where content validity matters is the assessment experience itself, meaning, when the student sits down to take the assessment, what items do they see? In a fixed-form, grade-level test, most or all students at a given grade level see the same item set, namely those assessing the grade-level standards to which the student is assigned. In a cross-grade, computer-adaptive test, an item selection algorithm presents each student with items sampled from a broad range of standards and adapts to the in-the-moment performance of the test taker. Each student sees items at the difficulty level that’s appropriate for them, based on their previous responses. This adaptivity enables test developers to provide very precise information about a student’s learning and performance in a domain area.

An assessment can have all sorts of bells and whistles, incorporate cutting edge technology and functionality, and have a great suite of reports that tell a compelling assessment narrative, but if the test is lacking content validity, it is not worth much.

Content validity is a concept germane to the building block level of MAP® Growth™ as well: the questions, or items, themselves. Experts in both content and assessment design items to measure the concepts and skills in the standards at the indicated levels of cognitive complexity. Every item in a high-quality assessment goes through a rigorous development process with several levels of review, which ensures that item content is clear, accurate, and relevant. The result is a robust and aligned item pool that serves to provide the most accurate information possible about a student.

Content validity is supported in a number of ways in educational assessments, including:

  • General assessment design principles that control for readability
  • Content expert review cycles
  • Evidence-centered design methodology
  • Statistical analysis of student performance on test items

One way to check content validity of an assessment is to ask these guiding questions:

  • How closely does what the assessment measures match the intended (instructed) content?
  • What knowledge or skills does the student most need to perform successfully on this assessment?
  • If the student performs successfully on this assessment, what does that mean?

In closing

Content validity is foundational to making accurate inferences. If an educator is unclear about what an assessment is measuring, then the inferences made will be uninformative. In other words, the assessment will have failed in its prime directive: to provide valuable information about what the test taker knows and can do. An assessment can have all sorts of bells and whistles, incorporate cutting edge technology and functionality, and have a great suite of reports that tell a compelling assessment narrative, but if the test is lacking content validity, it is not worth much.  What’s more, when data from an assessment that lacks content validity is used to inform instruction, the result could include wasted time and inappropriate growth expectations of students. For these reasons, content validity is central to a high-quality educational assessment.

Learn more about validity in our guide, Not all assessment data is equal: Why validity and reliability matter. In my next post on characteristics of quality educational assessments, I’ll explore the importance of reliability.

Blog post

Helping students grow

Students continue to rebound from pandemic school closures. NWEA® and Learning Heroes experts talk about how best to support them here on our blog, Teach. Learn. Grow.

See the post

Guide

Put the science of reading into action

The science of reading is not a buzzword. It’s the converging evidence of what matters and what works in literacy instruction. We can help you make it part of your practice.

Get the guide

Article

Support teachers with PL

High-quality professional learning can help teachers feel invested—and supported—in their work.

Read the article