Measuring Student Growth: What is a Scale and What Does Stability Mean?

Measuring Student Growth | What is a Scale and What Does Stability Mean?Measuring student growth is a crucial part of an educator’s role, and the stability of a measurement scale over time is necessary in order to measure that growth accurately. So what is a scale, and what does stability mean?

Scale is one of those words that befuddles learners of English because there are various and unrelated meanings, from the covering on certain animals, to a cause of blindness, to an instrument with graduated degrees. When we talk about scales in educational measurement, we’re talking about something akin to the latter definition: a construct that indicates the degree of student ability in a certain area, such as mathematical reasoning. We call this ability a latent trait because we can’t see the amount of a student’s mathematical reasoning ability directly, like we can see a student’s hair color or the shoes she is wearing. Even though we can’t physically interact with it, we know mathematical reasoning is a real thing. It is similar to a psychological state such as happiness—it exists in the interiority of a person. Quantifying, or measuring, latent traits is a subtle endeavor to which we bring to bear the power of statistical modeling.

To measure a latent trait, we must first elicit it. We elicit evidence of a latent trait through the use of instruments such as test items, performance tasks, and writing samples. The evidence is a proxy, or representative, of the trait itself. We infer that a person is happy by certain indicators such as body language, laughter, or smiling. Likewise, we infer the degree of a latent trait such as mathematical reasoning by the evidence we get from eliciting that trait.

Not all educational measurement scales are inherently stable. Many scales can, and do, drift over time…”

Once we elicit evidence of a latent trait, we measure that evidence by using a scale. The scale used by the Measures of Academic Progress® (MAP®) K-12 assessment is an equal interval measurement called the RIT scale. An equal interval scale provides a specific kind of information about order, namely that there is the same distance between points on the scale, or the same amount of underlying quality. Another example of an equal interval scale is a thermometer. The value of an equal interval scale is that it is consistent and objective; such qualities help make a scale stable over time.

Scale stability means that scales maintain their measurement characteristics, allowing for comparisons of assessment scores among groups of students, growth estimates, and longitudinal studies. For example, a RIT score of 215 in 1975 would be equivalent to a RIT of 215 in 1995. Maintaining this stability of scale—so that our scale retains its measurement properties in exactly the same way year after year—is a crucial part of what we do at here at NWEA.

Not all educational measurement scales are inherently stable. Many scales can, and do, drift over time, meaning that the underlying quantity of what is being measured shifts. Item calibrations can become more difficult or less difficult as time goes on. This is disastrous for measuring growth or making any sort of longitudinal comparisons. There are a number of reasons a scale might drift over time:

  • Curricular changes (including pedagogical approaches)
  • Changes in standards or what is valued in a content area
  • Changes in testing populations over time
  • Changes in the stakes of a test or assessment, its social context, or the meanings and consequences attached to it
  • Changes in the intended purposes of the assessment

Guarding our scale’s integrity in the face of such changes allows us (NWEA) to maintain a scale with remarkable stability over time. This stability gives us confidence in the educational decisions that are made as a result of the data gleaned from our MAP assessments. We are aware of the importance of maintaining scale stability, so we make it a priority to monitor our scales for drift. The scale connects it all—items to scores to students. We ensure our data are the gold standard in measuring growth because we know that every student can grow. Reporting that growth accurately makes a difference in the lives of students across the country and world.