To Measure a Year’s Growth, Begin with the Student

To Measure a Year’s Growth, Begin with the Student At NWEA, one of the research team’s functions is to provide high-level technical support on a range of assessment issues, including the use and interpretation of the MAP/MPG norms.  In that capacity, one question we receive regularly is, “What percentages of students typically make a year’s worth of growth on MAP in a single year?” This question is difficult to answer, because it’s precisely like asking what percentage of students increase their height by one year’s worth of growth in a single year. Both questions assume a false premise: that the attributes in questions are measured and expressible as units over a constant time interval.

Policy is inadvertently complicating this matter. Policymakers are accustomed to assessments designed to estimate whether students have demonstrated grade level “mastery” of or “proficiency” with academic content as articulated in a set of content standards. Mastery and proficiency here are typically defined by the state as a scale score that meets or exceeds a predetermined level. Familiarity with such assessments leads policymakers to rely on metrics that summarize student performance as above or below grade level. At the same time policymakers have increasingly enacted policies emphasizing the measurement of student academic growth. In doing so, they have relied on grade level as a time metric. This has led to the use of the term “years of growth,” as if a predetermined amount of change should occur between years. Estimating growth over time is not the same thing as estimating grade level “mastery.” Tests designed to estimate grade level “mastery” assess the extent to which students are learning what has been established for them to learn. Tests designed to measure growth aren’t focused on grade-level mastery, but whether improvement has occurred. These are different kinds of assessments, and they utilize different metrics.

Whether measuring height or student achievement, one defines growth as the change in that attribute between two time points. If my daughter was 39 inches on her fourth birthday and 42 inches on her fifth birthday, then she grew three inches during that year. So, for her, one year’s worth of growth was three inches because that’s how much she grew during the year. If she produced a MAP math score of 140 on her fourth birthday and a 160 on her fifth birthday, she would have shown 20 RIT points of growth. So, for her, one year’s worth of growth was 20 RIT points because that’s how much she grew during the year.

NWEA provides growth norms that allow one to compare a student’s observed growth relative to a nationally representative comparison group. The norms provide a context for knowing how much growth is typical or atypical for students over a school year or between varying time intervals within a school year. For example, the fall-to-spring growth exhibited by first graders with Fall Math RIT scores of 130 is described by a normal distribution (i.e., bell curve) with a mean of 22.6 points and a standard deviation of 6.9, looking like this:

The fall-to-spring growth exhibited by first graders with Fall Math RIT scores of 130 is described by a normal distribution (i.e., bell curve) with a mean of 22.6 points and a standard deviation of 6.9, looking like this:

About half of first graders with this fall score of 130 will show fall-to-spring growth less than about 23 RIT points, and about half will show more than 23 RIT points. Some schools tabulate the percentage of their students whose growth meets or exceeds mean growth for students who start out at the same RIT level (50% would be about typical), and report this value as a performance indicator for how well their students are doing.

Here’s the important thing: while 23 points of growth might be typical for the first grader who has a fall RIT score of 130, it’s not typical for all first graders. First graders with fall math scores of 180 have a fall-to-spring growth distribution that has a mean of 13.0 and a standard deviation of 6.9, looking like this:

First graders with fall math scores of 180 have a fall-to-spring growth distribution that has a mean of 13.0 and a standard deviation of 6.9, looking like this:

For these first graders, “typical” growth is only 13 points. The amount of growth typically achieved within a year is much less for these students than for the ones in the prior example who had a lower fall RIT score. The point is that a “years’ worth of growth” (as defined by mean normative values) varies across kids of differing initial achievement, and across kids of different ages. Growth in achievement, just like growth in height, is not constant across all kids. Furthermore, students who meet typical growth are not necessarily on track to meet any external performance criteria, such as state proficiency or college readiness. These students are merely showing change that is at or above average, compared to other students like themselves.

If you work with data from growth measures such as MAP, the next time you are asked what percentage of students shows a year’s worth of growth, give them the correct answer: 100%. But if you are asked what percentage of students show change that is “typical” or is consistent with (statistical, normative) expectations, don’t forget to take into account: 1) where students start, and 2) how much instructional time is in the test interval of interest.