Recently in the New York Times, Stanley Fish wrote a piece titled The Two Cultures of Educational Reform. Though his essay was focused on higher education, his thoughts and those of Derek Bok and William Bowen about the limits and prospects of empirical, data-driven reform (assessment data) are equally relevant to k-12 education. Here is Fish quoting Bok:
“Some of the essential aspects of academic institutions — in particular the quality of the education they provide — are largely intangible and their results are difficult to measure.” Indeed, he adds, the “result is that much of what is important to the work of colleges and universities may be neglected, undervalued, or laid aside in the pursuit of more visible goals.”
Here, Bok is clearly worried about the limited scope of what we can measure, and the fact that we have a tendency to do only what will be measured. I must say, now that the tools of empirical measurement are beginning to be used to look at universities, it is refreshing to have the past presidents of Harvard and Princeton express strong skepticism about these efforts—though in general they support the use of data to support reform.
We have always claimed too much for our summative assessments in K – 12 education. Even the last generation of assessments—those driven by the NCLB legislation—did a poor job achieving their rather limited purposes: Identifying schools that were doing a good job with all groups of students and letting parents know if their children were making adequate yearly progress (AYP). Proficiency levels across the states were all over the place, and the argument is that most states had proficiency bars set too low to guarantee that students would exit high school adequately prepared for college or career. We all know stories of strong schools tripped up by poor performance of one sub-group and labeled as “in need of improvement.” We are all aware of the effect of demographics on school performance in terms of AYP, and that other measures of student growth might show that many high poverty schools are doing an excellent instructional job. The summative assessments were not even doing a good job of measuring the tangibles, let alone the intangibles that Bok references. And we all know how the intangibles – the arts, social studies, etc.—were marginalized in many schools to focus on reading and math.
Yet here we are now, moving to a next generation of summative assessments of which we are going to expect much more than the previous generation of assessments. We are asking these new tests to measure “career and college readiness” and across the earlier grades adequate yearly progress towards career and college readiness. I would argue that career and college readiness is so complex that it is one of the intangibles or at least that career and college readiness is a product of a high quality education many aspects of which are intangible. We have already addressed the fact that the Common Core State Standards do not define readiness though they offer some benchmarks and suggest some instructional shifts that the authors assert are necessary for career and college readiness. Thus, it is hard to agree that simply aligning assessments to the CCSS will create assessments that tell us much about college and career readiness.
Additionally, these new generation summative assessments are considerably longer and proficiency cuts scores are being set extremely high to reflect “career and college readiness.” Further, these assessments are being used to evaluate teachers. The question I ask is: are these CCSS summative tests worth the strains we are putting on our teachers and more importantly our students? We need to be very clear about what the CCSS aligned summative assessments do measure and very clear about what we are saying when labeling large portions of our students as not proficient.