Living in Dialogue: Hocus Pocus History of High Stakes Tests

John Thompson

April 6, 2015

Did you hear the one about “Voo Doo Economics?” President Ronald Reagan said that his “Supply Side Economics” would cut taxes, increase spending, and reduce the deficit!?!?

If a 22nd century historian were to uncover Reagan’s claim, and yet discover that all of the physical and digital records of the 1980s and several subsequent decades had been lost, it might be hard to prove that Reaganism didn’t raise the deficit.

Perhaps the same agnosticism could apply to claims that No Child Left Behind boosted student performance as measured by the reliable NAEP tests – except for one reason. NAEP records are readily obtainable by a quick Google search.

NAEP data may not prove what I believe to be the best summary of the evidence – that NCLB and subsequent NCLB-type testing caused more harm than good for students. But, NAEP metrics do prove the intellectual dishonesty of the true believers who claim that high stakes testing has improved so-called “student performance.”

In fact, NAEP scores were increasing before NCLB and their growth slowed after NCLB testing took effect. The American Institutes of Research’s Mark Schneider, known by the conservative Fordham Institute as the “Statstud,” is just one scholar who documented this pattern, concluding “pre-NCLB gains were greater than the post-NCLB gains.”

Curiously, Schneider was also one of the true believers who first pushed the silly claim that NCLB deserves credit for test score gains that occurred before the law was enacted. Illogically, reformers claim test score increases from 1999 to the winter of 2002 were the result of a law that was enacted in the winter of 2002. The actual passage of NCLB high stakes testing was the tail of a “meteor” that was dubbed “consequential accountability.” And that brings us to the latest convoluted spin trying to deny that test-driven reform has failed. The most recent example is Tom Loveless’s“Measuring Effects of the Common Core.”

At least Loveless’s approach to the pre-NCLB effects of NCLB is much more modest. He claims that it is “unlikely” that accountability efforts and increased reform-related spending did not “influence” pre-NCLB NAEP scores. Even so, Loveless offers no credible reason to believe that increases in 1999 test results should be attributed to stakes attached to tests that were imposed three years later.

So, what evidence does Loveless offer for his conclusion that NCLB might deserve credit for the test scores that preceded it? He cites Derek Neal and Diane Whitmore Schanzenbach who reported that “’with the passage of NCLB lurking on the horizon,’ Illinois placed hundreds of schools on a watch list and declared that future state testing would be high stakes.”

Neal and Schanzenbach were studying Chicago schools, however, and they concluded the opposite. They report that “ISAT performance played a small role in the CPS rules for school accountability over this time (1999 to 2001).” Neal and Schanzenbach explain that “in one year, the ISAT went from a relatively low-stakes state assessment to a decidedly high stakes exam.” But, “in the springs of 1999, 2000, and 2001, CPS took the ISAT with the expectation that the results would not have significant direct consequences in terms of the state accountability system.”

By the way, Neal’s and Schanzenbach’s title of “Left Behind by Design,” is not exactly a ringing endorsement of Schneider’s and Loveless’ spin in favor of No Child Left Behind.

Loveless cites two other papers in his footnotes, but a careful reading of provides evidence for both sides of the issue as to whether testing that preceded NCLB was actually high stakes, whether it changed behavior in positive or negative ways, or whether it contributed to improved test scores.

Rather than get bogged down in arcane details, policy-makers should remember that true believers in consequential accountability can’t even agree on which states had such systems. They acknowledge that the economy and demographic changes could explain changes in NAEP results, but they do not try to control for those factors. Moreover, test score growth of early adopters of accountability did not necessarily increase in comparison to their previous scores. And, the fact that early adaptors had the money to increase education spending may explain subsequent (though often short-lived) gains in student performance. Above all, even the papers cited by Loveless help explain why NCLB-type testing, whether it helped some students or not, probably damaged other students.

While Loveless and others debate whether the effects of the No Child Left Behind Law should be seen as beginning in 1999 or 2003, they don’t seem aware of a more likely conclusion. When debating the effects of stakes attached to NCLB, shouldn’t we ask when stakes started to be attached to its tests? Schools started to receive additional NCLB funding in 2002, but for many or most of them, the law’s punitive measures did not take effect for another three years. There is no definitive answer, but I would argue that the 2005 NAEP is likely to be the single best indicator of the first effects of NCLB accountability provisions.

I would also argue that 8th grade NAEP reading scores should be seen as the most valuable metric when evaluating an education policy. A variety of reforms have been shown to improve math scores, without improving reading but, I would argue, being able to read for comprehension is the much more important skill required for schooling and life. Also, boosts in 4th grade test scores that wash out by 8th grade are not as likely to change the educational and life trajectories of students.

But, here I go again. I’m once again trying to parse the details of quantitative research and why some academics seem allergic to asking how their statistical models relate to these pesky real world issues. Instead, I should ask why non-educators who study education policy are so willing to cherry-pick evidence in support of the bizarre spin of reformers who seem to believe in Teleology.

What do you think? Why are economists and think tanks so insensitive to the realities in school and yet so sensitive to the spin of corporate reformers?

This blog post has been shared by permission from the author.
Readers wishing to comment on the content are encouraged to do so via the link to the original post.
Find the original post here:

Living in Dialogue

The views expressed by the blogger are not necessarily those of NEPC.