Living in Dialogue: Will Tom Toch Ever Find a Reform Bad Enough to Abandon?

John Thompson

August 25, 2016

In 2008, Tom Toch’s “Rush to Judgment” reviewed science-based alternatives to value-added teacher evaluations. Toch explained that the District of Columbia’s test-heavy IMPACT teacher evaluation system cost $1000 per teacher. For that price, we could have invested in proven, win-win alternatives such as the National Board for Professional Teaching Standards certification, the Toledo peer review plan, or Robert Pianta’s holistic approach to professional development. The United States spent $500 billion dollars a year on education, and a $3 billion investment in better teacher evaluations should have been seen in the context of the $14 billion dollars of annual spending on professional development, which was often ineffective. Toch observed that some would like to use standardized test results in evaluations, but that those scores “aren’t great measures.” Then, in a sentence that seemed inexplicable after such an overview, he called for test scores to count for as much as 50% in teacher evaluations.

This was the time when corporate reformers took the opportunity to push their entire competition-driven model on the nation, and when Toch was involved in an even more confusing case of wordsmithing. Educational consultant Marc Dean Millot wrote in This Week in Education about Toch’s “Sweating the Big Stuff: A Progress Report on the Movement to Scale Up the Nation’s Best Charter Schools.”

Based on interviews with CMO insiders, publicly available data, and his own analysis, Toch presents a compelling indictment of the “new philanthropy’s” primary investment strategy for education reform. His arguments should be available to all and addressed on the merits. Instead, someone at EdSector hacked away at Toch’s evidence until it fit the rhetoric of CMO advocates.

My take on the paper’s confusing conclusion on the scaling up of charter schools was to emphasize the contradictions inherent in Toch’s paper:

But Toch then writes “the research for a report on CMOs that I’ve produced for the think tank Education Sector reveals that many of these organizations are going to be hard-pressed to deliver the many schools that Duncan wants from them.” … Toch describes “four dozen charter networks’ opening about 350 schools with some 100,000 seats over the past decade. This is a long ways from the 5,000 failing public schools. …” (emphasis mine)

Toch further wrote,

CMOs increasingly face the challenge of either paying their teachers more as they gain seniority, or churning through teachers and making it tougher to sustain their schools’ powerful cultures.

Also,

student attrition is high in CMO schools, fueled by higher standards, long hours, and transient families. A study of four San Francisco Bay Area KIPP middle schools found that 60 percent of entering students departed before graduating. The loss of revenue from so many departing students is devastating, but the price of bringing in replacement students is also high.

In retrospect, reading these two papers in tandem provides an insight into why the Duncan administration adopted the questionable strategy of including test score growth in teacher evaluations. That metric was intended to drive the reward and punish approach to both individual educators and to the mass closures of schools. Now it is clear that high stakes testing made both corporate reforms more destructive. And the rush to implement these mandates did not help. But, then as now, corporate reformers were not very good at facing up to the facts of life in actual schools.
The history of spinning research in order to support the desires of corporate reformers now repeats itself in Toch’s “Grading the Graders.” He writes:

Because of the speed, complexity, and reach of reform, the new teacher evaluation systems emerging in states and school districts are very much works in progress. Flawed, fractious, and incomplete, their return on investment is not yet fully visible.

Some states and districts are merely going through the motions of change, more compliant than committed to careful teacher evaluation and the opportunities it creates to improve staffing decisions and teacher performance.

Such a statement is representative of Toch’s method. His report lists the reasons why the Gates/Duncan approach to teacher evaluations failed, and why the value-added portion of the experiment was the most unsalvageable. Then, he seems to assume that their basic model must be salvaged. Toch thus exemplifies that education version of the old irony. We’re hurt by every Gates/Duncan data-driven mandate that we implement, but we’ll make it up on volume.

Toch notes the $6.7 million price tag for D.C.’s IMPACT in 2011-2012, and how the money for such expenditures is now gone. Today, “many state departments of education and local school districts suffered budget cuts in the wake of the recent recession that have made it harder to respond to demands for reform.”

As Toch seemed to understand eight years ago, the attempt to dramatically reform teacher evaluations, while instituting equally dramatic, untested, and hurried reforms – that were inherently contradictory to each other – created a number of Perfect Storms of dysfunction. For instance, Toch observes, “The pace and scope of change in teacher evaluation have been dramatic in recent years. Add to this the complexity of the reforms involved, and it’s no wonder a host of methodological and morale challenges has arisen that must be addressed if the reforms are to achieve their full potential to strengthen instruction, make teaching more attractive work, and raise student achievement.”

Why Toch believes, or claims to believe, that continuing to squander teachers’ energy and morale on these dubious policy gambles would make the profession more attractive is beyond me. Even more inexplicable is his willingness to stay the course even though “the reform goals of improving instruction in high-need schools and using student achievement to evaluate teachers are set in conflict with each other.” Even though he admits that the value-added portion of evaluations is inherently biased against teachers in high-challenge schools, and thus “top teachers have an incentive to avoid working in struggling schools,” Toch still claims that reducing the relative importance of that unfair metric, but not ending it, would encourage teachers to transfer to those schools!?!?

Nobody questions Toch’s knowledge about education, and he assembles a long list of reasons why value-added evaluations failed. The following are just a few of his points. It would have taken a huge effort to lay the foundation for even the qualitative part of the evaluations, but the process was rushed into place. The value-added portion isn’t “particularly helpful in identifying the differences among the nation’s many mid-range teachers,” which are about 70% of the profession.

Despite the incredible amount of resources dedicated to creating metrics for the 70% of teachers in non-tested courses, no good solutions have been discovered for quantifying their effectiveness. For instance, some districts took the laughable approach of evaluating teachers based on the outcomes of students and subjects they didn’t teach. Others created student learning objectives, SLOs, but they were “also time-consuming and therefore expensive to create. And because they’re typically non-standardized assessments, they make dependable teacher-to-teacher comparisons next to impossible.” Moreover, the challenge of using tests to evaluate the 13% of teachers who teach special education was even greater.

Toch notes that increasing the number of standardized tests to provide metrics for all teachers would be expensive, but he doesn’t adequately address the worst problem. More high stakes tests means more teach-to-the-test malpractice. And, of course, that is the inherent flaw of the Gates/Duncan approach. It makes no sense to use a system for firing the bottom 5 to 10% that is virtually guaranteed to lower the effectiveness of the rest of the profession.

As anyone who read Toch’s 2008 paper should have anticipated:

Several elements of the evaluation reform movement have alarmed and angered many teachers, turning them against reform. These include the fast pace of mandated changes; the early focus on removing bad teachers; the heavy reliance on principals who turned out to be underprepared; the new, untested use of student achievement results; the simultaneous rollout of the Common Core curriculum and new tests; poor-quality feedback in many jurisdictions; and a lack of improvement resources.

Then he somehow manages to pin the blame for all this on the inflammatory press, unions that responded to their rank-in-file and backed off from trying to rescue the experiment, and Arne Duncan’s overreach.

I don’t claim to be objective. Like so many of the educators who were tasked with implementing the Gates/Duncan evaluation approach, I believe that, predictably, value-added evaluations were tied with Prohibition and the War of Drugs as the worst unforced social policy error of the last century. I would think that any objective observer would say that the hard-earned lesson should be that the reward-punish approach to evaluations should be repudiated, and a completely new approach adopted. The next era of school improvement should draw upon the promise of the Toledo Plan’s peer review and/or the National Board or Robert Pianta’s approach to professional development. However, Toch concludes, “The hard-learned lessons of the past several years suggest that building on that progress, staying the course on reform despite the dissolution of the Obama incentives, is in the best interests of both students and teachers.”

What do you think? Why would a respected expert, in 2008 and in 2016, lay out the problems with an expensive, untested, and risky policy, and then go along with it? When Toch’s criticisms of high stakes testing for individuals are read next to the flags he raised about mass charterization, doesn’t that imply that corporate reformers ignored the facts because their goal was defeating teachers and unions?

This blog post has been shared by permission from the author.
Readers wishing to comment on the content are encouraged to do so via the link to the original post.
Find the original post here:

Living in Dialogue

The views expressed by the blogger are not necessarily those of NEPC.