Re-examination of results finds that the data undermine calls for the use of value-added models for teacher evaluations
BOULDER, CO (January 13, 2011) – A study released last month by the Gates Foundation has been touted as “some of the strongest evidence to date of the validity of ‘value-added’ analysis,” showing that “teachers' effectiveness can be reliably estimated by gauging their students' progress on standardized tests” [http://articles.latimes.com/2010/dec/11/local/la-me-gates-study-new-20101211]. However, according to professor Jesse Rothstein, an economist at the University of California at Berkeley, the analyses in the report do not support its conclusions. “Interpreted correctly,” he explains, they actually “undermine rather than validate value-added-based approaches to teacher evaluation.”
Rothstein reviewed Learning About Teaching, produced as part of the Bill & Melinda Gates Foundation’s “Measures of Effective Teaching” (MET) Project,for the Think Twice think tank review project. The review is published by the National Education Policy Center, housed at the University of Colorado at Boulder School of Education.
Rothstein, who in 2009-10 served as Senior Economist for the Council of Economic Advisers and as Chief Economist at the U.S. Department of Labor, has conducted research on the appropriate uses of student test score data, including the use of student achievement records to assess teacher quality.
The MET report uses data from six major urban school districts to, among other things, compare two different value-added scores for teachers: one computed from official state tests, and another from a test designed to measure higher-order, conceptual understanding. Because neither test maps perfectly to the curriculum, substantially divergent results from the two would suggest that neither is likely capturing a teacher’s true effectiveness across the whole intended curriculum. By contrast, if value-added scores from the two tests line up closely with each other, that would increase our confidence that a third test, aligned with the full curriculum teachers are meant to cover, would also yield similar results.
The MET report considered this exact issue and concluded that “Teachers with high value-added on state tests tend to promote deeper conceptual understanding as well.” But what does “tend to” really mean? Professor Rothstein’s reanalysis of the MET report’s results found that over forty percent of those whose state exam scores place them in the bottom quarter of effectiveness are in the top half on the alternative assessment. “In other words,” he explains, “teacher evaluations based on observed state test outcomes are only slightly better than coin tosses at identifying teachers whose students perform unusually well or badly on assessments of conceptual understanding. This result, underplayed in the MET report, reinforces a number of serious concerns that have been raised about the use of VAMs for teacher evaluations.”
Put another way, “many teachers whose value-added for one test is low are in fact quite effective when judged by the other,” indicating “that a teacher’s value-added for state tests does a poor job of identifying teachers who are effective in a broader sense,” Rothstein writes. “A teacher who focuses on important, demanding skills and knowledge that are not tested may be misidentified as ineffective, while a fairly weak teacher who narrows her focus to the state test may be erroneously praised as effective.” If those value-added results were to be used for teacher retention decisions, students will be deprived of some of their most effective teachers.
The report’s misinterpretation of the study’s data is unfortunate. As Rothstein notes, the MET project is “assembling an unprecedented database of teacher practice measures that promises to greatly improve our understanding of teacher performance,” and which may yet offer valuable information on teacher evaluation. However, the new report’s “analyses do not support the report’s conclusions,” he concludes. The true guidance the study provides, in fact, “points in the opposite direction from that indicated by its poorly-supported conclusions” and indicates that value-added scores are unlikely to be useful measures of teacher effectiveness.
Find Jesse Rothstein’s review on the NEPC website at:
Find Learning About Teaching: Initial Findings from the Measures of Effective Teaching Project, by Thomas J. Kane and Steven Cantrell, on the web at:
The Think Twice think tank review project (http://thinktankreview.org), a project of the National Education Policy Center, provides the public, policy makers, and the press with timely, academically sound, reviews of selected think tank publications. The project is made possible in part by the generous support of the Great Lakes Center for Education Research and Practice.
The mission of the National Education Policy Center is to produce and disseminate high-quality, peer-reviewed research to inform education policy discussions. We are guided by the belief that the democratic governance of public education is strengthened when policies are based on sound evidence. For more information on NEPC, please visit http://nepc.colorado.edu/.
This review is also found on the GLC website at http://www.greatlakescenter.org/