Any attempt to evaluate teachers that is spoken of repeatedly as being "scientific" is naturally going to provoke rebuttals that verge on technical geek-speak. The MET Project's "Ensuring Fair and Reliable Measures of Effective Teaching" brief does just that. MET was funded by the Bill & Melinda Gates Foundation.
At the center of the brief's claims are a couple of figures (“scatter diagrams” in statistical lingo) that show remarkable agreement in VAM scores for teachers in Language Arts and Math for two consecutive years. The dots form virtual straight lines. A teacher with a high VAM score one year can be relied on to have an equally high VAM score the next, so Figure 2 seems to say.
Not so. The scatter diagrams are not dots of teachers' VAM scores but of averages of groups of VAM scores. For some unexplained reason, the statisticians who analyzed the data for the MET Project report divided the 3,000 teachers into 20 groups of about 150 teachers each and plotted the average VAM scores for each group. Why?
And whatever the reason might be, why would one do such a thing when it has been known for more than 60 years now that correlating averages of groups grossly overstates the strength of the relationship between two variables? W.S. Robinson in 1950 named this the "ecological correlation fallacy." Please look it up in Wikipedia. The fallacy was used decades ago to argue that African-Americans were illiterate because the correlation of %-African-American and %-illiterate was extremely high when measured at the level of the 50 states. In truth, at the level of persons, the correlation is very much lower; we’re talking about differences as great as .90 for aggregates vs .20 for persons.
Just because the average of VAM scores for 150 teachers will agree with next year's VAM score average for the same 150 teachers gives us no confidence that an individual teacher's VAM score is reliable across years. In fact, such scores are not — a fact shown repeatedly in several studies.
So we aren't going to fire groups of 150 teachers arbitrarily lumped together who might have low VAM scores, nor pay big bonuses to the high VAM group. Nor are we going to fire those teachers whose Language Arts VAM score is low, because the odds are substantial that the same teachers' Math VAM score might be average or even above. We would see that such teachers are hardly the exception if the authors of the MET Project brief had simply shown us scatter plots of individual teachers' VAM scores instead of having tripped up on Robinson's ecological correlation fallacy.
This blog post has been shared by permission from the author.
Readers wishing to comment on the content are encouraged to do so via the link to the original post.
Find the original post here:
The views expressed by the blogger are not necessarily those of NEPC.