The Problem with Value-Added Measures

Matthew Di Carlo in his blog on January 26th wrote this in a discussion of Florida's use of value-added assessment of teachers and schools:

I would argue that growth-based measures are the only ones that, if they’re designed and interpreted correctly, actually measure the performance of the school or district in any meaningful way....

Those italics are in the original, and they are a bit of a cop-out. In my opinion there is no way to "design and interpret correctly" the various growth measures that have been proposed for the measurement of the contribution of a teacher or even a group of teachers to a group of children's learning. In the first place, any system of high stakes, punitive measurement of teachers for purposes of monetary rewards or other benefits produces not just teaching to the test—a problem so pervasive that even the President of the U.S. can talk about in a State of the Union address—but also produces cheating...and before those feelings of moral outrage begin to take you over, please summon the honesty to admit that you would too if placed in the same circumstance.

But there is another problem with the value-added measures that is much too infrequently talked about.

Way back in the 1990s while moderating an online discussion of the Tennessee Value-Added Assessment System (due to a business prof at U Tenn named William Sanders, I believe), Sanders himself, and later his assistant and occasional co-author, Sandra Horn made a brief appearance in the discussion and quickly retreated. That discussion among a dozen or more scholars runs to several thousand words and is available to anyone here:http://gvglass.info/TVAAS/.

I happened to be giving a talk in Denver in the mid-1990s at a conference in Denver where Sanders was also speaking. After my talk I was approached by a young woman who identified herself as Horn and asked if I had time for a brief conversation. "Yes, certainly." Horn started off by saying that Sanders and she felt that if I just understood a few things about TVAAS that the objections I had expressed in the online discussion would surely be cleared up. "Try me."

For 15 minutes I listened to descriptions of TVAAS that were entirely irrelevant to my objections. Finally I interrupted:

GVG: Let me pose a hypothetical to you. Suppose that there are two classes of children and that Class A and Class B are taught by two teachers who teach in exactly the same way. In fact, every word, action, and thought they produce is identical. And suppose further that these two groups of children begin the school year with identical knowledge acquired in the past. Now here is the critical assumption. Suppose that the pupils in Class A have an average IQ of 75, and the pupils in Class B have an average IQ of 125. Do you believe that your measure of teacher value-added will produce the same numeric value for these two teachers?
SH: Yes.

Rather than deliver an impromptu lecture on the difference between aptitude (mental ability, a portion of which is undeniably inherited) and school achievement, I excused myself.

And such is the Achilles heel in all of the so-called value-added assessment systems. They act as though the statistical equating on achievement tests (as fallible as it is) of groups of students has held all influences constant (ceteris paribus), and hence the gain score is valid and fair as a measure of the contribution to learning of a teacher or a school. It is not, and never will be.