Janresseger: Northwestern University Economist Uses Data to Prove Students’ Test Scores Fail to Measure Quality Teaching

Jan Resseger

November 20, 2018

Mike Rose, a UCLA education professor, understands a lot about teaching. In his extensive writing about education, Rose explains good teaching with precision and insight. Rose culminated a four year visit to excellent classrooms across the United States with the publication of the story of those teachers in Possible Lives. He has also written widely about what good teachers do and what ought to be considered when teachers are evaluated.

Rose explains: “Teaching done well is complex intellectual work, and this is so in the primary grades as well as Advanced Placement physics. Teaching begins with knowledge of subject matter, of instructional materials and technologies, of cognitive and social development. But it’s not just that teachers know things. Teaching is using knowledge to foster the growth of others. This takes us to the heart of what teaching is…. The teacher sets out to explain what a protein or a metaphor is, or how to balance the terms in an algebraic equation, or the sociological dynamics of prejudice, but to do so needs to be thinking about how to explain these things: what illustrations, what analogies, what alternative explanations when the first one fails. This instruction is done not only to convey particular knowledge about metaphors or algebraic equations, but also to get students to understand and think about these topics. This involves hefty cognitive activity, … but the teacher is doing it with a room full of young people—which brings a significant performative dimension to the task.”

Rose continues: “Thus teaching is a deeply social and emotional activity. You have to know your students and be able to read them quickly, and from that reading make decisions to slow down or speed up, stay with a point or return to it later, connect one student’s comment to another’s. Simultaneously, you are assessing on the fly Susie’s silence, Pedro’s slump, Janelle’s uncharacteristic aggressiveness. Students are, to varying degrees, also learning from each other, learning all kinds of things, from how to carry oneself to how to multiply mixed numbers. How teachers draw on this dynamic interaction varies depending on their personal style, the way they organize their rooms, and so on—but it is an ever-present part of the work they do.”

Rose further describes characteristics of the classrooms created by excellent teachers and what he has observed about how teachers continue to improve their practice through their careers: “The classrooms were safe. They provided physical safety…. but there was also safety from insult and diminishment… And there was safety to take intellectual risks… Intimately related to safety is respect… Respect also has a cognitive dimension. As a New York principal put it, ‘It’s not just about being polite—even the curriculum has to be challenging enough that it’s respectful.’ Talking about safety and respect leads to a consideration of authority… A teacher’s authority came not just with age or with the role, but from multiple sources—knowing the subject, appreciating students’ backgrounds, and providing a safe and respectful space. And even in traditionally run classrooms, authority was distributed. Students contributed to the flow of events, shaped the direction of discussion, became authorities on the work they were doing. These classrooms, then, were places of expectation and responsibility.”

As Rose discusses the characteristics of good teaching, he is not evaluating teachers by standardized test scores in language arts and mathematics. Arne Duncan’s U.S. Department of Education made the evaluation of teachers by their students’ test scores a condition for states’ qualifying for Race to the Top grants and No Child Left Behind Waivers. Major research bodies—the American Statistical Association and the American Education Research Association— have, however, condemned the use of test scores and econometric, value-added-measures of teacher quality due to their unreliability.

Now Northwestern University economist, C. Kirabo Jackson, has developed an econometric model demonstrating that students’ test scores including those based on value-added models miss the most important characteristics of teachers—the sort of qualitative characteristics Mike Rose so clearly describes. In EducationNext, Jackson explains: “I find that, while teachers have notable effects on both test scores and non-cognitive skills, their impact on non-cognitive skills is 10 times more predictive of students’ longer-term success in high school than their impact on test scores. We cannot identify the teachers who matter most by using test-score impacts alone, because many teachers who raise test scores do not improve non-cognitive skills, and vice versa. These results provide hard evidence that measuring teachers’ impact through their students’ test scores captures only a fraction of their overall effect on student success.”

Jackson creates a behavior index to measure whether students act out, skip class or fail to hand in homework. He concludes: “A student whose 9th-grade behavior index is at the 85th percentile is a sizable 15.8 percentage points more likely to graduate from high school on time than a student with a median behavior index score. I find a weaker relationship with test scores…”

Jackson then studies specific teachers and whether the same teachers demonstrate facility at raising students’ language arts and math scores, on the one hand, and improving their behavior, on the other. His data demonstrate that, “(M)any teachers who are excellent at improving one skill are poor at improving the other, but also that knowing a teacher’s impact on one skill provides little information on the teacher’s impact on the other.” You will need to read Jackson’s new article to learn his methodology for evaluating teachers’ impact on students’ behavior.

Jackson concludes: “These results confirm an idea that many believe to be true but that has not been previously documented—that teacher effects on test scores capture only a fraction of their impact on their students. The fact that teacher impacts on behavior are much stronger predictors of their impact on longer-run outcomes than test-score impacts, and that teacher impacts on test scores and those on behavior are largely unrelated, means that the lion’s share of truly excellent teachers, those who improve long-run outcomes—will not be identified using test-score value added alone… This analysis provides the first hard evidence that such contributions to student progress are both measurable and consequential.”

While I’m delighted to know that Jackson’s research has exposed a tragic flaw in the practice of judging teachers by the standardized test scores of their students, I hope this new research does not stimulate policy makers to start demanding that states use Jackson’s methodology to measure teachers’ impact on student behavior.

Thank goodness, Jackson himself suggests a more qualitative approach: “To fully assess teacher performance, policymakers should consider measures of a broad range of student skills, classroom observations, and responsiveness to feedback alongside effective ratings based on test scores.” In other words Jackson acknowledges that experts on pedagogy—people like Mike Rose—know what they are talking about when they analyze and describe excellent teaching.

This blog post has been shared by permission from the author.
Readers wishing to comment on the content are encouraged to do so via the link to the original post.
Find the original post here:

Janresseger

The views expressed by the blogger are not necessarily those of NEPC.