Skip to main content

Should Computers Grade Essays?

Todd Farley is the scourge of standardized testing. His book, “Making the Grades,” is a shocking exposé of the industry. Todd spent nearly 15 years scoring tests, and he knows the tricks of the trade.

In this article, he skewers the latest testing craze: machine-scoring of essays.

Having demonstrated the fallibility of humans who score essays, Farley is no more impressed by computer scoring. As he puts it:

“…the study’s major finding states only that “the results demonstrated that overall, automated essay scoring was capable of producing scores similar to human scores for extended-response writing items.” A paragraph on p. 21 reiterates the same thing: “By and large, the scoring engines did a good [job] of replicating the mean scores for all of the data sets.” In other words, all this hoopla about a study Tom Vander Ark calls “groundbreaking” is based on a final conclusion saying only that automated essay scoring engines are able to spew out a number that “by and large” might be “similar” to what a bored, over-worked, under-paid, possibly-underqualified, temporarily-employed human scorer skimming through an essay every two minutes might also spew out. I ask you, has there ever been a lower bar?”

Farley quotes the promoters of automated scoring, who say that the machines are faster, cheaper and more consistent than humans. Also, they make money.

He concludes: “Maybe a technology that purports to be able to assess a piece of writing without having so much as the teensiest inkling as to what has been said is good enough for your country, your city, your school, or your child. I’ll tell you what though: Ain’t good enough for mine.”

One of the responses to Farley’s post came from Tom Vander Ark, who is a tech entrepreneur and a target of Farley’s post.

Vander Ark wrote: “The purpose of the study was to demonstrate that online essay scoring was as accurate as expert human graders and that proved to be the case across a diverse set of performance tasks. The reason that was important is that without online scoring, states would rely solely on inexpensive multiple choice tests. It is silly to suggest that scoring engines need to ‘understand,’ they just need to score at least as well as a trained expert grader and our study did just that.”

A reader of this blog saw this exchange on Huffington Post and sent me this comment:

“Diane–we use an automated essay scorer at my school, and I have seen coherent, well-thought out writing receive scores below proficient, while incoherent, illogical writing (with more and longer words, and a few other tricks that automated scorers like) receive high scores. The students who suffer the most are the highest level students, the verbally gifted writers who write with the goal of actually being understood, “silly” as that may be.”

“In fact, all standardized testing penalizes the brightest students–those who think outside the box. Standardized testing is the box.”

This blog post, which first appeared on the

website, has been shared by permission from the author.
Readers wishing to comment on the content are encouraged to do so via the link to the original post.
Find the original post here:

The views expressed by the blogger are not necessarily those of NEPC.

Diane Ravitch

Diane Ravitch is Research Professor of Education at New York University and a historian of education. She is the Co-Founder and President of the Network for Publi...