High Stakes Testing
by
Gerald Bracey, Ph.D.
Center for Education Research, Analysis, and Innovation
School of Education
University of Wisconsin-Milwaukee
PO Box 413
Milwaukee WI 53201
414-229-2716
December 5, 2000
CERAI-00-32
An Education Policy ProjectBriefing Paper
High Stakes Testing
Contributor: Gerald Bracey, Ph.D.
“This is the year that U.S. schools went test-crazy.” Thus reads theopening line of an April 16, 2000, article by David Bacon in the OaklandTribune.
1 The statement contains oneinaccuracy--the schools didn’t so much go crazy for the tests, as they wentcrazy trying to cope with the tests imposed on them by governors, legislators,and state boards of education, all cheered on by business and industry. Thatquibble aside, Bacon captured the feelings of many people observing education. To be sure, this was not a sudden madness. In the 1970’s and early1980’s, some 35 states had adopted some version of a “Minimum Competency Test”to assure that high school diplomas were not based on so-called socialpromotion, seat time, or both. In 1977 a report on the apparent decline in SATscores made every minute change in those scores front page news -- at least,when the scores went down; when the scores went up, that result got buried withlocal news.
The new illness, though, was more virulent. When Bacon penned hisarticle, all but one state had adopted or created standards for public schoolstudents and 41 had adopted or constructed tests for measuring and passingjudgment on student performance.
Students now are more at risk of not graduating than in the era of MinimumCompetency Testing because the tests are tougher or the scores needed to passare unrealistically high. Fully 90% of the students in Arizona failed atthe first administration, and if the failure rates continue, over half ofArizona’s sophomores will not graduate in 2002.
2 In Virginia, 98% of the schools failed the first administration of its newstate tests, 93% the second. 3 , 4 In addition, students are being retained in grade or forced to attend summerschool based on test scores. Proposals exist to start testing students inkindergarten. Teachers are warned that their raises, bonuses or eventheir jobs are on the line. Principals and superintendents suffer similar threats. While theemphasis has been on the negative, on rarer occasions, the bonuses of teachers,principals and superintendents are tied to specific test score gains. Whereas tests were once used largely as monitoring devices, they now haveenormous consequences for many people. Hence the catch-phrase“high-stakes testing.”
Looking at the frenzy about testing, two questions immediately come to thefore: The first: Why? The second: Are the testing programs having theirdesired impact? The short answer to the first question is, “A lossof trust in teachers and administrators.” The answer to the second is,“No.” The balance of this report considers each question in detail inturn.
Why Did Americans Become Nervous About Their Public Schools?
American public schools have always suffered from criticism, but the criticsbecame more numerous and more vocal shortly after World War II. As thenation moved towards universal secondary education, it was also engaged in aspace and arms race with the Soviet Union. For the first time, schoolswere perceived as integral to national defense. Colleges would, ofcourse, prepare the engineers, scientists and mathematicians needed to meet theRed Menace, but those colleges and universities had to start with the productsof high schools, and there was looming anxiety in some quarters that theseschools were falling short. Rising enrollments and graduation ratesheightened anxieties, as people feared those increases reflected a decline inrigor. When the Russians launched Sputnik in 1957, the beeps emitted by thatsmall satellite proved to the critics that they had been right.
A quarter-century later schools were hit once again, this time with thepublication of a paper Sputnik, “A Nation of Risk.” The 1983 report was,the New York Times observed in 1997, merely propaganda, but it was notrecognized as such in many quarters at the time.
5 Its highly selective and negatively spun statistics were used as a clarion callto overhaul the schools. The anxiety people might feel about their schools was heightened by the factthat, as journalist Peter Schrag observed, good news about schools served noone’s political education reform agenda.6 The Reagan andBush administrations, pushing privatization, vouchers and tuition tax credits,actively suppressed positive data where it could and ignored positive datawhere it could not actually control the flow of information. Thus, a 1992international study in mathematics and science which found American ranksmostly (but not entirely) low, was given a large press conference by the U. S.Department of Education.
7 An international study in reading thatfound American students second in the world was ignored. 8 A large analysis of the U.S. public school system by Sandia National Laboratoryengineers was suppressed for being too positive. It was finally published afterthe Clinton administration arrived, but was seen by few people. 9 U. S. Department of Education officials denied that the report was suppressed,but Lee Bray, the now-retired Vice President of Sandia National Laboratoriesresponsible for the report is emphatic that it was. 10 The Clinton-Gore years have seen an increased press for additional resourcesfor public schools, but they have emphasized the problems of schools thatrequire the resources. American universities use a similar approach intheir attempts to obtain funding from governments and foundations. And,as it has for the last 100 years, business and industry has found Americaneducation wanting and has tried to prescribe what is to be taught.
The consequence of this negativity coming from so many sources is thatvirtually everyone is willing to believe the worst about the schools. Forinstance, in the mid-1980’s two lists appeared showing the most seriousproblems in the schools in the 1940’s and in the 1980’s. In the 1940’s,schools were plagued by students talking out of turn, not raising their hands,chewing gum in class, and breaking in line. In the 1980’s, drugs,violence, gangs and teen pregnancy had become the most serious problems. Yale University professor Barry O’Neill found that many people along the entirepolitical spectrum assumed that the lists were based on research and weretrue. O’Neill revealed them as a hoax.
11 All of the above events contributed to a feeling that the people running theschools could not be trusted to provide accurate information on what studentswere or were not learning. Something more objective was needed, somethingthat did not depend on the subjective judgments of teachers. Thatsomething in most instances turned out to be a test.
Is the public’s nervousness warranted? Not according to the data --mostly test data -- that exist. There are many aspects of schooling thatcannot be measured with tests, but tests are the major source of data wecurrently have that permit comparisons of schools, states, ornations. What do these test show?
· Standardized achievement tests attained record high levels in the mid- to late1980’s and remain there.
12 · Scores onthe National Assessment of Educational Progress have risen to all-timehighs. Gains have been especially dramatic for blacks and Hispanics.
13 · Theproportion of students scoring above 650 on the SAT mathematics sectionattained record levels around 1995 and has remained at the all-time high.
14 This cannot be accounted for by Asian-American students who are too few innumber, constituting some 9% of all SAT test-takers. 15 Of the 75% increase between 1981 and 1995, black, white, Hispanic and NativeAmericans accounted for 57%. 16 · The numberof students taking Advanced Placement examinations his risen from just over1000 in 1961 to over 1,000,000 currently.
17 · American students are second in the world in reading.
18 It would thus seem that the condition of public education, insofar as it canbe adequately assessed by existing test instruments, shows no cause foralarm. Even if the raison d’ętre of the high stakes testing programs weremissing, they could still be acceptable programs if they were shown to becausing achievement to increase. A check of the data, though, notonly fails to find such improvements, but uncovers a gaggle of unfortunateoutcomes.
Examining these outcomes results in an extended answer to the second question: Are the tests having their intended impact?
Before answering the question, we should consider carefully the wordsof Robert L. Linn probably the most respected psychometrician in thenation. In the March 2000 Educational Researcher, Linn examined the evidenceon the impact of high-stakes testing and offered this assessment:
As someone who has spent his entire career doing research, writing, andthinking about educational testing and assessment issues, I would like toconclude by summarizing a compelling case showing that the major uses of testsfor student and school accountability during the past 50 years have improvededucation and student learning in dramatic ways.
Unfortunately, that is not my conclusion. Instead, I am led toconclude that in most cases the instruments and technology have not been up tothe demands that have been placed on them by high-stakes accountability. Assessment systems that are useful monitors lose much of their dependabilityand credibility for that purpose when high stakes are attached to them. The unintended negative effects of the high-stakes accountability uses oftenoutweigh the intended positive effects.
19 Before examining these “unintended negative consequences,” it shouldbe noted that there is a gap between what test scores reveal and what peoplewant to know. New York Times reporter Anemona Hartocollis put it thisway: “In the war of perception against reality, almost nothing can beharder to gauge than the meaning of test scores…Yet parents and teachers areencouraged [to use tests] to judge their children and schools the way investorswatch the Dow industrials.”
20 Thus, one of the negative consequences of high-stakes testing is todrive a wedge between parents and their children. Parents, having watchedtheir children for years, have a feel for “what they are about.” But thetest might say otherwise. Fortunately, most parents are skeptical aboutwhat tests say. A poll by the American Association of SchoolAdministrators found that two thirds of parents say a test can’t measure achild’s progress and half say that tests don’t reflect what children know.
21 On a more societal level, high stakes testing is increasing socialstratification. On the Virginia Standards of Learning U. S. History test,required for graduation, only 13% of black students and 23% of Hispanicstudents passed, compared with 40% of white students. And this was on secondadministration, after a year of intense preparation for the test. Similargaps were found on all tests. For instance, 76% of white students passedthe Algebra I test, while only 36% of blacks and 49% of Hispanics scored highenough to pass.
22 When statewide tests were introduced inTexas, the dropout rates for black and Hispanic students rose sharply and havenot returned to previous levels.23 These tests have been presented by people such as Diane Ravitch and E. D.Hirsch, Jr., as engines of social justice. By providing a universal set ofstandards to which all must measure up, they reason, schools serving poor andminority pupils can be held accountable for improving their performance. This argument has a certain appeal, but makes sense only if test scores wereused to help diagnose which schools need additional resources in order to meetthe needs of more troubled student populations. More often, however, thestrongest advocates of high-stakes testing either are silent on this point,retreat to the argument that school funding doesn’t matter, or advocatepenalizing outright schools or their administrators whose students have thelowest scores. They fail to explain how, given the enormous differencesin scores for affluent and majority students, such tests will improve thechances of success for poor and minority children.
Under the gun of the tests, teachers are abandoning their usual curriculaand modes of teaching to lecture about test-oriented material. In manyinstances, they are omitting aspects of the curriculum not on the test. One local school board in a large Virginia district held a special session todetermine if they needed to mandate recess for their elementary schools becauseso many of them had abandoned it in favor of test preparation.
24 In Texas, where science and social studies were not initially included intesting, teachers reported that those subjects virtually disappeared. When the science and social studies tests appeared, science and socials studieswere quickly geared to what those tests tested. Tests can easily misrepresent the achievements of a school. For instance,six high schools in Miami-Dade and Broward County, Florida, made a list of thetop 100 high schools in the entire nation, based in part on the number ofAdvanced Placement examinations taken per student.
25 Yetin the Florida state accountability system, which grades schools from A to F,all six of these same schools received a grade of C. To be sure, there are limitations to using AP exams as a measure of quality,but the differing pictures painted by the different measures point to anotherproblem afflicting many high-stakes programs: severe judgments are being madeon the basis of a single test score. The standards for test usepromulgated jointly by the American Psychological Association, the AmericanEducational Research Association and the National Council on Measurement inEducation say clearly that no decisions about human beings should be made insuch a way.
26 Even the commercial test developers whoare realizing enormous profits from the test boom, concur. Parents, teachers and students are rebelling against these tests in variousways, another indication that the people most affected by the tests do not findthem to be healthy for children. Some parents simply refuse to permittheir children to take the tests, while others openly organize for theirrepeal. Some students, on pain of suspension, refuse to take thetests. And some principals and teachers, faced with instruments they donot believe validly assess what their students know, cheat.
Everyone in the nation supports efforts to improve schools. But thereis a growing realization among many, perhaps most people, that the impositionof high stakes tests carries, as Linn wrote, unintended negative consequencesthat defeat that purpose of improvement. It is time for those who wouldgovern our nation and our communities to offer a more thoughtful and humaneprogram for holding schools accountable.
1
Bacon, David, “Value of School Testing Opento Question.” Oakland Tribune, April 16, 2000 2
Pat Flannery and Kelly Pearce, “AIMS OffMark, Graduations at Risk.” Arizona Republic, Sept. 28, 2000, p.A1. 3
Mathews, Jay and Benning, Victoria, “97 Percentof Schools in Virginia Fail New Exams.” Washington Post, January9, 1999. The figure was later amended to 98%. 4
Benning, Victoria, and Mathews, Jay, “TestFail 93% of Schools in Virginia. Washington Post, August14, 1999, p. A1. 5
Applebome, Peter, “Dire Prediction Deflated:Johnny Can Add After All.” New York Times, June 11, 1997, p. A31. 6
Schrag, Peter, “The Near-Myth of Our FailingSchools.” Atlantic Monthly, October, 1997. 7
Lapointe, Archie E., Mead, Nancy A., andAskew, Janice M. Learning Mathematics. Lapointe, Archie E.,Askew Janice M., and Mead, Nancy A., Learning Science. PrincetonNew Jersey: Educational Testing Service. Report Nos. 22-CAEP-01 and22-CAEP-02. 8
Elley, Warwick P., How in the World DoStudents Read? The Hague, 1992. Available in the United Statesthrough the International Reading Association, Newark, Delaware. 9
Carson, C. C., Huelskamp, R. L., andWoodall, T. D. Perspectives on Education in America. Journal ofEducational Research, May/June, 1993, pp. 249-311. 10
Personal communication, August 2000. 11
O’Neill, Barry, “Anatomy of a Hoax.” NewYork Times Sunday Magazine, March 6, 1994, pp. 46-49 12
Hoover, H. D, director, Iowa TestingPrograms, Iowa City. Data supplied in personal communication, 1998;publication forthcoming. 13
NAEP 1999 Trends in AcademicProgress. Office of Educational Research and Improvement, U. S.Department of Education, August, 2000, Report No. NCES-2000-469. 14
Bracey, Gerald W. Unpublished analyses. 15
“Profiles of College Bound Seniors, 2000.” New York, The College Board, August, 2000 16
Bracey, Op Cit. 17
Advanced Placement Yearbook, 2000. New York: The College Board. 18
Elley, op. Cit. 19
Linn, Robert, “Testing andAccountability.” Educational Researcher, March, 2000, p. 4-15 20
Hartocollis, Anemona. “Test ScoresAre Up. From This We Can Conclude?” New York Times,June 11, 2000. 21
Ibid. 22
“Assessments: State & Fairfax CountyPublic Schools Passing Rates.” Data: Virginia Standards of Learning. Suppliedby Office of Testing and Evaluation, Fairfax County Public Schools, 11/4/99 23
Haney, Walter. “The Myth of the TexasMiracle in Education.” Educational Policy Analysis Archives,available at: http://olam.ed.asu.edu/epaa/v8n41/. 24
Sinha, Vandana, “Give Kids Recess, VirginiaBeach Parents Urge.” Norfolk (VA) Virginia-Pilot, March 21, 2000p. B1 25
“The 100 best high schools.” Newsweek,March 13, 2000, pp. 51-53. 26
The American PsychologicalAssociation, the American Educational Research Association, and the NationalCouncil for Measurement in Education. Standards for Educational andPsychological Testing. Washington, DC, 1985.