Jersey Jazzman: NJ Court Strikes Down Graduation Test; An Opportunity to Re-Think Testing?

Miss me?

Yesterday, a New Jersey appellate court struck down regulations that required students to pass two PARCC tests -- the statewide tests implemented a few years ago under the Christie administration -- as a requirement to graduate. Sarah Blaine has an excellent legal analysis you should read about the ruling. Let me add a few thoughts to it, coming less from a legal perspective and more from an educational one:

SOSNJ has posted a copy of the ruling on its Facebook page. The ruling notes that back in 1979 the NJ Legislature called for "'a statewide assessment test in reading, writing, and computational skills . . . .' N.J.S.A. 18A:7C-1. The test must 'measure those basic skills all students must possess to function politically, economically and socially in a democratic society.' N.J.S.A. 18A:7C-6.1."

As Sarah points out, state statutes are usually vague when it comes to things like setting standards; functioning "politically, economically and socially" will mean different things to different people. It's also not clear why the Legislature thought a test was necessary for graduation. Was it to uphold the value of a New Jersey diploma by making sure all holders could meet a certain standard? Was it to hold districts accountable for their programs? Was it to make sure the state, as a whole, was providing the resources needed for students to have educational success?

Or was it possible the Legislature never really thought about the purpose of the test? I'm not sure; I do, however, know that in the time since the passage of the law, psyshometricians have been giving long and hard thought to something called validity: in the most basic terms, whether a test measures what we want it to measure, and whether it should be used for the purposes in which it is used.

The passage of the federal No Child Left Behind in 2002 act gave us gobs of test outcome data, which has been used by policymakers and researchers to evaluate and justify educational policies. Unfortunately, many of these folks have failed to ask the most basic question of the tests they rely on: are they valid measures of what we want to measure?

This might seem picky: a test is a test, right? Why can't we just say a kid needs to pass a test to graduate? It sounds simple at first... until you get to the problem of what to test, and whether you should use a test for purposes other than those for which it was originally designed, and even what outcome should be considered "passing."

As I've noted before, there's been a lot of fuzzy thinking about this over the last few years. "College and career ready," for example, has been held up as a standard all children should meet. But it's a meaningless phrase, artificially equating admission to institutions that, by design, only accept a certain percentage of the population to being able to participate in the workforce. "College and career ready" essentially means that everyone should be above average, a logical impossibility.

Clearly, the NJ Legislature didn't mean to set such a standard. On its face, the law is calling for a test that shows whether or not a student has reached a level of education that allows them to participate in society. Which brings us to the PARCC tests in question: the Algebra 1 and English Language Arts (ELA) Grade 10 tests.

Forget for a moment whether these tests "measure those basic skills all students must possess to function politically, economically and socially in a democratic society." Because it's a hard enough lift just showing that these tests measure a student's abilities in language and algebra in a way that is valid and reliable. I'm going to be way too simplistic here, but...

By valid, we mean the extent to which the test is actually measuring what it purports to measure. This is much trickier than many people are willing to admit: take, for example, word problems on an algebra test. Do they measure the ability to apply mathematical concepts to real-world situations... or do they measure a student's proficiency in the English language?

For a test to be valid, we have to present some evidence that our interpretation of the test's score can be used for the purposes we have set out. If, for example, a mathematically adept student can't pass an algebra test because they aren't able to read the questions in English, we have a potential validity problem -- we may not have a meaningful measure of what we want to measure. Maybe we want to use the test to place the student in the correct math class. It could be our test outcome isn't giving us the feedback we need to make the right decision -- we have to give some reason to believe it is.

By reliable, we mean that the test can consistently gauge a student's abilities. Do test scores vary, for example, based on whether they are taken on a computer or on paper? Do they vary based on the weather? All sorts of unobserved factors can influence a test outcome, and some tests vary more on these factors than others.

Validity and reliability are actually closely intertwined. What we should remember, however, is that test outcomes are always measured with error, and will vary due to differences in things we don't want to measure. This applies to the PARCC tests -- especially when we use those tests to determine whether a student should receive a diploma, a task for which they were not designed.

In yesterday's ruling, the court points to two problems with using the PARCC tests to fulfill the mandates of the state's law:

We hold N.J.A.C. 6A:8-5.1(a)(6), -5.1(f) and -5.1(g) are contrary to the
express provisions of the Act because they require administration of more than one graduate proficiency test to students other than those in the eleventh grade, and because the regulations on their face do not permit retesting with the same standardized test to students through the 2020 graduating class. As a result, the regulations as enacted are stricken.

It's worth noting that the court said there may be other problems with the regulations calling for the use of PARCC,  but that these two problems -- the tests aren't equivalent to an eleventh grade test, and there are no provisions to allow retesting -- are enough to overturn the regulations.

What the court is basically doing here is calling into question the validity and the reliability of the PARCC for the purpose of granting a diploma. The validity problem comes from the fact that neither the Algebra 1 nor the ELA-10 exam is measuring what the law says it's supposed to measure: whether an 11th grader is able to "function politically, economically and socially in a democratic society." How could they? They're not 11th grade tests!

The court rightfully restrained itself from going any further than this basic flaw in the regulations, but I'm under no such restriction, so let me take this a step further. Where has anyone ever made the case that passing the Algebra 1 exam is a valid measure of the "computational" skills needed to be a fully capable citizen? Keep in mind that NJDOE's own guide to the exam says that, among other tasks, Algebra 1 students should be able to:

Identify zeros of quadratic and cubic polynomials in which linear and quadratic factors are available, and use the zeros to construct a rough graph of the function defined by the polynomial.

Graph the solutions to a linear inequality in two variables as a half-plane (excluding the boundary in the case of a strict inequality), and graph the solution set to a system of linear inequalities in two variables as the intersection of the corresponding half-planes. 

Given a verbal description of a linear or quadratic functional dependence, write an expression for the function and demonstrate various knowledge and skills articulated in the Functions category in relation to this function.

The passing rate last year for the Algebra 1 test was only 46 percent. I know this is a cliche, but it rings true: given that rate, how many members of the NJ Legislature or the State Board of Education could pass the PARCC Algebra 1 test right now? If they can't, does that mean they don't have anything to contribute to our state?

One of the reasons the senior leaders of the NJDOE under the previous administration pushed so hard for a switch to the PARCC was that the previous math tests (the NJASK) had what are called "ceiling effects" -- basically, too many kids were getting perfect (or close to perfect) scores on the test. PARCC cheerleaders told us this was a huge problem; we needed to be able to sort the kids at the top of the distribution because... uh... reasons?

The PARCC Algebra 1 is not, therefore, measuring whether a student meets a basic level of achievement in math. It's a test that is attempting to gauge algebra ability, and it includes plenty of items that have low passing rates so as to tease out who is at the top of the score distribution. On its face, therefore, the test is not suitable for the purposes set out in law -- a point both Sarah and Stan Karp of the Education Law Center have been making for years.

The reliability problem in the regulations comes from the lack of retesting opportunities for students who fail the PARCC tests on their first try. Again: all tests outcomes are measured with error. A student who fails one administration of a test may have a "true test score" that is much higher; however, due to circumstances having nothing to do with their academic abilities, they may get a score lower than their "true" score.

When the stakes are as high are they are in graduation test, there must be a chance for students to take the test again. But this is difficult when the test, like the PARCC, has to limit its administrations due to security concerns. I can't say for sure that the HSPA, New Jersey's old test, addressed this problem as well as it should. There were, however, other alternative tests available to students if they didn't pass the HSPA.

I can only guess as to what comes next, but it's highly unlikely, given the rhetoric during the campaign, that the Murphy administration will challenge this ruling before the state Supreme Court. Which means the state's students are in a real bind: the law says they have to pass an 11th grade test, but the state doesn't have one ready to go. It takes a good bit of time to develop a valid, reliable test.  Maybe there's something available off-the-shelf -- but it would still be unfair to students to spring a brand new test on them without giving their schools the chance to actually teach the content on which those tests are based.

In the short term, the Legislature should work quickly and amend the law so today's high school students can get their diplomas without having to pass tests that are invalid for the purpose of "measur[ing] those basic skills all students must possess to function politically, economically and socially in a democratic society." 

I know that some legislators have invested a great deal of their reputations into the PARCC, but they need to step up and do the right thing here. A lot of kids have been working hard and playing by the rules, and they shouldn't feel their diplomas are at risk simply because the state acted rashly. No student should miss out on graduating due to this ruling.

In the long term: we are well past the time for this state to have a serious conversation, informed by expertise, about what exactly we are trying to achieve in our schools and how testing can help us get there. I have no doubt the usual suspects will claim anyone taking this position is setting low standards and in the pocket of the teachers union and doesn't really care about kids and blah blah blah...

Those, however, are the same folks who pushed hard for PARCC without engaging in a meaningful debate with skeptics about the purposes of testing and the consequences of implementing the current regime. They never bothered to address the carefully laid out, serious concerns of folks like Sarah and Stan and many others. Many of them made wildly ambitious claims about the benefits of PARCC, suggesting the tests could be used for all sorts of purposes for which they were never designed.

They are also (mostly) the same folks who have continually downplayed the role of school funding in educational outcomes; they insisted on higher standards without pausing for a second and asking whether schools have what they need to meet those standards. Remember: our current funding formula, which isn't even being followed, came years before we moved to the Common Core and PARCC. If we've adjusted our standards upward, isn't it sensible to think we'd have to adjust the resources needed upward as well?

Let me be clear: I am all for accountability testing. I think there is a real and serious danger of short-changing schools and exacerbating inequality if we don't use some universal measure to assess how students are performing. But we've got to have an understanding of what tests can and can't do if we're going to use them to evaluate policies. And we've got to be extremely cautious when we attach high-stakes decisions, such as graduation, to test outcomes.

Let's view this ruling as an opportunity: a chance to make smart, well-informed decisions about testing. New Jersey happens to be the home to some of this country's most highly-regarded experts in testing, learning, and education policy. Let's bring them in and have them help fix this mess. Our kids deserve no less.
 

h/t Mike Keefe

This blog post has been shared by permission from the author.
Readers wishing to comment on the content are encouraged to do so via the link to the original post.
Find the original post here:

The views expressed by the blogger are not necessarily those of NEPC.

Jersey Jazzman

Jersey Jazzman is the pseudonym for Mark Weber, a New Jersey public school teacher and parent. Weber is also a doctoral student at Rutgers University in Education Theory, Organization, and Policy.