Can Americans Trust Their National Report Card?

Published March 1, 2000

Testing data tumble out of the National Assessment of Educational Progress (NAEP), the so-called “Nation’s Report Card,” with great regularity. But can the numbers always be trusted? Researchers have raised disturbing concerns about whether NAEP test score gains could have been achieved by excluding weaker students.

Stung by charges that one of the U.S. Department of Education’s pet reform states–Kentucky–showed questionable gains in fourth-grade NAEP reading scores released last spring, the National Center for Education Statistics (NCES) asked a research contractor for the Kentucky Department of Education to prepare a new set of figures. The study, demanded by NAEP governing board member Wilmer Cody, concluded the gains were real. NCES officials say the case is closed, but others in the technical education community don’t agree.

If NCES won’t reopen the matter to determine if this study can survive skeptical scrutiny, Congressional inquiry is possible. A hearing on Capitol Hill last May probed possible politicization of NCES but stopped short of looking directly into manipulation of test scores.

Concerns about political influence on NAEP test scores arose last spring after Vice President Al Gore personally announced the 1998 NAEP reading scores at a Department of Education (DoEd) conference attended by cheering political supporters. Kentucky’s reading scores were up a statistically significant 6 points over 1994 testing. Other big gainers like Connecticut, which was up 10 points, were mostly states that had pursued DoEd-favored reform plans.

Yet, as Kentucky activist and testing expert Richard Innes was among the first to point out, Kentucky excluded from the 1998 testing a whopping 10 percent of students with various learning disabilities (LD). In 1994, the state had excluded only 4 percent. Moreover, from 1992 to 1998, the proportion of Kentucky’s school population officially designated as “disabled” skyrocketed from 7 percent to 13 percent–an increase of 86 percent. It did not seem unreasonable to conclude that dropping many more of the weaker students from testing could make the state’s average test scores look much better.

In Connecticut, LD exclusions jumped from 4 percent in 1994 to 10 percent in 1998. In some other high-reform states, a similar pattern was observed: higher NAEP scores accompanied by significant increases in exclusion rates. North Carolina’s NAEP scores were up 3 percentage points, while exclusions were up 5 points. In Secretary of Education Richard Riley’s home state of South Carolina, scores were up 7 percentage points, but testing exclusions were up 5 points.

The state that may have shown the firmest gain in reading was California, which actually excluded 2 percent fewer students from testing in 1998 than in 1994, but showed a 5 point gain in overall scores. Notably, the Golden State recently has junked a 1987 whole-language mandate and returned to phonics in teaching reading.

NCES initially took the criticism of skewed scores seriously. The agency asked the Educational Testing Service, a NAEP contractor, to look into how NAEP reading scores might have looked had exclusion rates been uniform across the years. While ETS concluded no one can ever know the answer for sure because the reading abilities of the students excluded in 1998 are unknown, the contractor nevertheless speculated that states like Kentucky and Maryland could have received inflated report cards because of increased exclusions.

That’s when NAEP board member Cody–who also is Kentucky’s commissioner of education–hotly demanded another study. To conduct the research, NAEP’s governing board selected Dr. Lauress Wise of the Human Resources Research Organization (HumRRO), which already was under contract with Cody’s Kentucky Department of Education to do research on the state’s education reform.

HumRRO attempted to create NAEP-equivalent scores for the state’s LD students by looking at how they performed on a statewide test–the Kentucky Instructional Results Information System (KIRIS). The group found the students did well enough on KIRIS that their exclusion would not have dropped NAEP averages much. With that, NCES declared that Kentucky’s NAEP gains were genuine after all.

But are 1998 KIRIS test scores reliable for validating NAEP test scores? After all, KIRIS was so flawed that Kentucky scuttled it after the 1998 tests. Moreover, a RAND study by Dr. Dan Koretz suggested that most of the Kentucky pupils excluded from NAEP didn’t exactly take a reading test with KIRIS. In compliance with federally mandated individualized education plans, these students had the test read aloud to them. This accommodation–more a test of spoken word comprehension than of reading ability–is not permitted by NAEP. The Wise report did not attempt to sort out which students took real reading tests and which didn’t.

Among the questions for Congress to raise is the impact on test scores of students wearing LD labels. If LD students aren’t given a test of reading ability, how can a school district possibly know if it is being successful in teaching them to read?

Robert Holland is a senior fellow at the Lexington Institute in Arlington, Va. His e-mail address is [email protected]

For more information …

The report by Dr. Lauress Wise, “Impact of Exclusion Rates on NAEP 1994 to 1998 Grade Reading Gains in Kentucky,” is available on the Internet at http://nces.ed.gov/commissioner/remarks99/9_27_99.asp#approach.

NAEP reports, including its “NAEP 1998 Reading Report Card for the Nation and the States” and “NAEP 1998, 1994, and 1993 National and State Reading Summary Data Tables for Grade 4 Student Data, Weighted Percentages, and Average Composite Scale Scores,” are also available on the Internet at http://nces.ed.gov.