Advertisement

Scoring of School Tests Found to Be Inaccurate : Education: Officials concede using too small a sample for many grades. O.C. had six cases with under 25% tallied.

Share
TIMES STAFF WRITERS

State education officials broke their own rules in scoring California’s new achievement tests, counting so few examinations that the scores for hundreds of schools may be wildly inaccurate and the results in half the cases are less precise than was promised, a Times computer analysis shows.

Among the most extreme examples found in The Times study are a San Bernardino County middle school where only 1% of the math tests were counted, and two schools--Vicentia Elementary in Corona and Columbia Elementary in Tuolumne County--where results were based on exactly one student’s work.

At Roosevelt Elementary in Indio, 169 students took the tests, but the state graded only four papers, then reported that everyone there scored at the lowest possible level, showing “little or no mathematical ability.”

Advertisement

“I kept reading (the results) over and over again. . . . I kept trying to turn the bar graph upside-down. It really had me panicked,” Principal Kennedy Rocker said. “It’s not even a snapshot. It’s a very blurry Polaroid with some Vaseline over the lens.”

Short of time and money in the first year of the revolutionary California Learning Assessment System, which 1 million students took last spring at a cost of $15 million, education officials decided to score only a fraction of the tests at each school, promising that results would remain statistically sound.

But when told of The Times analysis, several independent statisticians agreed that although the scores in reading, writing and math may offer a general picture of California students as a whole, sampling error makes them too imprecise to be used for judging most districts or individual schools.

“It appears they lost control,” said Lee Crombach, a retired Stanford professor of education who has studied testing for 60 years. “(Sampling) is certainly something that can be managed, but in the first run it can go wrong, and it apparently went wrong here.”

State officials acknowledged dozens of sampling problems with the scoring system, and said last week that reporting results on schools where the most extreme snafus occurred was a serious error. They said corrections and apologies would go out to some schools this month, while a new group of students takes the tests with a state promise that this time many more exams will be scored.

Still, education officials strongly defended CLAS, a new approach to testing that evaluates students’ thought processes, as well as their ability to derive correct answers, and measures performance against tough statewide standards.

Advertisement

“When you’re beginning a program as massive as this one, with limited funding, there have to be choices made,” Acting Supt. of Public Instruction William D. Dawson said. “The expectation is that it’s going to be simple, it’s going to be perfect and it’s going to be painless the first time through. That’s impossible. It’s just an absolutely extraordinary accomplishment to have moved as far as we have as constructively as we have.”

Results released last month to schools and the media painted a grim portrait of student achievement across California, with an especially woeful performance in math, where at least a third of the students statewide showed little or no understanding of basic concepts.

In their attempt to obtain valid samples from the 1993 tests, CLAS architects set some guidelines: At least 25% of the exams taken at every school were to be scored. The percentage varied by school but the state said the number should never fall below 44 tests for fourth grade, 70 for eighth grade and 80 for sophomores. At the smallest schools, all exams were to be counted.

However, using the state’s data, the Times analysis found that:

* Overall, the guidelines were broken more than 11,000 times, meaning that in 49.8% of the cases, individual school results were based on smaller samples than the state intended. In about half of those cases, the errors were minor, with samples falling one to five tests short of the guidelines. But in some 400 instances, the samples were at least 15 tests off.

* The most serious errors, where fewer than 25% of tests were scored, occurred in 148 cases, mostly in elementary schools. In general, the sampling problems were the worst at elementary schools, where the guidelines were broken more often than they were followed.

* On the writing segment, more than 60% of the small schools did not have all their tests scored as the guidelines require.

Advertisement

* The ability of the samples to reflect a school’s diversity also broke down when the smallest number of tests were counted. At one school where less than 4% of the children have learning disabilities, for example, 40% of the tests scored belonged to those students, skewing the school’s results.

* In Orange County, there were six cases where less than 25% of a school’s tests were scored. Those and other less extreme sampling violations occurred in 49% of the cases countywide; among fourth-graders, samples were too small 58% of the time, with guidelines broken in 75% of the schools on the writing segment and 72% in reading. At the 10th-grade level, math samples were too small in 61% of the schools.

* In Los Angeles County, there were more than 50 cases--most of them in the Los Angeles Unified School District--of the extreme violation in which fewer than 25% of a school’s tests were scored.

Across the county, the overall sampling guidelines were violated in 53% of the cases. For fourth-graders, samples fell below the standard in 81% of the schools on the writing segment and 70% in reading.

State officials agree that it would be best to avoid sampling and have announced plans to count all tests and report individual results when the CLAS system--which is expected to cost $55 million a year--is fully in place: “Every student, every paper--that’s the plan,” Dawson said.

CLAS Director Dale Carlson said that since the 1993 results were released last month, the contractor that handled the scoring has discovered sampling errors at about 70 schools. But the Times analysis found that in at least twice that number of instances, not even the required minimum of 25% of tests were scored.

Advertisement

Although that magnitude of sampling error directly spoiled the scores at only about 1% of the state’s schools, the imprecise data was factored into district averages and other comparisons used for the 7,000 schools that took the test.

“It was a waste of money,” said Barbara Anderson, principal of Santee Elementary in San Jose where 90 children took the tests but only nine or 10 were scored in each subject. “It’s a difficult test to give, it’s a hard test for students, and you’d at least like to have some (accurate) results.”

In retrospect, Carlson said, he wishes the state had not publicized data based on so few students’ work in schools such as Anderson’s. He called the situation very disappointing.

“It’s hard to think of anything that can go wrong that didn’t go wrong this year,” Carlson said. “A lot of mistakes were made on a lot of people’s part--most all of them were related to it being the first time.”

The private firm hired to process the tests, CTB Macmillan/McGraw Hill, is trying to figure out how the errors occurred, Carlson said. So far, officials say the problems resulted from answer sheets getting lost at the contractor’s office, arriving there defaced, or being split into groups that caused processors to believe that fewer students at certain schools took the tests.

Winnie Young, CLAS project director for the company, said the state Department of Education instructed her not to discuss the tests with the media.

Advertisement

Unlike previous standardized tests, which used multiple choice answers scored by a computer, the CLAS tests include essays, diagrams and explanations that must be judged by people. Last summer, about 2,000 public schoolteachers statewide were paid $100 a day to score a random sample of tests. Overall, 48.7% of the state’s tests were scored, and because most students took three tests, that left 1.5 million answer sheets untouched.

Each graded test was evaluated against the tough new achievement standards and assigned a score of 1 to 6. Then every school, district and county in the state was told what percentage of their students scored at each level.

All test scores contain some degree of “standard error,” a statistical term that measures the level of precision. CLAS results have higher standard error than previous tests because students are given a diverse set of problems and many of their answers are explanatory rather than absolute.

Small samples amplify standard error, in some cases raising it so high that the results have little meaning, said Gary Phillips, associate commissioner of the U.S. Department of Education’s National Center for Education Statistics.

“If you’re going to be giving a test that has high stakes . . . if it’s going to affect the lives of the students, the schools and the state, then you want to have as low a standard error as possible,” Phillips said.

Eva Baker, a UCLA professor of education who runs the Center for Student Testing, Evaluation and Standards, cautioned people against choosing schools or changing curricula based on the test scores.

Advertisement

“I don’t think people should be massively running around moving people about at this point,” Baker said. “High-stakes decisions ought not to be made until they have the kinks worked out.”

Even though educators warn against it, the public typically uses standardized tests as a barometer for judging schools.

Real estate agents carry lists of the scores, encouraging buyers to move into neighborhoods where the numbers are highest. With new state laws allowing open enrollment, public schools compete with one another for students, and many parents see CLAS as the simplest way to measure schools’ success.

So to Steve Simpkins, principal at Canyon Hills Junior High in Chino, the CLAS results are “less than useless.”

“When incorrect scores come out, people remember that first information,” he said. “We’re going to have to do damage control.”

The Los Angeles Unified School District, the state’s largest, included schools with some of the worst sampling mistakes: 51 schools where less than 25% of the tests were scored, 37 of them among fourth-graders, according to the Times analysis.

Advertisement

Throughout the district, the guidelines were broken in some way more than 1,100 times, or 61% of the total.

Statewide at the eighth-grade level, more than half of the schools had fewer than half of their math tests scored. Similarly, fewer than half the sophomore math tests were scored at 49% of California’s high schools.

But the sampling problems were most widespread at the fourth-grade level, where thousands of schools--76% in writing, 65% in reading and 30% in math--had fewer tests scored than promised by the state guidelines.

According to those rules, any school with 49 students or less should have had all their tests scored. But the data shows that about 500 such schools, more than half of the total, did not have 100% counted in the writing or math sections. In reading, more than 400 of these small schools, 48% of the total, did not have 100% of their tests counted.

“That is an unfortunate occurrence,” said Dawson, the state’s top education official.

In writing, 43 schools had less than 25% of their tests scored, while in reading, there were 37 such schools. On the math tests, 21 schools had less than 25% of their tests scored.

Like about half a dozen other schools, Cahuenga Elementary in Los Angeles makes the list in all three categories. At Cahuenga, only eight of 95 papers were graded in each area, instead of 46, as the guidelines require.

Advertisement

“I work so hard. . . . Our test scores have always shown that we were above the state average,” Cahuenga Principal Lloyd Houske said. “To drop to the lowest level--it was terrible.”

At Lerdo Primary in Kern County, 106 students took the exams, but only 12 were scored in reading and 13 in writing (the sampling guidelines call for 46). At Muir Elementary in Fresno, only 25 of 151 writing papers were graded (it should have been 48). At Excelsior Elementary in Garden Grove, the math results are based on five of the 30 students tested (the guidelines require 30).

“What do you get for what you’re paying?” asked Christine Olsen, spokeswoman for a Sacramento County district where five of 72 math tests were scored for one school. “There are a lot of things that are better about CLAS (than previous tests). . . . But (these results are) just flat not accurate.”

The state’s scoring guidelines instructed that “the larger the school, the larger the number of students sampled,” but in fact, some large schools had results based on tiny numbers of students.

About four dozen of the state’s junior high and high schools had less than 25% of their tests counted, breaking the main sampling rule.

Some examples: two of 350 math tests were scored at Terrace Hills Junior High in San Bernardino County, 14 of 360 tests at Sequoia Junior High in Simi Valley, 16 of 305 tests at Tetzlaff Junior High in Cerritos, 41 of 494 tests at Mar Vista Middle School in San Diego County and 74 of 590 tests at Portola Middle School in Tarzana.

Advertisement

“It’s ludicrous,” said Rollin Grider, director of curriculum for the district that includes Terrace Hills. “These results are not valid. We can’t draw any conclusions from (them). They told us they’d get us some new results, so we’re waiting for those. And if they don’t, we’ll scream.”

Another key question educators are asking is: Which students were counted? The state’s sampling was done randomly to ward off bias, but some administrators say that in schools in which the samples were too small, the group often lacked diversity and did not reflect the student population.

Cahuenga Elementary’s student body is one-third Asian American and two-thirds Latino, but the tests that were scored all belonged to students of Asian heritage, Houske said.

The problem was flipped at Santee Elementary in San Jose, where only nine or 10 students were scored on each portion of the exam. Santee is 40% Asian American, but no Asian American students’ tests were graded, according to the principal.

Further, four of the tests graded in each area belonged to students with learning disabilities, although only 3.8% of the students overall have such problems.

“It wasn’t our population. The CLAS test is a better assessment tool than the former tests were, but they need to make sure there’s enough time and energy and money to correct them and give them back to schools so they have correct information,” Principal Anderson said.

Advertisement

“It’s not a nice feeling to know that the public is seeing information like this. They don’t know what we know about it. They don’t know it’s inaccurate.”

O’Reilly is The Times’ director of computer analysis. The story was written by Wilgoren.

Small Samples

The Department of Education scored only a fraction of the California Learning Assessment System (CLAS) tests taken at each school last spring. To ensure statistical validity, there were sampling guidelines, including a promise not to base any school’s scores on less than 25% of the tests taken there. But in 148 instances--about 1% in most testing areas--schools got results based on less than 25% of the students tested. When no tests were scored, schools did not receive any results. Some examples:

ORANGE COUNTY

Taft Elementary

Orange Unified (Grade 4)

Reading Writing Math Tested 165 165 166 Scored 29 38 49 % Scored 18 23 30 Guidelines 48 48 48 Excelsior Elementary Garden Grove Unified (Grade 4) Tested 30 30 30 Scored 0 0 5 % Scored 0 0 17 Guidelines 30 30 30 Woodbury Elementary Garden Grove Unified (Grade 4) Tested 81 81 82 Scored 22 21 10 % Scored 27 26 12 Guidelines 46 46 46 Lathrop Intermediate Santa Ana Unifed (Grade 8) Tested 344 374 371 Scored 74 95 72 % Scored 22 25 19 Guidelines 116 122 122

*

LOS ANGELES COUNTY

Cahuenga Elementary

Los Angeles Unified (Grade 4)

Reading Writing Math Tested 95 95 96 Scored 8 8 8 % Scored 8 8 8 Guidelines 46 46 46 Eastman Avenue Elementary Los Angeles Unified (Grade 4) Tested 139 139 140 Scored 34 19 39 % Scored 24 14 28 Guidelines 48 48 48 Tetzlaff Junior High ABC Unified (Grade 8) Tested 305 299 305 Scored 107 138 16 % Scored 35 46 5 Guidelines 108 108 108 Sierra Vista Intermediate Covina Valley Unified (Grade 8) Tested 310 307 307 Scored 99 117 9 % Scored 32 38 3 Guidelines 110 108 108 Millikan Junior High Los Angeles Unified (Grade 8) Tested 432 388 406 Scored 114 166 59 % Scored 26 43 15 Guidelines 134 124 130 Gardena Senior High Los Angeles Unified (Grade 10) Tested 468 468 454 Scored 178 130 98 % Scored 38 28 22 Guidelines 158 158 154

*

VENTURA COUNTY

Sequoia Junior High

Simi Valley Unified (Grade 8)

Reading Writing Math Tested 360 349 360 Scored 120 146 14 % Scored 33 42 4 Guidelines 120 118 120

*

RIVERSIDE COUNTY

Vicentia Elementary

Corona-Norco Unified (Grade 4)

Reading Writing Math Tested 23 23 104 Scored 1 0 44 % Scored 4 0 42 Guidelines 23 23 46 Roosevelt Elementary Desert Sands Unified (Grade 4) Tested 176 176 169 Scored 32 35 4 % Scored 18 20 2 Guidelines 48 48 48 Railroad Canyon Elementary Lake Elsinore Unified (Grade 4) Tested 151 151 150 Scored 47 29 55 % Scored 31 19 37 Guidelines 48 48 48

Advertisement

*

SAN BERNARDINO COUNTY

Del Norte Elementary

Ontario-Montclair Elementary

(Grade 4)

Reading Writing Math Tested 98 98 98 Scored 19 19 18 % Scored 19 19 18 Guidelines 46 46 46 Canyon Hills Junior High Chino Unified (Grade 8) Tested 338 334 338 Scored 10 42 8 % Scored 3 13 2 Guidelines 116 114 116 Terrace Hills Junior High Colton Joint Unified (Grade 8) Tested 354 350 347 Scored 88 124 2 % Scored 25 35 1 Guidelines 118 118 118

*

SAN DIEGO COUNTY

Sherman Elementary

San Diego City (Grade 4)

Reading Writing Math Tested 101 101 97 Scored 26 24 37 % Scored 26 24 38 Guidelines 46 46 46 Mar Vista Middle Sweetwater Union High (Grade 8) Tested 499 499 494 Scored 99 127 41 % Scored 20 25 8 Guidelines 146 146 146

*

ELSEWHERE IN THE STATE

Muir Elementary

Fresno County (Grade 4)

Reading Writing Math Tested 151 151 151 Scored 26 25 55 % Scored 17 17 36 Guidelines 48 48 48 Lerdo Primary Kern County (Grade 4) Tested 106 106 179 Scored 12 13 51 % Scored 11 12 28 Guidelines 46 46 48 Santee Elementary Santa Clara County (Grade 4) Tested 90 90 90 Scored 9 9 10 % Scored 10 10 11 Guidelines 46 46 46 Skycrest Elementary Sacramento County (Grade 4) Tested 72 72 72 Scored 33 37 5 % Scored 46 51 7 Guidelines 46 46 46 East Cottonwood Elementary Shasta County (Grade 4) Tested 114 113 116 Scored 12 13 36 % Scored 11 12 31 Guidelines 46 46 46 Columbia Elementary Tuolumne County (Grade 8) Tested 50 26 50 Scored 0 1 0 % Scored 0 4 0 Guidelines 50 26 50 Elk Grove High Sacramento County (Grade 10) Tested 545 520 538 Scored 165 157 98 % Scored 30 30 18 Guidelines 174 168 170

Source: State Department of Education

Researched by RICHARD O’REILLY and JODI WILGOREN / Los Angeles Times

Advertisement