Test 'Average' Covers Half of All Students

By RICHARD LEE COLVIN and NICK ANDERSON

July 16, 1998 12 AM PT

Share via
- Email
- Facebook
- X
- LinkedIn
- Threads
- Reddit
- WhatsApp

TIMES STAFF WRITERS

Time for a quiz--part statistics, part philosophy.

One student’s answers on the Stanford 9 achievement test put him ahead of 76% of his peers nationally.

Another winds up behind 76% of the other youngsters--in the bottom quarter, in other words.

Are these students: (A) different, (B) the same, or (C) similar?

According to the test result reports being sent to parents, the correct answer is C--both students are average.

Even though one’s performance was encouraging and the other’s worrisome, both fall into a vast middle range that, by design, includes roughly half of the 4.1 million California students who took the tests this year.

Of course, some students--those in the lowest 23%--are told that their score was below average. And those in the top 24% on any portion of the tests get the good news that their performance was above average.

The rest are lumped into the massive average group, meaning it includes students whose showings were markedly different.

For example, the math portion of the exam for eighth-graders had 78 questions. Students who answered 55 correctly are being told that their performance was average. And so are those who answered only 29 questions correctly.

Using a Range Rather Than a Point

The company that created the Stanford 9 exams, Harcourt Brace Educational Measurement, defends the approach as appropriately cautious--a recognition that such tests do not measure a child’s performance as precisely as, say, a scale gauges weight. A student might score 10 points higher--or lower--if he took the exam a second time, according to one outside analysis.

But many educators, policymakers and testing experts disagree with the company’s broad definition of “average,” saying that it distorts the term beyond meaning.

Some wonder whether the approach represents hypersensitivity to the self-esteem of students--a desire to brand as few as possible “below average.”

The critics say that teachers certainly wouldn’t lump a child in the 23rd percentile in with a classmate at the 76th. And the “average” label may confuse parents--whether it’s those who should be pleased that their child scored near the top or those who should be concerned because of a score at the low end.

Dan Edwards, the education spokesman for Gov. Pete Wilson, fears that school administrators will misuse the definition of average as signaling performance that is “OK.” Doing that, Edwards said, would contribute to an unwarranted complacency.

“Don’t be fooled if somebody is telling you, ‘Don’t worry about it, your child turned in an average test score,’ ” he said. “Those are people trying to hide behind bad test scores by using this methodology.”

He said parents and policymakers should focus on a hard number, “the national percentile rank, grade by grade.”

But Tom Brooks, Harcourt’s own statistical expert, sees “average” as a subjective concept. The company thus chooses to define it as a range of scores rather than as a single midpoint--the 50th percentile--where half of the students scored below and half above.

Doing otherwise, Brooks said, “automatically dooms half of the kids to failure.”

The definition of average--as the middle half of all students--came with the testing system the state bought last November, only a few months before it was given.

Harcourt Brace says the definition is the same one it has used for years. It’s also accepted by the eight other states where the Stanford 9 was given this year to test skills in everything from math and reading to science and social studies. Those states include Alabama and West Virginia, places not known for sensibilities aimed at boosting students’ self-esteem.

Brooks says that the approach merely reflects how people naturally define what’s average in other areas.

He uses the analogy of height.

According to the U.S. Center for Health Statistics, the average American male is 5-foot-9. Would we call a man who is 5 foot, 7 inches tall--shorter than three-quarters of the men--short? And is a man who is 5 foot, 11 inches tall--taller than three of four peers--tall? Or are both more or less average?

“I think you can make that same kind of extension to academic achievement,” Brooks said. “There’s some range in which average is average. And even though you might be below the 50th percentile, that’s not necessarily bad and . . . you might be perfectly capable of functioning.”

Like Harcourt, most testing companies view average as a range. They want to discourage making too much of small differences, such as between the 47th and 51st percentiles. In some school districts, differences of a couple of points can determine who is admitted to programs for the gifted or the learning disabled.

Still, other experts say it’s being overly generous to stretch the average range beyond the 40th to 60th percentile.

For parents, knowing that “their kid falls into a range that encompasses fully half of the kids . . . won’t be particularly useful,” said Daniel Koretz, a researcher who studies testing for Rand Corp.

Separate Scale for Ability

Whatever way average is defined, though, it doesn’t answer the questions that probably are of most concern to parents: Even if Johnny did about the same as other students, does that mean he--or they--can read? Or do algebra?

Indeed, Harcourt has a whole separate scale for viewing test scores by those criteria.

The company had teachers analyze its tests to figure out what various scores in each subject said about a child’s ability to do the work expected at his grade level.

The panel spelled out a range of abilities, starting at the lowest end with “below basic.” That means that a student has “less than partial mastery” of the skills.

The next rungs up are “basic” mastery and then “proficient,” which means a student has the skills to succeed at the next grade.

Above that, a student is considered “advanced.”

This scale, however, is not shared with parents. And even teachers must search a technical manual available for sale to school districts.

Those who do check it out would find another reason to wonder about the wisdom of the testing company’s definition of average.

The average range, it turns out, dips so low that it often includes students whose scores--on the proficiency scale--are “below basic,” meaning they are unprepared for the next grade.

Many educators don’t need a technical manual to tell them that some of the average students are struggling. Any student with scores around or below the 30th percentile “has some major gaps,” said Susan-Harumi Bentley, an assistant superintendent in the Carlsbad Unified School District in San Diego County.

The broad definition of average serves to “obfuscate the real achievement of the children,” said Robert L. Baker, a USC education professor. “There has to be some truth in labeling, so we’re honest with ourselves and honest with the parents.”

The State Board of Education agrees there is a problem with what parents are being told.

Next year, it plans to customize the Stanford 9 for California. The major goal is to make sure the tests are aligned with new standards that spell out what the board believes students should know in each grade.

Gerry Shelton, who manages the testing program for the Department of Education, said the state plans another change: It will scrap Harcourt’s definitions.

Shelton said the state will try to tell parents whether their children knew the material that they should have. It will not even attempt, he said, to say what it means to be average.

To get a special report on the state’s failing schools, “California’s Perilous Slide,” go to The Times’ Web site: www.latimes.com/schools

(BEGIN TEXT OF INFOBOX / INFOGRAPHIC)

What Does ‘Average’ Mean?

Results for the Stanford 9 standardized tests are calculated in three different ways that some parents may find contradictory or, at the least, confusing.

On the reports sent to parents at home, scores are given as percentiles. A score at the 50th percentile means the students did better than 50% of his or her peers in a national sample.

But parents also will be told whether their child’s score is “below average,” “above average” or, in most cases, “average.” All those whose scores are between the 23rd and the 76th percentile are considered “average.”

Below average: 23%

Above average: 76%

****

Eighth-Grade Math: Two Yardsticks

The “average” designation includes a wide range of performance. On the eighth-grade math test, for example, the “average” range includes those who correctly answer between 29 and 55 of the 78 questions.

Below average: 25% (29 correct answers)

Above average: 75% (55 correct answers)

The “average” range is so broad it includes students whose skills may be severely lacking, as well as those who are “proficient.”

Harcourt Brace Educational Measurement, which created the Stanford 9 tests, has four categories: “below basic,” “basic,” “proficient” and “advanced.”

On the eighth-grade math test, students ranking below the 36th percentile are considered to have “below basic” abilities, which means that “average,” as defined by the first measure, includes students whose skills are “below basic,” and even some who are “proficient.”

Below basic: 36%

Basic: 36%-73%

AVERAGE: 23%-76%

Proficient: 73%-96%

Advanced: 96% +