Editorial: Casting doubt on linking teacher evaluations to test scores


A new study out of USC and the University of Pennsylvania finds that value-added measurements — a way of using student test scores to evaluate teacher performance — aren’t a very good way of judging teacher quality. This isn’t the first study to cast doubt on what has become a linchpin educational policy of the Obama administration but there’s an interesting element that lends its findings extra weight: It was funded by the Bill & Melinda Gates Foundation, a well-known supporter of using test scores in teacher evaluations.

In fact, researcher Morgan S. Polikoff, an assistant professor of education at USC, said the findings ran counter to what he had expected. Yet he was unflinching in his conclusion in a YouTube video on the research: “Value-added scores don’t seem to be measuring the quality and content of the work that students are doing in the classroom.”

This shouldn’t put the kibosh on all use of value-added, which many states have adopted (California has not). Evidence continues to build on both sides of the issue, and many studies have found that increases in test scores, though they might not correlate with teacher quality, do have important ramifications for student success down the road.


If that’s true, though, there are legitimate questions about what we consider to be teacher quality. The measures used in the latest study were widely accepted ones adopted for other research. Among them: Good use of classroom time. A nurturing and respectful classroom environment. Students who are asked challenging questions. If these don’t lead to improved academic performance, though, maybe they’re not as vital as everyone thought.

Another important issue is whether better tests would do a better job of measuring teacher quality. The new tests based on the Common Core standards were designed to do that; it remains to be seen whether they succeed.

What the new study should accomplish, though, is to persuade policy makers to press the pause button when it comes to high-stakes use of the test scores to judge whether a teacher deserves a pat on the back or a shove out the door. Polikoff suggests test results might be more useful as a qualitative measurement that enables schools to fine-tune their teacher training.

The problem is that, under pressure from the U.S. Department of Education, states have been rushing to set up rubrics for judging teachers based, to a significant degree, on rigid use of test scores. The Obama administration is pulling Washington State’s waiver from No Child Left Behind requirements, solely because the state didn’t follow through with this one aspect of its reform plan. Education policy is supposed to follow well-established research, not the other way around.