Peter Foltz was frustrated by the fact that it was impractical to give essay exams to his psychology students at New Mexico State University. With about 200 students in the classes, it would take him three or four weeks just to grade the exams.
“So I go with multiple choice exams, and that’s not a good learning experience for the student,” Foltz said. “Making students have to write is actually one of the best ways to make them learn.”
So Foltz joined with other psychology professors at the University of Colorado at Boulder to develop computer software that can do what they no longer have time to do--grade essay exams.
The “Intelligent Essay Assessor” compares the student’s essay to a large body of information--a downloaded textbook, for example--to see whether the essay includes the right words in the right context.
“It uses a new mathematical analysis technique to learn word meanings from large bodies of text, then applies this to determine the similarity of meaning between a student essay and comparison essays that have been created or graded by an expert,” said Thomas Landauer (Thomas.Landauer@colorado.edu), a University of Colorado psychology professor who has worked on the technology for 10 years.
The software is not just a sophisticated keyword search, both men insist. Instead, it searches for words used in a context that indicates the student understands the meaning of each phrase.
Simply lumping all the right words into an essay won’t earn a passing grade.
“We’ve tried to write bad essays and get good grades, and we can sometimes do it if we know the material really well,” Landauer said. “The easiest way to cheat this system is to study hard, know the material and write a good essay.”
Foltz insists the software passed with flying colors when he tested it on his own students.
“We had students write over 2,000 essays on 11 different topics,” he said.
The essays were graded by the computer, as well as two instructors or teaching assistants. Even professors disagree sometimes on the quality of a student’s work, but the computer agreed with the human graders as well as they agreed among themselves, Foltz said.
And it took the computer only a “second or two” to render a grade, compared with five to 30 minutes for the humans.
Since the software was unveiled a couple of months ago, the professors have come under criticism by some educators who see this as further isolating students from their professors. Some view it as depersonalization, leading to less involvement by potential mentors in a student’s progress.
But Foltz and Landauer insist the program should make it possible for professors to be more involved in teaching their students rather than less. It will free time that would have been used in grading essays to work directly with the students.
In his trial run at New Mexico State University in Las Cruces, Foltz said the program was shown to be a highly efficient teaching aid, not just a depersonalized tester.
Students sent their essays to a Web site where the program evaluated them instantly. The computer told the student the “estimated grade,” Foltz said, “and it came back with suggestions about what was missing from the essay. It told them what they should go back and look at in order to rewrite their essay.”
It also told them which pages in the textbook to look up, or it referred them to the professor’s class notes, which were also available at the site.
“It helped them figure out what they didn’t know,” Foltz said.
The program works better in some areas than others. It was not designed to evaluate creative writing, for example. It works best in testing the student’s knowledge of a particular set of facts, or concepts, such as how the human heart works.
“If it sees an essay that looks highly different from one it’s graded previously, it can turn that one back to the teacher and say, ‘I don’t know how to grade this one. I have low confidence in my grade for it.’ ”
Thus, a highly creative essay might end up with a human grader instead.
Landauer admits some people might be uncomfortable with turning another part of the educational process over to computers.
“I know that most people will find it very surprising that a computer can do all this, but the plain fact is that it now can,” he said. “We have developed a new theory and a powerful analysis technique that captures the meaning of text well enough to make these once-inconceivable feats not only possible but practical.”
You can try it yourself at https://Lsa.colorado.edu.
Where will it all end? Will the computer of the future be able to write the essays, and then submit them to a more powerful computer that will grade them?
“I still think that’s where the humans need to be involved, in doing the writing,” Foltz said.
I hope he’s right.
Lee Dye can be reached via e-mail at email@example.com.