Caltech Student Has the Stats to Make It to the Major Leagues
Growing up in New Jersey, Ari Kaplan was a different kind of kid, the brainy kind. When his classmates were playing Little League, young Ari was home playing with his computer. He came to baseball late. First he was only a fan, but this spring, in his sophomore year at Caltech, he decided to become a player.
Kaplan tried out for the Caltech squad and made the team. But then, so did everyone else. Kaplan came off the bench four times. He was 0 for 4--all strikeouts.
But Ari Kaplan is one bench-warmer who is destined for the big leagues. This month, the 20-year-old phenom will leave the Pasadena campus and join the Baltimore Orioles, because he can crunch numbers the way Jose Canseco crunches fastballs.
When it comes to computing statistics, Ari Kaplan is The Natural. Baseball is beckoning because, as any fan knows, the sport thrives on numbers--and more so than ever in the computer age. Stats are used by coaches making strategic decisions, by agents seeking bigger contracts for star players, by fans competing and wagering on fantasy teams playing in fantasy leagues. Publishers crank out books loaded with baseball numbers, and newspapers are printing more stats than ever--and even promoting fantasy baseball competitions for readers.
So in the world of baseball, where ERA has nothing to do with the women’s movement, there’s always room for another good stat. Perhaps Kaplan’s RE, for Reliever Effectiveness, could someday be as familiar as as the earned-run average, the classic measure of pitching performance. Players and coaches who have perused his computations are intrigued.
Then again, Kaplan’s formulas, developed on an undergraduate research fellowship, may all wind up in the circular file of dubious stats, such as the game-winning RBI.
Either way, Kaplan seems delighted just to be around the game he loves. He got his job with the Orioles after presenting his paper--"How Do You Spell Relief? An Analysis of Baseball Pitching, 1876 to Present"--at a meeting of the Caltech board of trustees last September. Trustee Eli S. Jacobs was impressed--and Jacobs happens to own the Orioles.
“I feel like a child trick-or-treating,” Kaplan said during a recent visit to Dodger Stadium.
Kaplan also is thrilled to be thrust into the debate that rages in ballparks across the land. To critics, baseball’s appetite for numbers has become a form of gluttony, a blight on the poetry and romance of the game. “Just what we need,” sportswriter Steve Dilbeck of the the San Bernardino Sun groused upon being introduced to Kaplan. “Another statistic.”
But, baseball followers agree, there are good stats and there are bad stats. According to Bill James, author of “The Baseball Abstract” and several other works that take a close statistical look at the game, the analyst’s job is to figure out the difference.
“It’s shaking out,” said James, a Ruthian figure in the field of baseball stats. “Our ability to generate stats has gotten way ahead of our ability to make any sense of it. The first generation of computers gave us lots of numbers, but it’s going to take . . . a lot of work by people like Mr. Kaplan before we understand what all this means.”
Kaplan, at least, thinks he’s on to something. He got funding for his effort and a $3,000 stipend last summer through Caltech’s summer research fellowship program for undergraduates. While other students researched everything from asteroids to honey bees, Kaplan developed his baseball computer programs and loaded it with data on the Dodgers’ staff.
“Caltech does a lot of forefront research,” said Carolyn Merkel, director of the fellowship program. “But we’re not isolated from the rest of the world.”
Much of Kaplan’s analysis concerned the inadequacy of one statistic, the earned-run average. The ERA is a ratio of earned runs--that is, runs scored without the help of fielders’ errors--per nine innings pitched; the lower the number, the better.
Kaplan focused on a variable that is known as the “inherited runner"--a runner who is left on base when a pitcher leaves a game and is replaced by another pitcher. If those runners score, the runs are charged to the pitcher who allowed them to reach base, affecting his ERA. Thus, a relief pitcher could pitch poorly but come out of the game with his own ERA untarnished.
Kaplan set out both to improve upon the ERA concept and to develop a way of measuring the effectiveness of relief pitchers to prevent inherited runners from scoring.
Instead of the conventional ERA, Kaplan suggests a range. Consider this contrast of Dodger pitchers Orel Hershiser and Mike Morgan. Both had excellent ERAs last season--Hershiser’s was 2.31, Morgan’s was 2.54. But were they really that close in performance?
Kaplan says no. If none of the runners these pitchers left on base had scored, Hershiser’s ERA would have been 2.24 and Morgan’s would have been 2.35. But if all had scored, Hershiser’s ERA would have climbed only to 2.54, while Morgan’s would have jumped to 3.36.
Kaplan calls these relatively simple concepts the Potential ERA (PERA) and Worst-Case ERA (WERA). But there is nothing simple about the stat he calls Reliever Effectiveness, or RE.
Kaplan set out to measure how well relievers performed when they entered games with runners on base. The problem was accounting for the wide variety of situations--bases loaded and no outs, for example, or runners on first and third and one out.
To calculate the performance of individual relievers, Kaplan divided the number of a pitcher’s inherited runners that scored by the total expected to score, based on a matrix of probabilities developed by statistician Pete Palmer. If the resulting number was below 1, the reliever was better than average; above 1, he was worse than average. “The higher the RE,” Kaplan noted, “the worse the pitcher.”
All of which is bad news for the Dodgers’ Tim Crews. Kaplan found that Crews, though he had a decent ERA of 3.21, allowed 17 inherited runners to score when only 11.08 should have. Thus, his RE was an ignominious 1.53.
Did Jay Howell, the Dodgers’ ace reliever, have the best RE last season? No. Surprisingly, Ricky Horton, a left-hander who failed to live up to the Dodgers’ expectations and is now with the St. Louis Cardinals, proved to be in top form when he came into the game with runners on base. Horton’s ERA was an ugly 5.06, but his RE was a strong .63.
“Horton was the best? That is surprising. Very surprising,” said Dodger announcer Ross Porter, a well-known purveyor of baseball numbers. A stat that suggested Horton was somehow better than Howell, he suggested, has credibility problems.
“Unless you’re Ricky Horton,” Porter added. “Then you say it shows how important I am and you use it at (a salary) arbitration hearing.”
Dodger pitcher John Wetteland quickly pointed out that the stat would not take into account certain strategic situations in which a team with a big lead would be willing to allow runs to score if they are able to get an out. Porter suggested it also cannot account for the emotional element of game--a circumstance, say, in which a pitcher like Crews is called in to “mop up” in a losing cause.
All in all, though, Wetteland seemed impressed with Kaplan’s work.
“I like it,” he said. “It’s amazing how stats can lie. But this really puts a lot of things in perspective.”
Wetteland, who is now struggling with an ERA of 7.82, was supposed to start that day, but he had the flu. His replacement was Crews. He pitched four innings, giving up two runs, one earned. The Dodgers lost, 8-3, and Crews was charged with the loss.
Although he follows the Dodgers, Kaplan’s favorite team is the Mets. Before he entered Caltech, Kaplan achieved a measure of notoriety as one of the original Shea Stadium “Coneheads.” Donning conical headdresses that made them look like alien invaders from an old “Saturday Night Live” skit, Kaplan, his brother Todd and their pals would cheer on Met pitcher David Cone. Their antics landed them on the highlights of Mel Allen’s “This Week In Baseball.”
A spokesman for the Orioles said Kaplan will be working with their chief statistician.
Kaplan will be back at Caltech in the fall, hitting the books and, come springtime, trying to hit, period.
Kaplan expects more playing time next year. That .000 batting average, he suggests, is not a true measure of his abilities. “Hey,” he said, “I fouled off a couple.”