Beyond OPS: filling in the gaps by John Walsh February 25, 2009 I wanted to talk a little about OPS. I love OPS. I use it all the time when I need to know, say, whether Nick Markakis had a good year in 2008*, or if I’m wondering which was Vlad Guerrero’s best year.** OPS will not give you a definitive answer, but it’ll get most of the job done. *Markakis’ OPS last season was .897, a very good number (seventh in the American League). I’m beginning to really like Nick Markakis: fine bat, solid defense, rifle arm, decent base runner. The 25-year-old Oriole right fielder doesn’t seem to have any holes in his game. **Vlad’s best year with the bat was probably 2000 or 2003 or 2004. He topped 1.000 OPS in each of those years. The value of OPS derives from its ability to measure offensive production pretty well, while being super easy to calculate. Actually, you don’t need to calculate it anymore, since it’s found on most mainstream baseball web sites nowadays. By now we have a feeling for what’s a good OPS: .750-.770 is about average, and 1.000 is a league-leading-type number. OPS’s biggest shortcoming is probably its failure to correct for park effects: a 1.000 OPS in Colorado is not as impressive as the same number achieved in San Diego. This defect is remedied by OPS’s close relative: OPS+, which takes into account overall league offensive level in addition to park effects and puts everything on a nicely calibrated scale where 100 is average, 150 is 50 percent better than average, and so on.* *A typical league-leading value for OPS+ is around 170 or so. In 2008, Albert Pujols led MLB with an OPS+ of 190. Albert, though, was way ahead of the pack—only one other player topped 165 (Chipper Jones at 174). So, is OPS+ the best way to judge a player? First of all, let’s not forget that hitting is just one aspect of baseball. When estimating a player’s value or comparing two players such as, say, Manny Ramirez and Carl Yastrzemski you need to consider defense (range and arm), position played, base running and maybe some other stuff as well. So, when I said that Markakis had a good year last year, I really don’t know that. I need to check his defense and base running. (I just checked: average range and base running, super arm—overall a very good season.) The neglected double play Even putting aside defense and base running, OPS has some shortcomings, one of which is the subject of this article. Now, let me say right off the bat that this isn’t going to be one of those articles that compares a bunch of different run estimators, complete with correlation coefficients, citations of RMSE and the like.* Instead, I’m going to focus on one aspect of hitting that OPS doesn’t take into account: hitting into double plays or, more specifically, grounding into double plays (known as GIDP or GDP). Sometimes a player’s tendency to ground into double plays or his skill in avoiding the double play can make a real difference in how we view his production. At least, it should make a real difference. *I don’t want to disparage the “theoretical” work that goes into finding the best run estimator. That kind of research is essential for moving sabermetrics forward, but this article is going to be smaller in scope. I don’t mean to pick on OPS here. As far as I know, no statistic that purports to evaluate offensive production includes GDP. Batting runs, base runs, runs created, wOBA, VORP—none of these incorporate double plays. And for a good reason—not all players have the same opportunities for hitting into a double play, so to include double plays in a stat like OPS, you’d need to consider how often a player comes to bat with a runner on first base and fewer than two outs. And that requires a analysis of play-by-play data, not just season totals. Taking GDPs into account fairly is a lot of work. Work that, in many cases, is probably not justified. But not all cases, not all. My own double-play philosophy My views on the GDP have changed over the years. I used to disparage the slow-footed oafs who seemed to ground into double plays with monotonous consistency. I can remember cringing whenever George “Boomer” Scott came to bat with Rice or Yaz on first base, knowing, just knowing that a nifty 4-6-3 double play was imminent. I can remember once standing up in the right field bleachers at Fenway and yelling, “Strike out, ya’ bum!” Boomer hit a two-hopper to the right side. I hated Boomer. Many years later, but still several years ago, my brother and I were talking about Mike Piazza. “What do you think of Piazza?” he asks me. “What do I think of him? He’s a future Hall of Famer and the best guy on the team, that’s what I think of him! How can you not love Piazza?” “Doesn’t he ground into a lot of double plays?” asks my brother. The Mets star catcher, despite being the best hitter on the team (as measured by OPS, heh), was being bad-mouthed in the New York press for grounding into a lot of double plays. “Yes, I suppose he does, but hey, he’s a big strong guy who hits the ball hard and doesn’t run that well. Plus he’s always got Alfonzo on first base.” Then I asked a question: “Do you know who the career leader* in grounding into double plays is?” “Who?” “Hank Aaron*.” Ah. *Cal Ripken has since passed Aaron on the career GDP list. Cal was a pretty good player, too. This was the point I was trying to make: excellent hitter, hits with lots of guys on base, doesn’t run well (and not many catchers do)—of course, he’s going to hit into a lot of double plays. Don’t worry about it, it’s part of the package. Don’t hold it against him. That’s wrong, though. Mike Piazza is what he is (or was, I guess), a .300 hitter, decent plate discipline, exceptional power—all of it exceedingly rare in a catcher. But, you know what? Piazza did hit into a lot of double plays; that’s part of his package, too. And if you’re going to evaluate a player, you need to evaluate the whole package, or at least as much of it as you can. Figuring out double-play performance You have probably already guessed that the main reason that Ripken and Aaron have hit into more double plays than anybody else is that they had more chances to hit into twin killings than most. In fact, they are nos. 3 and 2 on the all-time (actually, since 1953, when the Retrosheet play-by-play data kicks in) most GDP opportunities list. Obviously the thing to do is to normalize a player’s GDP by his GDP opportunities. “Normalize” is just a fancy word that means “divide,” and don’t ask me why I used it at all. Even when you normalize the GDP number, though, do we know how to judge the result? For example, Albert Pujols has GDP’d in about 12.4 percent of his opportunities. Is that good? Bad? Average? You need a reference point, say the average value for all players, which happens to be 11.5 percent, more or less.* So, Pujols grounds into double plays at a slightly higher rate than the average batter. Now, Pujols ranks third in double plays grounded into since 2001, so if you didn’t consider his opportunities, you’d conclude that he was one of the worst GDPers of his time, instead of merely slightly below average. *As mentioned, the average GDP rate in recent years is 11.5 percent.** It varies slightly from year to year, but not much. Note that I’m only including groundball double plays. About 2 percent of double-play situations result in non-groundball double plays, some of them line drives, some fly outs and not a few of them strike ’em out/throw ’em out caught stealing double plays. I’m only considering GDP’s—I don’t think batters have much control over the other kinds. Or, at least, they have less control. **Just for fun, I checked the Retrosheet data for the year 1911. That’s not a typo—Retrosheet has play-by-play data from 1911, and in that Dead Ball year, the average GBP rate was around 6 percent, about half the modern value. Whether the difference is due to different strategies (lots of bunting, hit-and-runs, stolen base attempts) or inferior defense is an interesting question. Maybe for a future article. Ok, so we can compare a player’s GDP rate and compare with league average, but an even better measure is the number of GDP’s above or below average. That is, above or below the average number of GDP given a certain number of opportunities. Let’s go back to Pujols and look at his 2008 numbers: he came up in 173 double-play opportunities and grounded into 16 DPs. His rate is 9.25 percent, which is actually better than average. An average batter with 173 opportunities would have grounded into about 20 double plays (173 times .114, which was the average GDP rate in 2008), so let’s call Pujols +4 in DPA (double plays avoided*). Make sense? *Feel free to forget the acronym DPA as soon as you have finished reading this article. Despite being on the geeky side, I have a real aversion to the proliferation of acronyms to label arcane baseball stats. Right now, I need to give this stat a name, so I can refer to the thing I’m writing about, but you can safely forget it after finishing this article. I’m going to.** **UPDATE: This article will probably have a follow-up, so don’t forget what DPA is for another week or so. The best and the worst Now that we’ve decided to evaluated GDP tendencies with DPA (right, Double Plays Avoided), here are the leaders over the last three seasons (2006-2008): +-----------------+------+------+------+ | name | Opps | GDP | DPA | +-----------------+------+------+------+ | Sizemore_Grady | 308 | 10 | 25.4 | | Beltran_Carlos | 406 | 25 | 21.8 | | Patterson_Corey | 233 | 6 | 20.9 | | Abreu_Bob | 505 | 38 | 20.2 | | Suzuki_Ichiro | 313 | 17 | 19.0 | | Utley_Chase | 376 | 25 | 18.3 | | Giambi_Jason | 300 | 17 | 17.6 | | Damon_Johnny | 261 | 13 | 17.1 | | Howard_Ryan | 415 | 31 | 16.8 | | Hawpe_Brad | 387 | 28 | 16.6 | +-----------------+------+------+------+ This list makes sense; most of these guys have good speed and can get down the line fast to avoid the double play. Wait, what are Giambi and Howard doing on this list? What, do they get exceptional jumps out of the box or something? Jim Thome is 17th on this list (14.6 DPA) and David Ortiz is 20th (+12.7). Is anybody else surprised that these sluggardly sluggers are above average at not hitting into double plays? What’s this about? In other words, what makes a hitter good at avoiding the double play? The obvious answer, foot speed, is only a partial answer. Otherwise Giambi and Howard would not be on the list. These big slow guys avoid double plays with another technique: they avoid hitting ground balls. Actually, a fair fraction of the time, these guys avoid hitting the ball at all. For example, in 2008, 37 percent of Giambi’s plate appearances resulted in a strikeout, walk or hit-by-pitch. And when he did put the ball in play it usually wasn’t a ground ball. In fact, both Giambi and Howard hit a ground ball in only 20 percent of their GDP opportunities. Compare that to Ichiro, who hit a grounder in over 40 percent of his GDP opps. That’s how Giambi and Ryan Howard and Jim Thome manage to be above average when it comes to avoiding the double play—they don’t hit many ground balls. Here’s a list of the trailers, the guys who are grounding into more than their share of double plays (2006-2008): +-----------------+------+------+-------+ | name | Opps | GDP | DPA | +-----------------+------+------+-------+ | Tejada_Miguel | 438 | 82 | -31.5 | | Molina_Yadier | 247 | 53 | -24.6 | | Hudson_Orlando | 301 | 55 | -20.3 | | Konerko_Paul | 373 | 63 | -20.0 | | Castillo_Jose | 257 | 49 | -19.3 | | Berroa_Angel M. | 134 | 34 | -18.5 | | Martinez_Victor | 350 | 58 | -17.5 | | Molina_Ben | 292 | 51 | -17.4 | | Lo Duca_Paul | 222 | 43 | -17.3 | | Young_Mike | 432 | 67 | -17.2 | +-----------------+------+------+-------+ I guess there aren’t many surprises here: catchers, other slow guys, guys who don’t strike out or walk too much. Some are good hitters, but when you evaluate their worth, better take into account all these rally-killers they are hitting into. Which raises the question: how much are these double plays costing the team, anyway? The price you pay Let’s try to put a run value on the double play. Now, a generic out costs the team about .3 runs, more or less. So, we might estimate that a double play is worth, well, double that: minus .6 runs. The thing is, though, the double play seems more costly than just two outs, doesn’t it? It just seems to hurt more* than two generic outs. And you know what? The double play is more costly than two regular outs. That’s because it always happens with fewer than two outs and runners on base. These are often situations where the offense is expected to score multiple runs; the double play drastically reduces that run scoring potential. *The worst moment of my Little League career (which admittedly had quite a few bad moments): runners on the corners, one out, we’re down by two in the bottom of the seventh (final) inning. I step to the plate and hit it right at the third baseman. I run hard down the line, I think I can beat it out. I see the relay to second base from the corner of my eye. I’m starting to worry as the ball is on its way to first. I’m out by a full step. Grounded into a double play to snuff out our comeback and end the game! Do you know how rare it is for 11-year-olds to turn the double play? I cried as our manager gathered up the bats. So, I went through the Retrosheet play-by-play data to figure out the run value of the GDP: I get -.85 runs, pretty much constant over the last 55 years or so. So, now we can attach a run value to these extra double plays (above or below average). We do that by taking the difference between the value of the double play (minus .85) and the generic out. In other words, how much did the double play cost relative to making a single out (e.g. striking out or popping out or grounding into a 6-4 forceout, but beating the throw to first). Now an average generic out is, as mentioned above, worth -.3 runs. But in a double-play situation, the generic out is somewhat more costly—it’s roughly -.38 runs. So the difference in value between hitting into a double play and making a generic out is -.5 runs*. The actual value is -.47 runs, but to be honest, I don’t really know if the real value is -.47 or -.45 or -.51 runs. It’s best to round off to -.5 runs. Let’s not pretend we know more than we really do. I won’t reproduce the above tables with the run values; it’s enough to divide the DPA number by two to get the runs saved (or cost). So Grady Sizemore was worth an extra 11 runs (about one win) to the Indians over the last three years. At the other extreme, Miguel Tejada cost the Orioles nearly 18 runs (almost -2 wins) over the same time period. These are not large numbers, but as I like to point out, in today’s game a win on the free agent market is valued at around $4-4.5 million. What have we learned? For one, we’ve seen that for some players it’s important to take into account their performance in double-play situations when estimating their value. Not for the majority of players, but for some. Here’s a list of players, from the 2006 to 2008 time period, whose double-play performance changes their overall single-season value by at least a half-win: +-------------------+------+------+---------+ | Name | Year | Team | DP_runs | +-------------------+------+------+---------+ | Thome_Jim | 2006 | CHA | 6.1 | | Johnson_Kelly | 2008 | ATL | 5.3 | | Beltran_Carlos | 2006 | NYN | 5.0 | | Glaus_Troy | 2006 | TOR | -5.0 | | Molina_Yadier | 2008 | SLN | -5.0 | | Peralta_Jhonny | 2008 | CLE | -5.1 | | Berroa_Angel M. | 2006 | KCA | -5.3 | | Butler_Billy | 2008 | KCA | -5.5 | | Escobar_Yunel | 2008 | ATL | -5.5 | | Ordonez_Magglio | 2008 | DET | -5.5 | | Guerrero_Vladimir | 2008 | ANA | -6.2 | | Tejada_Miguel | 2008 | HOU | -7.1 | +-------------------+------+------+---------+ We’ve also learned that avoiding double plays is not just a question of foot speed. You can also avoid the DP by not hitting ground balls. Indeed, the correlation between speed and double play tendencies is not all that strong. In a future piece I will investigate this issue a bit more. I also want to have a look at historical double-play performances: the best and the worst all-time at avoiding the double play.