The peak of perfection
Lovers of great pitching are living in a golden age. The last six years, 2007 to 2012, have all had multiple no-hitters, including the first post-season no-no since Don Larsen. Even more striking has been the explosion of perfect games. In three and a half seasons, we have witnessed five of those gems (plus a sixth that was maddeningly close). The first 133 years of organized baseball produced 17 perfect games, roughly one every eight years. Right now, we’re getting them more than once a season.
Students of the game are asking why. The turning of the wheel toward a lower-scoring game is part of the equation, as is the greater number of games played in the major leagues each season, almost twice the number of 1960 and before. These seem inadequate to explain all of the current surge, though. More than one person has calculated the historic odds of an average pitcher getting 27 outs from 27 batters, multiplied it by all the games played, and concluded that things have gone haywire.
I did not aim for such a big conclusion. My curiosity started more modestly: I wanted to calculate how likely it was that the pitchers who did throw perfect games would have done so. Given their performance in the years they accomplished the deed, and throughout their careers, what were the underlying chances that they would get themselves into the record books this way?
The grungy details
The numbers are straightforward: how many batters the pitchers faced, how many batting outs they produced, then a lot of exponents. One needs to discount intentional walks and sacrifice bunts from the equations. The latter literally cannot happen in a situation where a perfect game is possible; the former practically never happens in such a situation. Thus, I do not count intentional passes as plate appearances, or sacrifice bunts as plate appearances or outs. Sacrifice flies, being glorified fly balls, count as plate appearances and outs.
I also counted post-season appearances in the year and career numbers. Larsen’s perfect game came in the World Series, and Roy Halladay came one walk away from a playoff perfecto in 2010, “settling” for a no-hitter. Granted, there’s a heightened degree of difficulty when playing against a team that’s made the World Series (or earlier rounds), but they are games, and it’s not like they count less than the regular season.
For much of baseball history, all this works out fine. For five of the 22 perfect games, though, some numbers are missing from the records. Intentional walks were only regularly counted starting in 1955, and players reaching on errors were tracked from 1948 on. So for Lee Richmond and Monte Ward in 1880, Cy Young in 1904, Addie Joss in 1908, and Charlie Robertson in 1922, I had to do some guesstimating.
Even at a site like this, going into great detail would strain readers’ patience. I will note one peculiarity I discovered while doing this work. In 1948, the first year reached on error was tabulated, 33.3 percent of recorded errors resulted in a batter reaching base. The next year, this jumped to 40.8 percent, then 48.8 percent, then 52.5 percent vin 1951, and creeping upward from there. (The 2011 figure was 58.4 percent) I doubt that actual fielding competence, or scorers’ rulings, changed this rapidly. It appears that, despite lack of notation in many split lines, the ROE totals are incomplete for the first several years that the stat was tracked. Future researchers, take note.
Season numbers
The table below shows seasonal statistics for all 22 pitchers to throw perfect games. Season stats for Philip Humber and Matt Cain are complete through July 1, and must be adjusted accordingly to compare them with full-season numbers.
Pitcher Starts PA(Adj'd) Reached Outs/PA Prob-Game Prob-Season Lee Richmond 1880 66 2410 750 0.689 0.00425% 0.280% Monte Ward 1880 67 2351 678 0.712 0.0102% 0.684% Cy Young 1904 41 1433 394 0.725 0.0170% 0.694% Addie Joss 1908 35 1208 295 0.756 0.0521% 1.808% Charlie Robertson 1922 34 1118 400 0.642 0.000642% 0.0218% Don Larsen 1956 22 795 254 0.681 0.00306% 0.0674% Jim Bunning 1964 39 1123 322 0.713 0.0109% 0.425% Sandy Koufax 1965 44 1366 325 0.762 0.0652% 2.827% Catfish Hunter 1968 34 949 286 0.699 0.00623% 0.212% Len Barker 1981 22 654 207 0.683 0.00345% 0.0759% Mike Witt 1984 34 1022 326 0.681 0.00313% 0.106% Tom Browning 1988 36 992 282 0.716 0.0120% 0.430% Dennis Martinez 1991 31 895 265 0.704 0.00764% 0.237% Kenny Rogers 1994 24 710 236 0.668 0.00183% 0.0439% David Wells 1998 34 967 263 0.728 0.0190% 0.643% David Cone 1999 33 877 287 0.673 0.00224% 0.0742% Randy Johnson 2004 35 956 242 0.747 0.0378% 1.315% Mark Buehrle 2009 33 860 276 0.679 0.00290% 0.0955% Dallas Braden 2010 30 775 232 0.701 0.00674% 0.202% Roy Halladay 2010 36 1066 294 0.724 0.0165% 0.591% Philip Humber 2012 12 299 103 0.656 0.00112% 0.0134% Matt Cain 2012 16 443 115 0.740 0.0299% 0.477%
There are only three pitchers showing a better than 1 percent chance to throw a perfect game the season they did it: Addie Joss, Sandy Koufax, and Randy Johnson. (Matt Cain might get there with a hot second half and/or some playoff starts.) Koufax’s number exceeds the rest through a convergence of factors: he had a great season with a lot of starts in a historically low run environment. Randy Johnson had a better ERA+ in his perfect-game year than Koufax (176 vs. 160), but he was pitching in a five-man rotation at the tail end of an offensive explosion. Joss’ ERA+ outstripped both, and he pitched deep in a deadball era, but a higher error rate depressed his chances.
Six pitchers had less than a 0.1 percent chance of making history that season. Larsen, Barker, and Rogers had the fewest starts of anyone on the list (except this year’s entries so far), helping explain their long odds. Cone pitched well, but yielded lots of walks. Buehrle has a reputation for getting results better than his underlying numbers, which certainly happened one day in 2009.
As for Charlie Robertson, he comes across as the biggest fluke on the list. He didn’t pitch badly that year, with a 111 ERA+, but much of that value came in suppressing extra-base hits during the early stretch of the Ruthian bat-boom. His OBP-against at .346 was just better than the suddenly inflated league average. Perfect games were next to impossible in that offensive flood, and the relatively anonymous name of the man who managed it just highlights the unlikeliness.
The perfect game from the worst pitcher for a season has a fairly unexpected source: Catfish Hunter. Hunter, still only 22, was down in ’68 from his first truly good season, and posted a mere 83 OPS+. It was still uncertain then whether Catfish was a young star in the making or just a kid who lucked into a good ERA in ’67. Getting his perfecto in the Year of the Pitcher was a mark in the “lucky break” column, but he’d provide some countervailing evidence in seasons to come.
Philip Humber’s 2012 ERA+ through July 1 is, at 71, even worse than Catfish’s, and his perfect-game chances are worst on the list (though for only half a season: he has time to pass Robertson). As I write, he’s on the disabled list with a right elbow flexor strain. One can wonder whether the effort of producing a perfect game could have caused the injury, or at least a lesser strain that left him ineffective and grew into outright injury. To answer, I note that in his first four games this year, he threw 115, 96, 115, and 107 pitches. His perfect game was the 96. He wasn’t busting any pitch counts to get in the record books, so we probably should look elsewhere to explain his woes.
Just for some fun, I ran the numbers on three other pitchers who threw unofficial perfect games: Harvey Haddix in 1959, Pedro Martinez in 1995, and Armando Galarraga in 2010. The odds are still for throwing nine perfect innings, even though one did more and another did arguably less (or arguably more).
Pitcher Starts PA(Adj'd) Reached Outs/PA Prob-Game Prob-Season Haddix 1959 29 875 240 0.726 0.0174% 0.5035% Martinez 1995 30 776 237 0.695 0.00533% 0.1597% Galarraga 2011 24 615 201 0.673 0.00229% 0.0549%
Haddix had a fine 1959 when he pitched those twelve perfect innings. Odds-wise, he’s solidly in the upper half of our exclusive group. Martinez’s low chances seem surprising, but it would be like Koufax tossing a perfect game in 1960, or Hunter in 1968. It was a couple years before he started putting up those mind-bending numbers. Galarraga’s odds are no shock: one could argue he’s the flukiest pitcher on the list, without even being on the official list.
Career numbers, and why they’re unreliable
I have also calculated the probabilities for our perfect pitchers to have done the deed throughout their careers. It is here that I discovered an unreliability in the numbers—but an unreliability that turns out to be informative, giving some perspective on our current perfect-game boom. I will explain it soon, but for now, I must echo a myriad of sports betting-line columns and state that the following numbers are for entertainment purposes only. Again, stats for contemporary pitchers are complete through July 1.
Pitcher Starts PA(Adj'd) Reached Outs/PA Prob-Game Prob-Career Richmond 179 6819 2610 0.617 0.000220% 0.0394% Ward 262 10162 2956 0.709 0.00932% 2.412% Young 818 29678 9277 0.687 0.00392% 3.156% Joss 260 8880 2530 0.715 0.0117% 2.993% Robertson 141 4265 1592 0.627 0.000332% 0.0468% Larsen 177 6718 2261 0.663 0.00154% 0.273% Bunning 519 15341 4673 0.695 0.00550% 2.813% Koufax 321 9576 2708 0.717 0.0127% 3.983% Hunter 495 14368 4199 0.708 0.00885% 4.285% Barker 194 5590 1877 0.664 0.00159% 0.309% Witt 301 8906 2923 0.672 0.00216% 0.649% Browning 303 7990 2534 0.683 0.00336% 1.014% Martinez 569 16740 5364 0.680 0.00295% 1.666% Rogers 482 14317 4951 0.654 0.00106% 0.508% Wells 506 14745 4674 0.683 0.00338% 1.698% Cone 437 12543 3959 0.684 0.00357% 1.548% Johnson 619 17385 5325 0.694 0.00515% 3.137% Buehrle 385 10745 3465 0.678 0.00272% 1.043% Braden 79 2056 673 0.673 0.00224% 0.177% Halladay 368 10706 3211 0.700 0.00659% 2.396% Humber 40 1200 385 0.679 0.00291% 0.116% Cain 222 5895 1750 0.703 0.00741% 1.633%
While producing this table, I briefly had the post-season numbers for our contemporary pitchers isolated, and got an eye-opener. In four career post-season starts, Mark Buehrle works out as having had a 0.0542 percent chance of a perfect game (on 34 batters reaching in 121 plate appearances, both numbers duly adjusted). This exceeds Charlie Robertson’s perfect game chances (0.0468 percent) for his entire career of 141 starts. Matt Cain outdoes them both, working up a 0.1139 percent chance in only three post-season starts. Even with the unsteadiness of the numbers, this illuminates just how unlikely it was for a journeyman pitcher in the Era of the Babe to do what Robertson did.
I was surprised to find Hunter edging out Koufax for the highest probability. Half again as many starts, in a similarly low-offense era, helped Catfish plenty. Lee Richmond ends up with even a worse career probability than Robertson does, and with 38 more starts. It’s the errors (and maybe my projections on errors for 1880) producing this result, so you can take this with a grain of salt. This is not, however, the big problem I was talking about.
So what is that problem? You’ll get a hint of it if you look at Sandy Koufax’s season and career numbers. He’s 3.98 percent for his whole career, and 2.83 percent for 1965 alone. Was Sandy that much less likely to have thrown perfect games in his other seasons? I checked: he wasn’t. He was at 1.26 percent in 1966. Perfect game chances are multiplicative, not additive, but that still comes out as 4.05 percent for the years 1965 and 1966, compared to 3.98 percent for 1955 to 1966.
This is, of course, impossible. What’s going on here?
The hitch is that these collective numbers represent a wide diversity of individual days and situations. On some days, the pitcher will be loose and ready; the fastballs will crackle; the breaking balls will snap; the catcher won’t need to move his glove; the wind will be blowing in at Dodger Stadium. Other days, he’ll have a cold or a tight shoulder or a twinge in his elbow; he’ll never find top gear with the fastball; the curve will be limp; his precision will be gone; he’ll be pitching at Coors Field, or Wrigley with a hot wind blowing out. There are good days and bad days, and while you can average them, they don’t average out where perfect games are concerned.
Take, for example, a pitcher good enough to get outs on 70 percent of his batters faced. That is a mean figure: he’s not pitching at 70 percent efficiency every day. If the variance of his underlying efficiency is large enough, it can produce huge effects on his perfect-game chances. Say he has two lousy starts at 60 percent, followed by a locked-in day at 90 percent. The 90 percent day is good enough to produce a 5.81 percent chance of a perfect game, whereas three games at 70 percent would only rate a 0.0197 percent chance.
That’s an extreme example, but the principle holds even with much smaller variances. Staying with our 70 percent pitcher, let’s postulate one day where he’s working at a 23 outs/32 batters rate (71.9 percent) and two where he’s at 20/29 (69.0 percent). Add the three days together, and he’s at 63/90, or 70 percent on the dot. (I’m assuming he gets to face more batters on days he’s pitching better.) With no variance, his odds are, as stated above, 0.0197 percent. With the slight variance, it rises to 0.0222 percent.
And it’s not only due to the good start having a greater variance from the mean than the lesser ones. Switch the numbers, make it two 22/31 (71.0 percent) days and a 19/28 (67.9 percent) day. The collective perfect-game chance is 0.0219 percent, still significantly ahead of the 0.0197 percent for the no-variance case. The conclusion is plain: any variance from the average produces gains in perfect-game chances from the “on” days that more than offset losses from the “off” days.
Koufax recapitulates that on a career level. He had several years when we was an okay pitcher, followed by several years when he was The Left Arm of God. Taking a 12-year average of his performance dilutes how dominant he was in those glory years, and crashes the math on how likely he was to put together a perfect game. The same thing happens year to year, month to month, day to day.
We have no mechanism by which to gauge how good a pitcher truly is from one day to another. Hits and runs, walks and strikeouts do tell a valid story, but they are fluctuations around an unseen baseline. We can’t measure that daily baseline yet, and though the emerging science of baseball is making inroads on a dozen different levels, I don’t think we will ever have it nailed down.
Breaking down the numbers into ever-smaller groups to chase a more precise answer brings on a reductio ad absurdum. Just because a pitcher retired 27 straight batters one day doesn’t mean he was a 100 percent lock that day to pitch a perfect game. It might have been as high as a 1-2 percent chance; it might have been one in a million.
What is pretty sure is that the chances are better than they appear from taking averages for a season, a career, or the whole history of baseball. The tables above are useful for comparison between the pitchers, but they end up representing a floor on the true range of probabilities. Perfect games turn out to be more likely than straightforward calculations make them out to be, and the current surge we are witnessing is not as big a warping of the percentages as some have concluded.
Might we figure out some day what the percentages really are? It would mean figuring out the true shape of the bell curve of pitchers’ performances, how the curve bends differently for aces, journeymen, and fringe players, how far and how often they stray from their average talent level. Even an approximation would be a weighty undertaking, a mountain too steep for me to climb.
But someone else might. People do keep beating the odds, and more often than you think.
References & Resources
Baseball-Reference.com
Retrosheet.com
Great, interesting article. I particularly enjoyed the point about variance typically being neglected in using large range totals to calculate these types of odds.
I still wonder if a lot of these sorts of questions are ill-formed. What I mean by that is this – Phil Humber throws a perfect game, and it strikes us as flukey and incredibly unlikely, given that he is not, say, Sandy Koufax in his prime. So we start calculating odds that Phil Humber would have thrown that perfect game in that game, in this season, in a season, etc.
But none of those questions really get at the phenomenon we just observed. We are surprised not that *Phil Humber* threw a perfect game, but that *someone like Phil Humber* threw a perfect game. Probabilities are calculated as successes (or hits) over the entire space of possible outcomes. So it seems important to note that hits here don’t just include PH’s perfect games, but perfect games tossed by anybody of his caliber (or really, his caliber or below). So whether that was flukey or unlikely depends on a lot more than just Humber’s chances per se.
This may be more of a fine philosophical point than a statistical, sabermetric one, but the problem as I see it is the timing of the question. If I ask what Humber or Cain or whoever’s chances are ahead of time, then these calculations seem to reflect the questions. What is the chance that, say, Matt Cain will throw a perfect game in his next 10 starts? That we can guesstimate, and guesstimate better by accounting for variance and such. But if I wait until after it happens and then ask what the chances were that it would have happened, my calculations don’t really track the phenomenon I’m interested in. Because now what I’ve seen is that *a* perfect game was thrown – if I didn’t specify which one I was looking for ahead of time, then it seems inaccurate to pretend that this particular perfect game was the one that I was thinking of all along.
It’s like winning the lottery, to use a classic example. If one of my friends wins the lottery, that will seem amazing and incredibly unlikely. But I’ll be surprised if anyone I know wins the lottery, not just he/she. So that’s the type of odds I should be calculating after the fact. Clearly my friends who buy 100 tickets are more probably the winner than the ones who buy one, to use a Koufax/Humber comparison. But if it turns out I have a lot of friends, and I know everyone who played the lottery – then I shouldn’t be surprised at all.
For comparison sake, is it possible to produce the numbers for some extraordinary seasons that did not produce a perfect game? Say some of the top ERA+ seasons of all time (Tim Keefe 1880, Pedro Martinez 2000, Dutch Leonard 1914, Greg Maddux 1995, Walter Johnson 1913, Bob Gibson 1968, etc.) and perhaps a few other interesting seasons (Old Hoss Radbourn 1884, Nolan Ryan pick the season, etc.)
Nyet: Good points all around. I didn’t tackle the odds of other players getting perfect games because the article was threatening to grow quite huge already. Just as it’s untrue that Koufax’s perfect-game chances were 100%, it’s untrue to think that, say, Bob Gibson’s chances were 0%. And if you play enough long-shots, one of them should hit. It’s a selection bias, but one that gave me an excuse not to run the numbers for every starting pitcher ever, so I embraced it. <wink>
But since Paul G. asked, I’ll run them for one guy who didn’t get a perfect game: Pedro Martinez in 2000. And let me be the first to say: holy mackerel! (Okay, I’m not remotely the first guy to say “Holy mackerel!” about how Pedro pitched in 2000.)
Starts: 29
Reached/PA: 185/815
Outs per PA: 0.773 (Koufax ‘65 was only 0.762.)
Prob-Game: 0.0957% (Koufax ‘65 was 0.0652%.)
Prob-Season: 2.74% (Koufax ‘65 was 2.83%.)
Only Pedro’s limited starts keep him from busting Sandy’s season odds. The numbers are saying Pedro was basically a thousand-to-one shot to throw a perfect game any day he took the mound in 2000. And remember, with the variance effect it was even likelier than that. I’d play those odds.
Thanks, Shane. After-the-fact probability questions, particularly in sports scenarios, are interesting, primarily because of the overwhelming temptation to narrow in on the small subset of the present. I really like how you emphasized here the differences between “for any given game” v. “within the season” mindsets.
One more thought to add to your variance concept. The pitcher will vary from game to game – inter-game variance – and that will increase his chances compared to the steady-as-she-goes guy. But another effect is that the pitcher will experience intragame variance, simply because the batters will be of differing quality. And this actually harms the pitcher’s chances. A quick demonstration – what are a pitcher’s chances of getting back to back guys out? If he can retire guys 80% of the time, then it will be (.8)^2 = a 64% chance. But if player B is better than A (or vice versa), and they just average out to .8 – say, .9 and .7 – then it’s .9 * .7 = 63% chance. It gets worse when you’re raising to the 27th power, obviously.
(this is a variation of the farmer problem – a farmer has, say 20 feet of fence and wants to build a pen with four corners with maximal area. The way to do this will always be, no matter how much fence, to make a square. In this case 5*5 = 25 and is more than 6 x 4 or 7 x 3. If the pen can have no corners, a circle is similarly better than an ellipse).