# The Physics of the Most Perfect Game

It happens. You’re walking down the street and you suddenly feel the need to straighten your tie, fix your hair, or adjust the collar on your coat. You see a weak reflection of yourself in a window and get to work only to realize the people inside the window have a clear view and are trying hard not to laugh.

Now, a normal person would just be embarrassed and walk away as quickly as possible. However, a nerd like me would continue to stare at the window because this episode brings to mind an experiment to help me grasp the nature of the universe.

Instead of standing in front of a window, I could shine a flashlight at a piece of glass. Most of the light goes straight through, but a small fraction, about four percent, reflects back toward the flashlight. Which part of the light bounces back? Could you build a flashlight without the bounce-back light? So many questions.

The flashlight’s beam, like everything else in our world, can be broken down into smaller entities. Perhaps you have heard of the fundamental particles of light called “photons.” The beam is composed of about a billion-billion identical photons coming out of the flashlight each second.

Since the photons are identical, there is no way to distinguish in advance which ones will reflect off the glass and which will go through. Therefore, the only workable explanation is statistical. As each photon encounters the glass, it has a one-in-25 chance (four percent) of bouncing back.

This statistical behavior turns out to be fundamentally true of everything in nature and is the basis for the science called “Quantum Mechanics.” However, The Hardball Times is about baseball, not physics. So, let’s move on to understanding a bit about how statistics work.

If a billion photons encounter the glass, then about 40 million photons will reflect. However, if you look at 25 photons hitting the glass, you would most likely see one photon reflect, but you would not be surprised to see zero, or two, or maybe even three reflect. Finally, if you examine a single photon you really can’t predict which it will do at all.

The universe – like baseball – runs on statistics. So, if you look at a stat like on-base percentage (OBP) it might give you a good sense about how likely a batter is to reach base over the course of a season. However, it is less helpful if you try to use it over a given week when a hitter might be on a hot streak or in a slump. It is even less predictive for a single at-bat.

Nonetheless, when it comes to perfect games, it is really the only relevant statistic that is easy to find and apply. Below is a table listing all 23 perfect games in major league history.

Pitcher | HOF | Date | Team | Opponent |

Lee Richmond | 6/12/1880 | Worcester Ruby Legs | Cleveland Blues | |

Monte Ward | 6/17/1880 | Providence Grays | Buffalo Bisons | |

Cy Young | √ | 5/5/1904 | Boston Pilgrims | Philadelphia Athletics |

Addie Joss | √ | 10/2/1908 | Cleveland Naps | Chicago White Sox |

Charlie Robertson | 4/30/1922 | Chicago White Sox | Detroit Tigers | |

Don Larsen | 10/8/1956 | New York Yankees | Brooklyn Dodgers | |

Jim Bunning | √ | 6/21/1964 | Philadelphia Phillies | New York Mets |

Sandy Koufax | √ | 9/9/1965 | Los Angeles Dodgers | Chicago Cubs |

Catfish Hunter | √ | 5/8/1968 | Oakland Athletics | Minnesota Twins |

Len Barker | 5/15/1981 | Cleveland Indians | Toronto Blue Jays | |

Mike Witt | 9/30/1984 | California Angels | Texas Rangers | |

Tom Browning | 9/16/1988 | Cincinnati Reds | Los Angeles Dodgers | |

Dennis Martinez | 7/28/1991 | Montreal Expos | Los Angeles Dodgers | |

Kenny Rogers | 7/28/1994 | Texas Rangers | California Angels | |

David Wells | 5/17/1998 | New York Yankees | Minnesota Twins | |

David Cone | 7/18/1999 | New York Yankees | Montreal Expos | |

Randy Johnson | √ | 5/18/2004 | Arizona D-backs | Atlanta Braves |

Mark Buehrle | 7/23/2009 | Chicago White Sox | Tampa Bay Rays | |

Dallas Braden | 5/9/2010 | Oakland Athletics | Tampa Bay Rays | |

Roy Halladay | 5/29/2010 | Philadelphia Phillies | Florida Marlins | |

Philip Humber | 4/21/2012 | Chicago White Sox | Seattle Mariners | |

Matt Cain | 6/13/2012 | San Francisco Giants | Houston Astros | |

Felix Hernandez | 8/15/2012 | Seattle Mariners | Tampa Bay Rays |

There are few slouches on this list of pitchers. In fact, more than 25 percent are members of the Hall of Fame and there are few pitchers you haven’t at least heard about. The performance most worthy of note, of course, is Don Larsen’s perfecto in the World Series:

Charlie Robertson’s effort against Detroit included an 0-for-3 day by Ty Cobb. Cobb had a rough start in 1922. On that day in April he was batting only .083 with an .154 OBP. He recovered and finished the campaign with his more typical .401 batting average.

Imagine the misfortune of Ossee Schrecongost of the Philadelphia Athletics, who was blanked during Cy Young’s perfect game. He was then traded to the White Sox in time to be blanked by Addie Joss four years later. Young’s perfect game was thrown against Hall of Famer Rube Waddell and Joss pitched his against another Hall of Famer, Ed Walsh.

That’s fun and all, but let’s get back to statistics. We’ll start with something relatively simple like dice. Suppose you hold a single die. What are the chances you won’t throw an ace (one)? Well, you’ll throw an ace about one in six throws, so the odds of throwing an ace are one in six, or the probability of throwing an ace is one-sixth. The chances of not throwing a one then must be one minus one-sixth or five-sixths.

Suppose you are going to throw the die two separate times. What are the chances that you will not throw at least one ace? There are 36 possible combinations. Eleven of them have at least one ace. So the probability of throwing no aces is 36 minus 11, or 25 of the 36 possible outcomes. An easier way to calculate this number is to just take the five-sixths chance for the first throw and multiply by the five-sixth chance for the second throw.

Suppose you intend to throw the die 27 times. What are the chances you never throw an ace? Well, I guess it would be five-sixths times five-sixths times five-sixths…until you do it 27 times. If you care, the probability is 0.73 percent, which means the odds are once in 137 tries.

What does this have to do with perfect games? If you accept that the on-base percentage (OBP) is the probability that a batter will reach base, then the probability that he won’t get on base is one minus the OBP. If no batter reaches base, then you have a perfect game.

So, the probability of throwing a perfect game is roughly the product of one minus the OBP for each batter each time he comes to the plate. The box scores from Baseball-Reference.com include the OBP for each batter at the beginning of the game. However, they only go back to the 1920s.

For the four older games I found box scores (without OBPs) at Baseball-Almanac.com. I then went back to Baseball-Reference.com to find the OBP for each batter for the year. Finally, I could calculate the odds for each perfect game and compare them for sheer entertainment value. The table below is sorted from the most likely perfect game to the least likely, with the calculated odds for each.

Pitcher | Date | Team | Opponent | Odds |

Sandy Koufax | 9/9/1965 | Los Angeles Dodgers | Chicago Cubs | 1,575 |

Lee Richmond | 6/12/1880 | Worcester Ruby Legs | Cleveland Blues | 2,804 |

Monte Ward | 6/17/1880 | Providence Grays | Buffalo Bisons | 3,005 |

Philip Humber | 4/21/2012 | Chicago White Sox | Seattle Mariners | 4,842 |

David Wells | 5/17/1998 | New York Yankees | Minnesota Twins | 8,369 |

Len Barker | 5/15/1981 | Cleveland Indians | Toronto Blue Jays | 10,703 |

Cy Young | 5/5/1904 | Boston Pilgrims | Philadelphia A’s | 10,723 |

Jim Bunning | 6/21/1964 | Philadelphia Phillies | New York Mets | 14,784 |

Tom Browning | 9/16/1988 | Cincinnati Reds | LA Dodgers | 16,833 |

Don Larsen | 10/8/1956 | New York Yankees | Brooklyn Dodgers | 18,799 |

Addie Joss | 10/2/1908 | Cleveland Naps | Chicago White Sox | 19,510 |

Mike Witt | 9/30/1984 | California Angels | Texas Rangers | 21,622 |

Catfish Hunter | 5/8/1968 | Oakland Athletics | Minnesota Twins | 22,501 |

Charlie Robertson | 4/30/1922 | Chicago White Sox | Detroit Tigers | 29,129 |

Dallas Braden | 5/9/2010 | Oakland Athletics | Tampa Bay Rays | 32,751 |

David Cone | 7/18/1999 | New York Yankees | Montreal Expos | 41,183 |

Dennis Martinez | 7/28/1991 | Montreal Expos | LA Dodgers | 43,433 |

Roy Halladay | 5/29/2010 | Philadelphia Phillies | Florida Marlins | 46,074 |

Felix Hernandez | 8/15/2012 | Seattle Mariners | Tampa Bay Rays | 51,421 |

Matt Cain | 6/13/2012 | San Francisco Giants | Houston Astros | 52,858 |

Kenny Rogers | 7/28/1994 | Texas Rangers | California Angels | 57,651 |

Randy Johnson | 5/18/2004 | Arizona D-backs | Atlanta Braves | 87,708 |

Mark Buehrle | 7/23/2009 | Chicago White Sox | Tampa Bay Rays | 121,275 |

Sandy Koufax was my childhood idol. I remember listening to Vin Scully call his perfect game in 1965. So, it saddened me greatly to see that he had the most likely perfecto of all. The Cubs, as usual, were not a good team. They finished eighth that year — losing 90 games.

Talk about bad timing. On the night in question, two September call-ups, Byron Browne and Don Young, played their first major league game for the Cubs. In addition Chicago’s pitcher, Bob Hendley, couldn’t hit water if he fell out of a boat. He went hitless in 14 at-bats that year, striking out 10 times. In essence, Koufax was really facing only six major league batters that night. Of the six only Ron Santo, Ernie Banks and Billy Williams had OBPs above .300.

By this methodology, Buehrle’s flawless performance in 2009 was the most difficult to accomplish. You might recall that in 2008 Tampa Bay lost the World Series to the Phillies. In 2009, the Rays they were set on returning to the Fall Classic. However, the Yankees and Red Sox were also playing great ball. On July 23, the Rays were 6.5 games behind in the AL East but had a record of 52-44.

Their line-up that day included Ben Zobrist, Evan Longoria, Carl Crawford and B. J. Upton. Only one batter had an OBP of less than .300; Zobrist boasted a gaudy .413. Of course, since it was in the American League the Rays pitcher didn’t bat.

I hope this table of odds for perfect games will start lots of fun conversations-discussions-arguments and you’ll add your thoughts to the comments below. After all, these sorts of statistical gymnastics are not just part of the National Pastime, but they are a fundamental truth about the behavior of our universe.

Great read!

I was shocked to see Humber as the 4th most likely…..those 2012 Mariners really couldn’t get on base.

Random thought: is there any evidence that these happen on day games after night games or “getaway” days more than random chance would suggest?

Would changes in the strike zone affect the odds? Koufax pitched his perfecto at a time when the strike zone was very large. That presumably affected the chances of no one getting on base. On the other hand, it doesn’t look like there was any disproportionate number of perfect games during the large strikezone ere (which I think was about 1963-1968). There were three during that period but there were also three in 2012 alone.

I would think the strike zone would be one factor baked in to the respective OBPs.

I was wondering what happened around 1980 that created conditions for so many more perfect games. Maybe just more games in general, with MLB about doubling in size since 1961? “Not as many good hitters” I don’t think is plausible, given the vast expansion of the talent market since then. It’s interesting to me that only Witt’s game (after 1980) happened in September, presumably against a lineup loaded with 40-man roster call-ups.

Anybody here seen a perfect game in person, or come close to one? I’ve seen a couple reverse no-hitters (leadoff man or first two batters got hits and then zip). That’s all I got.

I was at Fenway in mid-2008 when John Lackey took a no-no into the 9th. That game also happened to be the final nail in the coffin for Sox fans and Manny, when he was booed for his lack of hustle. There was one play earlier in the game in which Manny could have easily got an infield hit on a throw that pulled the first baseman off the bag, but he was still out because he was running less than 50%. And I believe it was also the same day that the Angels traded for Teixeira.

I saw the Red Sox pitch one against the Blue Jays during spring training, 14 March 2000–Pedro Martinez (3.0 IP) plus five relievers

I noticed you forgot to include Armando Galarraga’s perfect game, which, by the way, happened. I ran the numbers and it came out almost exactly tied with Felix Hernandez at 51,327, so therefore Armando Galarraga was exactly as good a pitcher as Felix Hernandez.

Good stuff. Man, the Dodgers and Rays came out on the wrong side of that list three times each, with the Rays accumulating those zeroes in a very short span.

The closest I ever came to seeing a perfect game was Brandon Webb’s one-hitter against the Cardinals. September 9, 2006 at the BOB. Even at that, it was several plays away from perfect. The one hit was a Scott Rolen double, but there were also two errors, and a HBP.

As a preliminary thought, I would imagine that wOBA would be more useful than OBP, since it is a stat which could be usefully adjusted to the ballpark in which the game was pitched. Not likely to make a major difference though except in some really close comparisons (e.g. maybe Cy Young and Len Barker swap places).

A more significant thought though is that there are a lot of other *really* subtle factors that go into a perfect game which aren’t captured by OBP. Errors, for example, are not reflected at all (nor should they be!). However, an error by Hanley Ramirez is the only thing which kept Clayton Kershaw from a perfect game against the Rockies in 2014. Given the Rockies 2014 line-up, I would imagine that game was actually lower probability than many of the perfect games here. Other factors also apply, like pitch framing. Roy Halliday came within one pitch of a perfect game in the 2010 NLDS. What if Carlos Ruiz had done a slightly better job at framing that cutter? Ruiz was very highly regarded as a framer in 2010; how does that affect the odds? And of course, there are stupid “soft” factors, like Tabata’s elbow in Scherzer’s near-miss no-no.

Generally, I just don’t think that calculating perfect game odds solely from OBP is as enlightening as it might seem. It does certainly cause some performances to stand out (Koufax’s remarkably unremarkable perfecto as a good example), but it really under-sells how unbelievably improbable these performances are, and how little of the probability is under the pitcher’s control.

Interesting article. A couple of points worth addressing. In looking at a player’s OBP, it should be adjusted, or regressed, toward the mean (think Stein’s paradox: http://statweb.stanford.edu/~ckirby/brad/other/Article1977.pdf). Players with fewer plate appearances will have less reliable OBPs, so games earlier in the season will not be as accurate as games later, in general. Adjusting to the mean OBP of the given year would provide a more accurate OBP. Second, I think probability of reaching a base also needs to be adjusted by the given pitcher. The probability of getting on base is certainly not the same against all pitchers, as is assumed. Not sure the best approach to this, but similar to the first point, regressing a player’s OBP toward the pitcher’s OBP may be a good one.

Okay, really geeky question for @mathEmagician: Do we care about Stein’s paradox? We don’t want the estimator with lowest MSE for the vector (OBP1, OBP2, . . . OBP9). We want the best estimator for the quantity (1 – OBP1) * (1 – OBP2) * . . . . (And I have no idea whether the estimator he used here is unbiased, has lowest MSE, whatever. But it sure is the obvious choice.)

This was my fault. Although I read the article in full, I apparently didn’t pay enough attention to all the details before making my remark. I thought Mr. Kagan was using OBP coming into the game, but he clearly states he used OBP for the year. Therefore, regressing toward the league average is much less relevant. My apologies.

I still stand by my second comment, regarding adjustment for the pitcher, however.

I’m curious as to how the probabilities of perfect games are reflected in the number of games played. Are we seeing the “expected number of photons reflected?”

There are currently 2430 games per season so I’m guessing that there has been something like 200,000 games in history (I don’t have time to do the math). We’ve seen 23 perfect games, so it’s about 1 in 9000. That’s in the ballpark of what would he predicted with the odds here.

Especially considering that many of these were great pitchers, we would have to assume that the OBP against these pitchers was on average lower than each player’s overall OBP. That would slightly increase the odds of a perfect game. However, I think the overall point to is a good one. Getting a perfect game is a lot like rolling 27 straight times without an ace.

The Tom Browning part links to Tom Brown.

Was he “Johan Santa”-ed, so to speak?