Outsmarting Pythagoras

by Sal Baxamusa
September 17, 2007

I need to rant.

On July 20, Joe Sheehan of Baseball Prospectus wrote:

Run differential is a key measure of team quality, and a better predictor of future performance than win-loss record. With that said, it is worth taking the time to look more deeply and see what has gone into that run differential, and what the reasons are for any gap. The AL West may seem to hold two teams of basically even quality and disparate luck, but it’s clear that the Mariners, despite sharing a +11 differential with the A’s, are the better team and will continue to be so.

Rationalizing the difference between a team’s Pythagorean record and its actual record tops my list of analytical pet peeves. I don’t mean to pick on Joe; he’s a very good analyst and a fine writer. I’m using him as an example that even the best will engage in this fool’s errand. Exactly one month later, another very good analyst, THT’s own Chris Jaffe, tilted at the windmills:

Yet the D-backs this year have done quite the job mocking sabermetric orthodoxy. Not only have they floated over their ordained record, but they’ve teased people into thinking they’ll regress. After bursting out to a 46-35 start (while getting outscored all the while), they fell apart and dropped 13 of their next 17. “A-ha! Mathematical certainly will not be mocked for long!” Well, except in this case that is. Since then the D-backs have won 21 of 26. Sure they’ve outscored their opponents in that stretch, but only 140-116. They’re still wildly exceeding expectations.

Chris explained Arizona’s deviation as the result of a consistent run scoring and inconsistent run prevention. While I don’t know if that’s true in Arizona’s case or not, the general fact is indeed true. Consistent offense and inconsistent pitching is one way to beat your Pythagorean projection. But the magnitude of Arizona’s differential is so large as to be inexplicable by this effect.

Mainstream writers trying to flash some saber-cred are getting into the act as well. On the same day that Chris’s article was published, Ken Rosenthal of FOX Sports wrote:

The Diamondbacks’ run differential, quite simply, is misleading. Between July 1 and Aug. 2, the D-Backs displayed a bizarre knack for getting blown out in the final game of a series. They suffered four such defeats, getting outscored, 48-1—and distorting their differential.

Knack? It may be bizarre, but it sure isn’t a “knack.” It seems that Rosenthal is implying that being on the wrong end of blowouts but winning close games is the result of some underlying strategy or skill. Logic tells us that a team with a good strategy would win lots of games by big margins and only lose the close ones. But attempting to outsmart oneself is tradition in baseball, from justifying Shannon Stewart as an MVP to batting a terrible hitter high in the lineup because of “bat control.”

And to these eyes, it is a growing phenomenon. Over the last few years, at first on fan blogs and message boards but increasingly in analytical and mainstream circles, the following sentence has becoming more and more common: “Run differential is useful, but here’s why it isn’t applicable to team X.”

I’m too dumb to outsmart myself, so allow me to be stupid about it: the Pythagorean formula is the best way to estimate a team’s winning percentage short of counting wins. When a team under- or over-performs its Pythagorean record, a large part of it is…well, I won’t say it’s luck, since that word tends to incite rioting among baseball fans. Let’s just say it’s not any kind of skill that we know of yet.

There are two main reasons why teams under- or over-perform their Pythagorean records: 1) a bizarre run distribution that is distorted from its typical Weibullian shape and 2) an inordinate number of blowouts or squeakers going for or against you. The first point has been previously reported—under its pseudonym, “consistency”—anecdotally by yours truly, analytically by David Gassko, and very elegantly by Pizza Cutter. The second point may be related to a finding by David that pitcher leveraging has some correlation with Pythagorean differential.

Even that correlation accounts for something like +/- 2.7 wins for 95% of teams, whereas 95% of teams fall within +/- 7.5 wins of their Pythagorean record. That’s not to say that we should ignore something like pitcher leveraging. But it does mean that a large chunk of the difference between actual wins and Pythagorean wins is “I-won’t-call-it-luck.” And none of the cited factors of consistency or bullpen leverage have been shown—although I’m open to seeing evidence to this effect—to be the result of some kind of predictable or skill-based combination of situational hitting, managerial strategy or roster construction.

It’s time to stop trying to outsmart ourselves. While we may be able to point to certain factors that have caused a team to have a large Pythagorean differential, it’s folly to use those facts as evidence that the trend will continue.

Coda

As an example, let’s look at what happened in the AL West. Here are the standings since Sheehan’s column ran:

On July 19, 2007
Team     W   L   pW    pL    RS   RA
Seattle  53  39  47.0  45.0  456  445
Oakland  45  50  48.7  46.3  402  391

Since July 19, 2007
Team     W   L   pW    pL    RS   RA
Seattle  25  31  24.3  31.7  270  311
Oakland  29  27  27.0  29.0  299  308

Wouldn’t you know it? The A’s have gained four real games on the Mariners since then and have played almost three games better in the Pythagorean world. Furthermore, Seattle’s record is just under its Pythagorean projection. Whatever enabled the Mariners to outdo their Pythagorean record earlier in the year seems to have worn off in mid-July, despite Sheehan’s observation that “the Mariners are the better team and will continue to be so.”

A Hardball Times Update

by RJ McDaniel

Goodbye for now.

Again, I don’t mean to single out Joe. If he is susceptible to falling into this trap, we should all be cautioned. And, to his credit, Sheehan pointed to the tremendous Mariner bullpen as one reason why the M’s had outperformed their Pythagorean record. And that bullpen faltered considerably since July 19:

Sea-borne reliever    Before	After
J.J. Putz             0.79      2.89
George Sherrill       1.24      5.40
Sean Green            2.70      5.70
Brandon Morrow        3.79      5.31
Eric O'Flaherty       3.45      5.00

None of this is to say that we should have predicted with absolute certainty that Oakland and Seattle would play equal baseball over the rest of the year. But we certainly should not have expected Seattle to outplay Oakland, much less outplay their Pythagorean fate!

And what of Arizona? Since Chris and Ken published their articles on August 20—an insanely small sample in which really anything can happen—Arizona has gone 13-12 with a Pythagorean record over that time of 12.0-13.0. Beware of trying to outsmart Pythagoras—or yourself.

References & Resources
For a nice statistical take on the predictive value of the Pythagorean formula, check out Clay Davenport’s article from 2004.

Dave Studeman took a look at this topic when he looked at Ten Things About One-Run Games. Dan Fox chimed in during the White Sox championship season and there was a good discussion at the Basebal Think Factory.

Nate Silver has noted that there is an ongoing problem in baseball – and in analysis, I would argue – in putting too much emphasis on current goings-on and not enough on the big picture. One might consider analysts trying to rationalize Pythagorean differentials as part of that problem.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG