The hangover effect

That’ll teach me to make predictions. Or it should.

Last month, I wrote a THT Live entry about a 19-inning game the Pittsburgh Pirates won over the St. Louis Cardinals. I observed that Pittsburgh had played a 19-inning marathon against the Braves in July of the previous season, lost it in heartbreaking fashion, and promptly plunged out of playoff contention and back into the second division. With the Pirates on the bright side of the equation this time, I stuck my neck out and forecast that they would ride the momentum to a playoff berth this year.

Hubris, may I introduce you to Nemesis? Oh, I see you know each other already.

Pittsburgh had already been in an 8-12 dip before that Long Day’s Journey Into Extra Scorecard Pages, but the slide turned into freefall. In the 36 days since their super-sized win (not including last night’s game), the Pirates have gone an ugly 8-24. A five-week stretch with four losing streaks of at least four games and no winning string beyond two has not only wiped out their October chances, but likely consigned the franchise to its 20th consecutive losing season.

How do we explain this? More importantly, how do I explain this so I don’t look like such an idiot? (Hm. With those instincts, I might have a future in politics after all. When’s the filing deadline?)

Well, the Pirates collapsed right after losing a 19-inning game, and the next year they collapsed right after winning a 19-inning game. Could the moral be, don’t play 19-inning games? More generally, might it be that engaging in extremely long baseball games, win or lose, has a deleterious effect on teams in subsequent games? How long might this adverse effect last? And does it make a difference whether you win the marathon or lose it?

In hopes of salvaging something from the blazing wreckage of my prediction, I decided to investigate this question. I chose 18 innings, the length of two regulation games, as my lower bound for a marathon game, fitting both Pirates contests in with a little to spare. As for how long the hangover might last, I checked the teams’ results for 30 days after the marathon game, broken down into several sub-groups.

I looked from 1990 to the 2011 All-Star break, cutting it off there so neither Pirates game makes the list. It would be unfair to bias the study by including the collapses that made me ask my questions in the first place. There were 23 such games in the majors in this period, giving me 46 teams to study. (It’s actually lower, as some teams had two marathons in one year—but never within one month, so the “hangover” periods never overlap. The 2008 Padres came closest, with two marathons 38 days apart.)

There are parts of the 30-day period that receive special attention. I threw out future games against the opponent a team played a marathon against, since both teams are working under the same handicap, and they’ll produce a combined .500 record in those games anyway. Since marathons often happen in the middle of a series, this takes plenty of first and second games after the marathon off the board. I also will look differently at games played after a substantial rest, such as the All-Star break.

Two digressions: fall-ball and half-life

I would have had another hiccup in the data dealing with games that came within 30 days of the end of the season. It turned out, though, that my data set didn’t have that problem. Of all the regular-season marathon games played from 1990 through 2011, 24 in total, none came in the month of September (or after). Four in April; five in May; four in June; three in July; eight in August; zero in September.

We can only speculate as to a reason. Expanded rosters in September could have a suppressing effect on extremely long games. Extra pinch-hitters or pinch-runners and spare bullpen arms might increase opportunities to break a deadlock. It could also just be luck, but the marathon-free month would be five times likelier to fall in the other months as September, and the timing is suggestive.

Of course, I originally wrote this passage mere days before the Orioles and Mariners broke the September drought with an 18-inning game on Sept. 18. Just one more way Baltimore has been bucking the numbers all year.

There was another interesting matter I noticed while doing the research. Don’t worry, we’ll get to the actual point of the article. But if I spot something intriguing while combing through the numbers, I’m going to take the detour and point out that interesting thing. (Yes, this is why no one ever lets me drive them anywhere.)

Going through Baseball-Reference’s inning summaries, I saw the number of games per season that went into each extra inning, from the 10th onward. What struck my eye was the pattern by which the numbers decreased. Each following inning would have roughly half the games of its predecessor, almost never perfect, but usually approximate.

I was reminded of half-life, the time it takes for half a quantity of a radioactive substance to decay into another element or isotope. Could it be that one inning is the natural half-life of an extra-inning contest?

Not quite, but nearly. Going by the 1990-2011 numbers, 47.03 percent of all extra innings (10th, 11th, all the way up to 22nd) saw the game end in that frame. It’s nearly a coin toss, but the coin is slightly biased in favor of getting to the next inning.

A Hardball Times Update
Goodbye for now.
                      Games getting to inning indicated (1990-2011)
Inning         10th  11th  12th  13th  14th  15th  16th  17th  18th  19th  20th  21st  22nd
Games Reaching 4402  2366  1264   664   335   160    82    43    24    11     5     2     2

This may not be a profound discovery. Indeed, with my luck, it’s something Bill James tossed off as page-filler in his Baseball Abstracts 30 years ago. But it’s a tidbit of baseball knowledge, and who knows where it might be useful some day. I’ll store it away, just in case.

Oh, and there’s no strong indication that the odds change in different run environments. It would make sense for a high-run environment to see extra-inning games end faster, but the numbers I have don’t prove it. The bombs-away 1990s don’t notably vary from those of the recent Years of the Pitcher.

With that, the detour is done.

After the marathons

The combined seasonal winning percentage for the 46 teams in the survey was .5013. Teams with two marathons in one season (the 2008 Padres, 2006 Astros, and 2001 Red Sox) were double-counted, since there were two post-marathon stretches to which I’d be comparing their overall performance.

The results of the tracking do not count “rematch” games played against a team’s marathon opponent. Also, there was a series in 2001 between Detroit and Arizona that came soon after their marathon games against different teams. I left these games out of the equation.

I broke my tracking into six groups. One group holds games from the first to third day after the marathon; the second goes from the fourth to seventh day; the others run out to the 10th, 15th, 20th and 30th days. I did this so that, in case there was a “hangover,” I could see where it ceased to have its effect.

Perhaps I should have tracked it longer.

Group of days            1-3     4-7     8-10    11-15   16-20   21-30

Marathoners' record     42-43   61-88    61-63   94-102 83-100  189-213
Winning percentage      .4941   .4094    .4919   .4796   .4536   .4701

Cumulative record       42-43  103-131  164-194 258-296 341-396 530-609
Cumulative win pct.     .4941   .4402    .4581   .4657   .4627   .4653

Since I started writing at The Hardball Times, I’ve come up with a pretty fair number of hypotheses regarding baseball, and seen most of them pop like soap bubbles when I put them under scrutiny. (Or they burn down, fall over, then sink into the swamp. I’m flexible with my metaphors.) I’m not accustomed to one of my theories passing the acid test, but I’d say this one has done it.

There is no time grouping that does not show a losing record, and tellingly, the groups closest to .500 are the smallest samples. Overall for the month, they finish .036 behind their season winning percentages, for a sample size equivalent to seven full years of games. This is well over two standard errors, a confidence percentage in the upper 90s. (My methods of calculation disagreed, so I’m being conservative.)

Before I throw any confetti, there is one potential skewing factor in the numbers: midseason breaks. A number of teams got several days off in a row during the 30-day windows, which could definitely mitigate any lingering fatigue or bullpen overworking that flowed from an extra-long game. Three instances were due to All-Star breaks, and a fourth to the week-long halt in baseball after 9/11.

I modified the survey by omitting any games that came after such breaks, and ran the numbers again.

Group of days            1-3     4-7     8-10    11-15   16-20   21-30

Marathoners' record     42-43   60-82    55-57   88-94   74-92  152-182
Winning percentage      .4941   .4225    .4911   .4835   .4458   .4551

Cumulative record       42-43  102-125  157-182 245-276 319-368 471-550
Cumulative win Pct.     .4941   .4493    .4631   .4702   .4643   .4613

The marathoners’ lot is improved at times during the first 15 days, but it falls off later, especially in the last 10 days. There is still no interval where the record rises even to .500. Cumulatively, they fare a little worse with the long-rest cases combed out, as one would expect if the days off let teams recover and regain their accustomed level of play. With the marathoners now .040 behind their season results, the confidence percentage inches up despite the smaller sample size.

I’m satisfied. Playing an 18-plus inning game has a significant negative effect on a team’s future play, and that negative effect lasts for at least a month. The remaining question is whether the winner of the marathon ends up doing better than the loser.

I am re-tweaking the sample for this question. Games against fellow marathoners will count this time, as we’re observing not overall performance but the split between winners and losers. Games after extensive breaks will still be left out.

Teams that won their marathons averaged a season record of .5027; the losers posted an average of .4999. This gives us an idea of the separation in records we can expect if there is no effect on future performance based on winning or losing the extra-long game. I will go directly to the modified sample this time to show how both winners and losers fared.

Group of days            1-3     4-7     8-10    11-15   16-20   21-30

Winners' record         32-27   42-38    28-31   44-51   38-49   79-97
Winning percentage      .5424   .5250    .4746   .4632   .4368   .4489

Cumulative record       32-27   74-65   102-96  146-147 184-196 263-293
Cumulative win pct.     .5424   .5324    .5152   .4983   .4842   .4730

Losers' record          28-34   25-51    29-28   49-51   41-47   78-90
Winning percentage      .4516   .3289    .5088   .4900   .4659   .4643

Cumulative record       28-34   53-85   82-113  131-164 172-211 250-301
Cumulative win pct.     .4516   .3841    .4205   .4411   .4491   .4537

There is a stark difference for the first week after the marathon, the winners outperforming their season records while the losers just crater. In all subsequent segments, the losers do outperform the winners, but this makes up only half the early won-loss differential, in three times as many games. Overall, the winners have a .0193 winning percentage advantage over the losers, when we would expect .0028.

Given the sample size and the lopsided shape of the numbers, this isn’t enough to make a definitive call, but it suffices to call it highly suggestive. In the first week after a marathon, winners seem to get a little bump, while losers suffer badly. After that, both sides tend to regress toward the mean, though it’s a mean of reduced performance for any team that plays a super-long game.

Winners and losers

What teams did the best after their curfew-busting games? That hinges on how you define your terms. The 2004 A’s went 19-9 in the 30 days after their Aug. 8 marathon against Minnesota (including a next-day win against the Twins). The 1996 Dodgers were 17-8 in non-rematch games for their stretch. However, both teams finished the year solidly over .500. The 2001 Rangers had a .451 mark that year, but went an overall 13-8 after their marathon with Boston, beating their season percentage by over 150 points.

The undisputed under-achiever is those self-same Red Sox that the 2001 Rangers went 18 against. They lost that game, then the next day’s game closing out the Texas series. Then they lost three straight in Cleveland. Then they went home and lost three straight at Fenway to the Yankees. They lost one to the visiting Indians, then finally won one, only to lose the rubber game. Then they went to the Bronx and lost three straight to the hated pinstripes. Boston lost eight straight after the Texas marathon, and 12 of 13.

The only thing that could stop this baseball disaster was a real disaster: 9/11. When baseball resumed a week later, Boston played out the string at a perfectly ordinary 10-10. For the 30 days after the Red Sox went 18 rounds with Texas, though, this 82-79 team went 4-16. The Sox had had an earlier marathon versus Detroit that season, and went an uninspiring 13-15 afterward, but their post-Texas crash stands unchallenged … as long as you don’t include recent Pittsburgh squads.

What it means now

The 2011 Pirates went 8-21 in the 30 days after their epic defeat in Atlanta; the 2012 Pirates went 7-19 in the month after their epic triumph in St. Louis. That’s 13 and 12 games under .500, respectively. Counting by that benchmark, the 2011 collapse was even worse than the 2001 Red Sox’s.

But we cannot blame it all on the marathon games. Going by the minus-.040 standard set by our modified test group, a nine-inning game on the pivotal days would have meant an expectation of only 1.16 and 1.04 added wins for the 2011 and 2012 teams. Neither team would have salvaged a playoff appearance from a one-win uptick, though it remains to be seen whether an added victory would have salvaged a winning season, or at least a .500 mark, for this year’s Bucs.

The Pittsburgh plummets get some strong narrative drive from this explanation, but ultimately it’s the teams that have to bear responsibility for how they play. Statistically, they could have over-performed expectations as easily as they under-performed them. One can look to the teams they played to see the other ways the aftermath could have played out.

Their 2011 marathon foes, the Braves, far outperformed postgame expectations, going 19-9 over the next 30 days. Ironically, it was after that span that they would suffer the collapse that took them out of the postseason mix. This year, the Cardinals had a more average reaction, sliding along at 13-14 in the month after going 19 with Pittsburgh. This treading water let several teams stay in or sneak back into postseason contention, though a recent surge has given the Redbirds a cushion.

And in the “developing stories” category, the Baltimore Orioles are currently 4-2 after winning their 18-inning war in Seattle. They are going to need to defy a little history if they intend to keep up the pace and overhaul the Yankees for the AL East title. Of course, that’s what they have been doing for the previous 154 games.

And I’m not going to make any predictions against them. I’ve learned that lesson.

References & Resources
Retrosheet and Baseball-Reference: justifying the Internet all by themselves.

A writer for The Hardball Times, Shane has been writing about baseball and science fiction since 1997. His stories have been translated into French, Russian and Japanese, and he was nominated for the 2002 Hugo Award.
Newest Most Voted
Inline Feedbacks
View all comments
11 years ago

Interesting stuff. I could no more do an analysis like this than sprout wings and fly to the moon, but it’s interesting to see it done by people who know how. (Well, most of the time it is . . .)

Shane Tourtellotte
11 years ago

Studes:  I don’t have a ready answer for why the effect should last that long, either.  I tracked it that far out just to be thorough, to see when any effect that did exist ran out.  Only it didn’t.  If forced to guess, I’d say it was a lot of little factors that add up.  I am open to other theories.

MikeS:  I may do that.  I may also examine the related question of how well teams perform after double-headers.  That’s 18-plus innings of baseball, too, although at least you’re planning on playing 18, which should cushion the blow somewhat.

11 years ago

Very cool, Shane. I especially like the half-life thing.  Don’t recall having that pointed out before.

But I’m struggling a bit with your primary finding. In particular, I’m having problems envisioning how a long game can affect a team’s record 20 days later.

Paul G.
11 years ago

Besides the obvious wear on the bullpen and the general fatigue factor, is it possible that marathon games may produce more injuries in the short-term?  I would be especially concerned about any catcher that was called upon to play the entire game or at least the large majority of it.  Relievers that have to pitch more than they usually do could also be suspects, as would be veterans playing the equivalent of a doubleheader on artificial turf.

Another factor that might impact things is if the bullpen is particularly taxed or if a starter is forced to relieve for several innings, this could result in a short-term roster change to add another pitcher.  Until the roster returns to normal this will most likely result in giving innings to a lesser pitching talent and possibly a reduction in options for the offense if a position player was bumped for the pitcher.  Worst case this may require putting a veteran bench player on waivers and either losing him to another team or not having him available until they can re-sign him if no other team takes interest.

11 years ago

Politics?  no, but writing for Will Rogers/Jay Leno looks good.

September isn’t the question to me, the question is August with 2x the average of the prior months.  Could it be the reshuffling of players due to deadline trading causes the baseball team to miss and sputter for awhile?

“bucking the numbers”?  yup, a future with Leno

Half-Life is exactly it.  When you get to the of a half-life period, you have a 50% chance of survival, so half the games drop out.  I saw this several years ago, but never connected it with radioactivity, it just seemed to be part of the sudden death nature of extra innings.

Where is your test group?  Teams at the same season day (1-162)?  The same teams over similar periods prior to the marathon?  I think you just show that marathon teams are weak—which sounds familiar.

Thanks for another interesting facet of the hardest game.

Greg Simons
11 years ago

I agree with studes, that half-life thing is very interesting.

Hank G.
11 years ago

If there is an effect, wouldn’t it show up after doubleheaders also?

Of course, there aren’t that many doubleheaders anymore to study, and maybe knowing ahead of time that you are going to play 18 innings in a single day allows preparation and dampens any (possible) effect.

Shane Tourtellotte
11 years ago

Hank G.:  See my comment above.  I’m going to look into the question sometime after October Madness has passed. (I can call it that, can’t I?  It’s not copyrighted.)  My gut tells me a planned double-header should be easier on teams than an unexpected double-header’s worth of baseball, but my gut doesn’t exactly bat 1.000 on these things.  That’s why we check them, right?

Hank G.
11 years ago

Duh! How did I miss that comment?

11 years ago

You compared the 30 days after with the whole season.  It might be interesting to compare the thirty days after with the thirty days before as that might be a better representation of the roster at the time.

Shane Tourtellotte
11 years ago

Studes:  There’s definitely something to that, though only approximately.  I ran some numbers for 2011, looking at how often one would mathematically expect run totals in an inning to be equal, compared to how often they were in extras (thus going to another inning).  It’s about 56.4% for the mathematical model, and 53% in real-life extra frames.

When I get time, I’ll look at other years.  If that difference holds up … well, I’m not wholly sure what it’d mean, but it would be interesting.

11 years ago

I was thinking about the half-life thing.  In today’s run environment, there are about nine runs scored per game, or one per inning.  So it might be that what were seeing is the result of a roughly 50% probability of one team scoring a run each inning.  50% of all games will be decided in the next inning.  That sort of thing.

11 years ago

Shane, easy answer: the home team (unless the visitors have scored) needs only one run, so they use one-run strategies – more bunting, etc. – instead of higher-variance strategies that increase both your chances of a big inning and your chances of not scoring.