Should Managers Save Their Challenges?

When is the right time to challenge a play? (via Rich Anderson)

In the seventh inning of an otherwise unremarkable May 9, 2015 game against the Texas Rangers, Tampa Bay Rays manager Kevin Cash received a standing ovation, a gesture of appreciation usually reserved for managers ejected after a colorful, face-to-face tirade against a vilified umpire. But Cash had done no such thing. He had just challenged a call, and for the first time in 12 attempts that year, he succeeded in having the call overturned.

A refresher on the rules: Both teams’ managers start the game with one challenge, and receive another if their first challenge is successful. Beginning in the eighth inning, a manager out of challenges can prompt the umpire crew chief to initiate a review of a call. As a result, the final few innings see, on average, 39 percent more challenges than the first seven, though the number of overturned calls increases only marginally.

Sources: Baseball Savant, Baseball-Reference

Until 2014, there was no way to overturn bad calls. Once manager challenges and instant replay were introduced, teams had a way to appeal those. The strategy behind deploying challenges hasn’t been scrutinized–except when it goes wrong. In replay’s inaugural month, Giants skipper Bruce Bochy lost a challenge on a close play at first base, and was then unable to challenge a not-so-close play at the plate later that inning. Cash made (small) waves for challenging any call that looked close, without consulting a bench coach looking at video, and lost his first 11 challenges of the 2015 season.

What’s the optimal strategy? Challenge at the first opportunity, or wait for a crucial moment?

For the start of the 2017 season, Major League Baseball fine-tuned the rules for instant replay. The headline was a new 30-second time limit for a manager to decide whether to challenge a call, and a two-minute limit for umpires to decide on the call. Not receiving as much attention, but arguably more strategically significant, was the rule change pushing umpire-initiated reviews to the eighth inning; in prior years, it had been the start of the seventh.

While subtle, this inning change should shift manager strategy a bit, increasing the value of their challenge. As described by Oliver Roeder at FiveThirtyEight, the challenge is similar to an American-style financial option that now expires at the conclusion of the seventh inning, rather than the sixth. In theory, managers should now be slightly more hesitant to use their challenges, for fear of needing them later. In practice, challengeable calls occur so rarely (though perhaps Earl Weaver disagrees) that any call with a decent chance of being overturned deserves a challenge.

Since 2014, there have been 4,090 challenges in 14,574 games, of which 2,029 have led to overturned calls, a 49.6 percent overturn rate. A look at the situations in which calls were overturned during the 2014 season showed that of the successful challenges, the average gain for the challenging team was 0.65 runs, or roughly a 6.5 percent increase in win expectancy on average. However, this change can be leveraged if it comes at a particularly impactful moment in the game. An overturned call at first base with two outs and the bases loaded in the seventh will matter a whole lot more than for instance, the same call with the bases empty.

The leverage index (LI) attempts to quantify the importance of such a play by looking at the change in win probability given the range of outcomes for a play relative to the average situation. For example, a scenario with a 3.0 LI will have thrice the impact on a game as an average scenario. Jesse Wolfersberger’s terrific analysis here at THT of challenge situations used LI as a proxy for importance. Wolfersberger set up the framework of how to evaluate challenge decisions, but his article was written before we had any actual data. Now that we have data, we can figure out how managers can optimize their challenging strategy. Leverage Index is still a great lens to look through for this endeavor, but it doesn’t consider how likely another important situation is to arise.

A Modified Leverage Index

A more useful measure of the importance of a situation for the purpose of replay requires a new concept of leverage index. Rather than compare the variability of a situation relative to the average situation in any game, it makes more sense to compare it to the average situation from that point onward in that game. To do this, I’ve created a modified leverage index — I’ll call it LI-7 — which measures the variability of a situation relative to all those that follow it, up through the bottom of the seventh (the last play where a challenge holds any value).

LI-7 should be thought of as a near-sighted, in-game measure of importance, whereas the traditional leverage index is better equipped to compare importance over multiple games. LI-7 is defined for a particular situation as the average change in win probability for the play, divided by the average change in all subsequent plays up through the last out of the seventh. (For example, a bases loaded, no-out situation while up two runs in the sixth will have a LI below 1, as the leading team is very likely to win either way. However, the LI-7 value will be above 2, as this situation is going to be one of the most important remaining challengeable situations.) For bullpen/pinch-hitting decisions, the analog of LI-9 is more useful (the denominator being the average change in win probability for all plays until the end of the game), but that’s a different article. For comparison’s sake, these are the LI-7 charts for a no-out, nobody on base situation, and a two-out, bases loaded situation.

Source: Retrosheet

A Hardball Times Update
Goodbye for now.

Source: Retrosheet

Given the LI-7 value of a situation, the inning, and the probability of a call being overturned, we can determine if a call is worth challenging. Since the probability is unknown, we’ll express it as the confidence of the manager in the call being reversed. The inning determines the expected number of future challenges; the loss of a challenge has a higher opportunity cost in the first than the seventh. To find the total number of possibly challengeable calls, we’ll take the average of the seventh inning onward, where challenges occur upon request.

From 2014-16 in the late innings, 7.87 challenges occurred per 100 innings; they were overturned at a 40.4 percent rate. For a challenge to be worthwhile, it must therefore have an expected value (probability of being overturned multiplied by LI-7 plus any future challenges) greater than that of any future successful challenge. On average, 0.0026 challenges are overturned per out (for both teams at the plate, with the challenge falling in favor of the team challenging). Both teams thus start the game expecting to overturn 0.11 calls, a number that drops linearly over the course of seven innings to zero. This generates the following cheat chart for challenges, similar to the one constructed earlier by Wolfsberger, but which uses LI-7 instead of the leverage index.

Given the three years of data available, the chart favors challenging much more than previously thought. Umpires turn out to be really good at their job, calling at least 99.6 percent of calls correctly. There is support for that in this graph, but also in the first graph presented above. In it, we can see that there is no evidence that calls made later in the game are more likely to be overturned; there is nothing to suggest any umpire fatigue or inconsistency. A call made in a late, close game is as likely to be missed as any other call, implying that, as a whole, umpires aren’t clutch or unclutch.

Going back to the initial question, if there is at least a one-in-five chance of the challenge succeeding (LI-7 rarely falls below 0.4), it will be worth doing regardless of situation, as there is little chance that a better opportunity for a challenge will present itself. In short: a manager’s optimal strategy is to challenge as much as he can. Each one nudges the odds of victory, just a little bit, in his team’s direction. In other words, Kevin Cash’s seemingly brazen strategy is right on the money.

References & Resources

Kolya Illarionov is a recent graduate of Belmont (Mass.) High School, and will be studying applied math at Brown University in the fall. He also writes about baseball.
newest oldest most voted

Kolya, I bought into everything I read, right up until the “recent graduate of Belmont (Mass.) High School” part in your biography section. 😉

Seriously, this was fantastic analysis AND very well written. Where can I find other things you’ve written.

Bob Rittner
Bob Rittner

If the object of the replay is to get the call right, the whole concept of challenges, particularly limited ones, is stupid. It adds a dimension to the game that has nothing to do with play on the field but rather with playing the odds on the bench. In other words, the manager now isn’t making decisions based on whether he is using the right player or playing strategy, but whether he thinks the replay will prove him correct, and if he is wrong, he cannot use it again even if the situation is more dire and the error more… Read more »

rolling game

Hope that their strategy is suitable and most optimal. Thanks for sharing the news.

Fake Yeezys Boost

There is a shot at redemption for Adidas—next April, the companies will be in court again over a lawsuit that alleges Skechers copied the iconic Stan Smith tennis shoe.

Stream xbox one games to any Microsoft PC in your home. with looking for group on xbox live you can search for games with similar gaming objectives play styles.

clicker heroes

But Cash had done no such thing. He had just challenged a call, and for the first time in 12 attempts that year, he succeeded in having the call overturned. (y)

happy wheels

Hope that their strategy is suitable. Thanks for sharing the news.

geometry dash

Their strategy is correct


Exactly what I was thinking.

hotmail sign up

Each one nudges the odds of victory, just a little bit, in his team’s direction. In other words, Kevin Cash’s seemingly brazen strategy is right on the money. exactly