Circle the Wagons: Running the Bases Part III

AXOIM I: A ballplayer’s purpose in playing baseball is to do those things which create wins for his team, while avoiding those things which create losses for his team.

AXIOM II: Wins result from runs scored. Losses result from runs allowed.”
The Bill James Baseball Abstract 1984

Those simple axioms as James laid them out has led many baseball analysts down a road to converting or translating popular statistics into the currency of runs with which, it is often said, wins are purchased. The most traveled of these roads is the history of run estimators that analysts are continually tweaking in order to measure how well an offensive player does those things that create runs, and therefore wins, for his team.

In my first two articles on baserunning, I’ve concentrated on understanding how often baserunners take extra bases in various situations in order to calculate how many bases and at what rate the best baserunners outdo the worst. I’ve also delved into understanding a bit about the role that ballparks play. In this final installment I’ll convert the calculation of bases into runs to see how well good baserunners create wins for their team by looking at the season leaders and trailers and the leaders and trailers for the five-year period of 2000-2004.

The Method

Beginning in the late 1950s and made popular by Pete Palmer and John Thorn in their 1984 book The Hidden Game of Baseball, performance analysts have been calculating average outcomes based on the combinations of outs and baserunners in an inning. These eight (representing the eight states in which you can find baserunners) by three (zero, one, and two outs) matrices are often called Run Expectancy tables. For example, below is the Run Expectancy table for the major leagues in 2004.

Runners        0       1       2
xxx        0.538   0.287   0.114
1xx        0.926   0.550   0.246
x2x        1.160   0.710   0.336
12x        1.467   0.958   0.461
xx3        1.454   0.972   0.362
1x3        1.854   1.224   0.522
x23        2.134   1.472   0.618
123        2.255   1.595   0.808

In other words, when the bases are empty with nobody out, the average team scores just over half a run (.538) in the remainder of the inning, while with the bases loaded and one out teams average 1.595 runs. These matrices have been put to a variety of uses including calculating break-even percentages in order to evaluate basic strategic decisions. For example, I wrote both a Pocket PC and desktop version of a software program I call Big League Manager that makes these calculations based on data from 1999-2002 as you change the scenario.

As discussed in the previous articles in this series, the evaluation of baserunning that I’ve done centers around runners taking extra bases, which are transitions from one cell of the matrix above to another. So if a baserunner moves from first to third on a single with one out, he’s helped transition his team from a state where they were expected to score .550 runs to one where the expectation is 1.224 runs. But of course we can’t give full credit to the runner for the additional .674 runs. After all, the hitter does play a role, and in the vast majority of the cases the baserunner could have jogged to second without a play.

So in order to count the runner’s contribution appropriately we should look at what would have normally occurred and compare it with what actually did occur. In this case we’ll assume the runner would have gotten to second anyway, making it first and second and one out (.958), and subtract it from the actual situation (1.224). This allows us to credit the runner with his contribution (.266) to increasing the Run Expectancy. Likewise, if a runner gets thrown out, he’ll receive negative credit since he cost the team a scoring opportunity. For example, getting thrown out at third on a single with nobody out cost the team .917 runs, calculated as the new situation (.550 = a runner on first with one out) minus the situation had he not tried to advance (1.47 = runners on first and second with nobody out). By following this method we can build a derivative table of run values for the various outcomes in the three baserunning scenarios I’ve used to measure baserunning performance. It should be noted that in making these calculations I did not give any credit to a runner for advancing the “standard” number of bases, e.g. one base for a single and two for a double.

That derivative table can then be applied to each actual opportunity in which a baserunner finds himself. So if Juan Pierre moved from first to third on a single with nobody out we’ll credit him with .266 runs for that opportunity, and so on. I call the total across all opportunities Base Runner Runs (BRR).

But just as with measuring the total number of bases gained, BRR needs to be compared against some baseline since the number of runs has a lot to do with the number of opportunities a runner has. So I also used the derivative table to calculate the number of runs the runner would be expected to contribute given the opportunities he had. The difference between that number (Expected Runs or ExR) and BRR I call Incremental Runs (IR). And because IR like Incremental Bases in my previous articles is also weighted by opportunity, I calculate a ratio of IR to BRR called Incremental Run Percentage (IRP), which is a rate statistic akin to OPS that shows at what rate players contribute runs with their baserunning.

Finaly, I then take all of these numbers and park adjust them using a similar technique as discussed last week, although refined a bit by adjusting the BRR, IR, and IRP only for those opportunities that occurred in the player’s home park.

While that sounds confusing—and it is—just remember that BRR is the total number of runs a player contributes with his baserunning, IR as that portion of the runs he contributes that are over and above what would have been expected given his opportunities, ExR as the number of runs that he would have been expected to contribute, and IRP as a means of comparing two runners regardless of the number and type of opportunities.

The Results

First, let’s take a look at the yearly leaders and trailers in IR. I’ve included the Incremental Base Percentage (IBP) from my previous articles for reference.

Leaders                           Opp     IBP     BRR     ExR      IR     IRP
2000            Luis Castillo      57    1.23   14.42    9.16    5.25    1.57
2001            Juan Pierre        41    1.20    9.06    5.14    3.92    1.76
2002            Jacque Jones       47    1.14   11.46    7.38    4.08    1.55
2003            Raul Ibanez        55    1.12   13.52    8.91    4.61    1.52
2004            Reed Johnson       53    1.15   13.68    8.68    5.00    1.58

2000            Joe Randa          50    0.80    2.81    6.86   -4.05    0.41
2001            Adrian Beltre      25    0.70   -0.57    4.05   -4.61   -0.14
2002            Frank Thomas       40    0.77    0.42    5.88   -5.45    0.07
2003            Edgar Martinez     39    0.87    4.11    9.22   -5.11    0.45
2004            Bill Mueller       47    0.76    2.17    7.40   -5.23    0.29

From these tables you can see that the leaders generally create an additional four to five runs while the trailers cost their teams an equivalent amount. The magnitude of the impact of baserunning accords very well with the research that James Click published in the 2005 Baseball Prospectus, although our individual leaders and trailers vary somewhat.

A couple surprises here are that Raul Ibanez and Reed Johnson float to the top in 2003 and 2004 although neither is known for exhibiting speed through stolen bases. On the flip side Adrian Beltre and Joe Randa are a bit surprising since both are considered to have average speed.

Over the five-year period from 2000 to 2004 the leaders and trailers in IR are:

A Hardball Times Update
Goodbye for now.
Top 10                    Opp     IBP     BRR     ExR      IR     IRP
Luis Castillo             272    1.10   51.92   38.19   13.73    1.36
Juan Pierre               259    1.11   45.50   33.02   12.47    1.38
Mike Cameron              168    1.14   39.73   29.11   10.61    1.36
David Eckstein            216    1.11   40.07   29.46   10.61    1.36
Ray Durham                203    1.11   36.65   26.72    9.93    1.37
Cristian Guzman           209    1.11   36.41   26.56    9.85    1.37
Jay Payton                168    1.10   37.48   27.80    9.68    1.35
Rafael Furcal             220    1.09   39.36   29.83    9.53    1.32
Jimmy Rollins             180    1.11   37.16   28.16    9.01    1.32
Johnny Damon              256    1.07   43.40   34.47    8.92    1.26

Bottom 10                
Rich Aurilia              151    0.92   13.96   22.86   -8.90    0.61
Frank Thomas              143    0.88   11.83   21.04   -9.21    0.56
Mike Lieberthal           139    0.87   11.55   20.92   -9.37    0.55
Ben Molina                138    0.86   13.27   22.81   -9.54    0.58
Rafael Palmeiro           207    0.88   17.04   27.17  -10.12    0.63
Bill Mueller              191    0.87   17.50   27.67  -10.17    0.63
Carlos Delgado            237    0.90   28.64   39.50  -10.86    0.73
Richie Sexson             157    0.88   13.78   24.80  -11.02    0.56
Dmitri Young              156    0.87   11.38   22.83  -11.45    0.50
Edgar Martinez            178    0.89   17.53   30.11  -12.58    0.58

So over the course of the five years the spread is +/-13 runs. These lists are a bit more in line with what you’d expect, with the possible exception of Eckstein ranking so highly. In looking at his season-by-season totals, he’s garnered more than 3.25 runs twice and almost two and half another time in just four years to make a pretty impressive showing.

But as mentioned earlier, ranking by IR only gives the advantage to those with more opportunities, so we’ll also show the yearly leaders and trailers by IRP for those who had more than 20 opportunities.

Leaders                           Opp     IBP     BRR     ExR      IR     IRP
2000            Tom Goodwin        40    1.21    9.18    5.58    3.59    1.64
2001            Cristian Guzm      26    1.28    7.77    4.16    3.60    1.87
2002            Ray Durham         40    1.23    6.80    3.71    3.09    1.83
2003            Omar Vizquel       22    1.10    3.44    1.55    1.90    2.23
2004            Chase Utley        21    1.20    4.16    1.71    2.45    2.43

2000            Jose Canseco       28    0.80    0.39    3.05   -2.66    0.13
2001            Javy Lopez         23    0.74   -0.69    2.31   -3.00   -0.30
2002            Ben Molina         32    0.73   -0.14    4.09   -4.23   -0.03
2003            Craig Counsel      28    0.78   -0.85    2.70   -3.55   -0.32
2004            Mike Piazza        27    0.71   -0.32    4.22   -4.54   -0.08

Here two players with good baserunning reputations, Tom Goodwin and Omar Vizquel, finally make appearances while the surprise leader in 2004 is Chase Utley. Although he had just 21 opportunities, he parlayed them into 2.45 incremental runs for an IRP of 2.43 (143% more runs than would have been expected given his opportunities).

And now the leaders and trailers in IRP over the five-year period that includes 2000 to 2004 for those that had 75 opportunities or more:

Leaders                   Opp     IBP     BRR     ExR      IR     IRP
Miguel Cairo              105    1.14   18.87   12.41    6.45    1.52
Tom Goodwin                97    1.16   23.48   16.07    7.40    1.46
Jack Wilson               117    1.14   21.62   14.85    6.77    1.46
Larry Bigbie               86    1.10   15.00   10.44    4.56    1.44
Pokey Reese                86    1.12   19.24   13.87    5.37    1.39

Richie Sexson             157    0.88   13.78   24.80  -11.02    0.56
Mike Lieberthal           139    0.87   11.55   20.92   -9.37    0.55
Dmitri Young              156    0.87   11.38   22.83  -11.45    0.50
Daryle Ward                77    0.88    4.25    9.00   -4.75    0.47
Fred McGriff              111    0.86    5.94   12.89   -6.95    0.46

Once again a few surprises here, including Miguel Cairo who comes out on top, as well as the new Rockie Larry Bigbie.


Since I’m sure you’re tired of looking at tables by now I’ll mercifully end this article and series with a few random thoughts.

  • Is using the Run Expectancy tables really better than simply counting incremental bases? In two ways I think it is. First, it at least quantifies the contribution baserunning makes in terms of runs and therefore wins and losses. And second, when looking at individuals it weights baserunning decisions more accurately since getting to third with two outs is much less valuable (over six times less) than doing so with nobody out. Simply counting bases does not capture this dimension. The weakness of this approach is that it doesn’t take into account the score. Teams that find themselves ahead a lot or behind a lot may have their baserunners underestimated since they’ll often play station-to-station.
  • The four to five incremental runs contributed by the best baserunners in a season are generally considered to be the equivalent of about half a win per season using estimates based on the Pythagorean method. So the spread from best to worst is right at one win. So does baserunning make a difference? Yes, but the magnitude is small compared to that between good and bad offensive players, where the difference is in the tens of runs
  • The spread over a five-year period is about two and half wins (26 runs) for individual players
  • Over an extended period the best baserunners contribute about 50% more runs due to their baserunning than would be expected and often more than 100% more in a given season
  • Although space prohibited listing the team leaders, the spread of IR for teams was around 20 runs in an individual season, meaning the best baserunning teams pick up a couple of extra wins per season over the worst due to their baserunning. That’s an advantage that can make the difference between winning the Wild Card and staying home in October
  • I did find that there was a very slight home team advantage in play. Home teams had a higher IR in each of the five seasons, but over the course of the five seasons the difference was around 100 runs (averaging out to under a run per team season). As a result, I didn’t make any corrections for home team advantage
  • Is this data actionable? In other words can it be used to make decisions even though the magnitude of the skill is fairly small? Well, at the very least it has a role to play in quantifying baserunning in order to put it in perspective when discussing individual players. While nothing is ever certain in baseball statistics (which is why we like them so much) this kind of analysis adds to our knowledge since claims like “Larry Walker is worth two wins per season because of his baserunning” alone can be more easily dismissed
  • Look for future refinements and breakdowns by team and year on my blog in the coming days.

  • Comments are closed.