Circle the Wagons: Running the Bases Part II
“No game in the world is as tidy and dramatically neat as baseball, with cause and effect, crime and punishment, motive and result, so cleanly defined.”
—Paul Gallico
In last week’s article I introduced a framework for evaluating baserunning in an attempt to determine the magnitude of the impact both good and bad baserunners have on their teams. In that article I used play-by-play data from Retrosheet for the five-year period from 2000 to 2004 to analyze three situations that I felt would likely reveal baserunning skill. These were:
In these situations I took into account both the number of outs and the fielder to which the ball was hit. In crunching the numbers I then introduced the concept of Expected Bases (the number of bases a runner was expected to gain given his opportunities), Incremental Bases (the difference between the Expected Bases and the actual number of bases gained), and Incremental Base Percentage (the ratio of total bases gained to expected bases). What I found was that there is about a 30-base span between the best and worst baserunners and that the IBP typically ranges from 1.15 to .85. A couple of minor corrections to the first article can be found on my blog.
Refinements
As many readers immediately perceived, these raw numbers are not the end of the story. A number of readers e-mailed suggestions for refining the framework, which included factoring in singles stretched into doubles and doubles stretched into triples (the effect of third base coaches), trying to factor out the effects of the hit-and-run, and including the scenario where second is occupied with a runner on first when a batter singles. While these are all excellent ideas and would make the framework more accurate, they all have problems.
Unfortunately, the play-by-play data from Retrosheet provides no ability to determine when a Juan Pierre stretches a hit rather than simply coasting into second or third. Nor does it allow for determining when a hit-and-run was on, something that would certainly affect the Cardinals’ statistics given Tony La Russa’s penchant for putting on the play.
And while as a Cubs fan I’m well aware of the cost of a third-base coach, having endured “Waving” Wendall Kim in 2003 routinely sending runners to their doom, I can’t think of a good way to isolate the effect of third-base coaches. Information about who was coaching third in a given game is nonexistent, and coaches probably don’t move around enough to be able to generate comparisons of their effect on teams. In addition, an aggressive third-base coach will not only cause more runners to be thrown out but will also cause more bases to be gained, which may well cancel each other out. Finally, I did consider including the situation where second base was occupied but dropped the idea since I was concerned that the defense might make a play on the lead runner allowing the runner on first to take third for “free.”
But while I chose not to pursue these refinements, there are a couple that are both doable and have an impact as James Click also discovered in his baserunning analysis published in the 2005 Baseball Prospectus. To introduce the first, take a look at the team leaders in IBP for each of the five years:
Year Team Opp Bases EB IB IBP OA 2000 COL 445 730 678 52 1.08 7 2001 COL 379 628 582 46 1.08 8 2002 COL 350 541 517 24 1.05 10 2003 COL 413 661 629 32 1.05 7 2004 COL 445 734 692 42 1.06 10
In addition to the Rockies, Texas and St. Louis each appear in the top five three times, with the White Sox and Twins twice each, meaning that 15 of the 25 spots are occupied by the same five teams. On the other end of the spectrum, the Red Sox, Dodgers, Brewers, Phillies, Blue Jays, Giants and Astros all appear in the bottom five more than once. Enough said. What is going on is that parks play a role in determining how often runners advance and how many bases they gain when doing so. And that difference is largely based on the field to which the ball is hit and the surface of the park. For example, in looking at the scenario where there is a runner on first base and the batter singles, the following tables show the high and low totals for the five years when the ball is fielded by each of the three outfield positions.
Single to Left Opp To3rd Scores OA Pct High Fenway Park 369 72 2 5 .201 Low Sky Dome 240 24 1 1 .104 Single to Center Opp To3rd Scores OA Pct High Coors Field 357 145 4 3 .417 Low Jacobs Field 268 54 2 1 .209 Single to Right Opp To3rd Scores OA Pct High Wrigley Field 347 171 3 4 .501 Low Yankee Stadium 364 126 4 5 .357
Here you can see that Fenway’s Green Monster has the effect of doubling the chances that a runner would move from first to third or score on a single to left, as opposed to the Sky Dome (Rogers Centre) where the AstroGrass—replaced with FieldTurf this season—and smaller dimensions ostensibly hold runners in check. The spacious center-field area at Coors Field allows twice as many runners to advance or score as Jacobs Field, while the right field well 353 feet from the plate at Wrigley allows runners to advanced to third or score 50% of the time. In Yankee Stadium at 318 feet in right field, runners advance to third or score just 36% of the time.
In order to see how these effects played out overall, I calculated a park factor (IBP/PF) for each park for every season. My approach to doing so was similar to how Batter and Pitcher Park Factors (BPF, PPF) are calculated. First I calculated the aggregate IBP in all games played at each park and then did the same for all road games for the team that played in that park. So for example, the Rockies and their opponents recorded an IBP of 1.03 in Coors Field in 2003 and an IBP of .99 on the road. I then divided the home IBP by the road IBP, in this case 1.03/.99 = 1.04, and took half the difference since the team only plays half of its games at home. For Coors Field in 2003 the IBP/PF was calculated at 1.02. For the five seasons the Coors Field factors were:
2000 1.03 2001 1.03 2002 1.02 2003 1.02 2004 1.01
These are pretty consistent and show that Coors generally provides a 2-3% advantage to runners in gaining incremental bases. One might wonder if the quality of the home team’s outfield defense might skew these numbers, but since the same defense plays both at home and on the road the effect, if any, should be cancelled out.
Once the park factors were calculated I took a look at their variability. The lowest IBP/PF was recorded in San Diego (Qualcomm Stadium) in 2002 at .96 and the highest was in Texas in 2003 at 1.04, so generally you can see that these park factors don’t vary as much as BPF and PPF, which can range from .90 to 1.10 or higher in the case of Coors Field. And even though some parks such as Coors Field and The Ballpark at Arlington show consistency there is also a good deal of variability in the season-to-season factors for a particular park. Running a simple regression (and excluding teams that moved parks such as the Padres, Reds, Brewers, Pirates and Phillies) showed that the correlation coefficient for the four pairs of seasons were positive three times but reached .3 only once, which is pretty weak. Clearly single-year park factors here should be taken with a grain of salt. As a result, I averaged the park factor for each park across the five years it was in use with the results below:
Coors Field COL 1.02 Ballpark at Arlington (Ameriquest) TEX 1.02 Bank One Ballpark ARI 1.02 Royals Stadium KCA 1.01 Stade Olympique MON 1.01 Comerica Park DET 1.01 Miller Park MIL 1.01 Comiskey Park II CHA 1.01 Enron Field HOU 1.01 Wrigley Field CHN 1.01 Shea Stadium NYN 1.01 SBC Park SFN 1.01 Network Associates Coliseum OAK 1.01 Tropicana Field TBA 1.00 Citizen's Bank Park PHI 1.00 Turner Field ATL 1.00 Fenway Park II BOS 1.00 Hubert H Humphrey Metrodome MIN 1.00 Minute Maid Park HOU 1.00 Skydome (Rogers Centre) TOR 1.00 Safeco Field SEA 1.00 Stade Olympique,Hiram Bithorn MON 1.00 Dodger Stadium LAN 1.00 PNC Park PIT 0.99 Edison International Field ANA 0.99 Pro Player Stadium FLO 0.99 Jacobs Field CLE 0.99 PacBell Park SFN 0.99 Oriole Park at Camden Yards BAL 0.99 Great American Ball Park CIN 0.99 U.S. Cellular Field CHA 0.99 Yankee Stadium II NYA 0.99 Oakland Coliseum OAK 0.99 Petco Park SDN 0.99 Busch Stadium II SLN 0.99 Veterans Stadium PHI 0.99 Qualcomm Stadium SDN 0.99 Cinergy Field CIN 0.98 County Stadium MIL 0.97 Three Rivers Stadium PIT 0.97
This further shrinks the variability to where there is only a 5% swing between the best and worst parks. Because of the small spread, the argument can be made that characteristics that allow runners to advance more frequently in a ballpark are cancelled by other characteristics that don’t resulting in most parks drifting back towards the middle. If that’s the case, what we really need to do is calculate park factors for each scenario or at least each field to which the ball is hit. That makes sense, but it will have to wait until another day.
Using these park factors I then went back and recalculated the IBP for each player in each season and then re-ranked the players for the last five years. The top 15 (actually 17 with ties) are listed below where Bases+ is the number of bases adjusted for park factor and IBP+ is the adjusted IBP.
Name Opp Bases EB IB Bases+ IBP IBP+ Damian Jackson 110 184 160 24 184 1.15 1.15 Jack Wilson 117 203 179 24 204 1.14 1.14 Mike Cameron 168 286 252 34 287 1.14 1.14 Raul Mondesi 120 201 179 22 203 1.12 1.13 Miguel Cairo 105 173 153 20 173 1.13 1.13 Chris Singleton 112 192 171 21 192 1.12 1.12 Brian Jordan 114 196 175 21 196 1.12 1.12 Juan Pierre 259 411 367 44 409 1.12 1.11 Vernon Wells 111 186 167 19 186 1.11 1.11 David Eckstein 216 352 319 33 355 1.10 1.11 Barry Larkin 140 237 216 21 240 1.10 1.11 Alfonso Soriano 136 216 196 20 217 1.10 1.11 Jimmy Rollins 180 297 269 28 299 1.10 1.11 Torii Hunter 170 279 252 27 279 1.11 1.11 Cristian Guzman 209 349 315 34 349 1.11 1.11 Ray Durham 203 334 301 33 333 1.11 1.11 Luis Castillo 272 450 410 40 454 1.10 1.11
As you can see the IBP+ values are very similar to the unadjusted totals. (You’ll also note that Damian Jackson, who was inadvertantly excluded from the list in my first article now takes the top spot from Jack Wilson by percentage points.) Those players who were helped most by their park were:
Name Opp Bases EB IB Bases+ IBP IBP+ Michael Young 148 238 223 15 232 1.07 1.04 Hank Blalock 101 136 152 -16 132 0.90 0.87 Larry Walker 173 288 263 25 282 1.10 1.07 Todd Helton 230 384 361 23 376 1.06 1.04 Jeff Cirillo 147 238 223 15 234 1.07 1.05
As you might have expected, we have two Rangers and three Rockies, all of whom gained from four to eight bases because of their parks. In the case of Larry Walker it drops him from 24th to 39th on the list of 210 players with 100 or more opportunities over the given years. You can find a complete list of the 210 players on my blog. The players who were most hurt by their parks include:
Name Opp Bases EB IB Bases+ IBP IBP+ Hideki Matsui 107 173 168 5 175 1.03 1.04 Ryan Klesko 187 292 282 10 296 1.04 1.05 Marquis Grissom 149 239 225 14 243 1.06 1.08 Aaron Boone 110 169 161 8 172 1.05 1.07 Brian Giles 199 321 302 19 327 1.06 1.08
Once again, it’s not surprising that two Padres are on the list given that both Qualcomm and PETCO had IBP/PFs under 1.0 and that the magnitude of the effect is only around four bases over the five seasons.
When applied at the team level, the leaders and trailers for the past five seasons are:
Opp Bases EB IB IBP OA IBP/PF IBP+ Leaders 2000 MIL 329 525 510 15 1.03 7 0.97 1.06 2001 SLN 329 511 490 21 1.04 9 0.98 1.07 2002 MIN 366 559 551 8 1.01 8 0.97 1.05 2003 OAK 401 641 617 24 1.04 11 0.98 1.06 2004 SLN 447 711 677 34 1.05 6 0.99 1.06 Trailers 2000 CHN 346 502 531 -29 0.95 12 1.00 0.94 2001 MIL 311 436 469 -33 0.93 13 1.03 0.90 2002 ARI 340 489 504 -15 0.97 11 1.04 0.94 2003 MIL 383 555 578 -23 0.96 13 1.03 0.93 2004 BOS 484 713 756 -43 0.94 12 0.99 0.95
It is interesting that the 2000 Brewers led the league while the 2001 edition was in the cellar. A quick look reveals that in 2000 Ron Belliard and Marquis Grissom accounted for 20 incremental bases all by themselves in 91 opportunities, with James Mouton adding over five more in just 10 opportunities. In 2001 with Grissom gone, his replacement Devon White added just over three incremental bases, while Belliard accounted for just two and Jose Hernandez along with Jeremy Burnitz were dinged for -14 thereby driving them to the bottom.
On Deck
The second refinement is to take my measures of Incremental Bases (IB) and Incremental Base Percentage (IBP) and convert these into runs gained or lost. That will be the subject of next week’s article.