Base Stealer Intangibles (Part 2)

Do Base stealers Disrupt the Pitcher When on First Base?

This is the question I first asked in Part 1 of this study. I set out to look at what batters do when a top base stealer is on first and compare that to what the same batters do in all situations. Before actually looking at those numbers, though, I considered how batting performance is affected by defensive positioning when there is a runner on first. In that case, I found (not surprisingly) that a lot more ground balls get through the infield; most of them are due to the first baseman having to hold the runner on, but a significant number are also due to the middle infielders playing for the double play when there are fewer than two outs. This effect has to be accounted for when trying see if the best base stealers have a disruptive effect on the pitcher.

I identified the top 10 base stealers during the period 2003-2005 (I call them the Stealers) and considered all plate appearances with those runners on first base. The raw numbers show an overall improvement in hitting when there is a Stealer on first base (S1B for short). Batting average went up 25 points, although the gains in OBP (7 points) and SLG (7 points) were more modest. Still, batters created more runs per game (RC27 = 6.2) when there was a Stealer on first base than they did in all situations (RC27 = 5.7). As promised last time, we now try to determine how much of that improvement is due to defensive positioning and how much due to disruption of the pitcher.

Taking Defense Into Account

In a previous article, I showed how one can account for different defensive alignments when evaluating batting performance in a specific situation. The key point is that you have to consider batting elements that only depend on the batter-pitcher match up, but not on the defense. Most readers are familiar with using only strikeouts, walks and home runs to evaluate pitchers; these are defense-independent stats. To evaluate hitters, I add the number of the different batted-ball types, or trajectories: fly balls, ground balls, line drives and pop-ups, abbreviated with F, G, L and P, respectively.

To translate the trajectories into outcomes, we use a table of probabilities, called the Hit-Trajectory (HT) Matrix, calculated from the play-by-play data. For example, we find from the play-by-play data that, on average, a ground ball will result in an out 77% of the time. Furthermore, its chances of becoming a single, double or triple are 21%, 2% and 0.1%, respectively. Here is the full HT Matrix, determined from the 2003-2005 play-by-play data using all situations:

 Average HT Matrix:  
          Out   Single   Double   Triple       HR 
 F      0.733    0.056    0.080    0.012    0.119 
 G      0.767    0.213    0.019    0.001    0.000 
 L      0.265    0.519    0.176    0.015    0.025 
 P      0.981    0.015    0.003    0.000    0.000 

Given this matrix, we can convert the batted-ball types into outs and hits, giving an estimate of what the runner on first base (R1B) performance would have been against a normal defensive alignment.

Before we do that, however, there is one more adjustment to make. THT’s own Dave Studeman pointed out in a recent article that not all batted-ball types are created equal. For example, a single fly ball by Barry Bonds is worth about 0.4 more runs than a fly ball by Einar Diaz (this is pretty amazing, when you think about it). Anyway, we have already seen that our sample of batters is not “average,” and therefore we cannot use the HT Matrix calculated using all batters. So, I have determined a custom HT Matrix for the batters in this study (refer to Part 1 for a list of top base stealers and the batters that bat behind them), which I show here:

 Custom HT Matrix:  
          Out   Single   Double   Triple       HR 
 F      0.726    0.063    0.081    0.015    0.115 
 G      0.754    0.225    0.020    0.001    0.000 
 L      0.267    0.518    0.173    0.016    0.024 
 P      0.974    0.020    0.006    0.000    0.000 

Note that these are the hit (and out) probabilities for the batters in our sample in all situations , not in runner-on-first situations. This will enable us to translate the performance of those hitters
from runner-on-first situations to generic ones. The numbers in the custom HT Matrix are only slightly different than the average one, but the differences are significant. Specifically, the batters in our sample have higher hit probabilities for flyball, groundball and pop-up trajectories.

OK, now we can translate the Stealer-on-first base (S1B) offensive line into a defense-independent (DI) performance: home runs, strikeouts and walks remain unchanged in the translation and the batted balls are converted into hits and outs using the HT matrix.

Defense Independent Performance

Let’s go right to the numbers:

            AB      H     2B     3B     HR     BB      K
   All:   3286    904    180     20    105    381    593
R1B-DI:   3385    968    198     23     94    303    502

The first line was shown in Part 1 and now I’ve substituted the runner-on-first line with its defense independent translation. We can see that the number of hits has been reduced by the translation to a defense independent context, as expected. Curiously, doubles and triples are slightly higher in the defense independent line. Here is the rest of the offensive line:

           AVG    OBP    SLG     RC   OUTS   RC27 
   All:  0.275  0.351  0.438    504   2381   5.71
R1B-DI:  0.286  0.345  0.441    515   2416   5.76

So, there are some differences in the “All” and “DI” rows, which we might expect given that the strikeout, walk and home run numbers, which aren’t affected by the defense independent translation, were quite different in the two cases. Taking these numbers at face value, there seems to be a very slight improvement in batter performance when there is a Stealer on first base.

Controlling for Pitcher

One thing that I haven’t considered yet is the quality of the pitching. There is reason to believe that the inherent quality of the pitching with a Stealer (or Runner) on first base is below average. The reasoning is that the worse pitchers will put more runners on and will tend to pitch more often with a runner on first base than an average pitcher will. I checked this hypothesis is the following way: I made a list of all pitchers (and the corresponding number of batters faced) for the Stealer-on-first base sample. I then looked at how those pitchers did in all situations and compared them to the average pitcher. I weighted the contribution of each pitcher in the Stealer-on-first base sample appropriately using the batters faced. Here are the results:

                 AVG   OBP   SLG   OPS  |  RC27
All pitchers:   .264  .335  .423  .758  |  5.16 
S1B pitchers:   .264  .330  .420  .749  |  5.04 

So, it appears that, contrary to expectations, pitchers in Stealer-on-first situations are actually a little better than average. This means that that the small “disruption” value of 0.05 runs per game needs to adjusted upward. I think a reasonable way to do this is simply to add the difference found here (0.12 runs) to the “disruption” value found earlier (0.05 runs) to get 0.17 runs/game of disruption. This is probably not mathematically rigorous, but it’s likely good enough for our purposes.

Any Runner on first base

The above analysis shows a small effect of “disruption” of the pitcher by a Stealer. But, to get to that conclusion, I had to do some complicated stuff, such as calculating a customized HT matrix, translating batted ball types into a defense independent context and controlling for pitching quality. Maybe there’s an easier way: how about considering the case where any runner (Runners), not just Stealers, are on first base (with second base open)? Presumably the important features of the defensive alignment do not change much depending who the runner on first is. (Some very slow runners are not held on, especially when the pitching team is ahead, but I don’t think this will change the results much.) Now, we cannot simply compare the
batting line with Stealers on first with the batting line with Runners-on-first line, since we’ve already seen that this would lead to the selection bias discussed above. However, it’s fair to compare the improvement in batting that occurs with Runners and Stealers on first base. Let’s look at the Stealers again (note, I am not making the Defense-Independent translation here):

 Batting Performance with Stealers on 1B
         AB      H     2B     3B     HR     BB      K   
All:   3286    904    180     20    105    381    593   
S1B:   3355   1007    187     14     94    303    502   

         AVG    OBP    SLG     RC   OUTS    RC27 
All:   0.275  0.351  0.438    504   2381    5.72
S1B:   0.300  0.358  0.448    539   2348    6.19

The following is what we obtain if we consider any runner on first base, with second base open:

 Batting Performance with Runners on 1B
         AB      H     2B     3B     HR     BB      K
All: 103133  27356   5597    547   3401   9997  19367
R1B: 104327  29439   5791    489   3344   8323  18154

        AVG    OBP    SLG     RC   OUTS    RC27 
All:  0.265  0.337  0.429  14908  75777    5.31
R1B:  0.282  0.342  0.443  15824  74888    5.71

With a Stealer on first base, the improvement is 0.47 in runs per game (6.19 minus 5.72). When any runner is on first, the improvement is nearly as large, 0.40 runs per game (5.71 – 5.31). So, if we assume the improvement with Runners on first base is all due to defensive alignment, i.e., the Runners as a group don’t disrupt, then we may conclude that the Stealers provide an additional 0.07 runs per game by “disruption”. There is no need to make a correction for quality of pitcher in this case, since the pitcher quality for the S1B and R1B samples are identical. The value found here (0.07 runs of “disruption”) is less than but fairly close to the value we found with the Defense Independent analysis with control for pitchers (0.17 runs).

A Hardball Times Update
Goodbye for now.
Simpler Measures

In an attempt to make things even more straightforward, I thought it might be interesting to look at some simpler things that could indicate a pitcher is getting “rattled” by the Stealer on first: namely walks, hit batsmen and balks. If a pitcher is losing his cool out there, it stands to reason he might make more of these kinds of mistakes.

Walks have actually already been included in the above analysis, and we’ve seen that walk rate goes down by about 20% when there is a Stealer on first base. About half of this decrease is inherent to the sample of pitchers: the pitchers in the Stealer-on-first base sample walk about 10% fewer batters than the average pitcher. I believe that the rest of the
reduction in walks in Stealer-on-first base situations is due primarily to a change in approach on the part of batters and pitchers, and is not caused by any real improvement in control in Stealer-on-first base situations. The batter wants to put the ball in play to advance the runner, and the pitcher wants to avoid putting another runner on via the walk. In any case, there is no evidence to suggest that a Stealer on first base causes the pitcher to increase his walk rate.

Hit batsmen might be a better indicator of pitcher disruption; it has nothing to do with the approach of the pitcher or batter, and it’s clearly a (big) mistake on the part of the pitcher (except in beanball wars, but I am neglecting those here). When you look at the data you find that slightly more batters are hit by pitches when a Stealer is on first base. In the sample studied,
42 batters were hit by pitches, while the expected number was 35. However, since such a small number of batters are hit, it’s necessary to perform a statistical test on the result to see if there is a real effect or just a fluctuation of the data. In fact, it’s fairly likely (approximately 20%) that the observed difference is due merely to statistical “noise” and is not related to pitcher disruption. Statisticians usually require a probability of less than 0.05 to claim that a real underlying effect exists.

What about balks? First, I looked at the overall balk rate for situations where there is a runner (any runner) on first base and second base open. I found that pitchers balked about 2.7 times per 1000 batters faced. When I look for balks with a Stealer on first base and second base open, I find about 5.5 balks per 1000 batters faced. This time, the increase is significant: the p-value is 0.0024, meaning that it’s very unlikely that the increase in balks is due to statistical fluctuation.

Wrapping it All Up

The goal of this study was to confirm or refute the notion that good base stealers disrupt the opposing pitcher/defense simply by his presence on first base. I reasoned that any disruption would show up in the performance of the batters who came to the plate with a Stealer on first base. I studied the performance of 219 batters (almost 3700 plate appearances) when one of the top 10 base stealers was on first base (with second base open) and compared that to what would be expected from those particular hitters. I then used a custom hit-trajectory matrix to convert the Stealer-on-first base performance into a defense-independent context that can be compared to the generic case. Finally, a small correction for pitcher quality was found to be necessary.

The results obtained indicate a small effect of disruption, amounting to about 0.17 in RC27, for the Stealer on first base situation. An independent cross-check was made considering what happens with any runner on first and again a very small effect of disruption was found (0.07 runs per game for that check). Finally, I found that pitchers hit a few more batters than expected when a Stealer was on first, although the effect was not statistically significant. It could be real, but we can’t say for certain. Pitchers do commit more balks with a Stealer on first base.

Assuming these small effects are real, how much are these base stealer intangibles worth over a season? A typical Stealer is on first base for 180-220 plate appearances per season. That corresponds to about 128 outs (for this group), which adds up to 4.75 games. Assuming the improvement due to disruption is 0.17 runs per game, this gives a measly 0.8
runs over the whole season. Let’s throw in an extra balk and a half-HBP (I didn’t include the HBPs in the OBP calculation for simplicity) and we get an additional 0.5 runs (more or less), for a grand total of about 1.3 extra runs a year.

So, the next time you tune into the White Sox game and Hawk Harrelson is telling you that “Scotty” Podsednik, by virtue of his ability to disrupt the pitcher, is worth more than what his statistics show, well you now know he’s telling you the truth. Podsednik is worth a little over one more run per season.


The above was all written and ready to go when I received an e-mail from Mitchel Lichtman (also known as “mgl” in sabermetric circles). After having read Part 1 of the study, he wrote to say that he thought my sample size of 3700 plate appearances was likely too small to draw any hard conclusions. In any case, he suggested that I quantify in a statistical way the precision of my findings (whatever they turned out to be).

Well, I hadn’t worried about this too much, since 3700 plate appearances just seemed like a lot to me, but (as usual) Mitchel was right. I found that base stealers “disrupted” pitchers only to the tune of 0.17 in terms of RC27, but the uncertainty (one standard deviation) on that number is about 0.4 runs. So, what does this say about our conclusion, that “disruption” amounts to 0.17 runs per game? We can say the following:

There is an 84% chance that a Stealer on first base improves batter performance by less than 0.57 in RC27.

I know, it’s much more satisfying to be able to say, the disruption effect is X runs per game, but that’s life.

In terms of additional runs for a team over the course of a season, we can say, that

It’s 84% likely that a base stealer adds fewer than 3.2 runs over the course of the season.

I tried re-doing the analysis using the Top 20 Stealers over the last three years (instead of the Top 10) to increase the sample size. The result I found is a little more stringent: the 3.2
runs per season number goes to about two runs per season.

So, instead of saying that Podsednik adds a little over one run per season due to his “disruptive” powers, we must content ourselves with saying that he very likely adds no more than two runs per season.

References & Resources

  • Mark Pankin, Do Base
    Stealers Help the Next Batters?
    . This powerpoint presentation
    contains a wealth of information on the subject at hand. There is no
    attempt to disentangle the effects of defensive alignment, but many
    other aspects of the subject are covered.

  • Cyril Morong, Does
    Base Stealing Create Havoc?
    . This article tries to answer exactly
    the same question that I’ve posed. It’s an interesting take on the
    subject, written without the benefit of play-by-play data.

  • The folks over at Retrosheet cannot be praised
    highly enough. They collect, digitize and make available play-by-play
    data for significant portions of baseball history. Studies like these
    (and many others!) would not be possible without their efforts.

  • Thanks to Dave Studeman who read a preliminary version of this
    article and made several useful suggestions and also to MGL and other readers of Part 1 who e-mailed with comments and suggestions.

Comments are closed.