The Most Underrated Base Stealer of the 1980’s and ’90’s

Vince Coleman is one of the greatest base stealers in baseball history. (via YouTube)

Rickey Henderson is the most celebrated base stealer of all time. For 25 years he terrorized opposing pitchers, catchers and middle infielders with his pitch anticipation, aggressiveness, and raw speed. Seemingly every walk or single turned into a double. Often that double turned into a triple, and on a few occasions a triple turned into a home run.

But as the sabermetric revolution gained traction in the minds of fans and analysts, we shone the spotlight less on his raw stolen base total and more on his 80.7 percent success rate. While that rate is excellent, it ranks 11th-highest out of players with 500 or more attempted steals. It leaves us room to consider other players as the best base stealers in major league history.

The man immediately above Henderson on this list (if you round to three decimal points) also garnered attention on the diamond. Yet despite leading his league in steals six years in a row, making multiple All-Star appearances, and having a higher success rate at swiping bags, fans don’t view this guy the same way they do Henderson.

This makes sense. The man I’m referring to was less talented with the bat, and his career was shorter. But even if we focus just on stealing bases, this guy’s name never comes up.

I’m not talking about Tim Raines. “Rock” does have a higher stolen base percentage, and he did suffer in Henderson’s shadow. But Raines was vindicated when he was elected to the Hall of Fame. And as we’ll see later, his base stealing prowess may be overrated.

No, I’m talking about a third guy, one who played against both men but in their shadows. Statistical analysis provides evidence that we should celebrate Vince Coleman, not Henderson or Raines, as the the most successful base stealer of his generation.

Modeling Success

How can I make this claim? I used empirical Bayesian analysis to estimate the true-talent stolen base percentage for all three men. This technique lets us compare players across eras, with different sample sizes, and while incorporating the principle of regression to the mean. For more details about empirical Bayesian analysis, refer to my article in The Hardball Times Annual 2018 and David Robinson’s excellent book.

Bayesian analysis begins with our prior expectations of each player’s talent. I modeled prior expectations on two factors:

  • The season
  • The number of attempted steals in that season

Why the season? Because major league-wide stolen base percentage fluctuates over time:

Why the number of attempts? Because managers and coaches control the running game, and they give more attempts to successful base stealers. If a player attempts 100 steals in a season, we should expect his true-talent stolen base percentage is higher than a player who attempts 20 steals in that same season.

The following graph shows how these factors interact:

The solid line shows the most likely true-talent stolen base percentage of a player who attempts 20 steals in a season. (The dashed lines show the 97.5 percent and 2.5 percent outcomes.) If a player in 1960 attempted 20 steals, we would expect him to have a true-talent success rate of about 64 percent. If a player in 2017 attempts 20 steals, we’d expect him to have a true-talent success rate of about 74 percent.

Why the change? Refer to the graph before this one. Today’s base stealers are more successful because today’s managers understand the value of not making an out much better than their 1960’s counterparts did. Today’s managers give steal attempts to talented base stealers, whereas managers in the 1950’s let darn near anyone run. Our prior expectation of a player’s true-talent stolen base percentage must reflect this change in thinking.

A comment: Before settling on this model, I tried one that used Retrosheet data to account for stolen base attempts per opportunity instead of just raw attempt totals. I thought accounting for opportunities would normalize players who reached base at different rates. It may have, but this model produced poor results, so I stuck with raw attempt totals, which produced more realistic results.

A Hardball Times Update
Goodbye for now.

Accounting for On-field Success

As players perform on the field, we pay less attention to the prior expectations and more to what they’ve demonstrated in real life. The graph that mixes prior expectations with on-field results is a probability distribution of the player’s true-talent stolen base percentage for that season. The distribution shows the range in which that true talent most likely exists. (We call this distribution the posterior distribution because it comes after observing what the player did.)

The following graph shows the prior and posterior distributions for Henderson’s 1992 season:

Notes:

  • The prior curve peaks at 76.9 percent. This is what we would expect any player’s true-talent stolen base percentage to be, knowing only the season (1992) and the number of attempted steals (59) in that season.
  • The posterior curve peaks at 79 percent. This is what we would expect Henderson’s true-talent stolen base percentage to be, given not only the prior expectation of 76.9 percent, but also the observed SB% of 81.3 percent.

The drop between actual and estimated stolen base percentage is regression to the mean in action. Think about the variables involved in a steal attempt. You have not only the player’s speed and acceleration, but also his first step, the pitcher’s motion, the speed of the pitch to the plate, the arm strength and accuracy of the catcher, and the ability of the shortstop or second baseman to catch the ball and apply the tag. Judging a player by his on-field success rate ignores these factors.

To find career estimates of stolen base percentage, I repeated the above for each player-season and added up the totals. Here’s what I found.

Henderson vs. Coleman

The following graph shows the posterior distributions for both players, along with labels of what each area means:

The peaks of each distribution show the most-likely true-talent SB% for each man:

  • Coleman: 79.4 percent
  • Henderson: 78.8 percent

But comparing point estimates doesn’t account for the range of probabilities shown by the distributions. As the labels show, there is some chance Coleman is the better thief and some chance Henderson is. Just looking at the graph makes it difficult to tell.

We can use an A/B test to calculate the probability Coleman is the better thief. Numerical integration tells us there’s a 70 percent chance Coleman is better. Simulating one million seasons for each player, and calculating the percentage of seasons in which Coleman’s stolen base percentage is higher than Henderson’s, gives us this result.

The following plot of the players’ joint densities shows this chance:

About 70 percent of the cloud is on Coleman’s side of the plot. That’s the chance he’s a better base stealer than Henderson.

How can this be true? The two men have nearly identical record-book stolen base rates. And I already told you steal attempts factor into the model. Henderson attempted 812 more steals than Coleman, so you’d think Henderson would emerge superior.

The answer lies in two graphs. First, recall the graph near the beginning of this article about the estimated stolen base percentage of a 20-attempt player in any given year. Notice that around 1985, expectations for stolen base percentage start to rise sharply.

Now look at the following graph, which shows the attempts and success rate per season for each player:

The post-1985 rise in expectations harms Henderson. Most of his high-attempt seasons occurred prior to 1985, but he was less successful in those seasons. By the time he began stealing bases more successfully, he was attempting fewer steals than Coleman. Conversely, Coleman racked up his highest attempts and most of his highest success rates after 1985.

Henderson also hung on a bit too long, accruing some low-attempt, low-success seasons at the end of his career. This longevity pushed his raw steals total to record-level heights but harmed his estimated stolen base percentage. Conversely, Coleman burned out instead of fading away. It’s possible he may have succeeded less often had he stuck around longer, but we don’t have any evidence this is true.

What about Raines?

Yeah, what about him? Surely his 84.7 percent stolen base success rate means his estimated rate trumps Coleman’s. Right?

Let’s find out:

Yikes. Raines’ curve peaks at 75.9 percent. That rate is far behind not only his actual rate, but also those of Henderson and Coleman. Why?

My model doesn’t believe in Raines because it sees his managers didn’t give him the green light a lot. Look at Raines’ 1979–1980 seasons and his 1993–2001 seasons (minus the year 2000 when he didn’t play). He was successful at swiping bags during those seasons, but the low number of attempts cancels the success out. These 10 seasons account for almost half his career. True, managers were more conservative in 2001 than they were in 1993, but not so much that Raines looks good.

Perhaps my model is penalizing Raines unfairly. I do find it counterintuitive that he was successful on the base paths during those years but didn’t get the green light a lot. His OBP was pretty high in the late stages of his career, so he should have had a decent number of steal attempts.

Maybe his managers those years were biased or conservative on the base paths to a degree that the model doesn’t account for, or maybe they failed to believe their own eyes. These arguments would make sense to me, and a future study could try to correct for it.

Regardless, the following graph illustrates my point. It shows stolen base percentage overperformance by subtracting estimated stolen base percentage from actual stolen base percentage:

A positive value indicates an actual stolen base percentage higher than what the model expects. By this method, Raines was overrated as a base stealer for much of his career. This fact helps highlight Coleman’s status as underrated, especially since the two played during the same era.

But I come not to bury Raines, but to praise Coleman. Take a bow, Vince. The evidence suggests you were a better base stealer than both of your more well-known peers. They may have had the all-around game to get inducted into the Hall of Fame. You, however, can rest easy knowing you out-stole them on the base paths.

References and Resources


Ryan enjoys characterizing that elusive line between luck and skill in baseball. For more, subscribe to his articles and follow him on Twitter.
21 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
han60man
6 years ago

Nice work. I published a piece with similar conclusions for SABR; see Aug 2009 issue of BTN here: http://sabr.org/research/statistical-analysis-research-committee-newsletters

GoNYGoNYGoGo
6 years ago

Interesting points Ryan.

Couple of issues though:

1) Coleman was certainly not underrated as a basestealer during the 1980s, especially when he led the league every year from 1985 – 1990, was a ROY winner, a 2x all star, and received the largest free agent contract to that point in time in the history of the NY Mets. The reason he isn’t remembered more is that his 13 year career was far shorter than either Henderson’s or Raines’. Remembered at all, it’s generally for throwing lit firecrackers at Dodger Stadium at fans asking for an autograph, injuring a 2 year old girl.

2) Your methodology and the last 3 graphs showing each player’s SB% over performance doesn’t measure the actual value or volume of the actual stolen bases. For example, Coleman’s best season, by far, shown is 1996. Despite the fact that Raines has 7 seasons with a greater outperformance, in 1996 Coleman stole 12 out of 14 bases. The graph weighs equally this season with every other season of both his and Henderson’s and Raines’ careers.

3) Continuing with the last series of charts, Coleman had 5 seasons of negative overperformance in 13 seasons, while Raines had only 2 or 3 (tough to tell on my monitor) in 23 seasons and Henderson 6 in 25 years.

Paul G.member
6 years ago

Do keep in mind that Vince was playing his home games at Busch Stadium, which had artificial turf. It was easier to steal bases on turf compared to grass. Rickey’s prime was spent in Oakland and New York, playing on grass.

WhatLeylandNoooomember
6 years ago
Reply to  Paul G.

But it could also be argued his legs would wear down more and/or more quickly on turf compared to grass/dirt impacting acceleration and speed over time.

Paul G.member
6 years ago

Quite possibly, but that takes time and how it would impact him is difficult to predict. Coleman’s career was significantly shorter than Henderson’s and if the negative impact of the turf caused his career to go off a cliff – he was always marginal as a corner outfielder except for his speed – then it might skew this analysis. If you look at his Rbaser rating on B-R.com, Coleman was fantastic in 1986-1987 and then never remotely that good again. By the time he got to the Mets he could not stay on the field consistently, so perhaps the turf had done him in. How would a brief but fantastic peak followed by regulation to semi-regular affect this analysis?

LHPSU
6 years ago

I don’t know how Vince Coleman can be underrated as a base stealer when he’s generally viewed as an all-leg, no-bat, no-glove player. As career 11.6 WAR players go, he’s quite well-known and it’s because people knew him as the guy who could really run.

evo34
6 years ago
Reply to  LHPSU

Exactly. If you ask any fan alive in the ’80s who they think of as all-time great basestealers, Coleman would probably be named the second most, behind Rickey.

I imagine the author was not alive in 1990, but how did this article’s premise [unmasking an underrated speed demon] make it past THT’s editors?

Las Vegas Wildcards
6 years ago
Reply to  evo34

Good point, we can’t undersell the value of seeing these players in action. If you were alive in the 80s, Vince Coleman was known as a terror on the basepaths, and a key part of those successful St. Louis teams. He intimidated pitchers and catchers during this era.

Jetsy Extrano
6 years ago

What exactly is this “true talent SB%” you’re estimating? Against what difficulty of steal opportunity, some kind of normalized difficulty?

During a season, some of a player’s opportunities are easy and some are hard. If they and their team are strategic about it, they’re stealing in the ones where they expect to be above break even in wins. So I see your use of attempt volume gets at this somewhat — more attempts means you probably went on some objectively harder attempts — but can you argue the handling is complete?

Seems you have a bias because Henderson had a higher OBP than Coleman. Assume two players in the same years with the same talent and the same green-light threshold of difficulty to attempt at. But one had a larger pool of attempts. Your model assumes that means he had the green light more, and has a higher prior for ability.

Paul G.member
6 years ago

The other question I have is there any distinction between stealing second and stealing third (and stealing home). The break even point is not the same for each base.

Thunder Donut
6 years ago

Can confirm. Automatic stolen base in RBI Baseball for the NES.

evo34
6 years ago

There is nothing “Bayesian” or “prior” about choosing an arbitrary statistic (raw SBA) and assuming it represents a true skill level. It’s just lazy. To get an idea of the complexity of isolating true SB ability from team tendencies, read this:

https://www.fangraphs.com/blogs/how-can-we-predict-stolen-base-talent/

Or this:

https://www.fangraphs.com/fantasy/manager-influence-on-stolen-bases/

DiscoJer
6 years ago

Vince Coleman is up for the Cardinals Hall of Fame this year, along with vastly superior players, like Ray Lankford and Scott Rolen.

So at least in St. Louis, he’s probably overrated

gc
6 years ago

Paul G already said the part about turf but if the true talent is based on how often you are given the green light, it is largely dependent on managers and the skill of the rest of the team (e.g. if Raines was followed by better hitters than Henderson or Coleman).
Another factor is how good a hitter the basestealer is, so Coleman may be given the green light more than Eric Davis because the loss of Davis’ bat would be a bigger loss than Coleman’s if they got injured. Trout may be getting less chances because of that.

evo34
6 years ago

This article would be an excellent (and quite subtle) April Fools Day joke. But since it’s not…

Themaven
6 years ago

The math is pretty.The reasoning behind it and the conclusions by the author bring to mind Mark Twain’s quote about statistics.