The Fastest Player in Baseball (Revisited)

Speed Scores help predict Sprint Speed with unsurprising accuracy.

In his 1987 Baseball Abstract, in an article entitled “The Fastest Player in Baseball,” Bill James introduced Speed Scores. A player’s Speed Score estimates how fast he is, on a 0-10 scale, based on his statistics— that is, based on the kinds of the back-of-the-baseball-card statistics that were available in 1987. More specifically, James calculated a simple average of six factors, each of which captures an effect of speed on performance: (1) success stealing bases, (2) propensity to steal, (3) ability to leg out triples, (4) ability to score once on base, (5) ability to beat out double-play balls and (6) defensive range. The factors are defined as follows:

Just over 30 years later, MLB.com’s Tom Tango created Sprint Speed. Sprint Speed uses Statcast data, which includes runners’ positions recorded multiple times per second, to calculate every player’s average fastest second-long-span in his “maximum effort runs.” The “maximum effort runs” for each player are the faster half of all runs during which he attempted to advanced two or more bases.

Sprint Speeds essentially measure what Speed Scores could only infer, but they don’t eliminate the value of Speed Scores. When we’re evaluating historical performances (anything before 2015) or any performances outside major league baseball, Sprint Speeds aren’t available. Sprint Speeds give us the opportunity to validate and maybe even improve on Speed Scores.

We�ve moved The 2018 Hardball Times Annual online to give the entire Internet the chance to read the research and insight The Hardball Times Annual has brought to print for the past 15 years.

Please consider supporting the 2018 Hardball Times Annual by
becoming a FanGraphs member. Your membership will help fund future Hardball Times Annuals, the thousands of articles FanGraphs publishes each year, and our ever growing database of baseball stats.

Support FanGraphs

How Well Do Speed Scores Estimate Speed?

The graph above, which plots single-season Sprint Speeds against Speed Scores for player-seasons with at least 50 games played (2015-2017) shows that Speed Scores predict Sprint Speed pretty darn well. Better yet, there’s a nearly linear relationship between the two ratings.

Trying New Weights

Instead of weighing each of the six factors that make up speed scores evenly, we can use linear regression to find the weights for each factor that would best predict Sprint Speed. When we do that (using linear regression) we get the following weights:

Weights, via Linear Regression Results
Factor Weight
SB success 0.19
SBA rate 1.68
3B rate 0.72
Run Scoring 1.11
GIDP 0.58
Range 1.73

Range, stolen base attempt rate and run scoring get the most weight. Perhaps surprisingly, stolen base success rate is given very little weight. A look at the correlations between each factor and sprint speed, as well as between every pair of provides the explanation.

Sprint Speed Correlations
sprint_speed F1 F2 F3 F4 F5 F6
sprint_speed    1 0.57 0.72 0.55 0.58 0.41 0.67
F1 0.57    1 0.75 0.35 0.46 0.26 0.45
F2 0.72 0.75    1 0.44 0.56 0.35 0.57
F3 0.55 0.35 0.44    1 0.40 0.30 0.36
F4 0.58 0.46 0.56 0.40    1 0.23 0.44
F5 0.41 0.26 0.35 0.30 0.23    1 0.32
F6 0.67 0.45 0.57 0.36 0.44 0.32    1

Stolen base success doesn’t receive little weight because it’s the least-closely related to speed (it isn’t; that distinction belongs to F5, double plays). It receives little weight because stolen base success rate is highly correlated with the frequency of attempts, and the attempt rate is more closely tied to speed than success rate is.

The best weights for one set of player-seasons might not be quite the same as the best weights for the next set, and in order to avoid being overly generous in evaluating the performance of our new weights, we have to look at their performance out of sample. To do this, we’ll slice the data (player-seasons from 2015 to 2017 with 50 or more games played) into 40 equal slices and predict the Sprint Speeds of players in each slice using regression weights derived from the other 39 slices as well as even weights.

Correlations of Predictions with Sprint Speed (out-of-sample)

  • Even Weights: 0.809
  • Regression Weights: 0.825

Our predictions using regression weights do a bit better than the original even weights, but it’s not exactly a blowout. Why is that? How can flying blind and assigning even weights do very nearly as well as weights fit to the data? In this case, it is not because we overfit the data in our regression model (ridge regression gives no improvement over ordinary least squares regression).

The truth is that when you’re using a set of positively correlated predictors, changing the weights doesn’t make much of difference [see “Estimating Coefficients in Linear Models: It Don’t Make No Nevermind,” Wainer (1976)]. In our case, the correlation between the even weight predictions and the regression model predictions is 0.979. In other words, it really didn’t make no never mind, and if we’re going to seriously improve upon Speed Scores, we’re going to need to work on the factors themselves.

The Advantage of Batting Lefty

In his original article, James noted that “no one of these [factors] is a pure indicator of speed; for example, the frequency of grounding into double plays is effected by where you hit in the batting order, whether you bat right-handed or left-handed, how often you are called on to bunt, how hard you hit the ball and whether you play on a grass field or on artificial turf, among other things.” Here, I’ll attempt to adjust for one of those factors, batting hand.

Left-handed batters start out a step closer to first base, and this gives them a built-in advantage when it comes to hitting triples and staying out of doubles plays. Left-handed batters also hit the ball to right field more often, giving them a second advantage when it comes to triples.

The following graphs shows that given the same Sprint Speeds, lefty batters score higher on factor 3 (triples) and factor 5 (double plays) and have higher Speed Scores as a result.

We may also want to use the same square root transformation for triple rates that James used for stolen base attempts. The more triples a player hits, the less each additional triple tells us about his speed. Thinking about it another way, triple rates are skewed right, and if we don’t transform this variable, we’ll be forced to choose between scrunching most players together at the bottom of the 0-10 scale and chopping off much of the right tail by assigning all of the high-triple players a score of 10.

We end up with the following:

where XL is the percentage of time that batter hits from the left-side of the plate. (If we want to avoid using play-by-play data, we can set XL equal to 0.27 for switch hitters.)

It’s worth pointing out that, in practice, we haven’t wandered very far away from James’ original formulation. The correlation between the original factor 3 and our modified factor 3 is 0.95, and the correlation between new and old factor 5s is 0.96. We have also only very slightly boosted the correlations between our factors and sprint speed (from 0.55 to 0.57 in the case of F3 and from 0.41 to 0.42 in the case of F5).

Infield Hits

Faster players get more infield hits, and we can use this to create a new factor. Infield hits are affected by batter handedness, and we’ll add in a handedness adjustment as we did for triples and double plays. Using infield hits, however, has a serious drawback since I only have this data back to 2002, fifteen years after Speed Scores were created. When competing against the original Speed Scores, this definitely counts as cheating.

Regression Weights for New Factors

Now let’s once again find the factor weights that give the best predictors this time using our modified factors 3 and 5 and the addition of a seventh factor for infield hits.

Weights for New Factors
Factor Weight
SB success 0.16
SBA rate 1.56
3B rate 1.01
Run Scoring 1.08
GIDP 0.58
Range 1.69
Infield Hits 0.93

Stolen base success rate is once again on the outside looking, with range and stolen base attempt rate the most important factors. We can also look at how well different models for predicting speed scores perform out of sample, once again splitting the data into 40 splices and predicting speed in each slice with a model built on the other 39 slices.

Correlations of Predictions with Sprint Speed (out-of-sample):

  • Even Weights, Original Factors: 0.809
  • Regression Weights, Original Factors: 0.825
  • Even Weights, Modified Factors: 0.831
  • Regression Weights, Modified Factors: 0.842

By modifying the factors and using regression weights, we’ve made a bit of progress. Not to let myself of the hook, but the truth, I suspect, is that there’s only so much better we can do. James’ original Speed Scores had already squeezed out almost all the juice.

The Fastest Players in The Sprint Speed Era

Now let’s look at what Sprint Scores our system would have predicted for the fastest player-seasons in the Sprint Speed Era (2015-2017). For each player, I’m including his quantile rank in each of the seven factors and his predicted Sprint Speed based on those factors. Bradley Zimmer is the odd duck here. Our (modified) Speed Scores think he’s fast but wouldn’t accuse him of being one of the fastest players in the league due to pedestrian triple, double play and range factors. Thus far in his career, his speed has played down.

Let’s also look at the slowest players. These players were, as a group, much slower than our Speed Scores predicted simply because Speed Scores lack the certainty needed to predict that anyone is as slow as these guys were. We get a taste of why stolen base success is given little weight by observing that Albert Pujols and Brian McCann have combined for a perfect 9-for-9 in stolen base attempts over the last two seasons. Despite his lack of speed, Dioner Navarro hit into only five double plays and managed a pair of triples in 2016.

Fastest Player Seasons of Statcast Era
First Last Age Pos Yr SS (ft/s) Pred SS F1 F2 F3 F4 F5 F6 F7
Byron Buxton 22 CF 2016 30.66 29.72 0.86 0.93 1.00 0.95 0.96 0.99 0.99
Billy Hamilton 26 CF 2016 30.22 29.75 1.00 1.00 0.76 1.00 0.89 0.98 0.96
Byron Buxton 23 CF 2017 30.12 29.77 1.00 0.94 0.98 0.88 1.00 1.00 0.99
Billy Hamilton 27 CF 2017 30.12 29.72 0.98 1.00 0.99 0.98 0.96 0.94 0.82
Bradley Zimmer 25 CF 2017 29.88 28.24 0.99 0.95 0.68 0.82 0.52 0.68 0.65
Paulo Orlando 30 RF 2015 29.66 28.64 0.34 0.83 0.99 0.95 1.00 0.46 0.81
Dee Gordon 29 2B 2017 29.65 28.91 0.96 0.99 0.86 0.99 0.90 0.66 0.44
Jarrod Dyson 32 CF 2016 29.58 29.35 0.95 0.98 0.98 0.95 0.83 0.96 0.67
Delino DeShields 25 LF 2017 29.58 29.25 0.92 0.97 0.67 0.99 0.99 0.63 0.99
Delino DeShields 23 CF 2015 29.55 29.79 0.89 0.97 0.99 1.00 0.99 0.94 0.94
Slowest Player Seasons of Statcast Era
First Last Age Pos Yr SS (ft/s) Pred SS F1 F2 F3 F4 F5 F6 F7
Albert Pujols 37 DH 2017 22.99 25.44 0.63 0.28 0.13 0.02 0.10 0.20 0.06
Brian McCann 33  C 2017 23.39 25.76 0.37 0.14 0.34 0.24 0.29 0.06 0.10
Brian McCann 32  C 2016 23.46 25.08 0.41 0.13 0.11 0.20 0.03 0.06 0.01
Albert Pujols 36 DH 2016 23.67 25.71 0.75 0.33 0.11 0.12 0.14 0.20 0.24
Dioner Navarro 32  C 2016 23.69 26.35 0.05 0.49 0.73 0.18 0.76 0.06 0.23
Brian McCann 31  C 2015 23.83 25.66 0.20 0.05 0.25 0.50 0.67 0.06 0.14
Victor Martinez 37 DH 2016 24.01 25.29 0.18 0.06 0.11 0.05 0.19 0.20 0.27
Bobby Wilson 33  C 2016 24.09 25.41 0.18 0.06 0.11 0.54 0.39 0.06 0.15
Dae-Ho Lee 34 1B 2016 24.16 25.51 0.18 0.06 0.11 0.12 0.36 0.20 0.53
Nick Swisher 35 LF 2015 24.25 25.40 0.20 0.05 0.11 0.00 0.05 0.75 0.07

The Fastest Players Ever

Let’s look at the predicted Sprint Speeds for the 20 fastest and 10 slowest players in the last 67 years (1951-2017). The game has changed over the years, and Speed Scores have changed along with it, so I’m going to make a suspect assumption. I’ll assume baseball’s tolerance for slow players may have wandered over the years but that the fast players have always been just as fast as they are now. Mathematically, I’m going to adjust Sprint Speeds so that the 75th percentile Sprint Speed is the same for every season. To put all of these player-seasons on even footing, I’ll used a Speed Score model that does not include factor seven, infield hits.

Unsurprisingly, the fastest players are exclusively outfielders and middle infielders and are mostly quite young. Willie Wilson, whose apparent decline in speed inspired James’ original Speed Scores, stands on top with his 1980 season and makes two other appearances on the list. Byron Buxton, who has the fast actual Sprint Speed in the last three seasons, shows up at 16th on this list of predicted speeds. Maury Wills deserved some recognition for making the leader board twice at the ages of 30 and 33!

Fastest Player Seasons, 1951-2017
First Last Age Pos Yr Pred SS (ft/s) F1 F2 F3 F4 F5 F6
Willie Wilson 25 LF 1980 29.88 0.99 0.95 0.96 0.99 1.00 0.96
Carl Crawford 23 LF 2004 29.80 0.95 0.99 0.99 0.99 1.00 0.84
Billy Hamilton 27 CF 2017 29.79 0.98 1.00 1.00 0.99 0.96 0.93
Maury Wills 30 SS 1962 29.78 1.00 1.00 0.84 1.00 0.97 0.86
Willie Wilson 24 LF 1979 29.76 0.99 1.00 0.98 1.00 1.00 0.80
Juan Samuel 23 2B 1984 29.61 0.98 1.00 0.99 0.95 0.95 0.64
Jose Reyes 23 SS 2006 29.58 0.92 1.00 1.00 1.00 0.95 0.55
Jose Reyes 22 SS 2005 29.58 0.95 0.99 1.00 0.98 0.96 0.68
Willie Wilson 30 CF 1985 29.57 0.92 0.94 1.00 0.93 0.91 0.95
Maury Wills 33 SS 1965 29.57 0.94 0.99 0.66 0.96 0.96 0.99
Willie McGee 27 CF 1985 29.56 0.92 0.95 0.99 0.96 0.99 0.93
Curtis Granderson 26 CF 2007 29.55 1.00 0.82 1.00 0.99 0.99 0.98
Bert Campaneris 24 SS 1966 29.54 0.99 0.98 0.94 0.98 0.95 0.84
Rickey Henderson 26 CF 1985 29.53 1.00 0.98 0.76 1.00 0.81 0.99
Byron Buxton 23 CF 2017 29.53 1.00 0.98 0.96 0.90 1.00 0.99
Freddie Patek 27 SS 1971 29.53 0.93 1.00 0.99 0.95 0.92 0.93
Ray Lankford 24 CF 1991 29.53 0.76 0.99 1.00 0.99 0.95 0.92
Willie Wilson 28 CF 1983 29.51 0.99 0.94 0.88 0.99 0.96 0.94
Cesar Tovar 29 CF 1969 29.51 0.94 0.98 0.74 1.00 0.89 0.96
Billy Hamilton 24 CF 2014 29.50 0.80 1.00 0.93 0.93 1.00 0.92

The slowest player-seasons are exclusively held by players at positions that put no or little emphasis on speed: catcher, first base and designated hitter. Fred Kendall impresses by making this list at the relatively tender age of 27 and by having a son who was surprisingly speedy.

Slowest Player Seasons, 1951-2017
First Last Age Pos Yr Pred SS (ft/s) F1 F2 F3 F4 F5 F6
Lance Parrish 31  C 1987 24.65 0.06 0.05 0.04 0.00 0.00 0.03
David Ortiz 39 DH 2014 24.80 0.11 0.03 0.08 0.00 0.01 0.15
Joe Oliver 28  C 1993 24.83 0.16 0.01 0.05 0.00 0.28 0.04
Fred Kendall 27  C 1976 24.91 0.23 0.08 0.02 0.00 0.04 0.02
Mo Vaughn 31 1B 1999 24.92 0.16 0.02 0.08 0.00 0.25 0.12
Willie McCovey 39 1B 1977 24.92 0.65 0.06 0.01 0.00 0.01 0.16
A.J. Pierzynski 34  C 2011 24.93 0.08 0.01 0.20 0.03 0.04 0.04
Dave Valle 33  C 1993 24.93 0.34 0.04 0.05 0.02 0.06 0.04
Bengie Molina 34  C 2008 24.94 0.14 0.02 0.04 0.00 0.12 0.03
A.J. Ellis 31  C 2012 25.02 0.11 0.02 0.36 0.00 0.06 0.03

I was also interested in looking at the slowest players who were forced to cover more ground in the field. The following table shows the slowest player-seasons by players who weren’t stationed behind the plate, at first base or in the dugout.

Exposed Snails, 1951-2017
First Last Age Pos Yr Pred SS (ft/s) F1 F2 F3 F4 F5 F6
Ray Knight 34 3B 1987 25.41 0.17 0.02 0.04 0.01 0.32 0.39
Chris Johnson 29 3B 2013 25.43 0.12 0.04 0.10 0.12 0.08 0.30
Reggie Jackson 39 RF 1985 25.46 0.07 0.15 0.04 0.07 0.01 0.32
Todd Zeile 37 3B 2002 25.50 0.17 0.09 0.07 0.08 0.02 0.27
Ron Cey 35 3B 1983 25.59 0.19 0.01 0.07 0.11 0.15 0.26
Ken Reitz 29 3B 1980 25.62 0.06 0.05 0.04 0.02 0.33 0.25
Ed Sprague 26 3B 1993 25.65 0.34 0.04 0.16 0.05 0.05 0.27
Mike Moustakas 29 3B 2017 25.65 0.12 0.01 0.10 0.10 0.07 0.29
Dave Magadan 31 3B 1993 25.68 0.34 0.08 0.05 0.03 0.18 0.33
Matt Dominguez 25 3B 2014 25.70 0.04 0.09 0.08 0.21 0.04 0.36

Finally, let’s look at the fastest players who were behind the plate, at first base or in the dugout. These are, perhaps, the most glaring instances of wasted speed.

Wasted Speed, 1951-2017
First Last Age Pos Yr Pred SS (ft/s) F1 F2 F3 F4 F5 F6
Gary Sheffield 39 DH 2007 27.88 0.82 0.77 0.31 0.95 0.68 0.48
Don Baylor 34 DH 1983 27.76 0.71 0.67 0.42 0.54 0.74 0.70
Vic Power 32 1B 1959 27.72 0.18 0.80 0.60 0.98 0.26 0.07
Orlando Cepeda 22 1B 1959 27.69 0.82 0.98 0.46 0.49 0.62 0.07
Tommy McCraw 28 1B 1968 27.69 0.94 0.88 0.95 0.55 0.40 0.11
Wil Myers 25 1B 2016 27.69 0.97 0.96 0.78 0.89 0.62 0.14
Donn Clendenon 30 1B 1965 27.68 0.32 0.76 0.98 0.72 0.64 0.08
Hal McRae 35 DH 1980 27.64 0.72 0.46 0.69 0.78 0.44 0.41
Wes Parker 26 1B 1965 27.61 0.65 0.79 0.82 0.85 0.89 0.08
Johnny Damon 38 DH 2011 27.60 0.77 0.74 0.78 0.65 0.95 0.28

At the end of his original article, James noted, “While I’ve been treating this thing basically as a toy, just running numbers to see who looks better than who, there are some substantial sabermetric questions for which it would be handy.” This, it seems to me, remains true today. Speed Scores both demonstrate and take advantage of the many ways in which speed impacts the game.

I realized while working on this topic that I’ve been missing opportunities to use speed to project player performance. Speed Scores also suggest ways in which we might use Sprint Speeds. By just inverting the factor formulas, we transition from predicting speed from performance to predicting performance from speed.

References & Resources


Jared Cross is a co-creator of Steamer Projections and consults for a Major League team. In real life, he teaches science and mathematics in Brooklyn.