More inverted records
A while back, I reintroduced an old Bill James invention that translates a hitter’s line into a pitching line. It’s frivolous fun that has few, if any practical applications, but it helps pass the time between baseball seasons.
In response to the original article, readers offered many suggestions for further work. One suggestion was to focus on “average” players. That’s sort of the premise here, although we will wander around a bit because… well, why not.
Preliminaries
To create our sample, I first identified all batting title qualifiers in 2010 (n = 151). After running the translations, I then identified all pitchers who worked at least 106 innings (n = 139) and added them to the pool, giving us a total of 290 players. I chose 106 innings as the threshold because that represents the lowest total among translated batting lines (Chase Utley’s 2010 comes out to 106.1 IP) and it gives a number fairly similar to the number of hitters.
I then found the average translated pitching line of 151 batters:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Sample average 140.0 149 83 75 18 54 103 4.81 1.16 3.48 6.65
As fate would have it, there are two hitters and a pitcher that matched this ERA in 2010. Here is the average line along with the lines of those players:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Sample average 140.0 149 83 75 18 54 103 4.81 1.16 3.48 6.65 Jason Hammel 177.2 201 97 95 18 47 141 4.81 0.91 2.38 7.14 Adam LaRoche 142.1 146 84 76 25 48 172 4.81 1.58 3.04 10.88 Raul Ibanez 144.0 154 86 77 16 68 108 4.81 1.00 4.25 6.75
Note that this is higher than the 2010 average MLB ERA (4.08), which makes sense since we are drawing only from the hitters who qualified for the batting title (and who presumably are better than those who didn’t).
For grins, I also divided our sample of 151 hitters into 10 groups according to OPS+ (plus Cesar Izturis, who was so bad he gets his own group). Group A consisted of players in the top 10% (1-15), Group B the next 10% (16-31), and so on. Here are the translated lines for the “leader” of each group:
Grp Rnk OPS+ Player IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 A 1 179 Miguel Cabrera 132.0 180 141 127 38 89 95 8.66 2.59 6.07 6.48 B 16 141 Adrian Beltre 144.1 189 110 99 28 40 82 6.17 1.75 2.49 5.11 C 31 130 Nick Swisher 141.0 163 103 93 29 58 139 5.94 1.85 3.70 8.87 D 46 122 Vladimir Guerrero 148.1 178 95 86 29 35 60 5.22 1.76 2.12 3.64 E 61 113 Stephen Drew 141.0 157 94 85 15 62 108 5.43 0.96 3.96 6.89 F 76 106 Adam LaRoche 142.1 146 84 76 25 48 172 4.81 1.58 3.04 10.88 G 91 102 Carlos Pena 132.0 95 72 65 28 87 158 4.43 1.91 5.93 10.77 H 106 95 Ben Zobrist 148.0 129 77 69 10 92 107 4.20 0.61 5.59 6.51 I 121 90 Derek Jeter 171.2 179 83 75 10 63 106 3.93 0.52 3.30 5.56 J 136 83 A.J. Pierzynski 125.0 128 50 45 9 15 39 3.24 0.65 1.08 2.81 Izt 151 50 Cesar Izturis 129.2 109 35 32 1 25 53 2.22 0.07 1.74 3.68
First off, mad props to the Orioles for letting Izturis qualify for the batting title. The gap between him and the second worst OPS+ among qualifiers (Alicides Escobar, 67) is greater than that between no. 59 (Mike Napoli, 113) and no. 104 (Starlin Castro, 97). Not every team would have the guts to stick a bat that bad out there every day, so way to go.
Cabrera’s line doesn’t compare with that of any pitcher in our sample, for the obvious reason that nobody (not even the Orioles) would let a guy work 106 innings while pitching like that. Izturis, on the other hand, matches up with the game’s elite pitchers:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Cesar Izturis 129.2 109 35 32 1 25 53 2.22 0.07 1.74 3.68 Felix Hernandez 249.2 194 80 63 17 70 232 2.27 0.61 2.52 8.36 Josh Johnson 183.2 155 51 47 7 48 186 2.30 0.34 2.35 9.11 Clay Buchholz 173.2 142 55 45 9 67 120 2.33 0.47 3.47 6.22
Close matches
The next thing I did was try to find roughly equivalent lines among hitters and pitchers. I approached this a number of different ways
My first attempt involved matching ERAs. There were many perfect matches, but here are the extremes. First the low end:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Chone Figgins 167.2 156 67 60 1 74 114 3.22 0.05 3.97 6.12 Alberto Callaspo 148.1 149 59 53 10 31 42 3.22 0.61 1.88 2.55 Chris Carpenter 235.0 214 99 84 21 63 179 3.22 0.80 2.41 6.86
And then the high end:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Nick Swisher 141.0 163 103 93 29 58 139 5.94 1.85 3.70 8.87 Scott Kazmir 150.0 158 103 99 25 79 93 5.94 1.50 4.74 5.58
Next I tried home runs. This ranged from several hitters and Brett Anderson at 6 to Joey Votto and Rodrigo Lopez at 37. Here is my favorite:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Michael Cuddyer 159.0 165 80 72 14 58 93 4.08 0.79 3.28 5.26 Gavin Floyd 187.1 199 92 85 14 58 151 4.08 0.67 2.79 7.25
This one is cool because Cuddyer and Floyd match not only in homers but also in ERA and walks. Cuddyer and Floyd also are your Mr. Average for 2010, both checking in with an ERA that matches MLB average.
Others of note are these three, who had the same number of homers in nearly the same number of innings:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Jeff Francoeur 121.1 113 53 48 13 30 81 3.56 0.96 2.23 6.01 Hisanori Takahashi 122.0 116 51 49 13 43 114 3.61 0.96 3.17 8.41 Brandon Phillips 162.1 172 86 77 18 46 83 4.27 1.00 2.55 4.60 Rick Porcello 162.2 188 96 89 18 38 84 4.92 1.00 2.10 4.65 Mike Young 168.0 186 93 84 21 50 115 4.50 1.13 2.68 6.16 Chris Narveson 167.2 172 96 93 21 59 137 4.99 1.13 3.17 7.35
I also looked at perfect BB/9 matches. Low end:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Roy Halladay 250.2 231 74 68 24 30 219 2.44 0.86 1.08 7.86 A.J. Pierzynski 125.0 128 50 45 9 15 39 3.24 0.65 1.08 2.81
And high end:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Luke Scott 112.0 127 91 82 27 59 98 6.59 2.17 4.74 7.88 Scott Kazmir 150.0 158 103 99 25 79 93 5.94 1.50 4.74 5.58
How about SO/9? Sure, why not; again, from low to high:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Carl Pavano 221.0 227 95 92 24 37 117 3.75 0.98 1.51 4.76 Shane Victorino 149.1 152 88 79 18 53 79 4.76 1.08 3.19 4.76 Bud Norris 153.2 151 94 84 18 77 158 4.92 1.05 4.51 9.25 Dan Uggla 145.0 169 115 104 33 78 149 6.46 2.05 4.84 9.25
A different approach
I eventually came up with a poor-man’s variant on similarity scores. Lacking the resources to do exactly what I wanted, I lined up the 151 batters and 139 pitchers side by side, in ascending order of ERA. This gave 139 matched pairs, from Izturis (2.22 ERA) and Hernandez (2.27) to Robinson Cano (6.63 ERA) and Ryan Rowland-Smith (6.75).
Ideally, I would have evaluated each player against all other players, but instead I compared values from one row for each matched pair. I looked at the rate statistics—ERA, HR/9, BB/9, and SO/9—and calculated the difference between each. Then I summed all four values in a couple of different ways (it’s a bit messy but it yields decent results; details can be found in References and Resources, for those interested in learning more and/or improving the method). I ran numbers for all 139 matched pairs and then sorted by lowest total differential. Here are the best matches:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Shane Victorino 149.1 152 88 79 18 53 79 4.76 1.08 3.19 4.76 Jake Westbrook 202.2 203 99 95 20 68 128 4.22 0.89 3.02 5.68 Starlin Castro 118.0 139 62 56 3 29 71 4.27 0.23 2.21 5.42 Jeremy Guthrie 209.1 193 93 89 25 50 119 3.83 1.07 2.15 5.12 Jhonny Peralta 145.0 137 69 62 15 53 103 3.85 0.93 3.29 6.39 Jon Garland 200.0 176 86 77 20 87 136 3.46 0.90 3.92 6.12 Ryan Braun 151.1 188 113 102 25 56 105 6.07 1.49 3.33 6.24 A.J. Burnett 186.2 204 118 109 25 78 145 5.26 1.21 3.76 6.99 Gaby Sanchez 146.1 156 88 79 19 57 101 4.86 1.17 3.51 6.21 Wade LeBlanc 146.0 157 69 69 24 51 110 4.25 1.48 3.14 6.78 Robinson Cano 150.2 200 123 111 29 57 77 6.63 1.73 3.40 4.60 Ryan Rowland-Smith 109.1 141 94 82 25 44 49 6.75 2.06 3.62 4.03 Alex Gonzalez 157.0 149 74 67 23 31 118 3.84 1.32 1.78 6.76 Hiroki Kuroda 196.1 180 87 74 15 48 159 3.39 0.69 2.20 7.29 Alexis Rios 148.2 161 80 72 21 38 93 4.36 1.27 2.30 5.63 Derek Lowe 193.2 204 88 86 18 61 136 4.00 0.84 2.83 6.32 Nyjer Morgan 139.0 129 55 50 0 40 88 3.24 0.00 2.59 5.70 R.A. Dickey 174.1 165 62 55 13 42 104 2.84 0.67 2.17 5.37 Casey McGehee 154.1 174 93 84 23 50 102 4.90 1.34 2.92 5.95 Jeff Niemann 174.1 159 86 85 25 61 131 4.39 1.29 3.15 6.76 Franklin Gutierrez 150.2 139 66 59 12 50 137 3.52 0.72 2.99 8.18 Cole Hamels 208.2 185 74 71 26 61 211 3.06 1.12 2.63 9.10 Victor Martinez 122.0 149 82 74 20 40 52 5.46 1.48 2.95 3.84 John Lannan 143.1 175 82 74 14 49 71 4.65 0.88 3.08 4.46 Brennan Boesch 118.0 119 64 58 14 40 99 4.42 1.07 3.05 7.55 Ross Ohlendorf 108.1 106 54 49 12 44 79 4.07 1.00 3.66 6.56 Jeff Francoeur 121.1 113 53 48 13 30 81 3.56 0.96 2.23 6.01 Matt Cain 223.1 181 84 78 22 61 177 3.14 0.89 2.46 7.13
Again, it’s not perfect (and I am sure there are better ways to achieve the intended goal), but this gives us some idea of players with similarly shaped lines. Need a guy who serves up homers? Try Cano or Rowland-Smith. How about a control freak? You want Gonzalez or Kuroda. Maybe you’re more of a strikeout guy: Gutierrez or Hamels. And on it goes, limited only by your imagination. Have fun!
References & Resources
Thanks to readers of the original article for finding it interesting enough to inspire a sequel. One suggested idea that I didn’t cover here is that of head-to-head matchups. The samples would be too small to have meaning but it might be fun—in a Spock versus Evil Spock kind of way—to see, e.g., how Cuddyer fared against Floyd in 2010 (he went 4-for-11 with a double and two strikeouts, if you’re curious).
As for the messy bits mentioned above, I summed ERA, HR/9, BB/9, SO/9 in two ways. The first takes the absolute value of each component, while the second takes the absolute value of the sum:
Method 1: |ERA dif| + |HR/9 dif| + |BB/9 dif| + |SO/9 dif| Method 2: |(ERA dif + HR/9 dif + BB/9 dif + SO/9 dif)|
For example:
IP H R ER HR BB SO ERA HR/9 BB/9 SO/9 Robinson Cano 150.2 200 123 111 29 57 77 6.63 1.73 3.40 4.60 Ryan Rowland-Smith 109.1 141 94 82 25 44 49 6.75 2.06 3.62 4.03 Method 1: |6.75 - 6.63| + |2.06 - 1.73| + |3.62 - 3.40| + |4.03 - 4.60| = 1.24 Method 2: |(6.75 - 6.63) + (2.06 - 1.73) + (3.62 - 3.40) + (4.03 - 4.60)| = 0.10
There was a logic behind this at some point, but I forget what it was. What’s important is that using both together yields better results than using just one. If you have suggestions on how to improve this or if you have ideas for further study, please share them in the comments.
Izt group = hilarious. I don’t see why Carlos Lee and Chone Figgins get so much hate when we have an Izturis posting an OPS of 50.
Interesting idea, but it matches good hitters with bad pitchers and good pitchers with bad hitters.
What about a system designed to match the good with the good? Start with the league average for the various reciprocal categories and then flip them. For example, translate a desirably high strikeout rate for a pitcher into a desirably low strikeout rate for a hitter, based on their desirable deviations from league average.
@Matt: Amazingly, Izturis’ 50 OPS+ is only the 17th worst among batting title qualifiers since 1961. Clint Barmes (47 in 2006) is the most recent to have a worse OPS+, while Matt Walbeck’s 37 in 1994 is the lowest since Art Scharein’s 34 in 1933. I guess Izturis still has work to do if he wants to be remembered with the immortals.
@Sabertooth: That would attempt to answer a different question but sounds interesting. Thanks for the suggestion.