Confessions of an RBI Fanatic

I have a confession to make: I’m a big fan of the RBI.
That may not sound so bad
to many of you, but think about
my position. Here I am writing for an “analysis-oriented” baseball
website. Not only that, but my own writing is often supported by quite
a bit of statistical analysis—I definitely consider myself
squarely in the sabermetric camp. That’s why I’m a bit nervous about
revealing
my love for the run-batted-in. See, the RBI
doesn’t have many fans among the sabermetrically-minded. Just consider that
when citing a player’s key offensive stats we no longer write something like .300-30-100 (average, HR, RBI),
but rather numbers like this: .300/.350/.500 line (average, on-base percentage, slugging percentage). Indeed, my
Baseball Prospectus 2004 Annual does not even report RBI for
players. We just don’t have much use for the RBI anymore, it
seems. That’s why it’s with some trepidation that I admit my penchant
for the RBI.

Oh, I realize the shortcomings of the statistic. I know that the RBI is not a
good stat for evaluating hitting ability. It’s too dependent on things beyond a player’s
control: the ability of his teammates to get on base,
his position in the batting order, things like that. But, you know, a solid
base hit that scores a runner in a close game—well, it’s
exciting, dammit. It’s the play that gets you to clench your fist and yell
“Yeah!” (even if you’re alone watching on TV).
A key RBI in a tight spot is often the high point of a close
ballgame.

I believe the backlash that the RBI has received in the
analysis community arises from the general public’s overrating
of the stat. It’s true that some mediocre players have racked up some
pretty impressive RBI totals. And likewise, often the best hitters
will not be found among the league leaders in RBI. Ok, that’s fine:
the RBI is not a particularly good measure of batting abilty. But, it
is one of the best things to see on a ball field, so it merits some
attention.

The Players’ Opinion

The RBI has always been held in high esteem by players, managers and coaches.
Hank Greenberg, one of the top
RBI men of his generation, recounted to Lawrence Ritter his view of
the RBI. From the incomparable The Glory of Their Times:

I’ve always believed that the most important aspect of hitting was
driving in runs. Runs batted in are more important than batting
average, more important than home runs, more important than
anything. That’s what wins ball games: driving runs across the plate.

Charlie Gehringer used to bat ahead of me, and if we had a man on
first base and Charlie was up, I’d yell, “Get him to third, Charlie,
just get him to third. I’ll get him in.” That was my goal: get that
man in.

Of course, Greenberg may have been biased; it’s natural to put a high
value on the thing at which you excel. I mean, if the question, “What
is the most important aspect of hitting?” is put to, say, Luis
Castillo
, he might not give you the same answer as Greenberg. Still, I
think it’s probably safe to say that the ability to drive in runs is
considered very important by the majority of players, managers and
coaches. I could be wrong on that, since I don’t have first-hand
knowledge, but I don’t think so.

Making Sense of RBI Totals

One
problem with the RBI statistic is that it doesn’t take into account
opportunity. Generally, a batter hitting third or fourth in the
batting order will have many more chances to drive in runs than a
leadoff man. Likewise, a hitter batting behind a couple of high-OBP
players will have more RBI opportunities than somebody batting behind
Corey Patterson and his .290 OBP. To determine the RBI opportunities of a player,
you need to turn to the play-by-play data.

I’ve devised my own method for determining the best RBI men in
baseball. The idea is to compare how a player does in his RBI
opportunities compared to the major league average. If he drives in
more than the average player would, given the same opportunities, he
gets “plus” RBI credits. If he drives in fewer, he gets “minus”
credits. This “plus/minus” system is nice in that it gives you an
immediate feel for how many extra RBI a particular hitter has
produced.

Before I start laying the tables on you, we need to hash through a few
details. Don’t worry, this will be short and sweet. First off, I’m
only interested in base runners driven in, i.e., I’m taking home
runs out of the equation. Home run totals are readily available, so we
know how often a player drives himself in. Next point: what is an RBI
opportunity? To me, it’s a plate appearance with runners on base. That’s fairly simple, or is it? Should we count plate appearances when
the batter was walked? What about if he sacrificed the runner along?
Well, I’m going to count all plate appearances except intentional
walks. I realize that’s going to penalize selective hitters somewhat,
but let’s face it: selective hitters sometimes walk down to first base
instead of driving in runs. Actually, many hitters will take fewer
walks in an RBI situation. This is often a conscious decision on the
part of the hitter, as evidenced by this quote from Red Sox first
baseman, Kevin Youkilis:

Sometimes, you know, in certain
situations you have to be more aggressive. With runners in scoring
position, you’ve got to be more aggressive and be ready to hit because
you want to drive in the runs; you don’t want to walk.

If it’s like second and third, one out, you get a pitch to hit, you’ve
got to hit it. That’s a big thing. You get a good pitch to hit in that
situation, you’ve got to hit it. You can’t go up there in that
situation taking a lot of pitches.

I also take into account where the runners are in any given
RBI oppurtunity. Obviously, it’s harder to drive a runner in from
first base than from third base. Also, when there is a runner on third
base, it’s much easier to drive him home with fewer than two outs: that’s
because an out, if it’s not a strikeout or a pop fly, will often score
a runner from third with fewer than two outs. So, I take into account
the number of outs for runners on third base. You might be wondering
how often a runner is driven in from the different bases. Here you go:

Fraction of Runners Driven In
Base Occupied      Fraction Driven In
   1B                  0.054
   2B                  0.163
   3B, < 2 outs        0.518
   3B, 2 outs          0.236

So, using these averages and the specific opportunies for each batter,
I can determine how much better or worse he was at driving in runs
than the average major league batter.

Let's take Miguel Tejada's 150 RBI campaign in 2004 as an
example. Tejada hit 34 home runs that year, meaning he drove in 116
runners, a very high total. In fact, that's the highest single-season total achieved
in the period 2000-2005. He had 381 plate appearances with runners on base (also
the highest in this period) with a total of 536 runners on. So, Miggy had a very large number of RBI opps in
2004. We can break it down by the bases occupied when Tejada came to bat and how successful he
was in driving the runners in. Here is a table showing the relevant
numbers:

+----------+--------+------+------+-------+------+-------+------+-------+------+---------+
| Batter   | pa_rob | rob  | r1   | frac1 | r2   | frac2 |   r3 | frac3 | r3_2 | frac3_2 |
+----------+--------+------+------+-------+------+-------+------+-------+------+---------+
| Tejada   |    381 |  536 |  248 | 0.093 |  172 | 0.203 |   69 | 0.696 |   47 |   0.213 |
+----------+--------+------+------+-------+------+-------+------+-------+------+---------+
r1 = runners on 1B, frac1 = fraction of 1B runners driven in, etc.
r3 = runners on 3B, fewer than 2 outs
r3_2 = runners on 3B, 2 outs

Tejada drove in over 9% of his runners from first base, compared to
the league average of 5.4%. He was also much better at driving in
runners from second base and from third base with fewer than two
outs. When you actually do the arithmetic, you find that given the
runners on base that Tejada had in 2004, the average hitter would have
knocked in 88 runs, while Miggy drove 116 across the plate. The
difference between those two numbers (let's call it "Diff") +28 is
very good.
It shows that
while Tejada had a huge number of opportunities, he made the best of
them and drove in 28 more runs than an average batter would have.

A Hardball Times Update
Goodbye for now.

Two adjustments I do not make: the speed of the runners on base and
adjustments for home ballpark. Okay, let's get to the results.

The Best RBI Men

I've got pbp data loaded for the years 2000-2005, so I'll be looking
at the best RBI seasons, as measured by Diff, during that period.
Here are the Top 20 RBI seasons over the last six years:

Top 20 Single-Season RBI Performances, 2000-2005
+-----------------------+------+--------+---------+------+------+
| Name                  | year | pa_rob | exp_RDI | RDI  | Diff |
+-----------------------+------+--------+---------+------+------+
| Teixeira_Mark         | 2005 |    355 |      64 |  101 |   37 |
| Helton_Todd           | 2000 |    347 |      71 |  105 |   34 |
| Cirillo_Jeff          | 2000 |    335 |      72 |  106 |   34 |
| Delgado_Carlos        | 2003 |    342 |      70 |  104 |   34 |
| Giambi_Jason          | 2000 |    331 |      61 |   94 |   33 |
| Delgado_Carlos        | 2000 |    329 |      64 |   96 |   32 |
| Gonzalez_Juan         | 2001 |    318 |      75 |  107 |   32 |
| Thomas_Frank          | 2000 |    309 |      69 |  101 |   32 |
| Ortiz_David           | 2005 |    351 |      70 |  101 |   31 |
| Helton_Todd           | 2001 |    352 |      66 |   97 |   31 |
| Tejada_Miguel         | 2002 |    351 |      68 |   97 |   29 |
| Martinez_Edgar        | 2000 |    354 |      79 |  108 |   29 |
| Ortiz_David           | 2004 |    323 |      70 |   99 |   29 |
| Tejada_Miguel         | 2004 |    381 |      88 |  116 |   28 |
| Pujols_Albert         | 2002 |    358 |      65 |   93 |   28 |
| Berkman_Lance         | 2002 |    307 |      59 |   87 |   28 |
| Rolen_Scott           | 2004 |    300 |      63 |   90 |   27 |
| Ramirez_Manny         | 2005 |    328 |      72 |   99 |   27 |
| Pujols_Albert         | 2003 |    301 |      55 |   81 |   26 |
| Sweeney_Mike          | 2000 |    373 |      90 |  116 |   26 |
+-----------------------+------+--------+---------+------+------+
RDI = runners driven in, i.e. RDI = RBI - HR

I have mentioned that there is no park adjustment, so the
presence of Helton (twice) and Cirillo on this list should be taken
with a grain of salt. The Rangers also play in a park favorable to hitters, but Mark Teixeira's 2005 mark of +37 is very impressive
nonetheless. It's interesting that, by this
measure, Tejada's 2002
season (131 RBI) actually was a touch better than the 2004 season that
we looked at above.

We can also look at the whole period together to find out who is the
best RBI man of recent times. The answer is ... well, here's the
leader board:

Top 20 RBI Batters, 2000-2005, Ranked by Diff
+-----------------------+--------+---------+------+------+
| Name                  | pa_rob | exp_RDI | RDI  | diff |
+-----------------------+--------+---------+------+------+
| Anderson_Garret       |   1843 |     373 |  500 |  127 |
| Helton_Todd           |   1919 |     368 |  490 |  122 |
| Ramirez_Manny         |   1771 |     377 |  497 |  120 |
| Guerrero_Vladimir     |   1722 |     326 |  443 |  117 |
| Delgado_Carlos        |   1820 |     372 |  487 |  115 |
| Sweeney_Mike          |   1564 |     323 |  436 |  113 |
| Pujols_Albert         |   1625 |     310 |  422 |  112 |
| Tejada_Miguel         |   2002 |     422 |  531 |  109 |
| Rodriguez_Alex        |   1979 |     390 |  484 |   94 |
| Sheffield_Gary        |   1824 |     368 |  457 |   89 |
| Ordonez_Magglio       |   1571 |     324 |  411 |   87 |
| Bonds_Barry           |   1200 |     208 |  293 |   85 |
| Giambi_Jason          |   1725 |     325 |  406 |   81 |
| Ortiz_David           |   1560 |     331 |  409 |   78 |
| Rolen_Scott           |   1634 |     336 |  413 |   77 |
| Kent_Jeff             |   1950 |     401 |  477 |   76 |
| Berkman_Lance         |   1743 |     351 |  427 |   76 |
| Gonzalez_Luis         |   1765 |     335 |  409 |   74 |
| Walker_Larry          |   1277 |     265 |  338 |   73 |
| Chavez_Eric           |   1742 |     343 |  414 |   71 |
+-----------------------+--------+---------+------+------+

Friends, Garret Anderson has been an RBI machine, driving in 127 more
runners than expected over the last six seasons. I was a little
surpised to see Anderson top the list, but it makes sense: the guy
rarely strikes out or walks and he hits the ball hard. Mike Sweeney
is another player in the same mold. Overall, this
is a list of some pretty great players. Barry Bonds, despite the
smallish number of opportunities (due to the many intentional walks
he's received), still ranks among the leaders.

Actually, the above list is biased towards players that have had many
RBI opportunities. If you're 10% better than
average, then your Diff value will grow with opportunities. So, I
would like to present another table, sorted by Diff per 300 plate
appearences with runners on (called Diff300).
This puts all players on an even
footing. (The 300 plate
appearances with runners on base represents a typical season's worth.) Here are the top 20 according to Diff300 (minimum 750 opportunities):

Top 20 RBI Batters, 2000-2005, Ranked by Diff300
+--------------------+--------+---------+------+------+---------+
| Name               | pa_rob | exp_RDI | RDI  | diff | Diff300 |
+--------------------+--------+---------+------+------+---------+
| Sweeney_Mike       |   1564 |     323 |  436 |  113 |    21.6 |
| Teixeira_Mark      |    900 |     173 |  237 |   64 |    21.3 |
| Bonds_Barry        |   1200 |     208 |  293 |   85 |    21.3 |
| Pujols_Albert      |   1625 |     310 |  422 |  112 |    20.7 |
| Anderson_Garret    |   1843 |     373 |  500 |  127 |    20.7 |
| Guerrero_Vladimir  |   1722 |     326 |  443 |  117 |    20.3 |
| Ramirez_Manny      |   1771 |     377 |  497 |  120 |    20.3 |
| Helton_Todd        |   1919 |     368 |  490 |  122 |    19.1 |
| Delgado_Carlos     |   1820 |     372 |  487 |  115 |    19.0 |
| Walker_Larry       |   1277 |     265 |  338 |   73 |    17.2 |
| Ordonez_Magglio    |   1571 |     324 |  411 |   87 |    16.7 |
| Tejada_Miguel      |   2002 |     422 |  531 |  109 |    16.4 |
| Everett_Carl       |   1310 |     257 |  323 |   66 |    15.2 |
| Ortiz_David        |   1560 |     331 |  409 |   78 |    15.1 |
| Sheffield_Gary     |   1824 |     368 |  457 |   89 |    14.7 |
| Gonzalez_Juan      |    941 |     195 |  241 |   46 |    14.6 |
| Matsui_Hideki      |   1029 |     211 |  261 |   50 |    14.5 |
| Rodriguez_Alex     |   1979 |     390 |  484 |   94 |    14.3 |
| Rolen_Scott        |   1634 |     336 |  413 |   77 |    14.2 |
| Giambi_Jason       |   1725 |     325 |  406 |   81 |    14.1 |
+--------------------+--------+---------+------+------+---------+

It's mostly the same guys reshuffled, with the notable inclusion of Teixeira who has only three seasons under his belt.
There are a couple of other new names, as well: Carl
Everett
and Hideki Matsui.
(Does anybody else find
it curious that one
of these is named after a dinosaur and the other one doesn't believe
in dinosaurs?)

Some Interesting Cases

In the table below I've listed a few players that had interesting RBI
numbers (shown in bold font). I've also included some other
players who are nearbyi in RBI ability, for comparison.

+--------------------+--------+---------+------+------+---------+
| Name               | pa_rob | exp_RDI | RDI  | diff | Diff300 |
+--------------------+--------+---------+------+------+---------+
| Beltran_Carlos     |   1665 |     334 |  396 |   62 |    11.1 |
| Abreu_Bobby        |   1872 |     373 |  437 |   64 |    10.2 |
| Konerko_Paul       |   1705 |     346 |  404 |   58 |    10.2 |
+--------------------+--------+---------+------+------+---------+
| Sosa_Sammy         |   1690 |     342 |  383 |   41 |     7.2 |
| Molina_Bengie      |   1241 |     261 |  288 |   27 |     6.6 |
| Thome_Jim          |   1680 |     349 |  383 |   34 |     6.1 |
+--------------------+--------+---------+------+------+---------+
| Spiezio_Scott      |   1116 |     239 |  242 |    3 |     0.7 |
| Snow_J.T.          |   1234 |     275 |  278 |    3 |     0.7 |
| Jeter_Derek        |   1626 |     316 |  319 |    3 |     0.5 |
| Jones_Andruw       |   2011 |     414 |  417 |    3 |     0.4 |
| Lieberthal_Mike    |   1231 |     251 |  252 |    1 |     0.2 |
| Polanco_Placido    |   1346 |     247 |  247 |    0 |    -0.1 |
+--------------------+--------+---------+------+------+---------+
| Gonzalez_Alex      |   1358 |     273 |  245 |  -28 |    -6.2 |
| Dunn_Adam          |   1270 |     243 |  216 |  -27 |    -6.5 |
| Diaz_Einar         |    809 |     164 |  146 |  -18 |    -6.8 |
+--------------------+--------+---------+------+------+---------+

I've included Bobby Abreu in the table, because apparently there are
some who believe that Abreu would rather take a walk than drive in a
run. Here's a quote from an recent article by S.I.'s Tom Verducci,
"He's the kind of hitter who is happy with a walk in run-scoring
situations, which sometimes leads to looking at third strikes."
Abreu has been as adept at driving runners in as Carlos Beltran or Paul
Konerko
, and I don't hear anybody complaining about those guys. (Caveat: the numbers are through 2005).

Bengie Molina? Yep, hanging with Sosa and Thome as a solid RBI man. I
have no further comment.

I was very surprised to find that Derek Jeter is just average at
driving in runs. I mean, I think many people would love to have Jeter
step to the plate when an RBI is needed. But, in fact, he's no better
at driving in runners than Scott Spezio, say, or Mike Lieberthal. And
Andruw Jones' high RBI totals (he'll top 100 for the fifth time this
season) have mostly been due to lots of opportunities.
Given the same number of chances, J.T. Snow, say, or Placido Polanco, would drive in as many
runs as Andruw. And finally, we come to Adam Dunn, who is really not very good
at getting runs in. He ranks below Alex Gonzalez, which says it
all. Don't ask me which Alex Gonzalez, because I don't
know. But does it really matter?

Final Disclaimer

I realize that my method of determining the best RBI producers is not
perfect. Garret Anderson (circa 2001) may well have the best chance of driving in a
runner on base, and in that sense he could be preferable in a given
situation to, say, Manny Ramirez. However, Anderson also has a greater
chance of making an out than Ramirez does, so maybe you'd
rather have Ramirez up there after all. You probably would, actually.

Still, I think the things I've learned while doing this are
interesting and possibly even useful. Who knows? Imagine this future
hypothetical scene:

The Yankees are trailing by a run in
the eighth inning, two outs, runners on second and third. Bengie Molina,
the new Yankee backup catcher, is sitting on the bench, dreaming about
the post-game spread. Next to him, manager Joe Torre pulls from his
back pocket a 4x6 index card with "RBI DIFF" written across the
top. After studying the card for a few seconds, he turns to the player sitting to his
left and growls, "Molina, wake up and grab a bat. You're hitting for
Jeter."

References & Resources

  • Play-by-play data for seasons 1957-1998, 2000-2005 can be obtained at Retrosheet. And it's all free!
  • A more serious and rigorous look at the RBI from the sabermetric viewpoint is provided by Tom Ruane,
    here. It is an excellent study.


Comments are closed.