Data Erratum Et Cetera

While pulling the stats together this week, I noticed a few things that might catch your interest. They caught mine, at least.

Data Errata

But first, a minor correction to last week’s article, Never Swat an Infield Fly. Turns out we had some bad data in our stats, which didn’t affect the general conclusions at all, but did affect several of the specific pitcher stat lines.

So let me repeat a table of the top and bottom IF/Fly pitchers, along with a few comments. I like to think there are two types of pitcher stats: FIP stats (K, BB, HR) and DER stats (G/F, LD, IF/Fly), and I’ll try to highlight them. Here are the leading IF/Fly pitchers.

Player         Team    RA    LD%    G/F IF/Fly    DER   K/9   BB/9  HR/9    Comments
Robertson N.   DET   4.12   .156   1.36   .356   .750   7.9    3.7   0.9    Great basic stats.
Hermanson D.   SFG   4.81   .138   1.14   .321   .706   6.4    2.7   1.2    Even better, but poor DER.
Lilly T.       TOR   4.52   .187    .88   .277   .715   8.2    3.9   1.4    Fine, except LD%.
Hentgen P.     TOR   6.45   .190    .92   .277   .744   3.9    4.0   1.8    Poor FIP, undeserved DER.
Gregg K.       ANA   4.21   .209   1.12   .275   .669   8.8    2.5   0.6    Superb, except LD%.
Zambrano V.    TBD   4.90   .123   1.05   .255   .740   8.1    6.6   1.1    The right P for a great OF
Gobble J.      KC    4.86   .149    .82   .254   .757   2.5    2.5   1.2    Bad FIP stats; great DER stats
Milton E.      PHI   4.32   .223    .63   .246   .707   7.3    3.9   1.7    Worst LD% on list
Wells D.       SDP   3.13   .162   1.52   .244   .761   3.1    0.8   0.8    GB pitcher with high IF/Fly
Sparks S.      ARI   5.51   .138   1.22   .243   .736   3.8    3.7   1.0    Knuckleballer, sui generis

Now here are the guys who have yielded the lowest infield flies per total flyballs so far.

Player         Team    RA    LD%    G/F IF/Fly    DER   K/9   BB/9  HR/9    Comments
Mulder M.      OAK   3.00   .169   1.96   .043   .752   5.9    2.7   0.7    Home park helps DER
Lowe D.        BOS   6.52   .167   3.71   .042   .679   4.5    4.1   0.6    He's better than record
Sele A.        ANA   3.72   .152   1.12   .041   .717   4.5    3.4   0.8    Good set of DER stats
Affeldt J.     KC    5.75   .208   1.26   .041   .668   4.7    3.8   0.6    High LD%, low IF/Fly = low DER
Reyes D.       KC    4.78   .152   1.53   .034   .685   6.9    4.3   0.6    Seems to deserve better DER

We’ll keep tracking these stats on a regular basis. Remember, though, that FIP stats are more reliable and predictable than DER stats.

In Media Res: Line drives from the Batter’s Box

On the Stats page, we’ve been tracking the same stats from the batter’s point of view. In particular, we track the percent of batted balls that are line drives (LD% again), and BABIP, or Batting Average on Balls In Play (the inverse of DER). In general, the more line drives a batter hits, the more hits he’ll rack up. Just like pitchers and DER.

What I’ve found is that you can use a general formula to estimate a batter’s BABIP: .120 plus LD%. But what’s interesting is that there is a good-sized difference between leagues. Here are the relevant stats for the National and American Leagues:

       LD%   BABIP   Diff
AL    .177    .302   .125
NL    .182    .285   .103

There’s a bigger difference between LD% and BABIP in the American League than the National — almost 25 points worth. This is kind of strange, particularly because the AL is more of a flyball league (GB/FB ratio of 1.21 vs. 1.28 in the NL), and flyballs are more likely to be caught than groundballs.

I can think of a couple of potential reasons for this:

  • Ballpark influence. As two examples, Oakland Coliseum, with its wide foul areas, hurts BABIP. Coors Field, on the other hand, is a place where batted balls go to become hits.
  • There are better fielders in the National League. I have no idea if this is true; I just thought I’d say it to get you AL fans riled up.

So which teams aren’t seeing their line drives fall in for hits? Here’s a graph of the National League teams, showing how the difference between LD% and BABIP rises as the team hits more groundballs:


We knew, a priori, that the Rockies would be above the line due to Coors Field. But look at those poor Phillies. Is there something about that new park that drags down BABIP? Wide foul areas? Slow grass? Larry Bowa? Joe Dimino has already observed this phenomenon, and we should keep watching it during the year.

Now for the American League teams:


Carpe diem! A graphical non sequitur: in general, teams’ BABIP difference actually decreases with the GB/FB ratio of the team. How did this happen? Well, it could be park effects; you’ll notice the impact of Oakland Coliseum. But I really don’t know. I’m speechless.

A Hardball Times Update
Goodbye for now.

The ERA Dictum

As we continue to explore the potential of FIP and DER stats, we continue to discuss ERA. Earned runs are neither a FIP nor a DER. They’re just there, because they’ve always been there. What if baseball stats were a tabula rasa, in which no stats or rules had yet been devised, yet we understood the underlying FIP and DER, our yin and yang, of baseball? Would we create this thing called Earned Run Average?

ERA creates its own distortions and inconsistencies. For instance, a lot more errors occur on groundballs than flyballs, so groundball pitchers tend to give up more unearned runs. Here’s a graph of each pitcher’s GB/FB ratio and the proportion of his allowed runs that are unearned (minimum of 50 innings pitched).


As you can see, the proportion of unearned runs increases as groundballs increase. So a flyball pitcher gives up less unearned runs, which means that his ERA will generally be higher than a groundball pitcher’s, given the same RA. Is this fair or right? I don’t know, but it’s worth chewing on.

That labeled outlier is Atlanta’s Horacio Ramirez, who has an ERA of 2.28 but has allowed 3.38 runs per game. Now, that’s still pretty good. But his FIP is an incredible 5.01, thanks partly to a DER of .771. So what’s his secret?

As a reader pointed out to me a while ago, Ramirez has pitched very well in the clutch so far this year. His OPS against has decreased from .751 with no one on, to .567 with runners on to .515 with runners in scoring position. This is one of the odder pitching records in baseball, and I’d say it’s not likely to last. But I just looked at that AL BABIP graph again, so who knows?

I could go on ad nauseum, you know, but that’s enough for today. Fortunately for you, I have no idea how to say goodbye in Latin.

References & Resources
Bill James and Michael Humphreys have discussed the tendency for groundball pitchers to give up more unearned runs in the past. Michael Wolverton at Baseball Prospectus has already issued a call for the end of Earned Run Average. I’m just joining in. And here’s a link to a bunch of Latin words.

Dave Studeman was called a "national treasure" by Rob Neyer. Seriously. Follow his sporadic tweets @dastudes.

Comments are closed.