Ranking the Relievers

by Dave Studeman
January 19, 2005

Last week, we introduced a stat called Win Probability Added. Actually, we didn’t really introduce it at all. As explained earlier, it’s been introduced many times; we’re just applying it to the relief pitchers of 2004.

After my last article, one reader complained that WPA didn’t really add anything. Basically, his argument was “Why not stick to simple performance stats, like K/BB ratios and :FIP:, to evaluate relievers?” Okay, he didn’t really mention FIP, but I like to think he would have. The sentiment would be the same: Why bother with new stats for relievers? Why not just go with good performance stats? What we know is good enough, right?

Well, no, not really. Basic performance stats like FIP are great, but saves are also used to evaluate relievers. You may not like saves, but you really can’t underestimate the impact they’ve had on baseball fans. Think about it. Without the save, would Rollie Fingers be in the Hall of Fame? (Hint, he only has 103 :RSAA:, which is tied for 213th in major league history. One less than Jon Matlack.) Would Bruce Sutter have garnered 60% of the Hall of Fame vote in the last election? (123 RSAA, seven less than Milt Pappas.) I think the answer is pretty clearly no. Saves have played a huge role in how we think of relief pitchers.

Saves add something to the bag of pitcher statistics. They attempt to tell you how important a pitcher’s innings were. If relievers pitch well in close games, they’ve done more for the team than if they’ve pitched well in blowouts. Baseball fans tend to rely on saves for this type of extra information.

And this is a problem. As a statistic for measuring pitchers, saves are more useful than Tarot Cards, but far from ideal. Here is the definition of a save, according to Major League Baseball’s official definitions page:

A pitcher is credited with a save when he finishes a game won by his club, is not the winning pitcher, and either (a) enters the game with a lead of no more than three runs and pitches for at least one inning, (b) enters the game with the potential tying run either on base, or at bat, or on deck, or (c) pitches effectively for at least three innings.

That’s not a bad definition on the surface, but it stinks when you actually use it. For instance, it treats a three-run lead in the ninth with none out (Win Probability of 96%) the same as a one-run lead (81%). It’s amazing to think that a pitcher can get a save when he enters a game with a Win Probability of 96%.

If your team loses, you don’t get a save, no matter how well you pitched or how important your innings were. If you don’t finish the game, you don’t get a save.

Major League Baseball invented the hold to address some of these issues. Here’s the definition of a hold:

A relief pitcher is credited with a hold any time he enters a game in a save situation, records at least one out and leaves the game never having relinquished the lead. A pitcher cannot finish the game and receive credit for a hold, nor can he earn a hold and a save in the same game.

The hold addresses the “finish game” issue, though it still treats a three-run lead the same as a one-run lead.

And the most damning thing about both of these stats is that a pitcher who enters a tie game gets neither a Save nor a Hold. When a game is tied, everything is in doubt. Tie games are the epitome of “critical situations.” And yet, the relief pitcher who does his job well in a tie game is credited with neither stat (he does get credit for a win if the offense scores and retains the lead until the end of the game).

Win Probability Added has none of these issues. As I’ve calculated it, WPA is the difference between the Win Probability when a reliever enters and when he leaves a game, if his team’s offense scored at an average rate. It’s a measure of how well he pitched and how critical his innings were. Tie games matter a lot. And you don’t have to finish a game to receive WPA points.

The best relief pitcher in the major leagues last year, as measured by WPA, had a lot of Saves and Holds. I’m talking about the phenomenon in Houston, Brad Lidge, who contributed almost one-and-a-half wins more than the second-best major league reliever, Eric Gagne. Gagne led the majors in WPA in both 2002 and 2003.

Here’s a list of all relief pitchers with one or more WPA, along with his innings pitched, saves and holds:

Pitcher                  Team         WPA      IP   Saves   Holds
Lidge, Brad              HOU         7.29    94.7      29      16
Gagne, Eric              LA          5.98    82.3      45       0
Gordon, Tom              NYY         5.47    89.7       4      36
Nathan, Joe              MIN         5.39    72.3      44       0
Smoltz, John             ATL         5.24    81.7      44       0
Rivera, Mariano          NYY         4.92    78.7      53       0
Benitez, Armando         FLO         4.64    69.7      47       0
Cordero, Chad            MON         4.31    82.7      14       8
Shields, Scot            ANA         3.99   105.3       4      17
Otsuka, Akinori          SDP         3.88    77.3       2      34
Rodriguez, Francisco     ANA         3.86    84.0      12      27
Ryan, B.J.               BAL         3.76    87.0       3      21
Isringhausen, Jason      STL         3.74    75.3      47       0
Cordero, Francisco       TEX         3.48    71.7      49       0
Linebrink, Scott         SDP         3.13    84.0       0      28
Jones, Todd              CIN/PHI     3.07    82.3       2      26
Foulke, Keith            BOS         3.05    83.0      32       0
Vizcaino, Luis           MIL         3.02    72.0       1      21
Hoffman, Trevor          SDP         2.62    54.7      41       0
Takatsu, Shingo          CHW         2.56    62.3      19       4
Looper, Braden           NYM         2.46    83.3      29       0
Torres, Salomon          PIT         2.37    92.0       0      30
Wagner, Billy            PHI         2.32    48.3      21       1
Rincon, Juan             MIN         2.10    82.0       2      16
Madson, Ryan             PHI         2.08    76.3       1       7
Mota, Guillermo          LAD/FLO     2.00    96.7       4      29
Mesa, Jose               PIT         1.96    69.3      43       0
Hawkins, LaTroy          CHC         1.72    82.0      25       4
Lopez, Rodrigo           BAL         1.72    31.7       0       3
Romero, J.C.             MIN         1.59    74.3       1      16
Almanzar, Carlos         TEX         1.58    72.7       0      19
Timlin, Mike             BOS         1.52    76.3       1      20
King, Ray                STL         1.51    62.0       0      31
Worrell, Tim             PHI         1.43    78.3      19      20
Calero, Kiko             STL         1.38    45.3       2      12
Frasor, Jason            TOR         1.36    68.3      17       8
Rodriguez, Felix         PHI/SFG     1.30    65.7       1      20
Carrara, Giovanni        LAD/FLO     1.29    53.7       2       6
Duchscherer, Justin      OAK         1.29    96.3       0       6
Kolb, Dan                MIL         1.29    57.3      39       1
Miller, Matt             CLE         1.26    55.3       1       7
Fuentes, Brian           COL         1.24    44.7       0      13
Harper, Travis           TBD         1.24    78.7       0       9
Baez, Danys              TBD         1.16    68.0      30       1
Williamson, Scott        BOS         1.13    28.7       1       3
Guardado, Eddie          SEA         1.09    45.3      18       0
Dotel, Octavio           HOU/OAK     1.08    85.3      36       0
Telemaco, Amaury         PHI         1.03    54.3       0       5

Four of the top ten WPA leaders had more holds than saves. Two of the top ten, Chad Cordero and Scot Shields, didn’t have many of either. In fact, both Shields and KRod did more to help the Angels win than Troy Percival did. Ditto Tom Gordon vs. Mariano Rivera on the Yankees. Ryan and Lopez did more than Julio for Baltimore. Torres and Lopez vs. Mesa for the Pirates. Etc. Etc.

A Hardball Times Update

by RJ McDaniel

Goodbye for now.

You may have noticed that Cincinnati’s Danny Graves, who tied for sixth-most saves in the National League with 41, isn’t on this list. Graves actually finished 305th in total WPA among major league relievers last year, with a negative -0.278 WPA. In fact, seven relievers contributed more to the Reds’ wins than Danny Graves!

I’m not saying that saves are a bad or meaningless stat. For about half the teams in the majors, the best reliever was also their saves leader. But WPA is better because it measures both performance and crucial innings accurately. As an example, let me point out a few things about Lidge’s record:

Lidge pitched in 80 games last year. He reduced his team’s probability of winning in only nine of those games.
His best WPA outing came on June 30 against the Cubs at Wrigley, when he entered a tie game in the bottom of the eighth with runners on first and second and none out (Astros’ chances of winning = 23%). He not only retired the Cubs without a run scoring, the Astros scored a run in the ninth, and he closed out the bottom of the ninth for a win. WPA of 0.519.
His third and fourth best outings came in extra innings of a tie game, when he pitched two scoreless innings and left with the game still tied. By the WPA system, he received .440 WPA points in each of those games, but he didn’t receive a save, hold or win because the Astros didn’t lead when he was in the game.

There are stories like these behind every WPA score, but I don’t have the space to tell them all. Suffice to say that for all the sophistication we bring to baseball statistics in some areas, we are lagging when it comes to relief pitching.

It’s time to reevaluate the way we evaluate relievers. You have to have play-by-play data to calculate WPA, but the good news is that we have play-by-play data for most of the years the relief specialist has been in vogue. We should use it and refer to it consistently. Tangotiger has given us a good start.

Rollie Fingers made the Hall of Fame due to his Saves totals, and Dennis Eckersley did too (to an extent). Bruce Sutter may well do the same thing. Relievers are being voted into the Hall nearly every year. We need to re-think the standards we apply to relief pitchers before it’s too late.

References & Resources
Here’s how I computed Bullpen WPA:

I calculated the Win Probability of the game at every point at which a reliever entered a game. This was based on the score, inning, number of outs and base situation.

I then calculated the Win Probability of the point at which the reliever left the game, based on the same factors AS WELL AS the number of runs an average team would have scored during his time in the game.

The difference between the two Win Probabilities equals Win Probability Added for that appearance. I then added up all Win Probabilities Added for all relievers for the team totals.

So in the Brad Lidge example against the Cubs, Lidge doesn’t earn extra points because the Astros happened to score in the ninth. WPA assumes there was an average offense playing behind him.

For average runs the team would have scored, I used 0.5 runs per inning, or 4.5 runs a game. This overall average probably negatively impacts the Bullpen WPA of all American League teams, as well as teams in hitter’s parks. It positively impacts teams in the National League, and teams in pitcher’s parks.

Despite these caveats, I think this is a relatively good measure of bullpen effectiveness.

As I’ve defined it, WPA is a good tool for comparing relievers, but it isn’t a good measure of the absolute number of wins the pitcher contributed to his team. It gives all the credit for Win Probability Added to the pitcher, and not to his fielders or hitters. So be careful you don’t refer to it in an absolute sense.

I’m working on an e-book, tentatively called “The Bullpen Book of 2002-2004,” which will contain all these stats and more for major league bullpens during the past three years. I hope to have it done by early February. I’ll let you know if I do. Catchy title, huh?

My thanks, as usual, go to Doug Drinen and Tangotiger, who have done much of the pioneering work in the evaluation of relief pitching, and have graciously shared their expertise freely.