Closer
Two months ago, we talked about something called Win Probability, and we followed up with a couple of Win Probability Added (WPA) bullpen articles, Team Bullpens and Ranking the Relievers of 2004. WPA and bullpens are a match made in baseball statistics heaven, because WPA can tell you much more about relievers than the current bag of statistics can.
There’s a related statistic that I didn’t introduce at the time, though you may be familiar with it. It was developed by Doug Drinen in the Big Bad Baseball Annuals of the late 1990’s, and it’s called “P”. Right, just one letter. P. A whole lot easier to remember than the standard three-letter acronym, don’t you think? Before we move on, feel free to insert your own juvenile P joke here (“spell ‘pig’ backwards and then say ‘funny'”). Okay. Back to the article.
P is the measure of how important a situation is, based on its potential impact on Win Probability. The higher the P, the more critical the situation. Bases loaded, bottom of the ninth, visiting team up by one run? Very high P. If you’re the visiting team, you want your best pitcher on the mound. Conceptually, P is very similar to Tangotiger’s Leveraged Index, but the math is different. To calculate P, you simply take the difference between the current Win Probability and what the Win Probability will be if the pitcher retires the side with no more runs scoring.
For example, a home team’s Win Probability, with the bases loaded and none out in the bottom of the ninth of a tie game, is .936 (by my calculations). If the visiting team pitcher miraculously retires the next three batters without allowing a run, the Win Probability decreases to .500. So the P is .436 (.936 – .500). As you can imagine, that’s a very high P.
When a manager brings a new reliever into the game, you can learn a lot by calculating the P of that situation. Particularly, who does the manager turn to in high-P situations? Does he turn to his best reliever or the best “matchup?” Does he waste good relievers in low-P situations? Does he even understand how critical the situation is?
I’ve been able to calculate the P of every relief appearance from 2002 through 2004, so we can start to answer these questions. Here is a list of the relievers who were brought into a game most often when the P was 0.20 or higher, along with the WPA that resulted from those appearances and the number of saves or holds the pitcher subsequently received:
NAME TEAM App. WPA Saves Holds Avg. P Marte, Damaso CWS 18 2.531 5 4 0.291 Myers, Mike ARI/BOS/SEA 17 -0.344 1 7 0.303 Ryan, B.J. BAL 16 0.476 0 4 0.294 Rincon, Ricardo CLE/OAK 15 -0.602 0 7 0.272 Romero, J.C. MIN 14 0.908 0 7 0.283 Grimsley, Jason BAL/KC 14 -1.055 0 1 0.271 Cormier, Rheal PHI 14 -0.081 0 4 0.308 Bradford, Chad OAK 13 0.849 0 6 0.247 Quantrill, Paul LAD/NYY 12 1.613 0 7 0.248 Rhodes, Arthur OAK/SEA 12 -1.200 0 5 0.247 Stanton, Mike NYM/NYY 12 0.180 1 3 0.264 Groom, Buddy BAL 12 1.898 2 2 0.345
The White Sox’s Marte was brought into more difficult situations than any other reliever during this time period, and he did his job extremely well (WPA of 2.531) — particularly in 2002 and 2003. Buddy Groom had the highest average P Value among all pitchers on this list, and he also performed very well (WPA of 1.898).
Overall, this list represents 169 appearances, 9 saves and 57 holds. Which tells you something about the relative importance of saves and holds. And the list of pitchers is very revealing, because it primarily consists of middle relief and situational pitchers, including a couple of LOOGYs (Lefthanded One Out GuYs), such as Mike Myers and Buddy Groom. There are no pure closers here.
Now, remember that P is based on the situation when the pitcher first enters the game. Therefore, P is going to be highest when a pitcher is brought into a game with men already on base. Pure closers, such as Gagne, Smoltz and Rivera, are typically used in what Bill James called the “Robb Nen” pattern: beginning of the ninth inning, none on, none out. So, by definition, today’s relief aces — the guys with the most saves — are not likely to be brought into high-P situations.
I’d like to talk about closers more specifically. First, here is a list of all pitchers with at least twenty saves last year, ranked by average P in 2004:
Name Team App WPA Saves P Value Hoffman, Trevor SDP 55 2.619 41 0.101 Cordero, Francisco TEX 67 3.477 49 0.092 Kolb, Dan MIL 64 1.288 39 0.091 Nathan, Joe MIN 73 5.390 44 0.090 Herges, Matt SFG 70 -2.262 23 0.089 Gagne, Eric LAD 70 5.977 45 0.088 Benitez, Armando FLO 64 4.644 47 0.086 Rivera, Mariano NYY 74 4.916 53 0.086 Percival, Troy ANA 52 0.743 33 0.084 Lidge, Brad HOU/OAK 80 7.292 29 0.084 Looper, Braden NYM 71 2.465 29 0.082 Graves, Danny CIN 68 -0.278 41 0.081 Dotel, Octavio HOU/OAK 77 1.077 36 0.081 Smoltz, John ATL 73 5.240 44 0.080 Mesa, Jose PIT 70 1.964 43 0.080 Hawkins, LaTroy CHC 77 1.723 25 0.076 Wagner, Billy PHI 45 2.316 21 0.076 Isringhausen, Jason STL 74 3.738 47 0.074 Julio, Jorge BAL 65 0.926 22 0.069 Urbina, Ugueth DET 54 0.250 21 0.069 Chacon, Shawn COL 66 -3.664 35 0.069 Foulke, Keith BOS 72 3.047 32 0.067 Baez, Danys TBD 62 1.160 30 0.060
For comparison, the average P for all reliever appearances in 2004 was 0.059, and it has remained fairly stable over the past three years. Trevor Hoffman had the highest P among all closers last year, primarily because almost half of his appearances occurred during a one- or two-run lead. Conversely, only 20% of Danys Baez’s appearances occurred with one- or two-run leads, and his overall P was about average, despite his thirty saves.
If you only want to use your closer at the top of the ninth inning, the score differential (one vs. two run lead, for example) is obviously key to getting the most value from him. Let’s use P to identify the most important score differentials — here is a graph of the P value at the beginning of the ninth inning, by score differential:

As you can see, tie games and one-run leads are by far the most important in the ninth inning (and extra innings, too). This makes a lot of sense. If you pitch a scoreless ninth with a one-run lead, you’ve just finished the game for your team. And if you do so in a tie game, you’ve given your team a chance to win it all in the bottom of the inning.
So do teams deploy their closers most often in the most critical situations? Well, let’s add a line to the graph that shows the percent of the time each team’s “closer” was used in each situation (closer being defined as staff leader in saves). Ideally, the line should follow the same outline as the bars in the graph — let’s see if it does:

It doesn’t. A closer is two-and-a-half times more likely to be brought into the ninth with a three-run lead (75% of the time) than with the score tied (30% of the time). Excuse my bold formatting, but this makes no sense at all!
Three-run leads are gimme situations; fans are heading to the exits. On the other hand, tie games in the ninth are the epitome of crucial situations. Yet most managers would rather use their closer with a three-run lead. What gives?
As Steve Treder documented so well last year, the save statistic has warped the way we think of critical situations. Instead of using closers in the most critical situations, managers use their closers in order to maximize their saves and not the team wins. But a save is a statistic, it’s not a strategic metric. WPA is both.
There is a lot to question about relievers in this day and age. Do closers really only have to pitch at the beginning of the ninth inning? Should managers really pitch to matchups as often as they do? Is the overuse of relievers leading to more, rather than less, injuries?
But I have one simple question: if managers really want to hold their closers back until the ninth inning, why aren’t they at least consistently using them in the most important situations?
References & Resources
In a couple of days, the Hardball Times Bullpen Book will be on sale. This book will include over 80 pages of WPA statistics on all relievers between the years 2002 and 2004, including P and WPA for each reliever each year. Watch for it!
My deepest thanks go to Doug Drinen, for giving me the insight and permission to use his creation.
Here’s a list of teams that used their closers in tie games in the ninth at least 50% of the time:
– Braves/Smoltz: 7 of 9 opportunities
– Red Sox/Foulke: 3 of 5
– Tigers/UUU: 9 of 15
– Mets/Looper: 8 of 14
– Yankees/Rivera: 2 of 4
By the way, I do believe that Tangotiger’s “Leveraged Index” is a better metric than P. However, Tango has not yet released Leveraged Indices for the past three years, and so I present P as a public service (and a way to address a pet issue of mine).
By the way, Tango pointed out to me that this subject was well covered by Baseball Prospectus nearly five years ago. I’m not sure why the baseball world hasn’t made more progress since then, but maybe my graphs can help a llittle bit.