Two months ago, we talked about something called Win Probability, and we followed up with a couple of Win Probability Added (WPA) bullpen articles, Team Bullpens and Ranking the Relievers of 2004. WPA and bullpens are a match made in baseball statistics heaven, because WPA can tell you much more about relievers than the current bag of statistics can.

There’s a related statistic that I didn’t introduce at the time, though you may be familiar with it. It was developed by Doug Drinen in the Big Bad Baseball Annuals of the late 1990’s, and it’s called “P”. Right, just one letter. P. A whole lot easier to remember than the standard three-letter acronym, don’t you think? Before we move on, feel free to insert your own juvenile P joke here (“spell ‘pig’ backwards and then say ‘funny'”). Okay. Back to the article.

P is the measure of how important a situation is, based on its potential impact on Win Probability. The higher the P, the more critical the situation. Bases loaded, bottom of the ninth, visiting team up by one run? Very high P. If you’re the visiting team, you want your best pitcher on the mound. Conceptually, P is very similar to Tangotiger’s Leveraged Index, but the math is different. To calculate P, you simply take the difference between the current Win Probability and what the Win Probability will be if the pitcher retires the side with no more runs scoring.

For example, a home team’s Win Probability, with the bases loaded and none out in the bottom of the ninth of a tie game, is .936 (by my calculations). If the visiting team pitcher miraculously retires the next three batters without allowing a run, the Win Probability decreases to .500. So the P is .436 (.936 – .500). As you can imagine, that’s a very high P.

When a manager brings a new reliever into the game, you can learn a lot by calculating the P of that situation. Particularly, who does the manager turn to in high-P situations? Does he turn to his best reliever or the best “matchup?” Does he waste good relievers in low-P situations? Does he even understand how critical the situation is?

I’ve been able to calculate the P of every relief appearance from 2002 through 2004, so we can start to answer these questions. Here is a list of the relievers who were brought into a game most often when the P was 0.20 or higher, along with the WPA that resulted from those appearances and the number of saves or holds the pitcher subsequently received:

NAME                TEAM          App.        WPA   Saves  Holds   Avg. P
Marte, Damaso       CWS             18      2.531      5      4    0.291
Myers, Mike         ARI/BOS/SEA     17     -0.344      1      7    0.303
Ryan, B.J.          BAL             16      0.476      0      4    0.294
Rincon, Ricardo     CLE/OAK         15     -0.602      0      7    0.272
Romero, J.C.        MIN             14      0.908      0      7    0.283
Grimsley, Jason     BAL/KC          14     -1.055      0      1    0.271
Cormier, Rheal      PHI             14     -0.081      0      4    0.308
Bradford, Chad      OAK             13      0.849      0      6    0.247
Quantrill, Paul     LAD/NYY         12      1.613      0      7    0.248
Rhodes, Arthur      OAK/SEA         12     -1.200      0      5    0.247
Stanton, Mike       NYM/NYY         12      0.180      1      3    0.264
Groom, Buddy        BAL             12      1.898      2      2    0.345

The White Sox’s Marte was brought into more difficult situations than any other reliever during this time period, and he did his job extremely well (WPA of 2.531) — particularly in 2002 and 2003. Buddy Groom had the highest average P Value among all pitchers on this list, and he also performed very well (WPA of 1.898).

Overall, this list represents 169 appearances, 9 saves and 57 holds. Which tells you something about the relative importance of saves and holds. And the list of pitchers is very revealing, because it primarily consists of middle relief and situational pitchers, including a couple of LOOGYs (Lefthanded One Out GuYs), such as Mike Myers and Buddy Groom. There are no pure closers here.

Now, remember that P is based on the situation when the pitcher first enters the game. Therefore, P is going to be highest when a pitcher is brought into a game with men already on base. Pure closers, such as Gagne, Smoltz and Rivera, are typically used in what Bill James called the “Robb Nen” pattern: beginning of the ninth inning, none on, none out. So, by definition, today’s relief aces — the guys with the most saves — are not likely to be brought into high-P situations.

I’d like to talk about closers more specifically. First, here is a list of all pitchers with at least twenty saves last year, ranked by average P in 2004:

Name                    Team            App       WPA    Saves   P Value
Hoffman, Trevor         SDP              55     2.619       41     0.101
Cordero, Francisco      TEX              67     3.477       49     0.092
Kolb, Dan               MIL              64     1.288       39     0.091
Nathan, Joe             MIN              73     5.390       44     0.090
Herges, Matt            SFG              70    -2.262       23     0.089
Gagne, Eric             LAD              70     5.977       45     0.088
Benitez, Armando        FLO              64     4.644       47     0.086
Rivera, Mariano         NYY              74     4.916       53     0.086
Percival, Troy          ANA              52     0.743       33     0.084
Lidge, Brad             HOU/OAK          80     7.292       29     0.084
Looper, Braden          NYM              71     2.465       29     0.082
Graves, Danny           CIN              68    -0.278       41     0.081
Dotel, Octavio          HOU/OAK          77     1.077       36     0.081
Smoltz, John            ATL              73     5.240       44     0.080
Mesa, Jose              PIT              70     1.964       43     0.080
Hawkins, LaTroy         CHC              77     1.723       25     0.076
Wagner, Billy           PHI              45     2.316       21     0.076
Isringhausen, Jason     STL              74     3.738       47     0.074
Julio, Jorge            BAL              65     0.926       22     0.069
Urbina, Ugueth          DET              54     0.250       21     0.069
Chacon, Shawn           COL              66    -3.664       35     0.069
Foulke, Keith           BOS              72     3.047       32     0.067
Baez, Danys             TBD              62     1.160       30     0.060

For comparison, the average P for all reliever appearances in 2004 was 0.059, and it has remained fairly stable over the past three years. Trevor Hoffman had the highest P among all closers last year, primarily because almost half of his appearances occurred during a one- or two-run lead. Conversely, only 20% of Danys Baez’s appearances occurred with one- or two-run leads, and his overall P was about average, despite his thirty saves.

If you only want to use your closer at the top of the ninth inning, the score differential (one vs. two run lead, for example) is obviously key to getting the most value from him. Let’s use P to identify the most important score differentials — here is a graph of the P value at the beginning of the ninth inning, by score differential:


As you can see, tie games and one-run leads are by far the most important in the ninth inning (and extra innings, too). This makes a lot of sense. If you pitch a scoreless ninth with a one-run lead, you’ve just finished the game for your team. And if you do so in a tie game, you’ve given your team a chance to win it all in the bottom of the inning.

So do teams deploy their closers most often in the most critical situations? Well, let’s add a line to the graph that shows the percent of the time each team’s “closer” was used in each situation (closer being defined as staff leader in saves). Ideally, the line should follow the same outline as the bars in the graph — let’s see if it does:


It doesn’t. A closer is two-and-a-half times more likely to be brought into the ninth with a three-run lead (75% of the time) than with the score tied (30% of the time). Excuse my bold formatting, but this makes no sense at all!

Three-run leads are gimme situations; fans are heading to the exits. On the other hand, tie games in the ninth are the epitome of crucial situations. Yet most managers would rather use their closer with a three-run lead. What gives?

A Hardball Times Update
Goodbye for now.

As Steve Treder documented so well last year, the save statistic has warped the way we think of critical situations. Instead of using closers in the most critical situations, managers use their closers in order to maximize their saves and not the team wins. But a save is a statistic, it’s not a strategic metric. WPA is both.

There is a lot to question about relievers in this day and age. Do closers really only have to pitch at the beginning of the ninth inning? Should managers really pitch to matchups as often as they do? Is the overuse of relievers leading to more, rather than less, injuries?

But I have one simple question: if managers really want to hold their closers back until the ninth inning, why aren’t they at least consistently using them in the most important situations?

References & Resources
In a couple of days, the Hardball Times Bullpen Book will be on sale. This book will include over 80 pages of WPA statistics on all relievers between the years 2002 and 2004, including P and WPA for each reliever each year. Watch for it!

My deepest thanks go to Doug Drinen, for giving me the insight and permission to use his creation.

Here’s a list of teams that used their closers in tie games in the ninth at least 50% of the time:
– Braves/Smoltz: 7 of 9 opportunities
– Red Sox/Foulke: 3 of 5
– Tigers/UUU: 9 of 15
– Mets/Looper: 8 of 14
– Yankees/Rivera: 2 of 4

By the way, I do believe that Tangotiger’s “Leveraged Index” is a better metric than P. However, Tango has not yet released Leveraged Indices for the past three years, and so I present P as a public service (and a way to address a pet issue of mine).

By the way, Tango pointed out to me that this subject was well covered by Baseball Prospectus nearly five years ago. I’m not sure why the baseball world hasn’t made more progress since then, but maybe my graphs can help a llittle bit.

Dave Studeman was called a "national treasure" by Rob Neyer. Seriously. Follow his sporadic tweets @dastudes.

Comments are closed.