And Here’s the Pitch …
This is a game to be savored, not gulped. There’s time to discuss everything between pitches or between innings.
-Bill Veeck
Last week The Washington Post printed a position-by-position analysis that detailed which players had swung and missed most and least frequently through August 20 of this season. This prompted a bit of discussion on the SABR-L listserv (and did I mention that if you’re not a member of SABR you should be?) as several members debated the values of this sort of data and the merit of analyzing it. Based on that discussion I thought this week I’d give a short synopsis of how pitch data is typically used and discuss what might and might not be learned from its analysis.
For this article I used Retrosheet data and included the 473 players who had at least 500 plate appearances in the years 2000 through 2004.
Pitches per Plate Appearance
Perhaps the most basic stat that can be derived from pitch data is the average number of pitches per plate appearance (P/PA). While total number of pitches is readily available on MLB.com they do not calculate P/PA for you. In any case the leaders and trailers in pitches per plate appearance for 2000-2004 were:
Highest PA OPS Pitches P/PA Rickey Henderson 1290 694 5595 4.34 Brad Wilkerson 2029 838 8740 4.31 Todd Zeile 2552 751 10953 4.29 Jeremy Giambi 1299 828 5574 4.29 Adam Dunn 2112 893 9039 4.28 Lowest PA OPS Pitches P/PA Randall Simon 1425 733 4310 3.02 Brent Butler 597 664 1810 3.03 Nomar Garciaparra 2456 906 7638 3.11 Rey Sanchez 2196 627 6876 3.13 Deivi Cruz 2603 699 8197 3.15
Not surprisingly, the players in the first list have much higher on-base percentages (OBP) than those in the second since it’s difficult to take a walk when you’re swinging so much, a la the quintessential “bad-ball hitter” Randall Simon. However, that’s not to say that players who see fewer than 3.4 pitches per plate appearance (which includes about 8% of the players in this study) can’t be successful. Nomar Garciapara and Vladimir Guerrero at 3.28 P/PA are cases in point along with Garret Anderson (3.32 P/PA) and Vernon Wells (3.37 P/PA). These players are aggressive hitters who often make solid contact.
That said, the players in the bottom 20% of P/PA have an average on-base plus slugging (OPS) of 719 while those in the top 20% have an OPS of 797. Since OPS correlates very well with run production, it is therefore a good proxy, and because going deep into counts forces opposing pitchers to throw more pitches and tire sooner, it’s safe to say that on average, players who see more pitches end up contributing more to their teams.
Swinging at the First Pitch
A second aspect of pitch data that is often discussed concerns how often players swing at the first pitch. Here are the top and bottom five in this category:
Highest PA OPS Pitches P/PA 1stP 1stP/PA Wily Mo Pena 563 771 2047 3.64 276 0.490 Nomar Garciaparra 2456 906 7638 3.11 1202 0.489 Karim Garcia 830 741 2813 3.39 403 0.486 Vinny Castilla 2738 734 8870 3.24 1306 0.477 Vladimir Guerrero 3168 1005 10399 3.28 1500 0.473 Lowest PA OPS Pitches P/PA 1stP 1stP/PA Scott Hatteberg 2415 763 9828 4.07 229 0.095 Todd Zeile 2552 751 10953 4.29 263 0.103 Jason Kendall 3279 778 12842 3.92 341 0.104 Mark Ellis 1026 711 4167 4.06 121 0.118 Randy Velarde 1090 749 4358 4.00 130 0.119
Niether of these lists contains many surprises with Garciaparra and Guerrero making the top five and Scott Hatteberg and A’s teammates Jason Kendall and Mark Ellis in the bottom five.
What is most interesting is that the top 20% have an average OPS of 760 while the bottom 20% recorded an almost identical 759 OPS. Further, the correlation coefficient (the measure of the linear relationship of two measures) between first pitch percentage and OPS was 0.01. In other words, it seems that players adopt different strategies as to whether to swing at the first pitch but at the plate appearance threshold of this study it doesn’t make a difference in their overall productivity.
Swinging and Missing
The category the Post focused on was how often players swung and missed. Over the last five years the leaders and trailers in this category included:
Highest PA OPS Pitches P/PA Miss Miss/P Wily Mo Pena 563 771 2047 3.64 400 0.195 Russ Branyan 1403 805 5774 4.12 1087 0.188 Jared Sandberg 706 703 2796 3.96 498 0.178 Todd Greene 726 734 2460 3.39 425 0.173 Ruben Rivera 1010 690 3840 3.80 646 0.168 Lowest PA OPS Pitches P/PA Miss Miss/P Juan Pierre 3037 742 10160 3.35 287 0.028 Luis Castillo 3231 743 13015 4.03 389 0.030 David Eckstein 2520 700 9531 3.78 291 0.031 Chuck Knoblauch 1393 685 5484 3.94 172 0.031 Scott Hatteberg 2415 763 9828 4.07 316 0.032
Once again you see free swingers with high strikeout rates in the leaders and contact hitters in the trailers as you would expect. And because the free swingers tend to have higher slugging percentages and the contact hitters higher on base percentages, the tradeoff means there is little correlation between swinging and missing and OPS.
That correlation is slightly positive (.11) with the top 20% recording a 745 OPS while the bottom 20% a 770 OPS but once again we’re likely seeing players focusing on their strengths in order to succeed. Of course I wouldn’t be hitting and running with the players in the first list…
Fouling off Pitches
Next, we’ll consider the ability to foul off pitches. The leaders and trailers in foul balls per plate appearance are:
Highest PA OPS Pitches P/PA Foul Foul/PA Kevin Young 1669 722 6699 4.01 1370 0.821 Johnny Estrada 899 741 2999 3.34 736 0.819 Joe McEwing 1169 644 4739 4.05 933 0.798 Vance Wilson 714 692 2709 3.79 559 0.783 Tomas Perez 1051 692 3947 3.76 815 0.775 Lowest PA OPS Pitches P/PA Foul Foul/PA Dave Roberts 1316 690 5057 3.84 583 0.443 Bill Haselman 546 714 1916 3.51 244 0.447 Jason Tyner 844 594 2743 3.25 390 0.462 Mark McLemore 2119 712 8417 3.97 982 0.463 Tom Goodwin 1388 661 5364 3.86 644 0.464
While it is often said that being able to foul off pitcher’s pitches in order to extend an at bat is a skill a la Richie Ashburn or Ichiro Suzuki (who was in the middle of the pack at .643), over the long run it doesn’t appear that fouling off more pitches than average makes a player any more successful. The top 20% had an OPS of 767 while the bottom 20% were at 753 with the correlation even lower than that for swinging and missing.
The idea that fouling off pitches is a skill is likely related to the several instances over the last five years of players fouling off pitch after pitch only to finally succeed. The most memorable being the 18 pitch at bat in the bottom of the 7th that the Dodgers Alex Cora recorded against the Cubs Matt Clement on May 12, 2004 that resulted in a two-run homerun to right field. That pitch sequence went as follows:
BCBFFFFFFFFFFFFFFX
were B=ball, C=called strike, F=foul, and X=put into play. For the record that’s fourteen consecutive foul balls.
Other long at bats that ended well for the batter include:
Plate Discipline
The final category is the percentage of pitches taken for balls not including intentional balls. The leaders and trailers here include:
Highest PA OPS Pitches P/PA Ball Ball/P Barry Bonds 3050 1316 12060 3.95 5377 0.446 Chad Kreuter 634 757 2643 4.17 1166 0.441 Rickey Henderson 1290 694 5595 4.34 2437 0.436 Jason Giambi 3036 1020 12622 4.16 5468 0.433 Mark McLemore 2119 712 8417 3.97 3645 0.433 Lowest PA OPS Pitches P/PA Ball Ball/P Alex Gonzalez 2321 674 8289 3.57 2410 0.291 Johnny Estrada 899 741 2999 3.34 880 0.293 A.J. Pierzynski 2015 775 6430 3.19 1893 0.294 Wilton Guerrero 649 629 2096 3.23 621 0.296 Rod Barajas 924 645 3292 3.56 979 0.297
Here is where we see the biggest difference with the top 20% recording an OPS of 837 and the bottom 20% at 705 and a correlation coefficient of .52.
Naturally, you would expect this since OBP is half of OPS and the number of balls taken directly impacts OBP. In addition, the percentage of pitches taken for balls is not only a function of the selectivity of the batter but also the desire of the pitcher to not serve up a fat pitch to a good hitter as illustrated by the inclusion of both power hitters like Bonds and Giambi as well as the likes of Henderson and McLemore.
The Verdict
So is pitch data an important tool for performance analysis?
My take is that it certainly can paint a picture of how a player approaches his at bats. I like to take a look at it, for example, when a player’s performance suddenly changes to see what they might be doing differently if anything. But as a tool for analysis these standard ways of looking at pitch data don’t really add much to the picture you get from aggregate statistics like OPS or Runs Created.