Quantifying Catcher Defense, and Other Stuff Like That

by David Gassko
November 17, 2005

I do not want to begin this article without acknowledging the previous work done in catcher fielding analysis by Tangotiger, Mitchel Lichtman, Chris Dial, Keith Woolner, Clay Davenport, Bill James and others. I only hope that this will be another step in improving our understanding of catcher defense.

To quantify catcher defense, it is imperative that we define it first. What is a catcher’s responsibility in the field? It seems that there are three things expected of a catcher on defense: to control the base paths, which can be measured with stolen base and caught stealing totals; to keep the ball in front of him, that is, prevent passed balls and wild pitches which lead to base advances; and to call a good game, an ability whose existence has often been questioned and which has been up to this point immeasurable.

First off, I want to say that I don’t have all the answers. As you read this article, you may even find many of the ideas here relatively obvious. They should be. However, I’m pretty sure that no one has ever attempted to tackle the subject of catcher defense from the angles taken here, which, in my mind, makes this project worthwhile.

This article will not even answer all the questions it will pose. But I would like to set up a solid framework for evaluating catcher defense, and hope that others will build on it. With the availability of extremely detailed play-by-play data, it is now possible to do all kinds of fielding analysis. However, the catcher position seems to have been forgotten in this progress, which really is too bad, because catchers have a high defensive worth.

It seems strange that baseball fielding analysis has made so little progress in terms of evaluating catcher fielding, when it seems to be universally acknowledged that catcher is the most important defensive position on the field. Bill James said so two decades ago when he first published the defensive spectrum:

1B – LF – RF – 3B – CF – 2B – SS – C

The idea of the defensive spectrum is that positions on the right are harder, and that players tend to move from right to left as they get older. Thus, a shortstop will often move to second base, but rarely will a second baseman move to shortstop. (And when that happens, it generally ends in disaster, a prominent case being that of Texas shortstop Michael Young, who is consistently rated as one of the worst defensive shortstops in the game by all defensive metrics that I know of). Here, James theorized that catcher is in fact the hardest defensive position to play. Not only is it physically challenging (the wear and tear catching puts on a player’s knees generally limits catchers to 130 or so games a year, and catchers tend to decline much earlier than players at other positions), but the nuances of catching take years to master.

Catchers need a strong throwing arm to control base stealers, agility to block pitches and of course, the ability to call a good game and work with pitchers. These are not universal skills, and the latter is almost an art form in terms of the knowledge and social skills it takes to perfect.

So, if we’ve established that catcher is an important defensive position, why is it that the defensive differences between the best and worst catchers seem so small? In Lichtman’s UZR ratings, the difference between the best and worst catcher is about 20 runs. In my ratings, the difference is about the same. Win Shares show a difference of about 25 runs. Meanwhile, the variance at other positions is much greater with all systems, generally between 40 and 50 runs. This would suggest that it is a lot more important to put a good defensive player at position other than catcher, even if commonly held beliefs indicate otherwise.

But this is not true. The fact is that without having a good measure of a catcher’s ability to improve the pitcher, we cannot fully understand the importance of catcher defense. However, it is not, in my opinion, faulty to assign great importance to catcher defense without measuring this aspect. That’s why I like Win Shares (which do actually attempt to evaluate catcher influence on team ERA, if with questionable results). Win Shares assign an “intrinsic weight” to each position, which basically means that an average fielder at shortstop is given more credit (Win Shares) than an average fielder at second base. In fact, the average Win Shares each player receives per 1,000 innings are as follows, per the Raindrops weblog:

POS   WS/1,000
1B      1.67
2B      4.45
3B      3.29
SS      5.00
OF      2.59
C       5.20

But, as far as I know, Bill James never explained how he derived these weights. They seem to make sense—as they conform to the order given in the defensive spectrum—but they need to be validated with empirical testing to be useful.

The question is: how do we set a correct weight for each position and how do we determine its relative defensive worth? This can be done by looking at the average plays made at each position. The fact is that the more balls a player is expected to get to, the more defensive worth a player at that position has. This is a simple enough method and it makes sense. The weights that we derive here should validate or reject the Win Shares’ “intrinsic weights” pretty simply. The following is a table that compares the percentage of total Fielding Win Shares (excluding catchers) that such a system believes each position should get, and what Win Shares says:

POS      ME       WS
1B     .087     .075
2B     .186     .201
3B     .131     .148
SS     .189     .225
OF     .407     .350

So Win Shares seems to be getting it mostly right. Based on your definition of plays (I used A+E for infielders, PO+E for outfielders, and independent plays, which are defined as a first baseman’s PO+A minus his infielders’ assists by Bill James in Win Shares, for first basemen) and on the sample size (I used all fielders in 2005), you will find different results. It seems that Win Shares underrates outfielders and overrates middle infielders, perhaps because there has been an increase in plays that are being made by outfielders in recent years.

As you may have noticed, catchers were left out of this analysis because catchers make very few independent plays (which can be defined as PO-Strikeouts), and their relative worth is unimportant. If we use plays made by catchers as our proxy for evaluating how much credit to give catchers for playing defense, we will arrive at a small number, one that is not representative of what we presume to be their true value. So let us instead take a different approach to answering this question.

Sherri Nichols coined Nichols’s Law of Catcher Defense which states “that a catcher’s defensive reputation moves in inverse proportion to the quality of his hitting.” Since defensive reputation does correlate somewhat well with actual defensive performance, it is safe to say that in general, the worse a catcher is defensively, the better he is on offense and vice-versa. What’s cool about this, though, is that her law can be applied to all positions, not just individually, but in relation to each other. In other words, the better the average player at a position is as a hitter, the tougher that position is defensively. This idea is extremely important to quantifying catcher defense. Take a look at the following chart of the Gross Production Average (GPA) at each position in 2005:

C      .245
1B     .283
2B     .250
3B     .260
SS     .243
LF     .272
CF     .258
RF     .275

The tougher defensive positions have less offensive production—in fact, this confirms James’s defensive spectrum; if you put each position on a line, with the higher GPA positions to the left and the lower GPA positions to the right, you would have James’s defensive spectrum, with only right field and left field flipped, barely. So how can we use this information?

A Hardball Times Update

by RJ McDaniel

Goodbye for now.

It can be used to find the amount of defensive credit each position deserves. The correlation between each position’s GPA and the amount of credit it should get based on plays made is almost perfect—r^2 = .91. In other words, determining how much defensive credit each position should get (in terms of their “intrinsic weight” or what percentage of total WS the average player at a position should assigned) by using offensive production is a wholly acceptable method. Using it, we can understand how much credit catchers should receive for their fielding.

To do this, I regressed GPA onto the percentage of Fielding Win Shares a player “should” get. Knowing the exact impact of GPA on fielding responsibility allows me to assign Fielding Win Shares based on each position’s offensive production. Using ordinary least squares regression, I found that each additional point of GPA results in the loss of about .0025% of credit. Here is how the actual credit each position should get compares to that predicted by my model:

POS   ACTUAL    PREDICTED
1B     .087        .093
2B     .186        .175
3B     .131        .151
SS     .189        .192
LF     .121        .121
CF     .158        .155
RF     .128        .114

With an almost perfect fit, using this model, we can judge the importance of catcher defense as well, or at least the perceived importance. Actually, let me digress here, as this is a big issue that needs to be addressed. I keep saying here that catcher defense is important, that it’s the toughest position to play, that catchers have a huge impact on the game, but there doesn’t seem to be much proof. The fact is, and I mentioned this earlier, that until we can properly quantify the run prevention abilities of catchers, we cannot quantify their true worth.

In his excellent article on catcher defense in the 1999 Baseball Prospectus, Keith Woolner concluded that “if there is a true game-calling ability, it lies below the threshold of detection. There is no statistical evidence for a large game-calling ability, but that doesn’t preclude that a small ability exists that influences results on the field.” In a follow-up study, he found that catchers save less than .60 points off ERA; however he did not approximate how much less. It seems that Bill James theorized that one standard deviation equals .11 points, meaning that 68% of catchers have an effect on pitcher ERA that is less than or equal to +/- .11 points, and 95% have an effect that is less than or equal to +/- .22 points. But even such a small effect would suggest that the difference between the best and worst catcher in the league would be 40-50 runs a season, only in terms of calling a game.

But if catchers have no measurable impact on ERA, why give them so much defensive credit when the impact of a good throwing arm and blocking pitches is relatively small? Three things go into play here:

1) There is generally some truth to commonly held beliefs. If every person who works in baseball or has ever played baseball believes that catcher defense is extremely important, then it probably is. Generally speaking, this many people, over many generations, cannot be wrong. It’s possible, but highly unlikely. Just because we cannot measure an effect does not mean it is nonexistent, and if we must turn to the baseball establishment to gauge its importance, so be it.

2) The system used here to establish how many Win Shares catchers should get, comparing catcher offense to offense produced at other positions, is so accurate at all positions, it is highly unlikely that it could be very wrong at catcher. In other words, the baseball establishment has done such a good job at balancing offense and defense at other positions that it would be shocking to find that it is significantly overrating the need for a good defensive catcher.

3) If it is possible that people who work in baseball are completely wrong (it wouldn’t be the first time), and catcher defense is indeed overrated, that does not mean that we should not be apportioning much credit to catchers based on the position they play. The fact is that the only other suitable explanation for the low level of offense at catcher is that very few players can be catchers, that most do not have the ability to call a game or to squat for three hours straight, etc. For those who do have that ability, there is little variance, but either you have it or you don’t. Either you can catch or you cannot. In that case, catchers deserve credit for possessing a rare skill, and that credit can be apportioned to them by looking at their offensive abilities as a group.

Finally, the question of what Win Shares really is comes into play. In Bill James’s construct, it is a quasi-above replacement metric, so I see no reason that my construct should not work for Win Shares. I’m not introducing replacement level into Win Shares; Bill James already did that. I’m simply refining where it should be.

So let’s get to the results. Using the equation derived from the regression, here is a comparison of the percentage of credit for fielding that we “should” be apportioning to each position and what Win Shares actually does:

POS      ME       WS
C      .158     .190
1B     .078     .061
2B     .147     .163
3B     .127     .120
SS     .161     .183
OF     .328     .285

You can see that Win Shares actually overrates catchers a bit, but the difference is relatively small (only about one WS a year). In fact, even in the outfield, the difference between actual and predicted is only about two win shares per season for all three outfielders combined. Win Shares does a pretty good job in splitting up defensive responsibility.

So where do we go from here? We need to actually measure catcher defense. Almost 3,000 words in, and we’ve finally gotten to numbers that involve real players. First, let’s look at stolen bases. What impacts steal rates? Surprisingly little. You would think that harder throwing pitchers might have some advantage because the ball gets to the plate faster, but they do not. The problem is that hard throwing pitchers have a tendency to not pay as much attention to the runner on base, while soft-tossers have to find every advantage they can get. On the other hand, of course, if you’re not throwing the ball hard, the ball is going to take longer to get to the plate, compensating for the fact that a base runner might not have gotten as big a jump. In short, these things pretty much cancel out, leaving us with one big variable: handedness.

Every baseball fan knows this. Not only can left-handed pitchers get a throw over to first more quickly, as they do not have to turn around to do so, but they also can do it more nonchalantly because a throw over to first can look as if it is part of their natural throwing motion. More so, and this is a smaller advantage, left-handed pitchers are more likely to face right-handed batters, and it is easier for a catcher to make a throw to second with a righty up than with a lefty, as all catchers are right-handed.

Thus, when examining a catcher’s ability to prevent stolen bases, it is essential to control for handedness. For example, in 2005, left-handed pitchers were expected to allow -.2 stolen base runs (measured as .193*SB-.437*CS, figures stolen directly from tangotiger.net) per 100 Innings Pitched, meaning that it is generally a bad idea to attempt a steal with a left-hander on the mound. Righties averaged .17 stolen base runs per 100 innings pitched, meaning that the average stolen base attempt against right-handed pitchers had a positive result. While these differences may seem small, they are important to note, as the overall effect we are trying to measure (how good catchers are at preventing stolen bases) is not very large either.

So to measure catcher defense, what we need to do is find expected stolen base runs based on the number of innings pitched by left-handed pitchers and the number pitched by right-handed pitchers on each team, and compared to each catcher’s actual totals. Here are the leaders and trailers in 2005:

                     RAA
Yadier Molina       7.60
Ivan Rodriguez      7.54
Brian Schneider     6.68
Henry Blanco        5.67
Jose Molina         5.62
Toby Hall           5.57
Danny Ardoin        4.88
Joe Mauer           4.53
Raul Chavez         4.06
Mike Matheny        3.86
 
A.J. Pierzynski     -3.8
Javy Lopez          -3.9
Todd Greene         -3.9
Sal Fasano          -4.0
Paul Lo Duca        -4.1
Victor Martinez     -4.6
J.D. Closser        -5.8
Jason Phillips      -6.1
Jason Kendall       -8.3
Mike Piazza         -8.9

You can see that most catchers live up to their reputation, with Matheny and Pudge near the top and Piazza near the bottom. Some young stars—namely Mauer, Schneider and the Molina brothers—are great defensively and should be getting a lot more recognition, while the Kendall deal keeps looking worse and worse for the A’s.

While researching this article,, the problem of splitting credit between catchers and pitchers arose. If a catcher does particularly well, is it because he has a pitching staff that knows how to control the base paths or because the catcher has such a good arm? Generally, a pitching staff won’t be great either way, so these things tend to get cancelled out, but it still may be interesting to look at which pitchers are best at taking runners off the base paths and keeping them from stealing. Here are the leaders and trailers in 2005. (All the numbers in the table below have been adjusted for handedness):

                     RAA
Mike Maroth         4.04
Carlos Zambrano     3.87
Ryan Drese          3.38
Chris Capuano       3.11
Gustavo Chacin      2.73
Mark Mulder         2.68
D.J. Carrasco       2.56
Aaron Harang        2.55
Livan Hernandez     2.52
Jon Garland         2.42
 
Jamie Moyer         -2.3
Kris Benson         -2.4
Jason Vargas        -2.5
Victor Zambrano     -2.6
Brandon Webb        -2.7
Orlando Hernandez   -2.8
Barry Zito          -2.9
Jorge Sosa          -3.2
Kevin Millwood      -3.4
Jose Contreras      -4.2

The spread in pitcher ability here is very small—only eight runs between the best and worst. But that’s actually a pretty great difference if you think about it: it’s the equivalent of roughly a .35-point difference in ERA. If catchers were mostly responsible for base runners, you would see many players from the same team on one side of the list and would be unlikely to see two players from the same team on different sides of the list. Yet that’s what happens: Garland, Hernandez and Contreras all play for the same team, yet Garland is among the best and Hernandez and Contreras are among the worst pitchers at preventing steals.

Moving on, let’s look at passed balls and wild pitches. We know that a catcher greatly affects both, but how much? A straight measurement doesn’t seem to work—some pitchers clearly have more control than others and it’s unfair to punish catchers for that. So how do we find how many passed balls and wild pitches a catcher is expected to have? Tangotiger proposed one possible method. He said that if you looked at a pitcher’s passed ball and wild pitch totals with every other catcher to whom he has ever pitched, you could find a baseline to compare a catcher to. His research indicated that this baseline would generally be equivalent to league average, though obviously that would be one concern. The other issue, a much more important one, is sample size. Such a method might work well for a pitcher who has been around for a decade and who has pitched many innings to multiple catchers. But what about inexperienced guys who have had one primary catcher for their whole careers? And how do we factor in the effects of age?

My theory is that passed balls and wild pitches can be predicted by looking at a pitching staff’s numbers. To confirm or deny this idea, I threw a bunch of variables into a regression equation and then used stepwise, univariate and multivariate models to find which variables were significant and which were not. Put simply, I made sure that each variable I threw out indeed had little or no effect on passed balls before eliminating it.

In the end, three variables were found to be statistically significant: strikeouts, earned runs and hit batsmen. A regression on 2004 data showed the same results. The more earned runs a pitching staff allows, the more wild pitches and passed balls it is expected to have; the same holds true for hit batters. This seems obvious enough—hitting batters and allowing runs generally shows a lack of control. Strikeouts have a negative impact on expected passed balls plus wild pitches, which is also expected—to get strikeouts, you have to throw strikes, and to throw strikes, you must have good control. Obviously, the less control a pitcher has, the more likely he is to throw a pitch so errant that it ends up at the backstop.

So who were the best and worst catchers at preventing passed balls and wild pitches? Take a look:

                     RAA
Mike Lieberthal     4.96
Victor Martinez     4.05
Ramon Castro        2.62
Brad Ausmus         2.61
Jorge Posada        2.60
Yorvit Torrealba    2.43
Gregg Zaun          2.39
John Flaherty       2.29
Paul Lo Duca        2.17
Ramon Hernandez     2.13
 
Ivan Rodriguez      -2.1
Javy Lopez          -2.3
Miguel Olivo        -2.4
Geronimo Gil        -2.6
Chris Widger        -2.7
Chris Snyder        -2.8
Jose Molina         -3.3
A.J. Pierzynski     -3.7
Chad Moeller        -4.4
Bengie Molina       -5.3

Players who have great reputations as defensive catchers appear at the top of the list, while many who were near the top of the list in terms of stolen bases go to the bottom. Why? Throwing out base runners and blocking pitches are two very different skills. The first requires mainly a strong throwing arm; the latter requires agility. The average weight of the bottom 10 catchers here is 208; the top-10 average is 192. Obviously, weighing less would allow a catcher to be lighter on his feet and prevent more passed balls and wild pitches. Perhaps when evaluating catchers, teams should look at their body type for a good estimation of defensive ability.

Now let’s move back to that which I cannot quantify, namely game-calling ability. It is important to acknowledge first that it exists before we can continue; otherwise there is no point to the following exercise. I find it hard to believe that catchers have no impact on pitcher performance, mainly because we know that great preparation has made some pitchers’ careers (for example Greg Maddux and, to a lesser extent, Curt Schilling). A catcher who knows his pitcher and knows his opponent has to improve his pitchers, right?

I think it is impossible to deny that catchers impact pitcher performance. These effects are hard to pinpoint—when Chris Dial did a study into catchers’ effects on walks and strikeouts instead of the less granular ERA, he still found no discernable impact—but they are there. I mentioned earlier the need for more granular play-by-play data, and that is what I want to come back to. My local paper, The Boston Globe, likes to run charts like the one below:

The cool thing about this data is that it tells you a hitter’s strengths and weaknesses—here, we can see that Manny Ramirez feasts on pitches inside and becomes progressively worse the farther down and away they go. So, using this chart, a catcher would know to call pitches down and away, because we can predict that Ramirez will do worst on those pitches. If you broke the data down further, using pitch type as well, you could make an even more accurate prediction.

And this is what a catcher’s job boils down to—a catcher needs to call the right pitches in the right location to get a batter out. The actual outcome is unimportant in terms of evaluating catcher defense; too much noise in terms of luck and fielding is involved. But if we kept track of this data for every catcher in the major leagues, I bet you would find some large, sustainable differences. You would find that some catchers call a better game than others, and you would be able to say precisely how much that is worth. Using this kind of granular data, we can take the next step in fielding analysis, and it will be a big one, I think. Now someone just has to keep track of this stuff.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG