Analyzing the Strike Zone as a Three-Dimensional Volume by Eric Lang September 14, 2015 When it comes to the strike zone, you need to think three dimensionally. Editor’s Note: This piece was initially given as a presentation at the marvelous 2015 Saber Seminar. The strike zone has come under heavy fire lately from the analytics community, as the zone has expanded greatly in recent years — particularly here at The Hardball Times by the excellent Jon Roegele — and helped contribute to the decrease in run scoring. It’s an issue, But there is another interesting topic involving the strike zone conversation: the strike in three dimensions. The strike zone is defined by the MLB rule book as the area over home plate that extends from midway between the belt and shoulders of the batter down to the bottom of the knees. The latter part of the definition is not new; however, it did come as a surprise to me to learn that that strike zone covers the entire plate and forms a pentagonal volume. Most baseball fans, and arguably players, would assume that pitches are called as balls or strikes based on their position at the front of home plate. However, the rule book defines a strike as a pitch where “any part of the ball passes through any part of the zone.” Therefore, a ball that simply passes through the strike zone for a total of one inch should technically be called a strike. This is where you may think of back door strikes: the slider that catches the outside corner of the plate. The PITCHf/x cameras and system that track pitches and display them during television broadcasts show only where the pitches are at the front edge of home plate. However, home plate is over 1.4 feet long, offering more distance for the ball to curve in from the side or drop into the zone. This is crucial especially for pitches that are above the strike zone at the front edge of the plate. For a pitch released at 90 mph at a vertical release angle of -1 degrees, the ball will drop almost two inches over the course of that extra 1.4 feet of home plate; certainly enough drop for a pitch to fall into the strike zone. Therefore, it is not sufficient to rely on the accuracy of ball and strike calls that are made at the front of the plate. One must determine if a pitch passed through the three-dimensional version of the strike zone. The PITCHf/x cameras provide information regarding the initial position, speed and acceleration in all three dimensions and this is crucial to determining the trajectory of the pitch. But before we tackle the trajectory of the ball over the home plate area, we must address the geometry of the dish itself. The plate is 1.41 (17/12) feet deep and the same length across at the front edge. Halfway back toward the catcher (17/24 of a foot back), it angles back to form a right isosceles triangle that goes from 1.41 feet to zero feet across. However, one must also add the radius of the ball to each edge to account for the fact that if any part of the ball crosses the zone, the pitch should be called a strike. Additionally, PITCHf/x records estimates of the top and bottom of the strike zone for each pitch. Given all of this information, I iterated back from the front edge of home plate to the back edge to see if the pitch ever crossed into the zone. To conduct this study, I looked at data from the 2013 season, giving 757,556 pitches in total. I used the initial nine-parameter fit of the pitches and also extracted the position at the front edge and outcome of the pitch. From the constant acceleration fit, I iterated back over home plate at intervals of .025 feet, giving a total of 56 iterations. Therefore, if a pitch was in the strike zone for 56 iterations, then it went straight down the middle of the zone from the front edge to the back corner. Very few pitches did that. So, just how well did the umpires do in calling balls and strikes in a 3D sense? From the data set, 126,489 “takes” passed through the 3D strike zone at some point in time. All these pitches should have been called strikes. However, they were called strikes only 80.4 percent of the time. If the pitch was a strike at the front edge of home plate, the correct call percentage increases to 81.8 percent. So that’s a bit better, but what about pitches that weren’t strikes at the front edge? From the large data set, I found 3,950 pitches (3.1 percent of the total) that did not pass through the strike zone at the front edge but did at a later point. I’m going to call these “backdoor strikes.” CORRECT CALL PERCENTAGE All Pitches Strikes At Front Edge Back-door Strikes 80.4 81.8 31.8 Technically, these should all be called strikes. However, I found that roughly 68 percent were called balls. Not a great success rate. So we can see that umpires’ calls are more than likely influenced by the location of the ball at the front edge of home plate. Therefore, if a pitch was not a strike at the front edge, it was much less likely to be called a strike. This can be confirmed by looking at the correct call percentage for pitches as a function of the time they spent over home plate in the strike zone. From the above plot, one can clearly see that as the number of iterations increased, the correct call percentage also increased, but only if the pitch was a strike at the front edge. If a pitch spent less than six inches in the zone, it was called correctly about half the time. The correct call percentage increases all the way up to 93 percent for pitches that were in the zone for 14 or more inches. Interesting is the jump that occurs at around eight inches (half the depth of the plate). Pitches that spent less than half their time over home plate in the strike zone didn’t get over that 60 percent threshold. However, pitches that spent more than half their time in the strike zone never dipped under that 70 percent threshold. In other words, being in the strike zone for half the time spent over home plate in addition to being a strike at the front edge seems to be a crucial point in determining the accuracy of the umpires’ calls. For backdoor strikes, however, the story is not so rosy. The correct call percentage of those pitches is independent of the length of time the pitch spent in the strike zone. Whether a backdoor strike nicked the edge of the strike zone or caught a lot of the zone, it was still much less likely to be called a strike. Simply being a strike at the front edge appears to be the biggest factor in a pitch being called a strike. It is also interesting to look at where these backdoor pitches “missed.” Were they too high at the front edge of the plate or were they off to the side? Of that set of 3,950 pitches, 3,041 — or 77 percent — were above the top of the zone and the rest were off the plate to one of the sides. Therefore, one can assume that of the pitches that came into the strike zone at a later point, most fell into the zone from above rather than curved in from the side. This makes sense because if a pitch is above the zone, then gravity is acting to pull it into the strike zone. However, if a pitch is too far outside, then only the much weaker Magnus Force is acting to push the ball toward the strike zone. The heat map below shows the number of total iterations the ball spent in the strike zone versus the position of the ball at the front edge of the strike zone. One can see that a pitch was more likely to fall into the strike zone than curve in. If a pitch was six inches too high at the front edge, it still had a chance to spend 30 iterations (or nine inches) in the strike zone. On the other hand, if a pitch was six inches too far outside, then it had a possibility of spending only seven or eight iterations in the zone (two inches). Finally, we can break the analysis down by pitch type to see what types of pitches were the most likely to be the backdoor strikes. PITCH PERCENTAGES Pitch Type % Of All Pitches % Of Back-door Strikes Fastball 36.1 33.1 Cutter 6.2 8.1 ChangeUp 8.4 4.6 Sinker 21.9 13.6 Slider 13.9 16.9 Curveball 11.1 21.7 Fastballs (both four-seam and two-seam) most likely represent the highest proportion of backdoor strikes because they also represent the largest fraction of all pitches. As the table shows, all pitches have the possibility of being backdoor strikes, but some more than others. Fastballs represent 36 percent of all pitches, so with the volume of fastballs thrown, it would be expected that a significant portion of backdoor strikes would be also be fastballs. Looking further down the table, it turns out that the first thought of the backdoor slider does hold water. Curve balls and sliders are disproportionately likely to be the backdoor strikes. They are thrown only about 11 percent and 14 percent of the time respectively, yet they represent 21.7 percent and 16.9 percent of all the backdoor strikes. Curves falling into the strike zone makes intuitive sense because a curve thrown with topspin that is too high at the front edge of the plate will have both gravity and the Magnus force pushing the ball down toward the strike zone. In this brief look at the strike zone in 3D, I was able to conclude that if a pitch was not a strike at the front edge of home plate, it was much less likely to be called a strike. That seems to be a determining factor for umpires. Granted, the umpire almost certainly must pick a reference point for his strike zone, as it seems impossible for an ump to track the ball over the 1.4 feet of home plate and make a pitch call based on that trajectory. The front edge of the plate seems pretty clearly to be the reference point of choice. Also, I showed that breaking balls fall into the 3D strike zone disproportionately often, supporting the notion of the backdoor slider. However, this should certainly not be the end of this study. In the future, I would like to analyze catchers and umpires to see which catchers are stealing the most strikes for their teams and which umpires are actually calling the most accurate games behind the plate. References & Resources I would like to thank Dr. Alan Nathan for his guidance and Harry Pavlidis for access to the PitchInfo Database.