# Digging Deep Into Inside Edge’s Fielding Data

Inside Edge Scouting Services embodies the never-ending pursuit to quantify every aspect of baseball. With the introduction of Statcast data into the public sphere of baseball analysis this year, Inside Edge finds itself with MLB-sponsored competition. The two can work in harmony, though, to help expand our understanding of how to put a price on defensive aptitude.

For every single ball put into play, the wizards at Inside Edge spatially code it, like plotting a landmark on a map. FanGraphs then uses these data to populate both hitters’ spray charts and defensive fielding charts. As an example, ere are two Mike Trout charts:

FanGraphs adeptly explains the ins and outs of Inside Edge Fielding, but here’s a 30,000-foot overview. Each batted ball is assigned a percentage chance (or probability) it could have been fielded cleanly. Inside Edge established a series of bins into which batted balls of a particular probability are sorted: Impossible (0%), Remote (1-10%), Unlikely (10-40%), Even (40-60%), Likely (60-90%), and Routine (90-100%). I will henceforth refer to these bins as “levels of difficulty,” for ease of understanding.

This methodology differs from that of Baseball Info Solutions (BIS), another prominent provider of fielding data that is used in the calculation of Defensive Runs Saved (DRS) and Ultimate Zone Rating (UZR). While BIS’ metrics can be acutely mathematical — bordering on downright complicated — Inside Edge takes a simpler (but not elementary) and perhaps more intuitive approach to quantifying a player’s defensive aptitude.

It might be too much to ask Inside Edge to create 101 separate levels of difficulty — one for each percentage chance, from zero percent to 100 percent. It’s almost too granular and possibly too burdensome to feature comprehensively on a site such as FanGraphs. But the lack of granularity inherent to providing six levels of difficulty distills our collective comprehension of the *exact* chances a player might have to convert a particular play.

What we *can* do, instead–and, by no coincidence, what I’ve done here for you, because I love you–is:

- Take what is now four years’ worth of binned fielding data.
- Aggregate it.
- Examine major league-wide trends.
- Reduce them back down to the player level, identifying historical* outliers.
- Assess whether WAR (wins above replacement) adequately accounts for the difficulty level of each player’s defensive season.

**As historical as data can be when they only date back to 2012.*

To reiterate, Inside Edge characterizes each ball in play as a particular level of difficulty. It also tracks the frequency with which a player converted that type of play, calculated as a percentage as well. Thus, we can actually verify if a player converted the appropriate number of balls in play within each level of difficulty. For example, a fielder should convert anywhere from 90 percent to 100 percent of the balls in play that Inside Edge classifies as “Routine.” A player who converts fewer than 90 percent of “Routine” balls in play should set off some mental alarm bells.

Inside Edge’s levels of difficulty and their lack of granularity make it difficult on an individual basis to know exactly how many batted balls a player *should have* converted. Let’s say our favorite center-fielding Kevins, of houses Kiermaier and Pillar, field 20 plays. All 20 of Kiermaier’s opportunities are characterized as having a 95 percent chance of being fielded cleanly. Meanwhile, eight of Pillar’s opportunities are characterized as being perfectly easy to convert (100 percent) while the other dozen are a bit harder, each assigned a 90 percent chance of being fielded cleanly. We can calculate the weighted averages, or expected values, for each:

- Kevin Kiermaier: (20 * 95%) ÷ 20 = 95%
- Kevin Pillar: ((8 * 100%) + (12 * 90%)) ÷ 20 = 94%

By Inside Edge’s numbers, the plays Pillar fielded were, on average, slightly more difficult to convert than those that Kiermaier fielded. But because of the aforementioned lack of granularity, all 40 of those batted balls will appear similarly and generally routine to the public.

There’s a way to circumvent these issues, though, and it (finally) gets to the heart of this post. I aggregated all Inside Edge fielding data for each season, from 2012 through 2015, and parsed them by defensive position. Accordingly, we can glean the following:

- We can observe not only the actual conversion rates of plays but also the frequencies that plays occur within each level of difficulty.
- We can observe fluctuations in the difficulty of fielding plays from year to year.
- We can quantify which defensive positions are most difficult. In other words, Inside Edge’s levels of difficulty do not treat each defensive position equally.
- And, perhaps most importantly, we can identify which players were subjected to uncharacteristically difficult (or easy) defensive seasons relative to their positional colleagues.

### Inside Edge Fielding, by Year (2012-15)

Having addressed the de facto table of contents, let’s look at the frequency that plays occur within each level of difficulty:

Year | Impossible (0%) | Remote (1-10%) | Unlikely (10-40%) | Even (40-60%) | Likely (60-90%) | Routine (90-100%) |

2012 | 7.8% | 3.2% | 3.0% | 3.7% | 6.4% | 74.2% |

2013 | 9.9% | 2.4% | 3.6% | 3.2% | 5.8% | 75.2% |

2014 | 7.1% | 5.6% | 4.4% | 3.2% | 6.2% | 73.4% |

2015 | 7.3% | 5.3% | 3.7% | 2.8% | 5.6% | 75.4% |

Total | 8.4% | 4.1% | 3.7% | 3.2% | 6.0% | 74.6% |

Generally, the frequencies at which we observe each level of difficulty remain fairly stable throughout the four-year sample. There’s a peculiar spike in impossible plays accompanied by a dip in remote plays in 2013, but other than that, everything is pretty consistent. Most plays are routine, which makes sense, given offense in baseball is predicated mostly on failure.

Next, the average conversion rate of plays by difficulty:

Year | Impossible (0%) | Remote (1-10%) | Unlikely (10-40%) | Even (40-60%) | Likely (60-90%) | Routine (90-100%) |

2012 | 0.0% | 8.0% | 27.7% | 56.3% | 79.5% | 97.9% |

2013 | 0.0% | 8.2% | 30.5% | 56.2% | 81.7% | 98.0% |

2014 | 0.0% | 4.6% | 27.5% | 59.1% | 79.7% | 97.8% |

2015 | 0.0% | 4.5% | 29.1% | 50.1% | 77.9% | 97.9% |

Total | 0.0% | 5.8% | 28.7% | 55.6% | 79.7% | 97.9% |

It’s good to see that all impossible plays are actually impossible. And it’s good to see elsewhere that the major league-average play-conversion rates within each level of difficulty actually *fall* within the range of probabilities stated by each bin.

But it’s important to note that the conversion rates above are not centrally located within each range. For example, the average probability of converting a play classified as “Even “40-60%” is actually closer to 55 percent, not the 50/50 coin flip we might assume would develop via a normal distribution. In fact, the frequencies in the previous table indicate that the distribution of play difficulty skews strongly left (with a slight bump in the left tail). Accordingly, the mean will exceed the expected median.

We can have some fun with these data as they pertains to error, as formally ruled by official scorekeepers:

Year | Plays | Errors | Fld% | x-Success | Sub-60% Errors |

2012 | 108,936 | 3,002 | .972 | 80.9% | -0.6% |

2013 | 109,267 | 2,745 | .975 | 81.5% | -0.1% |

2014 | 110,028 | 2,912 | .974 | 80.2% | -1.0% |

2015 | 108,208 | 2,821 | .974 | 80.9% | -1.1% |

Total | 436,439 | 11,480 | .974 | 80.9% | -0.7% |

*Plays, errors, and fielding percentage (Fld%) are intuitive, but in case you need a refresher: 1 – (Errors* *÷** Plays) = Fld%*

Expected success rate, or “x-Success,” calculates the overall success rate as a weighted average of each conversion rate by level of difficulty, according to their respective frequencies. It’s the expected value, in nerdy math speak. For the mathematically disinclined, expected value can be calculated as:

EV(x) = Σ [p(x)*_{i}x]_{i}

…wherexis the outcome and p(_{i}x) is the probability that_{i}xwill occur.

For the sake of this post, p(*x _{i}*) (probability) represents frequency and

*x*(outcome) represents difficulty. Subscript

_{i}*i*denotes a series of outcomes — in this case, the various difficulty bins previously described. Sigma (Σ) denotes summation. Ultimately, the long-form version of x-Success would be written as:

x-Success = [frequency ofImpossible* difficulty ofImpossible] + [frequency ofRemote* difficulty ofRemote] + … + [frequency ofRoutine* difficulty ofRoutine]

x-Success is not perfectly precise, given the inherent inexactitude of Inside Edge’s categorization process, but it’s close. It’s comforting to see the numbers so consistent from year to year, but a critical look might undo the numbers all together. If x-Success represents the percentage of total plays converted into outs, then the inverse should equal the league’s batting average on balls in play. Take the difference and, well, you know as well as I do that the major leagues didn’t hit below the Mendoza Line last year.

I asked FanGraphs’ resident data guru Jeff Zimmerman to investigate the discrepancies. Fortunately, he noted that deep within FanGraphs’ database lies a handful of batted balls that were classified as impossible and were not assigned a fielder to be held accountable for it, likely due to the sheer nature of each play’s impossibility. Thus, the total number of plays that turned up for me in my dig through FanGraphs’ leaderboards fell short of the number of plays that actually occurred. Given the major league batting average has hovered around .250, we can assume with non-zero comfort that roughly five percent of all plays are both (1) impossible and (2) not assigned a fielder.

The final column in the table above, “Errors < 60%,” requires a little bit of imagination. Players commit errors–thousands of them every year–but we don’t know the level of difficulty for each error. We *can* assume, though, that scorekeepers assign errors only to the easiest plays, sequentially. For example, all the routine plays; and when all the routine plays are accounted for, the likely plays; and so on. “Errors < 60%” therefore tries to guess how many errors are credited (debited?) to fielders on plays considered “Even (40-60%)” or harder, as a percentage of plays remaining:

((# of errors) – (# of plays: Routine, Likely)) ÷ (# of plays: Even, Unlikely, Remote, Impossible)

For example, an “Errors < 60%” of 10 percent means 10 percent of all Even, Unlikely, Remote, and Impossible plays, whether for the league or the individual player, are ruled as errors. By this methodology, the zeroes that fill the last column indicate that, generally, no relatively difficult plays are ruled as errors. In other words, the number of easy plays (Routine, Likely) exceed the number of errors committed. It’s a pleasant finding, a confirmation of reasonable expectations about scorers and/or Inside Edge’s standards by which plays are classified. But we’ll see soon that the finding breaks down when the data are further disaggregated.

### Inside Edge Fielding, by Position

The following tables depict the play frequency and conversion, respectively, by difficulty and defensive position:

Position | Impossible (0%) | Remote (1-10%) | Unlikely (10-40%) | Even (40-60%) | Likely (60-90%) | Routine (90-100%) |

P | 2.2% | 6.0% | 4.5% | 6.4% | 12.7% | 68.2% |

C | 7.9% | 16.1% | 31.1% | 8.0% | 5.7% | 31.2% |

1B | 1.2% | 3.0% | 2.4% | 3.4% | 7.3% | 82.7% |

2B | 1.8% | 3.6% | 2.5% | 3.0% | 5.7% | 83.4% |

SS | 2.2% | 4.5% | 2.9% | 3.1% | 6.2% | 81.1% |

3B | 2.1% | 4.9% | 3.6% | 4.6% | 9.1% | 75.7% |

LF | 21.2% | 2.7% | 2.1% | 2.1% | 4.0% | 67.9% |

CF | 15.7% | 2.6% | 1.7% | 1.5% | 3.2% | 75.2% |

RF | 20.5% | 2.6% | 2.1% | 2.0% | 3.7% | 69.2% |

Total | 8.4% | 4.1% | 3.7% | 3.2% | 6.0% | 74.6% |

I highlighted some of the interesting outliers. Outfielders are subject to far more impossible plays than infielders; I imagine these are typically deep shots to the gap and line drives/Texas Leaguers that fall in over the heads of corner infielders down the baselines.

Also — and, perhaps, intuitively — catchers play probably the most difficult position on the field (although that point can be debated endlessly). What caught my attention, though, is that center fielders are subject to more “Routine” and “Likely” plays than the corner outfield spots. Part of that might be selection bias; center fielders are typically highly athletic and, thus, liable to convert more plays than a different type of player (say, the prototypical bat-first left fielders). Maybe the selection bias is enough to account for five percentage points of batted balls, but maybe there also exists reason to believe we are overvaluing the need for defensive prowess in center fielders. Don’t shoot the messenger, it’s just what Inside Edge’s data show. Regardless, it’s a peculiar development.

Position | Impossible (0%) | Remote (1-10%) | Unlikely (10-40%) | Even (40-60%) | Likely (60-90%) | Routine (90-100%) |

P | 0.0% | 10.5% | 31.5% | 65.2% | 83.9% | 96.2% |

C | 0.0% | 8.7% | 30.1% | 57.2% | 79.6% | 96.6% |

1B | 0.0% | 3.9% | 21.9% | 55.1% | 79.4% | 97.5% |

2B | 0.0% | 4.5% | 28.1% | 57.1% | 80.3% | 98.1% |

SS | 0.0% | 3.9% | 26.4% | 48.6% | 76.9% | 97.2% |

3B | 0.0% | 3.8% | 26.4% | 54.9% | 75.8% | 96.2% |

LF | 0.0% | 6.3% | 29.8% | 55.3% | 80.9% | 99.1% |

CF | 0.0% | 7.0% | 32.1% | 55.8% | 83.7% | 99.4% |

RF | 0.0% | 5.3% | 30.2% | 53.9% | 83.7% | 99.1% |

Total | 0.0% | 5.8% | 28.7% | 55.6% | 79.7% | 97.9% |

Again, I highlighted any unusual details. Pitchers convert remote plays more than 10 percent of the time and even plays more than 65 percent of the time–conversion rates higher than what’s expected of each difficulty level by definition. I don’t know if there’s a takeaway here, other than some plays that pitchers field are, in theory, classified incorrectly. (For the sake of this exercise, it’s a non-issue.)

Position | Fld% | x-Success | Errors < 60% |

P | .940 | 82.5% | 7% |

C | .930 | 50.0% | 8% |

1B | .970 | 88.9% | 0% |

2B | .978 | 88.9% | 0% |

SS | .969 | 86.0% | 0% |

3B | .959 | 83.3% | 0% |

LF | .989 | 72.5% | 0% |

CF | .991 | 79.1% | 0% |

RF | .988 | 73.5% | 0% |

Total | .974 | 80.9% | 0% |

Despite the fact that easy plays outnumber errors as a whole for the league, the same does not hold true specifically for pitchers and catchers, who incur a high ratio of errors on plays classified as “Even” or harder. It seems that, per Inside Edge’s standards, pitchers and catchers are not held to the same standards as the other defensive positions. While catchers *are* subject to difficult plays more frequently than other positions (they should be expected to convert only half of their plays!), their fielding percentages may unfairly punish them to an extent.

### The Easiest and Most Difficult Defensive Player-Seasons, by Position

Using the information above, we can determine the most difficult defensive seasons from 2012 through 2015 by comparing each player’s expected success rate to the league average in a given year at a given position. To compare, I calculated Z-scores of expected success rates by player, year and position.

First, the hardest defensive seasons, minimum 100 plays:

Position | Player | Season | Errors | “Deserved” Errors | Difficulty (Z-Score) | Def |

C | Jarrod Saltalamacchia | 2014 | 15 | 10 | 0.6 | 4.6 |

1B | Yonder Alonso | 2014 | 2 | 2 | 0.6 | -1.4 |

2B | Skip Schumaker | 2013 | 5 | 4 | 0.8 | -14.2 |

SS | Tyler Pastornicky | 2012 | 7 | 7 | 1.2 | -10.3 |

3B | Nick Castellanos | 2014 | 15 | 18 | 0.5 | -16.3 |

LF | Carlos Gonzalez | 2012 | 4 | 6 | 0.7 | -12.6 |

CF | Yasiel Puig | 2014 | 2 | 0 | 1.1 | -5.7 |

RF | George Springer | 2014 | 7 | 2 | 0.6 | -7.9 |

Def: defensive value, per FanGraphs

“Deserved” errors according to “Errors < 60%” method

Pitchers excluded; none exceeded minimum play threshold

It’s no coincidence that the players above who were subjected to some of the most difficult defensive seasons in recent memory also generated the worst yearly defensive values of their careers. Defensive value is not a rate statistic, but even as a ratio of playing time, all of Schumaker, Pastornicky, Castellanos, Gonzalez, Puig and Springer experienced the worst defensive seasons of their careers. Puig and Springer, two of the game’s young phenoms, have markedly improved their defense since 2014. But such improvements may not be entirely self-manifested — their overall defensive difficulty has eased up, too. Ironically, even though Castellanos’ horrid defense in 2014 is partly vindicated by the sheer difficulty of the plays he encountered, he still butchered a lot of easy plays. Let’s not give him too much credit.

Position | Player | Season | Errors | “Deserved” Errors | Difficulty (Z-Score) | Def |

C | Matt Wieters | 2013 | 3 | 5 | -0.7 | 14.8 |

1B | Steve Pearce | 2014 | 1 | 0 | -0.5 | 5.4 |

2B | Brian Roberts | 2013 | 1 | 5 | -0.6 | 1.3 |

SS | Zack Cozart | 2015 | 3 | 7 | -0.8 | 4.4 |

3B | Eric Chavez | 2012 | 5 | 8 | -0.6 | -2.3 |

LF | Brandon Guyer | 2015 | 0 | 2 | -0.9 | 0.2 |

CF | Chris Young | 2013 | 0 | 2 | -0.6 | -1.5 |

RF | Hunter Pence | 2015 | 3 | 2 | -0.6 | 2.5 |

Def: defensive value, per FanGraphs

“Deserved” errors according to “Errors < 60%” method

Pitchers excluded; none exceeded minimum play threshold

The defensive seasons above aren’t especially impressive, but even Pearce, Roberts, Cozart, Guyer and Pence were subjected to their easiest defensive seasons since 2012.

There’s enough evidence here to inspire further research. I ran a regression that measured the correlation between each player’s difficulty Z-score and his defensive value (Def) relative to playing time (in other words, converted into a rate statistic). I limited the sample to players who fielded at least 100 plays at a particular position. For players who fielded at least 100 plays at more than one position, I kept only the most frequently-played position and omitted the rest. This left me with a sample of roughly 1,100 player-seasons from 2012 through 2015.

The regression yielded a slightly negative correlation coefficient (r = -0.180), indicating the difficulty of a player’s defensive season moves negatively with his defensive value, a major component of his WAR. When controlling for player fixed effects (in which difficulty is specified as the independent variable and defensive value the dependent variable), the regression produces an adjusted r-squared of 0.5616. The coefficient estimate for Z-Score conveys that holding all else constant, a player’s defensive value decreases (or increases) by one run per 100 plate appearances for every standard deviation above (below) the mean that the difficulty of his defensive season ranks.

One could argue, then, that Castellanos was docked eight to nine runs in defensive value simply because of how difficult his defensive season was relative to the rest of the league’s third basemen. Granted, he still had himself an atrocious season at the hot corner. But the eight-or-so runs you give back to him puts him very close to the defensive value he generated in 2015 in roughly the same amount of time–still bad, but not *as* bad.

### Bonus Content! 2016 Defense So Far

Here are 2016’s easiest and most difficult defensive seasons thus far:

Position | Player | Difficulty (Z-Score) | Def |

C | Dioner Navarro | 1.2 | -0.1 |

1B | Eric Hosmer | 0.9 | -12.7 |

2B | Brian Dozier | 2.4 | -1.6 |

SS | Asdrubal Cabrera | 2.3 | 0.4 |

3B | Matt Duffy | 0.9 | 8.2 |

LF | Melky Cabrera | 0.9 | -4.8 |

CF | Jackie Bradley Jr. | 0.6 | -0.3 |

RF | Matt Kemp | 1.6 | -10.1 |

Minimum 500 innings played at position, except catcher (300 innings)

Note that Navarro, Hosmer and Dozier are all on pace to post the worst defensive seasons of their careers–and that’s saying a lot for the defensively-challenged Hosmer. Meanwhile, Bradley is on pace to post the only negative defensive value of his career, aside from his very brief debut in 2013.

Position | Player | Difficulty (Z-Score) | Def |

C | Carlos Perez | -1.6 | 5.5 |

1B | Freddie Freeman | -1.0 | -4.4 |

2B | Ben Zobrist | -1.4 | 2.9 |

SS | Corey Seager | -1.7 | 7.6 |

3B | Adrian Beltre | -1.8 | 8.9 |

LF | Brett Gardner | -0.7 | -1.8 |

CF | Mike Trout | -1.2 | 1.4 |

RF | Mookie Betts | -0.7 | 1.7 |

Minimum 500 innings played at position, except catcher (300 innings)

Meanwhile, Freeman is roughly on pace to post his second-best defensive season, Betts hasn’t even reached the halfway point of what already *is* his best season, and Beltre is turning the clock all the way back to 2004.

That’s not to say all outliers behave similarly. Despite the difficulty of their defensive seasons, Duffy is on pace to generate the most defensive value of his career. But he has also played excellently, converting far more of his difficult plays than would normally be expected of him.

### Epilogue

I originally sought to better understand Inside Edge’s defensive data on my accord and articulate the context of the data here. But my exploration evolved when anecdotal evidence seemed to turn into something more. En route, I offered here limited but still quantitative evidence that WAR, as we now calculate it, fails to properly account for the difficulty of defensive seasons, at least in the tail ends of the distribution of difficulties. Further analysis may further illuminate my findings or invalidate them completely. Such is life. But I am curious to know how much farther we can take this research.

Into the picture steps Statcast. Baseball fans now have so many new tools at their disposal. Major League Baseball uses cameras and all sorts of fancy technology to measure every player’s reaction time, acceleration, peak velocity, route efficiency, and an endless number of other variables for every single play. Work is already being done to control for a player’s starting position as well as the velocity and trajectory of the batted ball.

The possibilities are endless. It’s almost certain we’ll reach a point where we’ll scoff at how primitive our attempts to quantify fielding used to be. There will be no more guesswork, no more eye test. I don’t know if this is something Inside Edge will be able to publicly offer us. I don’t know if it’s in its best interest to try to compete directly with MLB in this sense.

Regardless, we’re not there yet. We have Daren Willman and Mike Petriello and now Tom Tango to man the helm of the U.S.S. Statcast, but their research, no matter how impressive it is and will be, is still in its fledgling stages. We may not have any combination of reliable, consistent, and public defensive metrics for years, let alone in 2016. Meanwhile, Inside Edge provides plenty of quality data; for a privately owned company collecting and generating its own proprietary data, the fact that we get to see any of it is a treat. And there’s no saying yet, definitively, whether it is better or worse than Statcast.

Until then, let us appreciate the data we currently have at our disposal before marveling at the possibilities the future holds for us. Old dogs can still learn new tricks, after all.

### References & Resources

- Inside Edge, FanGraphs
- FanGraphs Library, “Inside Edge Fielding”

Great stuff, since fielding data is in its early stages. But how do we know the folks at Inside Edge are “wizards?”

The lack of granularity is indeed a big issue.

Nice article but I’m not sure I agree with the conclusion. What your data may just as well be showing is the bias of IE’s scouts – plays that don’t get made get labeled as being more difficult than those that do (i.e. what looks ‘impossible’ for Asdrubal Cabrera may look ‘even’ for Andrelton Simmons). This is also the issue with BIS data that is used to calculate UZR / DRS but it could be that it is to a smaller degree. Agree that Statcast will be the ultimate arbiter here.

Agreed completely. I’m incredibly interested in the subconscious/cognitive biases that affect “difficulty” judgment. Ideally, the person who codes each play is comparing it to all identical (or virtually identical) plays that came before it and adopting that success rate. That would be a somewhat sound rationale. If they’re assigning all willy-nilly, there will be issues. Regardless of methodology, one might expect the noise to smooth out over the course of several hundred plays, but who knows.

I thought the same thing. I think it is probably partially this and partially what Alex is concluding – that DRS and UZR are in fact biased by the difficulty of a player’s chances (and not properly accounted for). In what proportion, I have no idea.

I’m guessing the “high” percentage of Remote plays made by pitchers were on liners through the box that just happen to hit the glove and stick.

That’s likely causing some of the difference in the pitching fielding classification. I’d also add that it’s likely that many of the pitcher’s tougher missed plays are being fielded by other players, so there’s a bias against difficult pitcher’s plays being included in the sample, unless they successfully field the ball.

Thus, the total number of plays that turned up for me in my dig through FanGraphs’ leaderboards fell short of the number of plays that actually occurred. Given the major league batting average has hovered around .250, we can assume with non-zero comfort that roughly five percent of all plays are both (1) impossible and (2) not assigned a fielder.I am not sure what league batting average is representing in the above statement. A quick query of Retrosheet shows 126000 or 127000 balls in play for each of the years from 2012 to 2014. That would mean 16000 to 17000 “impossible and non-assigned” balls in play for each of those years or about 13%. This also corresponds with the BABIP for those years of .288 to .289 which is the relevant number, not batting average.

The problem with this study is that it tells us nothing new. We have known for a long time that there are only about 75 to 95 Balls In Play per position per year that are not “routine play” balls or “impossible play balls”. The very best fielders making nearly all of them, the very worst nearly none of them, and an average fielder about half of them. Nothing in the Fielding Edge Data or this study helps us to identify those plays or the players that excel at making them. When or if we get the data that Statcast is collecting on hit ball landing positions, hang time, and fielder starting position we can begin to use the fielding metric already outlined by Greg Rybarzyk in his excellent 2010 Pitch Fx presentation when we all thought we might get Field Fx data. I estimate with the Statcast data you can expect a minor improvement in accuracy of 2 to 4 runs per year per fielder from our already very good fielding metrics.

At the risk of overloading this post with content, I omitted “fielding scores” for each player, which was a comparison of each player’s actual success rate to his expected success rate. I think that’s one of the points you’re making here:

Nothing in the Fielding Edge Data or this study helps us to identify those plays or the players that excel at making them.In other words, who’s more impressive: Player A, who succeeds 95% of the time but was expected to succeed 95% of the time, or Player B, who succeeds only 90% of the time but was expected to succeed only 75% of the time? Arguably, Player B. It’s just one more step (that I ultimately left out) to use the methodology above to determine who has succeeded most compared to the average player at a particular level of difficulty. Unfortunately, I don’t have the data in front of me to supply a couple of names. I may follow up with such information in posts at FanGraphs, though, if it interests you.I really enjoyed this piece. Kudos on all the work and the insight it revealed.

Thanks for this. I’m still poring through the article, but I see at least one correction:

In a left skewed distribution, the mean is generally less than the median, not greater.

It’s appalling how often I make that mistake as someone who, given his background, should never make that mistake. I would edit it, but the post is locked. Forever! Appreciate the catch nonetheless.

This research screams for a “xDef” stat not unlike xFIP that one could use to normalize (especially in-season) WAR figures.

I really like blended FIP-WAR/RA9-WAR to evaluate pitching seasons. I think a similar treatment to Def folding in the findings from this article about expected impact of hard/easy plays could be useful. It would at least be a step forward from “beware small sample defensive metrics” where we currently reside.

I’m having trouble understanding a couple of variables in your calculation.%)

1. I can’t reproduce the “Sub-60% Errors ” and “Errors < 60%" in your tables, based on the definition you gave in your formula:

((# of errors) – (# of plays: Routine, Likely)) ÷ (# of plays: Even, Unlikely, Remote, Impossible)

You show a value of 7% for pitchers, and note that for pitchers, easy plays do NOT outnumber errors. But unless I'm missing something, the data actually shows routine plays at 68.2%, which far exceeds the error % of 6% (based on a fielding % of 94%).

Intuitively, the number of errors has got to be much, much smaller than the number of routine plays, so I can't imagine the "Errors < 60%" variable ever being a positive (and useful) number.

2. Because I don't follow the "Errors < 60%" calculations, I don't understand the derivation of the 'Deserved Errors."

3. What is the distribution that the "Difficulty (Z-score)" is measured against?

From reading the article, I thought you were looking at a distribution where each data point is a different player's average play difficulty. The "outliers" are the players with the highest and lowest difficulty values, which are just the maximum and minimum x values.

In which case, I would expect the z-scores for the minimum and maximum x values to be be far outside the values you calculated, and certainly always far outside the (-1, 1) band.

4. Perhaps the above issues are peripheral to your central assertion, which I interpret to be that DEF values calculated from BIS data may not adequately reflect variations in the encountered difficulty of play from player to player.

My rudimentary knowledge of BIS fielding data is that it does adjust for difficulty of play, albeit in a mechanical way using standardized inputs. So the safe conclusion is that there may be systematic discrepancies in how BIS and Insider Edge assess play difficulty. I think additional investigation would be needed to conclude that Inside Edge results are more accurate/sensible.

Right — “Sub-60% Errors” and its “Errors < 60%" counterpart are not very intuitive. I knew I'd have trouble explaining it clearly. I don't have the data in front of me, but here's an example of something we might see: Routine plays (missed / total): 1 / 100 Likely plays: 1 / 40 All other (harder) plays: 10 / 80 Errors: 4 This player has missed 12 / 220 total plays and made 4 errors. If we assume that official scorers allocate errors to the easiest plays (sequentially), then the one routine play he missed should be considered an error, as should be the one likely play he missed. Meanwhile, the data generally shows that there about as many errors as there are plays missed between the "Routine" and "Likely" groups. Conversely, plays that are missing

outsideof these two groups maybe shouldn’t be considered errors.It’s highly likely, then, that this player, who committed four errors, was tagged with two errors on plays with difficulties harder than “Likely” — hence, “Errors < 60%," because the errors were attributed to harder-than-usual plays. They arguably "aren't deserved" -- hence, again, where the "Deserved Errors" tag comes from as well. Those two errors occurred on 10 total missed plays in all other difficulties harder than "Likely," so the "Sub-60% Error" rate would be 20%. * * * The Z-score is calculated relative to the distribution within each intersection of year and position -- for example, 2012 shortstops, 2015 left fielders, etc. * * * I'm not sure Inside Edge results

aremore accurate/sensible. Anecdotally speaking, I think defensive value accommodates difficulty fairly well, but it’s probably because most defensive seasons fall within the bulk of the distribution of difficulty (if that makes sense). It’s the tails — extremely easy or extremely hard seasons — with which the defensive valuations appear to struggle. Again, all anecdotally — I haven’t done too much rigorous testing — although there is a moderate linear correlation between difficulty and defensive value when controlling for player.* * *

LOTS to respond to, but all great questions. Let me know if something still isn’t clear!

Looks like I didn’t close an italics tag after the word “outside” in the previous comment. Sorry if it’s hard to read because of it. I goofed!

I was actually looking to work with IE Fielding Data for my own purposes, and you’ve done some of it for me already with your “SUCCESS RATES BY LEVEL OF DIFFICULTY AND POSITION” table.

THANKS

This is the type of fielding analysis article I have been hoping to read on FG/THT since the introduction of Inside Edge data. Have there been any other attempts to separate out “degree of difficulty” or fielding “opportunity” from UZR or DRS? This looks like a blind spot in determining true fielding talent with the available metrics, one that seems largely unacknowledged to this point. Maybe we’re all waiting for Statcast metrics to obliterate our current understanding of fielding—I don’t know.

Outfield defense is the place where the effect of opportunity seems to make the biggest difference. If you look at outfielders with at least 300 innings this year, a wide range exists between the top and bottom players in percentage of non-routine and non-impossible plays. The average outfielder sees 10.5% of his assigned chances in this 1–90% range, but individuals range from 3.4% (Hyun Soo Kim) to 21.6% (Matt Kemp).

Take Colby Rasmus and Stephen Piscotty, for instance. Both have played mostly all of 2016, and they have done so rather similarly from a defensive standpoint. Both have been used occasionally in center field but have played primarily outfield corners. Both are rated somewhat positively in range by both UZR and DRS, with a moderate edge in DRS for both. They have made an almost-identical percentage of their “routine” plays (0.1% difference).

The major difference between the two has been that 11.2% (23/205) of Piscotty’s chances have fallen in that 1–90% range, both non-routine and non-impossible, a figure that is very slightly above league average. Rasmus, on the other hand, has seen only 5.8% such opportunities (10/171). I think it is thus no coincidence that Rasmus has +4.6 average range runs (average of UZR & DRS), while Piscotty has a higher +6.6 average.

On one hand, you can easily say that UZR and DRS are recording what happened, and this is true. But I think if you want to know the difference in talent level (aside from needing way more innings for sample size purposes), you need to account for the difference in opportunity. Perhaps some of this effect is offset by the negative correlation you found between degree of difficulty and defensive runs, but I think you would be hard pressed to conclude from the 2016 data that Piscotty has shown definitively more range than Rasmus. (I’m realizing that I am using “you” a lot in the collective sense, not to imply that anyone specifically is making this claim.)

Using a different and hypothetical example, suppose the Braves managed to bat Freddie Freeman 1,000 times this year without somehow breaking the rules. Freeman’s .365 wOBA is basically his career average rate, and he is well-regarded but not considered one of the league’s very best hitters. Would 38 or so batting runs (if you project out his current production to 1,000 PA) cause us to conclude that he is more valuable at the plate than Mike Trout, David Ortiz, and the other league leaders who have produced nearly that much batting value in fewer than 400 PAs so far this year?

Intuitively we would not, but because the Braves can’t realistically send Freeman to bat twice as often as anyone else, opportunity is not something we generally have to weigh heavily when evaluating a hitter. With fielding, however, I think we really need to adjust for “opportunity,” “degree of difficulty,” or whatever you want to call it, when we decide who is having the best season. An “xDef” stat as Scott proposed above would be quite useful.

Alex, good article.

This:

….but maybe there also exists reason to believe we are overvaluing the need for defensive prowess in center fielders.

The reason we have the best fielder at CF has nothing (or almost nothing) to do with the difficulty of the plays, relative to RF and CF. It is simply that there are many more plays (after removing the impossible and routine) in CF.

If I’m reading the leaderboards correctly… Billy Hamilton has made 6 plays (60% of 10 chances) on balls categorized as remote (1-10%).

All other center fielders in MLB have made 5 such plays combined. One each for Cespedes, Jones, Jay, Pillar, and Cain.

I can say from experience, it is exceedingly difficult to watch a play and then categorize it.

So theoretically you could create an adjusted WAR based on the “expected” defensive WAR rather than their actual production? Akin to how pitching WAR uses FIP to say “this is what we expect to happen, given average defense”, our new adjWAR should say “this is what we expected our fielder to do, given an average distribution of balls in play at him.”

That makes sense to me, but someone please correct me if I’m wrong.