Baseball’s Framing Wars Are Not Over

by Evan Davis
November 6, 2017

Tyler Flowers still isn’t getting the recognition he deserves. (via Editosaurus, Rich Bamford & Michelle Jay)

Tyler Flowers had a career year for the Atlanta Braves this season. The 31-year-old catcher posted career bests in whiff rate, contact rate, strikeout rate and isolated power percentage. A 120 wRC+ not only lapped his previous career high, but made him one of the best offensive catchers in baseball. By True Average, his .309 mark was only three points behind position leader Buster Posey. It was a banner year for Flowers at the plate.

But his hitting, improved though it was, was not what made Flowers special. It’s not why he received a cover story in the pages of The Ringer back in July. It’s not why he caught and held my attention all the way back in 2015, when he was helping Chris Sale rack up strikeouts. No, what makes Tyler Flowers a headline is his ability to frame a pitch better than just about anyone.

Flowers was always a great pitch framer. Below is a table of Flowers’s rates of saving or stealing called strikes above the average catcher, per Baseball Prospectus’s Called Strikes Above Average stat. For reference, I’ve also included where Flowers ranked among all major league catchers who caught at least 2,000 called pitches in that season. (That’s a pool of about 55-60 catchers each year, which captures the starters and the backups on each team.)

Tyler Flowers, CSAA, 2014-2017

YEAR	INNINGS CAUGHT	CSAA	MLB RANK
2014	1,052.0	0.6%	24th
2015	878.1	1.8%	8th
2016	686.0	1.7%	t-5th
2017	745.0	3.4%	2nd

SOURCE: Baseball Prospectus

If we look at the accumulated value of the last four seasons, Flowers has saved or stolen nearly 1.9 percent more called strikes than the average catcher, second only to Yasmani Grandal in that timeframe. Flowers has been a wizard.

Not only has Flowers been great, but he’s steadily gotten better every year. He exploded this season, grabbing pitches on the margins that only Los Angeles Dodgers backup Austin Barnes could eclipse. Fold in his offensive contribution, and Flowers was worth nearly six wins of value, according to BP’s WARP model, still the only publicly available wins above replacement model to incorporate framing value. Only Posey squeaked out a fraction of value more than Flowers did, and he needed a late-season cleanup of his framing mechanics to get there. For most of 2017, Flowers was the best catcher in baseball.

And yet, Flowers was left off the National League All-Star team. He wasn’t shortlisted as a potential reserve after fan voting closed. He likely won’t receive any down-ballot MVP votes, whereas his performance could justify a 10th-place finish. Perhaps most galling, he wasn’t declared a finalist for the Gold Glove award.

What gives? How could one of the best catchers in baseball be completely ignored by fans and media? Such behavior suggests that despite the gains framing has made among sabermetricians and front offices, its value is still not properly appreciated by the broader baseball public. That needs to change.

…

Jeff Sullivan declared last year in these very pages that “framing was doomed from the start.” His thesis was a sound one—once teams recognized the outsize value of framing, they would begin to develop and employ catchers who were good at it. This would compress the range from the best to the worst, thus making framing just another standard tool in the toolbox, rather than a market inefficiency to be chased down and taken advantage of. “The market is going to end up flooded with good-receiving catchers,” Sullivan declared. “By then we’ll no longer recognize them as good-receiving catchers. Pitch-framing is sufficiently important that baseball teams will prioritize it right into insignificance.”

Teams are indeed doing that. The Minnesota Twins, one of the worst teams with regard to framing, hired Jason Castro last offseason. No player has crossed the ridiculously high four percent CSAA threshold since 2011. On the bottom end of the leaderboard, only one catcher has dropped below the ignominious negative-three percent threshold in the last four seasons.

Even when I have tried to make a forceful case for framing’s value, I have been met with a bemused, “you already won the fight, kid” attitude. (Skip to 11:45 for the relevant segment below.)

When you look at baseball solely through the lens of front offices, then sure, the framing fight is over. But while journalists, pundits, and more fans can get on board with more analytical approaches to hitting and pitching, they can’t seem to make the leap when it comes to framing. Just observe how Tom Verducci reacts in that clip.

Dan Turkenkopf, formerly of Baseball Prospectus, The Hardball Times and Beyond the Box Score and now currently with the Milwaukee Brewers, published an article back in April 2008—at the time the first major public study of the value of framing—and was shocked to discover just how much a called strike was really worth. “The results just seem too outlandish to be correct,” he gasped. Nearly a decade later, more sophisticated research has borne out Turkenkopf’s original conclusion.

That conclusion sort of passed the smell test anyway. After all, turning a 1-1 count into a 1-2 count, or an 0-1 into an 0-2, changes the entire complexion of the at-bat. You rack up a few dozen of those types of pitches in a game, or several thousand in a season, and suddenly a backstop worth 20 runs in framing alone doesn’t seem all that ludicrous.

But it didn’t feel right. You can’t see a well-framed pitch in the way that you can see a mammoth home run, or a filthy slider. One extra called strike in an at-bat changes a great deal, but it doesn’t appear to. There will be several more pitches to come after that. One pitch doesn’t decide the at-bat in the way that the home run or the swinging strike three does.

Plus, the jury isn’t entirely settled on what physical skills make a good framer. It’s a subtle mix of positioning to give the umpire the best look at the pitch; of knowing where the pitch is going so that you don’t have to move your body to receive it; of forearm and wrist control to stick the glove when the pitch is delivered; of stillness and perfect balance in your lower half to prevent extra movement and thus keep the glove “in the zone.” Each of these components is essential, but it’s not clear which components matter more than the others, or if a hierarchy exists at all.

I used to think that a “know it when I see it” approach could work when attempting to evaluate good or poor framers with my own eyes. I sometimes watch guys who look like they are well positioned, have good balance, and keep their gloves “quiet” when receiving a ball on the edge of the zone and see the call go the other way. Or I’ll see a jerk of the glove in the same location, surely meant to be a ball. Strike.

We can’t trust our eyes when it comes to framing yet. At least, the layman can’t. Walks, doubles and home runs are self-evident, and their value follows. So too strikeouts and limiting walks and home runs. The backbone of most advanced statistics are there to be seen, even if the next argumentative step to get to wRC+, FIP and WAR might take a bit of work. Their fundamental stories are there for us to witness.

It’s so much harder to take the leap when you have to look almost solely at strike zone heatmaps, ump called zone ratios, and numerous little events that add up to something bigger. You can acknowledge that framing is a skill and that catchers try to use it (as Verducci does in the above clip), but because the reassuring (if often misleading) layer of the eye test is largely absent, its validity must be questioned.

Even analytically inclined writers and thinkers don’t fully embrace the size of framing’s impact on catcher defense. Neither Ultimate Zone Rating (UZR) nor Defensive Runs Saved (DRS) uses framing in their calculus of a catcher’s glovework. These two metrics, and not BP’s Fielding Runs Above Average (FRAA), help comprise the SABR Defensive Index, which is the statistical side of the Gold Glove vote.

This surprises me. The core of all modern analytic work in baseball is to determine the probability of a run scoring or being prevented. That’s the object of the game, after all—to score runs, and to prevent the other team from doing so. If one can determine the linear weight value of an event, we can determine just how valuable it is to a team’s chances for scoring and preventing runs.

I’m not telling you anything new here. This inductive process was begun decades ago, and is now almost universally accepted. The value of framing comes from this process, too. That’s what made Turkenkopf’s discovery so shocking. The method of discovery was analogous to that of the original linear weights process in the early 1980s, and yet it revealed scarcely believable results.

A more refined method was later innovated by the likes of Harry Pavlidis, Dan Brooks and even Dan Meyer in these very pages. The basic work is to tally every event that happens after a ball or a called strike in every single count. Brooks and Pavlidis also assigned gradational credit to each ball or called strike based on the probability of that ball being called in either direction, batter’s handedness and the actual pitch being thrown. Jonathan Judge then took it one step further, adjusting the results for the effects of the individual umpire’s zone tendencies, the effect of the hitter and the zone that they create, and the pitcher’s ability to hit or miss the edges of the zone.

Once this process is done, we can find a linear weight value for a called strike and a ball. And sure enough, it sits between 0.13 and 0.16 runs, depending on the season. A sixth of a run on every extra called strike isn’t an arbitrary value. It wasn’t pulled out of a hat. It was inductively arrived upon, accounting for all necessary factors. Correlation studies demonstrate the strong in-season and year-to-year reliability of pitch framers. It is a genuine skill. That should be enough.

It is certainly enough for front offices. Once their own research began to push forward at the dawn of the first Obama administration, more and more good framers got everyday jobs, or at least backup positions if their bats couldn’t play. Young catchers were trained to pay attention to their receiving. Coaches like Jerry Weinstein preached not only the skill, but also the run values, too. Teams and executives got it.

Fans, writers and apparently managers just can’t pull the trigger. They still want to trust their senses, when empiricism teaches us to push past the limitations of subjective perception to arrive at a larger truth. Since 2008—the year of PITCHf/x’s arrival and the real beginning of pitch framing discourse—18 Gold Gloves have been awarded to catchers. Each league’s FRAA leader in each season since has gone empty-handed, except for Posey in 2016.

Is there a corrective to this? The SABR Defensive Index could certainly make an effort to include framing as part of its rankings of catchers. It may not be enough to countermand the manager vote, but it could at least serve as some small check on the groupthink that sees Salvador Perez march his way to the award every year.

Beyond that, we have to be less cocksure about where the state of the discourse really lies. Writers and analysts like Sullivan may be comfortable in declaring framing’s death, but that only speaks to the sabermetric echo chambers, whose ranks most assuredly include whole swaths of baseball operations departments. The truth of the game may not be as easily or immediately accessible to the public at large, but we must bear in mind that while the bombs lobbed in the last decade hit many of their targets, not all went off. The history of intellectual movements necessarily requires a refortification of one’s positions to allow for empirical methodology to slowly work its way into the broader consciousness.

I say this not for myself, but for Tyler Flowers. He generated 423 extra called strikes over the past four seasons. Think about how many more swinging strikeouts he helped force because his pitchers could tee up their wipeout pitch thanks to being ahead in the zone. Or how many called strikeouts he created because he stuck the glove on strike three. Or how many bad two-strike swings hitters had to take to protect the zone, generating a weak grounder or a lazy fly ball. Flowers has helped prevent runs in an active, measurable way. That’s playing great defense in my book, even if we can’t see it.

He may not be the National League’s most valuable player, but he’s a lot closer than people want to pretend. And that’s no accident. This time, the numbers help lift the veil from our eyes. We would do well to remember that, and to appreciate the awesome power of a called third strike lightly, elegantly falling into an outstretched mitt, its wearer a statue for the umpire to gaze upon in quiet admiration.

References & Resources

Baseball Prospectus, CSAA Leaderboard
Ben Lindbergh, The Ringer, “Tyler Flowers Is MLB’s Most Improved Player”
Jeff Sullivan, The Hardball Times Annual 2017, “Pitch Framing Was Doomed From the Start”
Dan Turkenkopf, Beyond the Box Score, “Framing the Debate”
Dan Brooks and Harry Pavlidis, Baseball Prospectus, “Framing and Blocking Pitches: A Regressed, Probabilistic Model: A New Method for Measuring Catcher Defense”
Dan Meyer, The Hardball Times, “Dynamic Run Value of Throwing a Strike (Instead of a Ball)”
Dan Brooks, Harry Pavlidis, and Jonathan Judge, Baseball Prospectus, “Moving Beyond WOWY: A Mixed Approach To Measuring Catcher Framing”

Evan Davis is a writer and broadcaster living in New York City. He has appeared regularly on MLB Network. Follow him on Twitter @EvanDavisSports and Instagram Instagram.

14 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Joe Joe

7 years ago

The reason I have in believing the magnitude of catcher framing numbers because it is a black box that depends on catcher, ump, pitcher, and even to batter to a degree. It is easy to say a called strike is worth “x”. It is harder to say “y” percent of “x” goes to the catcher. Pitchers should be getting credits and discredits because of framing (i.e., if Flowers is worth 6 wins, the majority of the 3.5 wins attributing to framing should be subtracted from the pitchers).

-1

mikejuntmember

7 years ago

Reply to Joe Joe

BP’s CSAA values also track for pitchers, umpires and even ballpark, so they do isolate for all those variables. Pitchers don’t pay a run price if they have a low CSAA: they already get hit with the ensuing walks or hits that result from a hitters count.

Anyone making this argument has never bothered to read how framing runs are calculated and is being deliberately ignorant because they’re uncomfortable with the statistic or the reality.

jondowd51

7 years ago

Came here for the 10/10 illustration, stayed for the article

Evan Davis

7 years ago

Reply to jondowd51

Good #content for all!

Jimmember

7 years ago

Evan, you are very correct, sir. Also, you write very well — a framer of words and sentences, if you will.

Bobby Ayala

7 years ago

I personally am resistant to “appreciating” this as a skill because it highlights an imperfection in the game that should be eliminated, rather than exploited. The catcher is literally tricking the umpire into thinking a ball is a strike, like a soccer player dives to trick the ref into awarding a foul, or an offensive lineman uses his body positioning to subtly hold without getting caught– it is essentially using dishonesty and trickery to skirt the rules of the game.

The technology exists to get balls and strikes called right, which would remove catcher framing as a thing. MLB could push to implement it at any time and then this conversation is moot.

-1

mikejuntmember

7 years ago

Reply to Bobby Ayala

Its not always what it is; sometimes the catcher is deceiving the umpire into calling a ball a strike that IS a strike, but is usually called a ball due to the same human imperfections.

Balls around the edge are called strikes a percentage of the time. Depending where they are this varies from 70-80% to 20-30% (balls more extreme than that are rarely framing related and are usually miscalls).

The catcher isn’t exclusively turning balls that are rarely strikes into strikes. Sometimes they are improving the odds of balls that should be strikes but aren’t always called, or simply ensuring that a usually-a-strike call is definitely one (moving a 50% pitch to 80%, moving a 75% pitch to 90%, etc). For example, the strikezone is basically a rectangle, and the upper outside corner of that rectangle is still a strike, but balls thrown there are called strikes less than 50% of the time.

TO focus on only the balls that ‘should have been a ball’ is to dramatically underestimate the number of pitchers being affected, and focus only on those instances that are most dramatically wrong. Sometimes the catcher is also making sure the umpire calls it correctly.

It’s also simply untrue that the technology exists to be accurate enough to play with. Its pretty accurate for inside and outside calls, but the top and bottom of the strike zones on all the strikezone trackers is highly unreliable, especially for players of higher or lower than average height (and not just extremes like Judge and Altuve).

I suspect that if you were to go pitch by pitch and score the entire season by eye with 2 perfect camera angles (direct side view and direct top view), you would find that Trackman is NOT yet more accurate than the umpire. I have seen entire games where the Trackman zone is skewed about 2 inches inside or outside and its never recalibrated in game.

It would be nice if we had this technology, but we don’t.

One of the most valuable things Grandal and Barnes did this year was frame those high Rich Hill curveballs that are legally strikes but aren’t always called strikes into the strike zone – the ones where the ball drops down and crosses across the top of the zone, a pitch that umpires are notoriously poor at calling because it never looks like a strike until its too late.

mgwalker

7 years ago

Reply to mikejunt

I would argue that a pitch that the umpire calls a ball is a ball, and a pitch that the umpire calls a strike is a strike. It keeps things nice, simple well-defined and — as long as the umpire is impartial — completely fair.

-2

Bobby Ayala

7 years ago

Reply to mikejunt

mikejunt you’re only highlighting my point, all these things a catcher can do to influence the call and sway the umpire’s judgement– when instead we should be focusing on getting the call right.

Trackman doesn’t work, but Tennis has been using Hawkeye for 16 years, the technology not only exists it has been proven again and again on huge sporting stages.

Listen to your statement: “One of the most valuable things Grandal and Barnes did this year was frame those high Rich Hill curveballs that are legally strikes but aren’t always called strikes into the strike zone” -so one of the most valuable things they did was get umpires to do their jobs correctly? Wonderful, I know this is what I want from my sport.

-1

mikejuntmember

7 years ago

Reply to Bobby Ayala

The boundaries in tennis are fixed.

The upward and lower boundaries of the strike zone change all the time.

That’s far from a fair comparison and tennis has never had to address the challenge of the hollow of the knee and the point midway between the belt and the lettering on the chest.

If the strike zone were always the same it wouldn’t be hard to track it at all.

Until you figure out how to not only track it from batter to batter, but track it as it moves during the pitch (many hitters either crouch over more or stand up straighter as they initiate their swing), please don’t act like it’s existing technology.

Bobby Ayala

7 years ago

Reply to mikejunt

Sensors can locate the features on a batter that determine the strike zone and map it accordingly, without even having any equipment added to the uniform. As soon as they step in the box it could size them up and establish a strikezone.

Snapchat filters track facial features and put graphics over them, and this is essentially the same idea, it identifies and tracks touchpoints, and uses that information, not to put puppy dog ears on your head, but to size you up and create a strikezone. It can move while you move in the batters box, it could calculate your exact strikezone based on your body positioning at the time the ball crossed the plane of the plate. Its actually much easier to program and implement than facial recognition.

The technology clearly exists, or perhaps a better way of saying it is everything necessary for the technology to work exists. It’s just a matter of MLB developing it, building the infrastructure (sensors and cameras in every park), and the biggest hurdle: the umpire’s union. They will fight it bitterly until the end.

ItsPoPtime

7 years ago

An article on pitch framing that doesn’t mention Christian Vasquez at all? Weird

LenFuegomember

7 years ago

I have to confess that I am a pitch-framing skeptic.

As someone who umpired for several years (though admittedly many years ago), my opinion is that umpires simply do not use the location of the mitt as a determining factor in calling a pitch. When a pitch comes in, a “picture” of the path of the ball is imprinted in the mind’s-eye of the umpire, and that is what he/she uses to call the pitch. Heck, most of the time the umpire cannot even see where the catcher’s mitt is after the pitch, because his/her head is situated behind and only slightly above the catcher’s head – those who watch on TV and have not actually umpired have a hard time understanding that because every pitch they see is from the pitcher’s point of view, where the catcher’s mitt is in plain view 100% of the time.

So how do I account for pitch-framing data that suggests I am very wrong about this? I am spitballing a bit here since I have not given it a ton of thought, but it seems to me that minor differences in the setup of pitch location measurement equipment are far more likely to account for pitch-framing data differences than anything catchers do. It would take only very subtle changes – probably as small as a millimeter – to drastically change the data. For example, if, say, the San Francisco equipment were set up in a way that the equipment believes the strike zone is ever-so-slightly smaller in all four directions (left, right, top, bottom) than the actual plate, Buster Posey would become a pitch-framing legend. Or if the equipment were adjusted before a season, a catcher like, say, Jonathan Lucroy might suffer a sudden decrease in effectiveness.

I am sure you all will tell me that the data has been scrubbed for all that and it still comes up with a giant thumbs-up for pitch-framing. But I am still a skeptic. I can only say in my defense that I am not a troglodyte – I believe heavily in baseball science and sabermetrics, and I recognize that my skeptical opinion seems to be overwhelmed by the experts in this area, which has made me question my opinion again and again. And yet, as a former umpire, I remain a very healthy pitch-framing skeptic.

mikejuntmember

7 years ago

Reply to LenFuego

Did you umpire professionally? Because the whole point of framing the pitch is that MLB-caliber velocity is almost impossible to track with the naked eye and so the mind -subconsciously- uses the position of the mitt when drawing that line.

The idea of framing is NOT that the umpire is intentionally being mislead; its that its a split-second impression what the path of that 95 mph, bending sphere was, and the glove influences the umpire’s read of that pathway when it doesn’t move out of the zone.

The amount of time you have to read a HS-caliber pitch (mid-80s unless you’re dealing with high end prospects) and professional pitchers, despite ‘only’ being 10 MPH or so, is about 1/3 more time.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG