Plato, the NL Cy Young Race, and Kyle Hendricks as Epistemological Problem

Advanced stats show just how good Kyle Hendricks has been this season. (MLB.com screencap)

Advanced stats show just how good Kyle Hendricks has been this season. (MLB.com screencap)

Plato is often believed to be an ancient Greek philosopher who used the figure of his teacher, Socrates, to expound his own philosophical beliefs about the nature of things, expressed in the form of the dialogue. More properly considered, Plato is one of the masters of theatrical literature, or perhaps literary theater. His dialogues are not historical recordings from memory of events that actually took place. The Socrates presented within Platonic dialogues bears some resemblance to what little we know of the real man who, as Dodgers announcer Vin Scully so memorably reminded us last spring, drank hemlock rather than be exiled from Athens for poisoning the minds of the city’s youth. That resemblance is minimal; Socrates is a character sprung from the mind of Plato, a vessel through which to impart the discourse of ideas.

But too often, even if that reading of Plato is accepted, it is not enough. Socrates may not even be the character who represents Plato’s own beliefs. Nowhere is this clearer than in the Theaetetus, a short, mid-period dialogue that dates to approximately 369 BCE. It tackles what may be perhaps one of the most vexing philosophical questions of all: What is knowledge?

Plato’s characters—Socrates, Theaetetus, and Theodorus—attempt to answer the question of knowledge. It is first proposed by Theaetetus that knowledge is merely sense perception, which he links with the sophist Protagoras’s thesis that “man is the measure of all things.” In other words, what an individual perceives, they then know. Socrates refutes this, calling to attention the different ways in which different people can experience the same phenomenon. If the wind feels cold to one person, and warm to another, what exactly do we know about the wind? Certainly not the essence of its temperature. Such relativism cannot be knowledge, according to Socrates.

So Theaetetus pivots away, offering that knowledge must be judgment. Specifically, the word “judgment” in this context refers to a matter of evaluating raw experiential information. Here too, Socrates slowly and methodically pokes holes into the idea that judgment on its own can qualify as knowledge, since false judgment can too easily occur. Memory can deceive, as can the persuasiveness of one who may convince you of the truth of something without you having experienced that thing directly. If you did not experience something, can you really know it? If you did experience something, but can’t remember it correctly, do you know it?

Theaetetus then attempts to refine a theory of knowledge further, by claiming that true judgment accompanied by logos, or an “account” of a thing, qualifies as knowledge. In other words, should we correctly judge experience to be true and are then able to correctly express that judgment with language, then we know it. But here, too, Socrates exposes this fallacy. Too many things in the world can be named, but not directly known by experience. Language is merely a combination of sounds, and so whatever thoughts or judgments to be expressed in words would then must qualify as knowledge. That sounds an awful lot like Protagoras’s relativism.

In the end, neither Socrates nor Theaetetus are able to establish a sufficient theory of knowledge. If Plato holds his metaphysics, laid out in the Republic, to remain true here, then he is not satisfied, through the character of Socrates, to claim any conclusive theory of knowledge because the mediation of human cognition does not allow for the ideal Forms of the world to objectively reveal themselves. But even here, this is not satisfactory, because in Socrates’s defeatism, Plato acknowledges that human beings must endeavor to know something. How, then, can this be resolved?

And most important of all: What do we know about Kyle Hendricks?

Kyle Hendricks would have driven Plato crazy. For a player who so consistently relies on the suppression of quality contact, few can agree on just how good Hendricks actually is about managing contact. It is generally accepted that contact management can’t be complete random variation. But how best to measure it?

TAv Against is a tempting number to examine, since it properly weights all hitting events, including strikeouts and walks, and makes park and league adjustments. It would seem to suggest that batter for batter, Hendricks has been the best pitcher in the National League this season.

One thing TAv Against can’t do is separate out the actual contact given up from the outcome of the batted ball. The result is the result, no matter how likely or unlikely it was based on the exit velocity and launch angle.

What’s more, the three pitchers with the lowest TAv Against in the NL all come from the same team, the Chicago Cubs. TAv Against may take the randomness of sequencing out of the equation in a way that ERA may not, but what it doesn’t do is account for defensive quality. The Chicago Cubs are first in baseball in Defensive Runs Saved, with 77. The second-place Red Sox have 50, a 27-run gap. The Cubs are also first in Park-Adjusted Defensive Efficiency, with a mark of 6.46. The second-place Dodgers have a score of 2.15. No matter what method you use, the Cubs are the best defensive team in baseball by a wide margin.

If, then, the three best pitchers in the National League at minimizing damage are playing in front of the best defenders in baseball, is it not possible to say that we know that their pitchers aren’t preventing all those runs on their own?

Let’s go deeper. We can attempt to remove defense from the equation by looking solely at contact quality, namely, the exit velocity and launch angle of batted balls. It looks like Hendricks is very good, elite, even, when we look at contact quality numbers like Hard% and barrel rate. Interestingly enough, he’s not the best by any of these metrics. By barrel rate, two high-strikeout power arms, Noah Syndergaard and José Fernández, sit above him.

A Hardball Times Update
Goodbye for now.

Even these numbers can’t be fully trusted when examining contact management. Baseball Info Solutions doesn’t use exit velocity when measuring Hard%, so its calculation is imprecise. The barrel numbers are incomplete, too. A healthy percentage of batted balls are not logged by Statcast, either because of an aberrant glitch, or a consistent bias, as has been reported by several Statcast analysts. These biases consist of batted balls hit so softly, or at such extreme launch angles (eg. extreme ground balls and pop-ups), that the radar can’t pick them up.

Furthermore, as Jonathan Judge et. al. discovered at Baseball Prospectus, there are significant and measurable park effects at play with exit velocity. This makes intuitive sense, whether as a result of different calibration of the Statcast system in each park, or the idiosyncrasies of each park changing the location of a ball’s “reading” after it leaves the bat. Statcast reads balls as hit harder at AT&T Park and softer at the Great American Ballpark, for example.

This is all to say that we know less about exit velocity than we may like to think. The application of this information to performance analysis has to be done with caution.

But let’s say, for the sake of argument, that a pitcher like Kyle Hendricks indeed appears to have some effect on contact management. How much of an effect? According to the Barrels stat, he’s a bit worse than perhaps his ERA suggests, especially when you factor in the defense playing behind him. Yet, another piece of evidence may contradict the size of the defense’s contribution, as Rob Arthur and Ben Lindbergh explained at FiveThirtyEight. Their methodology used exit velocity, launch angle, and their attendant linear weights to suss out the contribution of pitchers and defenders to run prevention. Cubs pitchers were over three times more responsible than their defenders at preventing runs.

This feels compelling, and even drove The Athletic Chicago’s Sahadev Sharma to say, “Every time I speak with [a front office friend of mine], he ends our conversation by reminding me that I should always be learning and I should be willing to change my mind if compelling evidence points me in another direction.” Sharma takes this missive and applies it to Arthur and Lindbergh’s study. Sharma is persuaded that Hendricks’s contact management is indeed a skill, one that he himself generates.

Yet a comment that Sharma made in a subsequent paragraphs struck me. He remarks that Hendricks has a propensity for getting ahead in the count. Pitchers have a bigger advantage over hitters when the count is 1-2, say, rather than 2-1. This fact intrigues me because Hendricks also possesses the highest called strike percentage of any qualifying starting pitcher in baseball. He gets called strikes, he gets ahead of hitters, and he forces them to weakly hit a ball on a part of the plate they didn’t anticipate having to cover. A recipe for success.

That called strike percentage isn’t exclusively the work of Hendricks. Even though ways of measuring command are incredibly difficult, most observers agree that Hendricks’s command is superlative. He gets calls on the edges of the zone because he can put the ball where he wants.

That’s not the only reason. Hendricks has thrown to some of the best pitch framers in the game in Miguel Montero, David Ross, and Willson Contreras. Even when you control for Hendricks’s tendency to throw on the edge and the effects of the hitters Hendricks has faced and the umpires’ individual strike zones, the Cubs backstops are among the best receivers in the game. Hendricks has worked most with Miguel Montero, who has saved or stolen 2.7 percent more strikes than the average catcher in baseball, per Baseball Prospectus’s Called Strikes Above Average metric. That’s second-best in the majors. Hendricks gets ahead in the count because of his own skill, but also in great part because of his catchers’ framing abilities.

Arthur and Lindbergh controlled for several factors when trying to parse the defensive contribution of pitchers and position players. They rated pitch framing as having a minor effect, mostly because they examined batted balls, which by definition don’t result in strikeouts or walks. Those two outcomes are obviously those most affected by framing. But if framing is a major reason why Hendricks can get ahead in counts and then force hitters to make weak contact on balls they otherwise would lay off, might not framing play a bigger role?

This, I feel, is not an insignificant question to ask. The run value of a called strike or ball has been repeatedly been demonstrated to be so high that even if framing doesn’t directly affect a ball on contact, the things that led up to that contact can utterly change the complexion of the at-bat.

So then, is the defense making Hendricks look good, or is it Hendricks making the defense look good? Arthur’s and Lindbergh’s study points to the latter. Deserved Run Average, the former. DRA was developed by Baseball Prospectus in an attempt to isolate a pitcher’s performance, to go beyond ERA in the precision of its explanation of run-prevention responsibility while also going past the limitations of FIP in order to understand a pitcher’s control over batted balls. DRA adjusts a pitcher’s linear weights results for opposing hitter quality, park effects, defense, catcher framing, controlling the running game, and other critical contextual factors. In other words, it attempts to strip away everything about the results of a pitcher’s performance except what the pitcher himself does. DRA doesn’t use Statcast, but shrinks all hitting outcomes so that fluky results are pushed toward the mean. As Judge discovered, both home run and BABIP suppression become true skills within the DRA framework, a major breakthrough.

Hendricks, it so happens, comes out exactly neutral on contact runs by DRA. Zero saved, zero given up. DRA looks to the pitcher’s park tendencies of Wrigley Field and the world-beating Cubs and has decided that while Hendricks hasn’t hurt his team on balls on contact, he hasn’t helped them, either. That neutrality, combined with slightly above-average strikeout and walk rates, makes him a very good pitcher, but not the best in the NL.

DRA isn’t the only measurement to rigorously control for defensive quality and contact quality. Tru ERA- puts him behind only Max Scherzer. Just looking at contact, Hendricks is the best in the NL.

Hendricks does much better with Tru ERA and Adjusted Contact Score than he does with DRA or its attendant Contact Runs. Which methodology makes us know more about Hendricks, and pitcher performance in general? It depends on whether you think pitch framing, opposing hitter quality, and control of the running game are important factors to consider when analyzing pitcher performance.

It also depends on how each method is measuring contact quality and the role of defense. They clearly disagree, and given the uncertainty built into Statcast data, we can’t even be sure if we’re measuring contact quality correctly. We’re probably closer than we were two years ago, but what is not known remains significant.

So let’s look at Hendricks’s position on some of these contact measurements we have been discussing, with his rank in each stat among the 30 qualified NL starting pitchers this season.

KYLE HENDRICKS, 2016 CONTACT STATS
Stat Score NL Rank
Hard% 25.7% 3rd
FB/LD avg. exit velocity 90.4 mph 4th
Barrels per batted ball 4.0% 3rd
Adjusted Contact Score 71 1st
DRA Contact Runs 0 13th
TAv Against .212 1st

Let’s compare this to José Fernández, whose Three True Outcomes performance is second to none, but whose contact management results are perhaps more mixed.

JOSE FERNANDEZ, 2016 CONTACT STATS
Stat Score NL Rank
Hard% 32.2% 20th
FB/LD avg. exit velocity 92 mph t-15th
Barrels per batted ball 3.6% 2nd
Adjusted Contact Score 114 30th
DRA Contact Runs 3.4 21st
TAv Against .231 6th

Fernández gave up a lot of hard contact overall, and was fairly middling when hitters get some loft on the ball. Yet, he seemed to keep that hard contact outside the barrel zone, suggesting that his hard-hit balls were less likely to do damage. Blengino’s Adjusted Contact Score disagrees, showing that Fernández was bad across the board on contact once adjusted exit velocity, launch angle, defense and park effects. DRA Contact Runs also thinks that he’s fairly poor on contact, despite its emphasis on the shrinkage of linear weights outcomes, rather than Statcast data.

What happens when you add contact, walks, and strikeouts together, pushing beyond TAv Against by adjusting for the necessary context? How do these two pitchers come out looking?

HENDRICKS-FERNANDEZ, 2016 COMPARISON
Player Blengino Tru ERA- NL Rank DRA NL Rank
Kyle Hendricks 66 2nd 3.52 12th
José Fernández 75 4th 2.21 1st

Fernández’s performance on walks and strikeouts keep his holistic performance scores closely knit vis-à-vis the leaderboards. Hendricks, whose status as an elite pitcher rests almost exclusively on contact, spreads him out. We know more about contact than before, but we can’t truly know which pitcher is performing better.

It is here that we return to Plato. The characters in the Theaetetus cannot settle on a definition or theory of knowledge. Too many elements of human beings’ cognitive perception of the world clouds the ability to truly know or understand. But futility is not Plato’s objective. Rather, the Theaetetus reads as a tool for further exploration into what comprises knowledge.

We see Kyle Hendricks pitch, and so believe that we know certain things about what he does. He has impeccable command. His pitches move. He induces weak contact, thus dominating hitters without striking them out at Kershaw-like levels.

But peel the layers back. How do we know that Hendricks has great command? We have no independent, quantifiable way to measure his command, except for the fact that his pitches seem to find their targets. The more we try to understand about his contact management, the less we can conclusively prove.

The Theaetetus points us, then, into what Plato was really after: Knowledge, ultimately, is conditional. To know something is not to say that it is true for all time. What constitutes a fact is a collection of perceptions, judgments, and accounts examined, analyzed, argued against and determined. In many ways, this is the scientific method with which any sabermetrically inclined baseball fan is familiar.

The most crucial element of a theory of conditional knowledge is that even in the absence of conclusiveness, we can conditionally say that something is known to us. Pure, objective knowledge of reality is impossible. But the probability of getting close is in fact a genuine yardstick.

That is why I settle on DRA. It can’t perfectly explain every single action and outcome on a baseball field, and who was responsible for each one. It comes closer than any other measurement in doing so. It doesn’t run the risk of using an incomplete data set like Statcast, which, even with its controllable biases, still misses registering events it shouldn’t miss, or even an entire game’s worth of data, as occurred recently when Hendricks pitched against the Pirates at PNC Park.

Brian Kenny recently presented a segment of his program on MLB Network, MLB Now, wherein he made the case for Hendricks as the NL Cy Young Award winner. Kenny has tended to prefer defense-dependent pitching metrics to evaluate Cy Young candidates, because, in his (paraphrased) words, “It’s not about what you should have happened. It’s about what did happen.” As such, ERA and RA9 and the corresponding RA9-WAR become perfect statistics to capture the descriptive performance of a pitcher.

Except, ERA doesn’t explain only what a pitcher did, nor does RA9. Those two metrics explain what a pitcher, a catcher, a hitter, seven defenders, and a ballpark did.

Similarly, FIP, which was born out of the Eureka moment of Voros McCracken in 2001, tells us more than ERA or RA9 about the underlying skill of a pitcher, since it focuses on the three things a pitcher can control. But it presumes that a pitcher has no control over batted balls hit against them.

The most important thing to remember is that we did know something about pitchers based on their ERAs at one time, because we had no better information to go on. We knew more about pitchers based on their FIPs than ERA for many years, because that was the best information we had to on. Now, we know more about pitchers based on their DRA than ERA or FIP could tell us. It is the best information we have to go on. In the future, DRA may teach us less than some new method. We must be prepared for that moment. That moment will not diminish the current depth of our knowledge.

Ron Darling once said that “Sabermetricians will never be happy.” Maybe that’s not true of every sabermetrician, but the majority of us remain restless, seeking to know more than what has come before. It is why VORP has all but disappeared from analysis. It is why FIP is likely headed there. It is why Defensive Runs Saved and Ultimate Zone Rating will surely be eclipsed by more precise measures of defense.

Knowledge is mediated by human beings, but that mediation still has explanatory power, with the understanding that whatever we may know, will almost certainly change in the future. The more you know, the less you know.

References & Resources


Evan Davis is a writer and broadcaster living in New York City. He has appeared regularly on MLB Network. Follow him on Twitter @EvanDavisSports and Instagram Instagram.
11 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Jim S.
7 years ago

Outstanding.

Craig S
7 years ago

So….is Hendricks your Cy winner or no?

Evan Davis
7 years ago
Reply to  Craig S

If I had a vote, my ballot would be:

1) Fernández
2) Syndergaard
3) Kershaw
4) Scherzer
5) Bumgarner

I imagine that Hendricks will, at minimum, finish in the top three in the final voting, and will be the favorite to win, along with Scherzer.

Heidi
7 years ago
Reply to  Evan Davis

Doubt it will Kershaw with the amount of time he missed

Johnny 23
7 years ago
Reply to  Evan Davis

Seems that with this list the strikeout plays a large part of the calculation

stephan
7 years ago
Reply to  Evan Davis

The innings gap between Scherzer and Fernandez/Syndergaard is greater than between Fernandez/Syndergaard and Kershaw, so what is the reasoning for placing Kershaw below those two? It seems that if you value the extra innings, Scherzer should be above those three pitchers, while if you are less concerned with the innings, Kershaw should be above F/S since the performance gap is at least as big between Kershaw and F/S as it is between F/S and Kershaw.

Rick
7 years ago

I think the New York Times should take notes based on this article and try and mimick it because this is so well written. You provide facts, deep deep analysis, and end the article without trying to portray your opinion as a fact.

Rick
7 years ago
Reply to  Rick

The reader is left to formulate their own opinion based on an unbiased collection of information.

bob m
7 years ago
Reply to  Rick

You are assuming the NY Times cares about baseball. It doesn’t. The proof is there everyday

Njguy73
7 years ago

“Sabermetricians will never be happy.” – Ron Darling

“Bad sabermetrics attempts to end the discussion by saying that I have studied the issue and this is the answer. Good sabermetrics attempts to contribute to the discussion in such a way as to enable it to move forward on a ground of common understanding.” – Bill James, 1981 Abstract

Which one went to Yale?

cktai
7 years ago

Interesting how a progressive article calling for the continued development of sabermetric analysis forgoes centuries of epistemology and philosophy of science and settles for Plato as the final word on “the scientific method”.