The Strange Career of Wes Ferrell (SP Leverage, Part 4)
This is the fourth article in a never-ending series on starting pitcher leverage. If you know the gist of these suckers, you can skip this paragraph . For the rest of you, starting pitcher leverage refers to the once-common practice of a team intentionally using its pitchers disproportionately against particular opposing teams. It could be an ace starting all the time against the best opposing teams, or southpaws starting against the most left-leaning offenses. For this study, I figured out that leveraging existed back in the earliest days of baseball up to the 1960s, and thus I looked at the usage patterns for virtually every pitcher worth looking at. I ended up determining the leveraging for over two-thirds of all GS from 1876 to 1969. For this I invented a stat called AOWP+. Scroll down below to see exactly how this stat works. Short version: it’s set up like ERA+ or OPS+, centered on 100. A higher score means the pitcher was used more against the best teams, a low score means more against the worst teams, and if he’s used evenly against all his AOWP+ will be 100. So much for that.
So far I’ve looked at that best and worst leveraged careers, , single seasons, and tried to determine how much impact leveraging had on a pitchers’ numbers. There’s a lot of ground I’d still like to cover, but first there’s an issue I must contend with first: the specter of Dick Thompson.
The problem
I mentioned in this series’ debut article that Dick Thompson’s work inspired this entire venture. Though the notion of starting pitcher leveraging is fairly well known—find an old ChiSox fan who remembers the Go-Go Sox and he’ll tell you about all the times Billy Pierce started against the Yanks—Thompson is the only person I know of who had the insight to use what’s known of leveraging to an individual pitcher in judging his career. In particular, in his book, “The Ferrell Brothers of Baseball he argued that Wes Ferrell is far better than anyone realizes because of how his teams used him.
Years ago he used to post on Baseball Primer detailing nuggets he’d learned. Most memorably, he claimed Ferrell from 1929 to 1936 was as good as Lefty Grove. For example, he’d point to 1930 and 1931 when Grove didn’t pitch much against the Yanks and never had to face his own squad’s potent offense, while the Indians routinely loaded Ferrell up against those teams. Since then I’ve come across members of the sabermetric community as prestigious as Rob Neyer mentioning Wes Ferrell as an example of a pitcher who was better than his numbers indicate because of his usage.
This was brilliant and frankly revolutionary research by Dick Thompson, and I salute him for having the idea to examine leveraging, the determination to see the project through, and the willingness to make his information public.
There’s just one problem. He’s all wrong.
Wes Ferrell was actually a rather poorly leveraged pitcher. Not always, mind you, but while with the Red Sox calling him poorly leveraged is a massive understatement. On the whole, leveraging actually diminishes, not enhances, his value. This realization was one of the most jarring things I uncovered in this study. Here’s his AOWP+ info for his career:
Year GS AOWP TOWP AOWP+ Team 1928 2 430 514 84 CLE 1929 25 502 495 101 CLE 1930 35 511 496 103 CLE 1931 35 513 498 103 CLE 1932 34 496 489 101 CLE 1933 26 513 498 103 CLE 1934 23 468 500 94 BOX 1935 38 481 498 97 BOX 1936 38 483 504 96 BOX 1937 35 492 501 98 BOX/WAS 1938 26 474 496 96 WAS/NYY 1939 3 482 490 98 NYY 1941 3 524 516 102 BOS All 323 493 497 99.16
When Dick Thompson pointed to the early 1930s and how the Indians used Ferrell often against the Yanks and A’s, he was right. In 1930 he started six games against the Yanks and seven against the Athletics. He had no more than five starts against any other team. Next year he started six times against the A’s and four against the Bronx Bombers (plus seven more against the third-place Senators). However, that was merely common for an ace in those years, and by no means was it the most remarkable part of his usage.
For Ferrell, the key lies in Boston. Sure, his AOWP+s don’t look that bad for those years, but back in the day, when leveraging was pretty common, you rarely saw a big-name pitcher like Ferrell, who still was winning 20 games a year, consistently fall several points below 100. If you divide up all his 110 Boston starts into seven categories—games against the best-available opponent, second-best, third-best, and so forth down to the seventh-best (worst opposing team), here’s what his usage pattern looks like while a Red Sox:
Opponent GS Best 10 2nd Best 14 3rd Best 16 4th Best 11 5th Best 18 6th Best 20 Worst 21
Pretty neat, huh? Only one step out of place. And that gives him the benefit of the doubt, because in 1936 both the White Sox and Senators had a .536 mark. He had six starts against the former but only two against the latter. I listed the Sox as the third-best and Washington as the fourth-best. Flip it around and his usage looks even more bottom heavy. It’s even more amazing when you realize that contemporary ace pitchers were more likely to start against the best available team. In fact, never, in the entire history of baseball, has there been a pitcher so talented as Wes Ferrell, who—while still in his prime—was as poorly leveraged as he was by the Boston Red Sox. For perspective, here’s his entire career:
Teams Cle Box Rest Total Best 28 10 6 44 2nd Best 20 14 8 42 3rd Best 27 16 7 50 4th Best 20 11 9 40 5th Best 18 18 10 46 6th Best 21 20 8 49 Worst 21 21 8 52 Total 158 110 56 323
And since Thompson compared Ferrell to Grove, here’s the great one’s leverage score, and below that, to finish off the parallel, how many times he started against the various teams in his career (since with Ferrell, when teams were tied, I put the team he faced the most on top, I’ll give Grove the same courtesy):
Year GS AOWP TOWP AOWP+ Team 1925 18 508 488 104 A's 1926 33 501 491 102 A's 1927 28 486 488 100 A's 1928 31 478 481 99 A's 1929 37 467 473 99 A's 1930 32 471 477 99 A's 1931 30 471 473 100 A's 1932 30 488 484 101 A's 1933 28 499 499 100 A's 1934 12 510 500 102 BOX 1935 30 500 498 100 BOX 1936 30 535 504 106 BOX 1937 32 532 498 107 BOX 1938 21 501 489 102 BOX 1939 23 516 485 106 BOX 1940 21 498 495 101 BOX 1941 21 487 495 98 BOX All 457 496 489 101.43
Rival A's BoX All Best 46 32 78 2nd Best 35 28 63 3rd Best 33 36 69 4th Best 40 21 61 5th Best 32 32 64 6th Best 38 24 62 Worst 43 17 60 Total 267 190 457
First, Grove’s marks with the A’s were low for an ace back then. Other aces weren’t leveraged—Walter Johnson being the best example—because their teams used them as workhorses throwing them out there as often as possible. Grove was never a workhorse like the Big Train. Also the first two articles in this series showed lefties were especially likely to be used against the best teams disproportionately, which makes Grove’s pedestrian show that much more unusual. Instead, Grove’s teammate (and fellow southpaw) Rube Walberg picked up the slack for him, and thus became baseball’s fourth best leveraged starter of all-time.
However, Grove bests Ferrell every way—better career marks, higher peak seasons, while Ferrell had the lower single seasons. Grove had a greater percentage of his starts against the best available team, and fewer against the worst. Even in terms of pure AOWP, without looking at leveraging, he beats Ferrell .496 to .493; a key point because part of Thompson’s argument was that Grove never had to face Philly’s own fantastic offense. Added bonus: the difference is most notable in the years they were teammates. It’s the damnedest thing.
Trying to Figure This One Out
Well, Dick Thompson did use innings pitched while I used GS, so that could be it. But I have a lot of trouble seeing that explaining the difference. Both men completed a large majority of their starts, so that shouldn’t make much difference. They both had numerous relief appearances, but in relief the situation matters more than opponent when it comes to leveraging. Hmmmmmm…
Dick Thompson is a brilliant researcher. Crickey, he even won SABR’s highest honor, the Bob Davids Award, for his work. As a general rule of thumb, when a society called the Society for American Baseball Research singles you out for brilliant baseball research, you’ve done something really right. So how the heck could he be so far off on this one? Well, that’s where this really gets, uh, “fun.”
You see, Dick Thompson became aware of my research and conclusions about Wes Ferrell around two years ago. (I had less refined ways of quantifying this stuff back then, but it came out the same for Wes). And oh lordy, you think that the people who firebombed Dresden made themselves some enemies. Without referring to me by name, in post #134 in this thread Thompson in short order, 1) dismissed it as “SABRmetric drivel,” 2) denigrated me as “the guy who tallies data from retrosheet and passes it off a[s] original research,” and 3) my personal favorite, said I “plagarize [sic] material by hitting a few computer keys.” Looks like I touched a nerve. I don’t want to rehash the entire thread (my response is post 154 under my handle of Dag Nabbit), but I do think there’s a connection between his botched interpretation and his emotionally venomous outburst.
The key to unlocking this mystery resides in a seemingly unconnected comment at the end of post #134. He mentions that he’s researching an obscure pre-integration black pitcher named Bill Jackman. From what he’s uncovered, Jackman’s a better pitcher than Dick Redding, Jose Mendez, either of the Foster brothers, and possibly the great Satchel Paige. Mind you, almost all those guys are in Cooperstown. Meanwhile, a few years ago I read a book, Cool Papas and Double Duties where about 30 Negro League experts, and about as many former Negro Leaguers (many of whom died before the book came out) named up to 27 picks for the best Negro Leaguers not in Cooperstown at that moment. None mentioned Jackman.
There’s a theme underlying Thompson’s interpretations. He argues his subjects of interest are better than anyone thinks. Far better. Jackman’s as good as Satchel Paige. Wes Ferrell was comparable to Grove. Even Rick Ferrell, one of the most denigrated Hall of Fame selections of all-time, gets Thompson’s defense, as Thompson marshaled a series of quotes from old time baseball men talking about how great Rick was.
One gets the sense that Thompson doesn’t just research his players, but falls in love with them. And when that happens, he becomes blinded to any/all negative information about them and entranced solely by the wonderous parts. This is another reason why I think using IP instead of GS wouldn’t explain the difference. The problem of his Wes Ferrell research fits a larger pattern of interpretational bias.
He ends up making claims so beyond what’s reasonable that some schmuck like me poke some serious holes in an interpretation he spent years working on. It’s a shame because it’s great research, and a fantastic starting point, but it’s a cautionary tale on the need to keep a critical eye on what you’re doing. As a general rule of thumb, when some random schlub can spend an hour poking around retrosheet and provide evidence you’re interpretation doesn’t hold water, you’ve done something really wrong.
This leaves me in a conundrum. Dick Thompson’s work on Wes Ferrell was the stimulus for all the work I’ve done on starting pitcher leverage. His insight into examining how usage patterns are something I still treasure. He’s a far better researcher than I’ll ever be in his dedication to mining information on a specific player. Being an expert, however, does not mean one’s interpretations are sacrosanct and above question.
One final comment: the charge of plagiarism I find especially unfounded. All I have to say about that is that unless one makes no distinction whatsoever between using retrosheet and plagiarizing retrosheet, then the charge is wholly without merit. Sorry from the just from the main thrust of this series, but when you engage in a major study like this, and the person who inspired it has such open contempt for it, you really need to address that. Next article this series will get back on track as I look at something that previous articles have shown to be extremely important to the concept of starting pitcher leveraging—platoon leveraging.
References & Resources
What the heck is AOWP+?: The stat I invented to judge pitcher leveraging. It’s AOWP/TOWP*100. AOWP is Average Opponent Winning Percentage. TOWP is Team’s (Average) Opponent Winning Percentage. To figure AOWP for a single season, you take the number of starts a given pitcher had against each opposing team, and multiply that by the team’s winning percentage. After doing this for all rival squads, add up the products and divide by the pitcher’s total GS. The result is his AOWP. The same logic applies to TOWP, only here you look at how many games the team played against all rivals. If a pitcher’s used evenly, his AOWP will be the same as the TOWP, and he’ll have an AOWP+ of 100. If he’s used more against better teams, he’ll have a higher AOWP+. I calculated AOWP+ for 659 pitchers who started 182,000 games, including over two-thirds of all games from 1876-1969.