MLB’s Rookies of the Year, Visualized

There won’t be a ton of debate about the Rookie of the Year Awards this year. (via Johnmaxmena2, Arturo Pardavila III & Michelle Jay)

An artifact of ancient major league history–one of those subjective, voting-dependent awards writers and fans love to loathe (but, also, secretly love to love)–the Rookie of the Year Award recognizes the best rookie-eligible player in the American and National Leagues in a given season. Voting for 2017’s rookies won’t happen until after the season, yet chatter always starts early, excited as we all were to be enjoying a new season of baseball. Accordingly, this discussion is either extremely premature or long overdue depending on where you fall on the loathe-love spectrum.

This post is driven less by narrative than it is data and visualization. Of relative youth, I know–knew–little about former Rookie of the Year award winners beyond roughly the turn of the millennium. I also knew not of an existing database that compiled public-facing statistics in database form exclusively for this or  other awards) So I pulled together some statistics, slapped together some data visualizations in Tableau, and voila: the post laid out before you.

If you don’t like words, information, numbers, etc., you’d do well to skip ahead to the visualizations by clicking here. My goal is, by using the visualizations, you can glean much of the same information presented herein–and even a little extra.

By the Numbers

From the award’s introduction in 1948 through the 2016 season, 9,662 hitters and 5,629 pitchers debuted and generated 1,635.3 and 2,568.9 wins above replacement (WAR), respectively, in their collective rookie seasons. The Baseball Writers Association of America has bestowed the coveted, occasionally disputed award to 138 players in that time–most often to hitters, who outnumber pitchers roughly three to one (100 to 38, or 72 percent to 28 percent) and have generated 0.3 more WAR in a given season (3.53 to 3.22) on average.

The WAR framework, however, attributes different amounts of WAR to hitters (roughly 570) and pitchers (roughly 430). When controlling for this, the script flips to give a slight edge to pitchers, who individually generate more WAR proportional to what’s available to them to earn.

It might seem, then, that there is bias at play in favor of hitters. Everything is relative, though; a rookie voting race is only as competitive as the rookies who compete in it. And rookie pitchers, while scarcer, have generally more valuable in their debuts, producing, on average, 0.46 WAR compared to hitters’ 0.17 WAR. Accordingly, a rookie hitter with identical WAR to a rookie pitcher might seem relatively more valuable given the lower baseline or expectation for hitters. (In other words, rookie hitter productivity might be more volatile.) All that said, there may still be bias–I can’t get inside writers’ heads–but, by the numbers, everything checks out.

The best rookie seasons belong to Mike Trout (10.3 WAR, 2012) and Dwight Gooden (8.3, 1984), each of whom all but lapped their respective positional fields, historically speaking, shattering previous records held by Dick Allen (8.2, 1964) and Gary Peters (6.5, 1963). Not much time has elapsed since Trout took what likely will be a death grip on the throne, but Corey Seager (7.5, 2016) came within striking distance last year, good for the third-most WAR by a rookie hitter since the award’s introduction. Hideo Nomo (5.2, 1995) came closest to besting Gooden, but saying he “came close” to begin with–like Seager, having fallen three wins short–is a grave overstatement. Jose Fernandez (4.1, 2013; R.I.P.) gave us the best rookie pitcher performance of the last decade.

The worst Rookie of the Year seasons are property of the delightful Ken Hubbs (-0.5 WAR, 1962; he scored 90 runs, though!) and Todd Worrell (0.3, 1986). Eric Karros (1.0, 1992) most recently took a swing at Hubbs’ record, but at least Karros went on to have a lengthy, if not wildly mediocre, career. (Hubbs was killed in a plane crash after his second season.) Kazuhiro Sasaki (0.6, 2000) most recently threatened Worrell’s record, but, alas, both were closers, a designation by which a pitcher’s contributions are effectively capped. Jeremy Hellickson (1.7, 2011) had the worst Rookie of the Yeartseason for a starting pitcher, and no one has really come close since.

Excluding Fernandez, the currently team-less Ryan Howard (2.2, 2005) is at risk of being the most recent Rookie of the Year  to permanently depart the majors, whether voluntarily or forcibly. If he sees time at the major league level this season (or, somehow, in 2018), he could cede the title back to the duo of Jason Bay (1.8, 2004) and Bobby Crosby (2.6, 2004), both of whom debuted in 2003. Otherwise, it’s his for the taking, until, I don’t know, Huston Street or Andrew Bailey (if I were a betting man–and I wager I might be) accept the torch from him.

Longest careers for Rookies of the Year? Pete Rose (15,892 plate apperances) and Tom Seaver (19,369 batters faced). Shortest? Joe Charboneau (722 plate appearances) and Sasaki (925 batters faced), albeit the latter by choice. Most productive careers? Willie Mays (149.8 WAR, although there’s a young bionic thumb-shaped man in Los Angeles of Anaheim who one day may beg to differ) and, unsurprisingly, the prolific Seaver (92.6 WAR). Least productive? Alfredo Griffin (-1.0 WAR) and Butch Metzger (-0.4 WAR).

Whew. So much data, so many ways to slice and dice it. I can’t feasibly fit everything you could ever want to know here. I hope my visualization can answer some of your remaining questions. Click here and have a go of it for yourself.

The Sophomore Slump

Among Rookies of the Year, 62 percent of hitters and 63 percent of pitchers suffered sophomore slumps, as measured by year-over-year declines in WAR. These percentages fail to account for rookies who may have experienced truncated seasons because they did not debut in April– a more frequent occurrence more recently, especially in light of teams’ desires to control the service clocks and costs of their young prospects. However, it also does not control for injuries players may have suffered during their sophomore campaigns that might artificially truncate their sophomore seasons. So, let’s call it a wash.

Rookies of the Year are liable to be 20 percent less productive by WAR in their sophomore seasons. There’s a relatively simple explanation to this: Given a distribution of talents for players who, themselves, will achieve any possible outcome along a distribution dictated by their talent levels, it’s fairly likely that a winner will have played way over his head without necessarily being the most talented rookie-eligible player that season.

That 20 percent dropoff, however, accounts for players who both improved and declined in their sophomore seasons. When split apart, players who improved achieved average WAR gains of 35 percent (pitchers) to 40 percent (hitters), whereas those who declined saw their production drop off 47 percent (pitchers) to 53 percent (hitters) on average. While pitchers are more likely to suffer sophomore slumps, hitters have more volatile productivity as sophomores, aligning with our assumption about positional volatility from the previous section.

A Hardball Times Update
Goodbye for now.

Moreover, 16 percent percent of hitters and 37 percent of pitchers never again matched their rookie-season WAR totals in any single season; in other words, they peaked as rookies. And seven percent of hitters and 11 percent of pitchers were worth fewer WAR cumulatively during the rest of their careers. Hitters gave back 29 percent of their annual WAR value, pitchers 47 percent. The volatility among pitchers likely can be attributed to how many of them emerged as dominant rookie closers. Few survive the ruthlessness of the closer carousel. It manages to chew up and spit out everyone who rides it.

2017 Candidates

In any other year, this might actually be an exciting discussion. Not that this discussion won’t be exciting. Perhaps the word I’m looking for is “competitive.” Outfielders Aaron Judge (NYY) and Cody Bellinger (LAD) have their respective awards locked down, barring a historically robust second-half performance by a competitor rookie and equally-and-oppositely catastrophic collapses from the frontrunners.

American League

FanGraphs reader hscer compared Aaron Judge’s rookie season to baseball’s greatest offensive performances by 25-year-old rookies. But why stop there? By weighted runs above average (wRAA), the linchpin to the offensive component of WAR, Judge’s first-half mark (44.8 wRAA) ranks among the top 10 single-season marks since 1948–and that’s among all hitters, not just rookies.

Indeed, Judge’s 5.5 WAR by the All-Star break trumps Trout’s 4.7 WAR in 2012, albeit in 20 more games and 76 more plate appearances. Prorate Trout’s performance, and he trumps Judge’s overall value. But in offense alone, Judge’s .466 weighted on-base average (wOBA) trounces Trout’s .407 wOBA, even after controlling for the league’s robust output during the alleged Juiced Ball Era.

Alas, by simple extrapolation, Judge is poised to best Trout’s all-time WAR mark for a rookie (10.3) and flirt with the elusive 11-WAR season, achieved only 23 times* in baseball history (by a mere 12 players, I might add). Yet such extrapolation might be bold, perhaps reckless. His .426 batting average on balls in play (BABIP), if sustained for a full season, would be the highest since 1900. And while it may be safe to assume Judge is a hitter the likes of which we (who are alive) have never seen, it’s also relatively safe to assume he won’t betray the First Fundamental Law of Sabermetrics for another two months. (Indeed, as of writing this, his BABIP has already fallen 27 points to .399.)

*Twenty-five to account for Ty Cobb (1911) and Joe Morgan (1975), whose WARs, prior to rounding, fell somewhere between 10.95 and 10.99.

A hitter of prodigious power, it’s also possible Judge sustains an outlier rate of home runs to fly balls (HR/FB), similar to a man who, once upon a time, did Judge-esque things before court was in session; the parallels between Howard and Judge, at least in terms of age and power, are striking. And through it all, Howard hit with this much power through his age-26 season before it eroded a little and then a lot.

Which is all a roundabout way of saying you’d do well not to assume Judge keeps up his historic pace. If he finishes the season with, say, a 30 percent HR/FB and a .370 BABIP–fairly generous assumptions, albeit not outrageous ones–we’re still talking 45 home runs and a .270 batting average in a full season. That’ll do.

National League

There’s less to say of Bellinger, although that shouldn’t detract from his monumental achievements to date. It’s difficult to live in the shadow, figuratively and literally, of a gigantic man who has captured America’s collective baseball heart. Yet Bellinger has mashed his way to a record-setting rookie campaign as well, breaking all kinds of most-home-runs-in-X-games records en route to consecutive Rookie of the Month awards in June and July. (Judge, uh, has all three of the AL’s awards.)

Besides being about three inches shorter and 60-odd pounds lighter than Judge, Bellinger is cut from the same cloth as Judge, making tons of hard contact with plenty of fly balls to the pull side, where power plays up best. He also has the same downside: lots of whiffs. Fortunately for Bellinger (and the Los Angeles Dodgers), he’s a young, highly touted prospect. He will (or at least should) continue to develop his skills and refine his approach. The future, no doubt, is bright.

Data Visualization!

I don’t know how else to end this other than abruptly. If you clicked the link early on to skip my cockamamie Rookie of the Year chatter, hello! If you read everything, hello again! I wanted to develop a way to visualize and process the Rookie of the Year through a statistical lens. I hope it’s fun! Or, at the very least, it cures your workplace boredom for five sad minutes.

Quick tip: Using the dropdowns to the right, you can (1) toggle the X and Y axes to several different combinations of variables, and (2) filter and manipulate the data on the graph to your liking. You can also hover over and click on data points for more information. It’s very interactive, so just mess around with it — you’ll get the hang of it.

Glossary terms:

  • WAR: wins above replacement
  • FIP-: indexed fielding independent pitching (“FIP-minus”); pitchers only*
  • wRC+: indexed weighted runs created (“wRC-plus”); hitters only
  • TBF: total batters faced; pitchers only
  • PA: plate appearances; hitters only

*If you designate a hitter-only metric for one axis and a pitcher-only metric for the other, the graph will crap out. Fair warning!


Two-time FSWA award winner, including 2018 Baseball Writer of the Year, and 8-time award finalist. Featured in Lindy's magazine (2018, 2019), Rotowire magazine (2021), and Baseball Prospectus (2022, 2023). Biased toward a nicely rolled baseball pant.
Newest Most Voted
Inline Feedbacks
View all comments
Dennis Bedard
6 years ago

No need to apologize (twice) for the preceding narrative. It was fun reading while sipping my morning coffee.

Alex Chamberlain
6 years ago
Reply to  Dennis Bedard

(I won’t apologize for apologizing I won’t apologize for apologizing I won’t apologize for apologizing I won’t apol)

Thanks! I’m glad you enjoyed it!

6 years ago

This is great. I’d really like to also see it done with RA9-WAR for pitchers. Hellickson, for example, had an outstanding 5.2 WAR his rookie season, by that metric. far exceeding his fWAR of 1.7. I’d guess voters have been much more influenced by ERA than by FIP, so it could explain a lot.

6 years ago
Reply to  TK07

And Hellickson’s rWAR basically splits the difference – 3.8. So there is a bid of disagreements about how to view his performance. Hellickson’s ERA has tended to be better than his FIP.

6 years ago

Gary Peters (6.5, 1963)

I said to myself “Who’s that?”

A two time All Star, received top ten MVP votes three times. His best season in 1966 Gary Peters had the lowest WHIP in the A.L. at 0.982 and an ERA of 1.98. For that effort over 204 innings Peters received no All Star placement and no postseason awards votes. Why? His W – L record was 12 -10.

As for Judge, he has 13 doubles. How many times have you watched a video of him hitting a ball the other way and thought “That’s a gapper” only to see it easily clear the RF fence? That kind of opposite field power will keep Judge around as a slugger.

Judge vs. Trout

This isn’t going to be close. Trout is 10.8 his rookie year. Judge is at 5.3 with 52 games left to play. Judge would have to be significantly better over those 52 games then he was during the first 52 games this year to catch Trout.

6 years ago
Reply to  JimmieFoXX

Yeah, but how many other AL pitchers had extremely low ERAs and WHIPs in 1966?

Paul G.
6 years ago

Ken Hubbs broke Bobby Doerr’s consecutive game and consecutive chances without an error records in his rookie season and, as a result, won the Gold Glove. He also led rookies in a bunch of offensive categories, mainly because he played almost every game and wasn’t awful. Modern metrics are unimpressed with his defense, but a 20-year-old full-time starter with elite defense and an adequate bat (at 20!) for a key defensive position is certainly an intriguing candidate, especially when one does not dig deeper.

Wes Roehl
6 years ago

Just a reminder that in Ryan Howard’s RoY season he played 88 games. So that 2.2 WAR came in a little more than half a season.

6 years ago

The RofY awards began in 1947 not 1948 . Jackie Robinson was the first winner. On the numbers alone, and ignoring any other factor, Larry Jansen deserved to win.

6 years ago

I still say Albert Pujols has had the best rookie season of all time. He shouldn’t be punished just because his rookie season of 2001 just happened to fall at the very peak of MLB’s PED-assisted offensive explosion. I don’t care that he had a big advantage over rookies in pitching-dominant eras like the ’60s. He’s still put up the best raw numbers of any rookie in MLB history. Try creating an offensive counting metric based on wOBA rather than wRAA or wRC+ and see where the chips fall.