Cold Weather, Positions and Penalties

Catcher's tend to fare better, offensively, in colder weather. (via Dirk Hansen)

Catchers tend to fare better, offensively, in colder weather. (via Dirk Hansen)

Cold weather isn’t conducive to offense. Low temperatures raise air density, which reduces the carry of fly balls and saps games of hits and home runs. These unwelcoming, chilly conditions are infrequent in the regular season, but occur often in the postseason. It’s another way that hitters are hampered in the playoffs, since they’re already up against better pitching, better defense and heightened pressure.

Arguably, the cold-weather disadvantage is equal for both teams, since both are sent to the cooler to play.  But it’s possible that the cold affects some aspects of baseball—and certain players—more than others.  We’ll explore this topic in the next two days, breaking the ice with an analysis of how positional performance and baseball’s major penalties are impacted by the cold weather.


We’ll look at how wOBA changes across discrete five-degree ranges of temperature, with the line charts smoothed by the LOESS method.  Our focus will be limited to night games so sunshine doesn’t influence how players feel out on the field.  For temperature data, we’ll use Retrosheet’s game-time Fahrenheit readings.  When the Fahrenheit measure is below 50 and the wind speed is above 3 mph, translations for wind chill temperature are used in lieu of standard air temperature.

All wOBA figures are adjusted for league run environment, batter quality, pitcher quality, park factor, and batter/pitcher handedness. Hitter- and pitcher-seasons needed a Marcel reliability score of 0.5 or higher to be included. Only open-air ballparks in cold-weather cities are considered (the list can be found in the resources section). Pinch hitters, substitutions, and pitchers-as-batters are excluded.  Plate appearances from 1988 and onward are used, for games in which temperature and wind speed data exist.

To keep the graphs uncluttered, confidence intervals aren’t shown. But these are important—just how uncertain is the calculated wOBA at the given temperature? That question will be answered at my blog, where I post an individual chart for each curve, sandwiched in between a LOESS-smoothed 95 percent confidence band. The supplemental graphics will be referenced in this article, because the fact is that uncertainty is large in five-degree bins of cold temperatures due to a lack of games. The histogram below shows that even at “cold-weather parks,” the vast majority of contests take place in 60- to 85-degree weather.

P1 Game Time Temp histogram

Performance by Position

As abominable as the cold weather can be, it’s easier to tolerate when one is frequently in motion. Among position players, catchers are the most physically and mentally active on every pitch. Ranking behind backstops would be infielders, who get set in an active ready position on each pitch and are moving on the majority of plays (even strikeouts, when they throw the ball around the horn). Outfielders enter an idler crouch and don’t need to move an inch unless the ball is hit their way or they need to reposition themselves for the next hitter. To put that all together, I ask: Do players hit better in the cold if they’re situated at more active field positions?

The chart below divvies hitters up by whether their position was catcher, infielder, outfielder or designated hitter. Outfielders are subdivided by whether they made one (or more) plays in the prior half-inning; catchers and infielders weren’t split up further due to their heavy involvement in many plays. I include designated hitters and adjust for the offensive penalty associated with DHing rather than playing the field. As colors darken in the chart, players move farther down the positional spectrum.

P1 Position

Before intersecting and overlapping at 50-55 degrees, the smoothed data come out drastically different. The two outfielder curves move in tandem throughout the chart, but especially in cold-weather games, where playmaking outfielders pick up a persistent, parallel advantage over their idler counterparts from 30 degrees up through 55. In hotter temperatures, the playmaking outfielders break apart from the idle outfielders and perform at the highest level.

Moving inward toward the diamond, infielders’ wOBA smoothly increases along with the temperature. Catchers do very well in the cold and poorly in the warmth, albeit through some of the smallest sample sizes at play in this study. The overall slope of designated hitters’ production, interestingly, stays flatter from cold to warm. We’ll revisit that finding soon; now, let’s expand the sample sizes and total up all positions’ wOBA before temperatures hit that convergence point of 55 degrees. In the table below, each positional group is ranked by wOBA.

Position Bin PA wOBA St. Dev (points) 95% Confidence Interval
C 6,157 0.339 6.5 (.327, .352)
DH 4,388 0.331 7.6 (.316, .346)
OF (made 1+ plays) 7,281 0.3241 5.9 (.313, .336)
INF 27,661 0.3238 3.0 (.318, .330)
OF (no plays made) 13,728 0.322 4.3 (.314, .331)

Catchers still stack up at the top, followed by designated hitters. Infielders and active outfielders are neck and neck, with both picking up a ~2-point wOBA advantage over outfielders who didn’t make a play in the prior half-inning. Both the chart and table seem to show that the more active field positions perform the best in cold weather, which meets expectations.

Of course, the confidence intervals have quite a bit of overlap, so we can’t rule out the possibility that the “true” wOBA figures could actually rank in a different order. Another caveat is a point on timing: most regular-season cold-weather plate appearances are taken in April and May, so players are “fresher,” and hampered by fewer nagging injuries. This figures to be especially important for catchers, as their cold-weather stats are indicative of when their legs are at their strongest and their nicks are at their fewest. In contrast, the level of wear and tear on infielders and outfielders is much more comparable throughout the entirety of their curves.

Where do these results leave designated hitters, who are never active in the field? My theory is that the DH penalty is alleviated by the cold, because DHs can keep warm in the dugout and clubhouse instead of standing around on a cold field. Is that what the DH curve’s steadiness in the chart hints at, in spite of its large uncertainty? Let’s check.

The DH Penalty

In the next table, we’ll calculate the DH penalty in five temperature-partitioned groups. The cold-weather plate appearances of interest are taken when it’s 55 degrees and cooler; the warmer remainder is split into quartiles. For readability, wOBA figures are rounded off at the standard third decimal and won’t perfectly match the penalty point differential (shown at the fourth decimal place). All stats were compiled with the delta method, with samples matching up by batter and temperature bin. Hitters needed a modest minimum of five PA to be included. Negative penalty figures indicate the presence of the DH penalty.

Temp. Bin (deg.) PA as a DH PA from the field wOBA as a DH wOBA from the field Penalty (points) St. Dev. (points)
55 and under 3,333 10,566 .328 .327 +0.2 10.0
56 – 69 10,311 51,417 .332 .347 -15.4 5.5
70 – 76 9,493 48,662 .332 .357 -25.4 5.7
77 – 82 8,530 42,552 .324 .355 -30.9 6.0
83 and up 8,209 46,204 .339 .357 -19.0 6.1

I find that there’s no DH penalty in the cold weather. So if it’s chilly out and a team wants to get its heavy hitter out of the field, it could do so without worrying about the offensive toll imposed by the DH penalty. This result implies that cold weather produces a break-even point: where the various effects blend together and balance to zero.

From there, the DH penalty follows a cyclical path. It surges to 15 points in the 56 – 69 bin, worsens another 10 and 5.5 points for the following two temperature groups, and lessens by 12 points in the final category. How can this pattern be explained? I surmise that just as it’s deleterious to stand out in the field during extremely cold temperatures, it’s also detrimental to play the field in extreme heat—albeit to a lesser extent.  The DH penalty hits its damaging nadir in optimal baseball weather, when playing the field instead keeps players engaged and comfortable.

The interpretations become less tidy when we move to the rightmost column. The standard deviations here are big in all temperature ranges, especially the cold! At 55 degrees and under, an effect size of zero is our best guess, but the 95 percent confidence interval indicates that the “true” cold-weather effect could swing within a broad 40-point range. The warmer temperature groups combined to post about 50,000 PA when not DHing, but still produce confidence intervals that have a ~22-point span.

We can’t declare statistical significance for these DH penalty results (or the other positional results, for that matter). But the results do align well with our Bayesian prior, our baseball knowledge, and common sense. So I’m inclined to believe that we’re viewing the “true” pattern, supporting our takeaways even if we’re unsure about the actual magnitudes.

One follow-up question I have is whether the related pinch hitter penalty shows the same temperature-related trend. I looked into it by going back and creating a separate dataset that applied the same set-up to pinch hitters alone. The experiment ended quickly because the samples were just too small for analysis. For example: Pinch hitters had only 212 PA at 55 degrees and below. That’s hardly enough to make a good projection of any single hitter, let alone to judge whether a true effect exists for a subset.

The Times Through the Order Penalty

The times through the order (TTO) penalty is another of baseball’s well-known penalties: that starting pitcher performance worsens with familiarity and each turn through the lineup. In an article at Baseball Prospectus two years ago, Mitchel Lichtman took another deep dive into the TTO penalty and in one table separated day from night games. The timing split produced a fourth TTO penalty in the daytime, with the data also showing that TTO penalties in day games were ~3 points higher than those in night games. MGL concluded that pitchers’ penalty from familiarity does get mitigated when batters are hindered by the cooler temperatures of nighttime.

It raises another question for this study here: Do TTO penalties shrink as night games go on and perhaps get colder? We’ll take a look with the next chart, which displays the wOBA curves for TTOs one, two, and three.


Curves for TTO1 and TTO3 largely run parallel all through their respective paths. As their slopes increase with rising temperature, the area between TTO1 and TTO3 does slightly increase. But what’s going on with TTO2? We’d expect it to fit neatly in between the curves for TTO1 and TTO3, but it doesn’t. It generally increases at a constant slope, although the line wobbles as games heat up.

Before trying to explain that curve, let’s consider the pooled stats via the same five temperature ranges used for DHs. With the delta method, I matched up samples by pitcher and temperature bin (with a minimum of six plates appearances), and calculated the penalties. Keeping in line with MGL’s findings on the TTO penalty, I dropped the top halves of first innings and all of ninth innings. Included in the table are penalties for the fourth (and onward) TTO. Again, negative numbers indicate total penalty points, here sustained by pitchers.

Temp. Bin (deg.) 1st TTO to 2nd TTO 2nd TTO to 3rd TTO 3rd TTO to 4th+ TTO
55 and under -9.6 -4.7 -14.7
56 – 69 -7.9 -7.9 -2.7
70 – 76 -8.3 -1.8 +1.3
77 – 82 +4.1 -21.3 +16.1
83 and up -1.6 -13.7 -1.3

As it gets warmer, we sort of see a lessening penalty from TTO1 to TTO2 and a rising penalty from TTO2 to TTO3. But both “trends” are inconsistent, in spite of some solid sample sizes (for all non-cold temperatures, the first two penalties typically contain 30,000 PA across every TTO to produce a joint SD of ~4 wOBA points). Most inconsistent of all are the smaller-sample 4th TTO trends, which jump around in a formless-looking pattern.

It’s hard to look at TTO2’s wobbling as anything but an aberration, maybe linked to the set-up of this study. If we ignore TTO2, we often see the aggregated ~15-point penalty that MGL found when a pitcher navigates through a batting order once, twice, and then three times at night. But even our summed penalties move around in these temperature bins, without showing a distinct trend. The TTO night-game effect found by MGL seems best applied in a general sense across all night games, without any additional temperature control.

Concluding Remarks

We’ve gained insight on what to keep in mind when a front comes in: The DH penalty appears to be at its most damaging in perfect weather and most innocuous in extreme temperatures—especially the cold. Additionally, players at the more active field positions seem to fare better than their idler teammates in the cold.

There’s still more I want to discuss in this examination of cold-weather play. Tomorrow, we’ll look at how different player populations are affected by these adverse weather conditions.

References & Resources

Gerald Schifman is the lead researcher at Crain's New York Business and a writer at The Hardball Times. He previously worked in the New York Mets' baseball operations department and in Major League Baseball's publishing department. Follow him on Twitter @gschifman.
Newest Most Voted
Inline Feedbacks
View all comments
6 years ago

Nice work. I wonder if part of the extreme heat penalty is due to extreme heat generally being accompanied by extreme sun, reflections, shadows, etc. that can make it more difficult to pick up the incoming pitches.

6 years ago

While I enjoyed this article, I do think the OF didn’t need to be split up. While they may not have the need to “prep step” every pitch, though good fielders will, they also have the best opportunity to move around and keep themselves warm between pitches (also they should be actively backing up plays, but this is the bigs, so there’s no guarantee.)

I would be interested to look at the mixed effect of the cold and the length of the game. Does the longer you stand out in the cold (or hot) have an effect on production? Time of Game, or a teams pitch count may show that having your players out in the cold longer erodes at their ability to generate offense.

Jetsy Extrano
6 years ago

Two things you mention in passing are pretty important: 1) we don’t actually see much here, with these uncertainty intervals, and 2) what we see as temperature may actually be effects of time within the season.

6 years ago

First, what is meant by saying that wOBA is adjusted for batter quality? I have no idea.

I have a tough time buying that catchers are the best hitters on the field in all games 50 deg. or colder. One issue you don’t mention is that the best catchers play more games at night and in the post-season (i.e., colder games) than their backups do.

For pitchers, you have to consider that SP quality and usage is far different in one of the cold months (Sep) as compared to the rest of the season.

Gerald Schifman
6 years ago
Reply to  evo34

The adjustment for batter quality is each hitter’s Marcel wOBA projection. This aims to place all subsets on equal footing, mitigating the issue you mention about catchers.

The adjustment for pitcher quality should wash out the issue of facing worse pitchers in September. Inexperienced call-ups drop out of the dataset due to the reliability requirement.

6 years ago

What is meant by “just how uncertain is the calculated wOBA at the given temperature?”

wOBA is a deterministic equation and we assume no measurement error in the events that it aggregates and weights (BB, HBB, 1B, 2B….). In other words, wOBA is precisely known. I think you are confusing the distribution of a sample of wOBA with uncertainty in wOBA.

Gerald Schifman
6 years ago
Reply to  nick

In that phrase, I aim to refer to the uncertainty in observed performance found at each subset and sample. Not any uncertainty in measured events or wOBA’s calculation itself.

6 years ago

“to uncertainty in observed performance”

There is trivial uncertainty in observed performance. Observed performance is equivalent to “measured events.”

This is exactly my confusion.

I believe your intent is to summarize a distribution of observations, rather than quantify uncertainty, as the observations are certainly known.

Perhaps you should invest in a statistics text.

6 years ago

These unwelcoming, chilly conditions are infrequent in the regular season,

Says someone who has never sat shivering in the damp night air of Safeco Field in May, or September, or even many evenings in July.