How reliable is UZR? by Colin Wyers July 1, 2009 Ultimate Zone Rating is now everyone’s favorite defensive stat on the Internet. But how reliable is it? Let’s define reliable as year to year persistance. It’s not the only definition, but it’ll do. We’ll measure it using weighted correlation coefficient. And for good measure, we’ll take a look at how to use it to regress to the mean. And for free, we’ll throw in a look at wOBA, a form of linear weights per plate appearance, as a point of reference. In order to get this to work out correctly, we need to convert UZR into a rate stat. (Okay, maybe we don’t, but it’s the only way I’ve found so far.) What I did was take Chris Dial’s methodology for converting STATS Zone Rating into a UZR-like plus-minus rating and back-calculate a number that looks like ZR from a player’s UZR and their expected outs. And now, the results of that trial: Obs R Const Regress IF UZR 107 0.36 190 64% OF UZR 80 0.26 228 74% wOBA 320 0.53 284 47% I split UZR up by infield and outfield; it could be productive to further break down by position. I looked at stats from 2002-2008, yearly totals only. (I did the same with wOBA to provide an apples to apples comparison.) The first column is the average number of observations – either estimated chances (expected outs divided by average zone rating) for UZR or plate appearances for wOBA. Next is the correlation between one year and the next. (It’s weighted by the harmonic mean of the chances in the two seasons.) Following is the constant number of observations needed to regress to the mean, using this basic formula: (Player’s Rate * Observations + League Average Rate * Constant) / (Observations + Constant) And the last column is that expressed as a percent. If that figure was 50%, then for a player with the number of observations in column one, you would put equal weight on his performance and on the league mean to estimate his true talent level. The takeaway: Everything regresses to the mean. A hitter in 300 PAs should be regressed roughly 50% to the mean. (Assuming all you have is those 300 PAs, of course.) Defensive metrics are less reliable than offensive metrics. (Which – see above – are not as reliable as they are sometimes treated, when it comes to determining a player’s inherent level of ability.) An infielder’s UZR is more reliable than an outfielder’s UZR. This is partly because an outfielder sees fewer chances than an infielder, and partly because outfield defense is more difficult to measure than infield defense.