A Quick Comparison of UZR and Plus/Minus

by JT Jordan
April 11, 2010

FanGraphs has now added the Plus/Minus system into their already impressive array of baseball metrics, making their site even more impressive. Now that it’s freely accessible, we can export the data rather easily and dissect it as we please. So, let’s cut it up a bit and take a look at how it compares to UZR*.

First, Range:

Range	r	r^2	MAE	Max	#10+
1B	0.778	0.606	1.4	14.6	9
2B	0.809	0.655	1.74	16.4	17
3B	0.871	0.759	1.6	16.7	10
SS	0.770	0.593	2.14	19.5	42
LF	0.797	0.635	1.41	17.9	10
CF	0.750	0.562	2.03	18.5	27
RF	0.769	0.592	1.55	16.9	20

All in all, the systems usually rate players within 2 runs of one another, which is nice to see. “Max,” by the way, refers to the maximum absolute difference between UZR and +/-. In other words, it is the largest difference in a player’s range rating. To be honest, I’m shocked at the difference. It’s not as if we’re looking at one or two outliers, either- “#10+” indicates the amount of players that showed a difference of 10 runs or more between the systems. For each position, we’re looking at some players that are being rated around 1.5 to 2 wins differently. The lack of agreement at shortstop (relative to other positions) is even more surprising to me. The “max” players, with their Plus/Minus range and UZR range, respectively:

1B: Mark Teixeira (2003), +21, +6.4
2B: Ian Kinsler (2007), +5, -11.4
3B: Scott Rolen (2003), -5, +11.7
SS: Rafael Furcal (2005), +20, +0.5
LF: Manny Ramirez (2003), -14, +3.9
CF: Andruw Jones (2005), +7, +22.3
RF: Ken Griffey Jr. (2007), -5, -21.9

EDIT 4/13: Here are the figures for “qualified” players (minimum ~900 innings):

Range	r	r^2	MAE
1B	0.796	0.633	3.51
2B	0.802	0.643	4.73
3B	0.877	0.768	4.32
SS	0.754	0.568	5.82
LF	0.831	0.690	4.53
CF	0.735	0.541	5.48
RF	0.820	0.672	4.58

The average error looks better now. Shortstop and center field have the highest level of disagreement, while third base looks to have the best agreement.

And, of course, the arm/double play ratings:

EDIT: David Appelman has pointed out that the arm ratings are not on the same scale, which throws the numbers off a bit.

DP/ARM	r	r^2	MAE	Max
2B	0.584	0.342	0.63	6.6
SS	0.774	0.600	0.41	3.5
LF	0.700	0.491	0.76	8.4
CF	0.742	0.551	0.86	10.5
RF	0.761	0.579	0.86	9.3

Oddly enough, as much disagreement there is at shortstop in terms of range, the exact opposite is true of double play ratings. Second base looks a bit fishy to me, but multiple tests have given the same result. Overall, we’re still looking at an average error of about 1 run per position, but some of the differences are striking. Aaron Rowand’s arm rating in 2007 was +11 compared to a UZR rating of +0.5, a full win of value. Richard Hidalgo received close to a win of value in 2004, and Carl Crawford was 8.4 runs better according to Plus/Minus in 2005. Robinson Cano’s 2007 Plus/Minus rated him as a +9, while UZR suggested a more modest +2.4, and Michael Young’s 2006 Plus/Minus rated him as a +1 at turning the DP, while UZR rated him as a -2.5.

While the two systems track each other quite well, it’s interesting to see some of the large discrepancies between them. I don’t know the exact reason for why this is- I’ll leave that up to the Mitchel Lichtmans and the John Dewans of the world to discuss.** I get the feeling this subject is going to be heavily discussed for quite a while, and it’ll be interesting to see the work that is to come.

*All information here from 2003-2009.

**You can read about the differences between the systems here.

Edit: Changed figures at 1:15 AM PST- I overlooked ErrR in UZR’s range component. Mea maxima culpa.

8 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Colin Wyers

15 years ago

Any cutoffs as far as playing time? And were you comparing DRS to UZR? Because that would include the Arm/DP ratings in the first comparison.

JT Jordan

No cutoff for playing time. As for the first comparison, I used the “range” portion of the metrics- so Arms and DP were excluded.

Probably should include Error runs in there, if you haven’t already.

You’re probably moderately over reporting the correlation and dramatically understating the average error, then.

Remember, UZR and Plus/Minus can be boiled down to:

Rate * Opportunity

The opportunity component is the same (or pretty close) between systems. So that’s “inflating” the correlation a little.

And, since these aren’t rates, the MAE is reported based upon the average player in your sample, who has very little playing time (about 30 games worth, give or take). Prorate everything out to 150 games, and you’re probably looking at an MAE of about 6-7.

That’s a very good point.

When I get the chance, I’ll re-do it with the “qualified” players and incorporate Error Runs into UZR’s Range portion.

The “range” portion has been fixed- I need to get some shuteye, and will take care of the rest (hopefully!) soon.

dkappelman

You should know that the ARM ratings for plus-minus and UZR are not on the same scale. ARM for Plus-minus is not zeroed out and there are some excess of +250 runs each year.

Thanks for pointing that out, David. I’m assuming those arm ratings are absolute runs saved. Any chance they’ll be normalized in the future?

Updated with qualified players.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG