Ten Things I Didn’t Know Last Week

by Dave Studeman
July 26, 2007

The Cubs may go for a billion dollars.

So far, the highest price paid for a major league baseball team has been the $700 million John Henry and Tom Werner paid for the Boston Red Sox (including Fenway Park and 80% of NESN). Now that Mark Cuban has entered the bidding, the Cubs may break the record and soar all the way to $1 billion.

Forbes magazine recently estimated that the Cubs are worth around $600 million. A billion is a lot more. Can the Cubs be worth it? Well, no, normally you wouldn’t say that a business with $22 million in operating income is worth a billion dollars. But this is baseball, not a normal business. As this New York Times column says (subscription required, but it’s free), Owners’ complaints about losses leave out the basic fact that capital gains are profits, too. And when scores of grown men desperately want to buy sports teams, there tend to be big capital gains.

And that’s the key, isn’t it? We can say that spending a billion dollars for a baseball club that only makes $22 million is a bad idea, but as long as there are new buyers in the future willing to pay more and more, it really isn’t. In the end, the owners will get richer, no matter how much they pay free agents.

Personally, I don’t doubt that there will be future buyers of ballclubs who pay more exorbitant prices. We are living in a bit of a Gilded Age: strong economic growth with a number of very rich capitalists at the top of the food chain. The immense wealth being gathered by the top .01% of our country is a bullish sign for baseball club owners.

And when people own baseball clubs in order to max out their eventual sale, it only makes sense to invest most of their short-term revenue in the ballclub instead of holding it in investments. In other words, farm systems and free agents are worth more to club owners than stocks and bonds. Getting the best sales bid often requires competitive teams with a strong community presence; that takes continual maximum investment in the ballclub.

So watch those free agent salaries continue to soar. But if the capital gains tax increases, or the incremental tax rate at the top of the income bracket rises, the rich elite will take a hit and perhaps pull back on their ballclub purchasing. That will be your free agent bear sign.

The mid-year Gold Gloves

Mitchel Lichtman released his Ultimate Zone Rating stats for the first half of the year (thanks, MGL!), and I’ve also had a chance to review the THT Revised Zone Ratings and some Baseball Info Solutions statistics. Based on those stats, I nominate the following players as my Gold Glove choices of the year:

First base: UZR likes Todd Helton, but I’ve got to go with Albert Pujols again. His .861 Zone Rating leads the majors and he’s shown good range outside of the zone. Plus, he continues to make fine defensive plays such as handling tough throws to first. On the other hand, Dmitri Young has been horrific.

Second base: UZR and RZR agree that Mark Ellis deserves the mid-year Gold Glove, with a special nod to Chase Utley. Jeff Kent and Rickie Weeks have been showing the range of snails.

Shortstop: With Adam Everett on the disabled list, both metrics agree that Troy Tulowitzki is the best fielding shortstop so far this year. I’ve only seen Tulowitzki play a little, so I’ve got to catch more Rockies games. Hanley Ramirez has been having a fine year at bat, but he hasn’t been getting it done in the field.

Third base: Pedro Feliz has been outstanding for San Francisco, and Scott Rolen appears to have found his old magic at the hot corner. Edwin Encarnacion has nothin’.

Left field: Alfonso Soriano has not only exhibited fine range, but he’s got the second-best arm ranking among all left fielders. Hard to believe that this former second baseman has made such a great switch, but this is the second year he’s ranked highly (after those initial embarrassing games in Washington last year). Manny Ramirez and Raul Ibanez. That is all.

Center field: The difference between UZR/Stats data and RZR/BIS data really pop up in center field (get it?). UZR rates Grady Sizemore far ahead of the pack, but RZR gives the nod to Andruw Jones. If someone wants to study what’s going on with these two data sources, I’d start in center. In the meantime, trust your eyeballs. Mine go with Carlos Beltran.

Right Field: Shane Victorino and Magglio Ordonez rate highly for both their range and arms, according to UZR and RZR. Jermaine Dye has apparently lost a lot of range this year, and it’s a good thing that Ken Griffey Jr. has been moved to right, but he doesn’t rate highly there, either.

Rick Morrissey is a goof.

I originally had a harsher word inserted after “a,” but this is a family website. Had to tone it down.

Rick Morrissey is a columnist in the Chicago Tribune, so I read every one of his columns. My opinion is that he’s a good writer who often just doesn’t have a clue. In a recent column called “Jackson’s Logic on Bonds a stretch”, Morrissey pooh-poohed the idea that Babe Ruth would have hit fewer home runs if blacks had been allowed to play major league ball in the 1920s.

A Hardball Times Update

by RJ McDaniel

Goodbye for now.

Here is Jesse Jackson’s quote:

“My question is if 400 guys tested positive, do you put asterisks by all their names?” Jackson told the Sun-Times. “Do you put asterisks by [spitballer] Gaylord Perry’s name? Do you put asterisks by guys who had the ultimate enhancement [by] denying others a chance to compete?”

Here is Morrissey’s spin:

This line of thinking is based on what? That because blacks dominate in sports such as basketball, it logically follows that they would dominate in baseball?

No, Rick Morrissey, that isn’t what Jackson is saying. It’s a simple fact, really. If you increase the pool of available talent, the level of play will increase. Babe Ruth would have hit fewer home runs had blacks been allowed to play major league ball. Walter Johnson would have had a higher ERA.

Get this Morrissey whopper:

If you use Jackson’s logic, the stats of Negro leagues star Josh Gibson are suspect too. Because he wasn’t allowed to compete against whites in the big leagues, how many of his reported 962 homers came against substandard talent? Shouldn’t we call that number the ultimate enhancement too?

Yes, Rick, that is exactly right. But no one is saying that Gibson’s home run record should be the major league record. All records occur in context, such as steroids, dead balls and the color line, and context should be recognized.

Ah, my anger renders me inarticulate. Let me turn it over to my favorite baseball writer:

You see what I’m getting at here … It isn’t just that Babe Ruth might not have hit 714 homers had he played against men of color. It’s that Oscar Charleston might have hit 745 homers and stolen 800 bases. It’s that Josh Gibson may have hit 68 homers in a season. You are, after all, judged against the players of your time. Ruth is amazing because he was SO MUCH BETTER than anyone else of his time. But that may not have been so had black athletes been allowed in the game.

Thank you, Joe. Would you be willing to move to the Chicago Tribune?

So was Paul Richards

I’ve just started reading Bill Giles’ very entertaining new book, Pouring Six Beers at a Time. It’s full of good stories about baseball and people. For instance, Paul Richards was probably one of the least ethical people you’ll ever want to meet. He constantly falsified expense reports, cheated at golf (he actually carried two putters: a long one for his “gimme” putts and a short one for his opponents’) and was even suspected of taking kickbacks from young players for their signing bonuses.

According to the book, Richards claimed that he once offered Houston’s entire 40-man roster for Detroit’s. That was back in 1967, a year before the Tigers won the World Series and when the Astros’ roster was filled with great young players, like Joe Morgan and Jimmy Wynn. Crazy.

The real Texas Rangers MVP.

Don’t you hate it when people cherry-pick stats? My favorite is “the team is only 2-40 when down by two or more runs in the ninth.” Well, duh. Everyone is 2-40 when behind that much in the ninth. (Note: not real numbers!!!)

Here’s another one I’ve been hearing lately, talking about Mark Teixeira: Texas has shown it can win without him, going 16-12 while he was out.

Scott Lucas took that one to task, and found that Brad Wilkerson (he of the .229 batting average) must be the team MVP, because the Rangers are 13-26 when he isn’t in the lineup! The second-best record is Ramon Vazquez’s.

It’s a tough time to be a Rangers fan. The club is in a void, in which neither down nor up looks like a particularly good strategy. Hopefully, Jon Daniels can find some creativity.

How to apply a baseline to WPA.

Last week, I presented a list of the best pitchers, covering 1957 to the present, using Win Probability Added. The list raised a few guffaws, most notably because it didn’t seem to take the impact of career length into account. So I went back to the tables and calculated WPAB (yes, a new acronym!), or Win Probability Added (above) Baseline.

You can think of the baseline as replacement level or bench level or what-have-you. I essentially inserted a .350 winning percentage into the figures to see who added the most wins above that level (instead of a .500 winning percentage, which is what WPA uses). This method will give more credit to players who have long careers with many average years, as you can see in this list of the top twenty pitchers:

Pitcher                    WPAB
Clemens, Roger              112
Maddux, Greg                 97
Seaver, Tom                  95
Ryan, Nolan                  86
Johnson, Randy               82
Perry, Gaylord               81
Palmer, Jim                  80
Sutton, Don                  80
Glavine, Tom                 72
Blyleven, Bert               71
Smoltz, John                 69
Carlton, Steve               69
John, Tommy                  68
Gibson, Bob                  68
Brown, Kevin                 65
Mussina, Mike                63
Eckersley, Dennis            61
Marichal, Juan               61
Schilling, Curt              58
Drysdale, Don                57

First of all, you might have noticed that there are no relievers on this list, other than Eckersley (who spent half of his career as a starter). Secondly, pitchers with long careers certainly rank better here. For instance, Phil Niekro just misses the list of top twenty (he’s 22nd). Koufax (who was 16th in WPA) drops to 25th, just ahead of Jerry Koosman.

You can also see why, in his original Historical Baseball Abstract, Bill James rated players by both career value and peak value. It’s pretty difficult to devise a metric that combines the two adequately. Some people would suggest using 50% WPA and 50% WPAB, but why 50%? It would be nice to find some value that has intrinsic meaning.

Notice how I didn’t mention Bert Blyleven and the Hall of Fame? Oops.

Willie Hernandez deserved that MVP.

Willie Hernandez had quite a year in 1984. He posted a 1.92 ERA in 80 games and 140 innings, with a 9-3 record and 32 saves. He was a critical contributor to Detroit’s World Series championship, and became the third reliever to ever win the Most Valuable Player award (after Jim Konstanty in 1950 and Rollie Fingers in 1981).

According to WPA and WPAB, he deserved that MVP. Hernandez posted 8.7 WPA in 1984, equivalent to Ryan Howard last year. In comparison, last year’s top reliever, Francisco Rodriguez, had a WPA of “just” 5.4 wins.

On the other hand, Rollie Fingers wasn’t the WPA leader in 1981, despite a 1.04 ERA. Dwight Evans led the league in WPA that year, with six (remember the strike?), followed by Steve McCatty and Dwayne Murphy. Rollie Fingers was fourth.

I can’t wait for Retrosheet to release the 1950 season, so we can see how Jim Konstanty, the first reliever to ever win an MVP, rates.

Nellie Fox, MVP

The White Sox won the American League title in 1959, and Nellie Fox (with a batting line of .306/.380/.389) was voted the MVP. If you’re like me, you’ve been perhaps a bit skeptical of this award, even though Win Shares has him tied for first in the league with Mickey Mantle.

Well, you can wipe that smirk off my face. Fox probably did deserve the MVP in 1959. His batting WPA was tied for second in the league (with Minnie Minoso), despite that lousy slugging percentage. The splits reveal the reason: Fox batted .383 with runners in scoring position in 1959, and .344 with RISP and two outs. He batted .364 in the ninth inning. When you combine his clutch hitting with his excellent glove, you’ve got yourself an MVP.

I wonder how many number two hitters have been MVP’s?

A new way of using the Pythagorean formula.

Hopefully, you know what the Pythagorean formula is. If not, take a break, read this definition.

Thanks. It turns out that Matt Souders (aka SABRMatt) has developed a new approach that appears to improve the Pythagorean approach slightly. It takes more work, so I don’t know if you’ll find it worth the trouble, but it’s still kind of interesting. In essence, he applies the Pythagorean formula to each game and then takes the team’s average (instead of applying the Pythagorean formula to the average number of runs scored and allowed per game). He does this because it helps offset the impact that blowouts (like 20-5 wins; see Yankees-Tampa Bay) have on the formula.

Does it work? Well, Matt found that it fits the historic winning percentage of teams better than either the Pythagorean formula or “Pythagenpat.” And he’s applied it on a regular basis this year. Still, Matt didn’t say whether his formula did a better job of predicting team performance, so I thought I’d give it a try.

I took the first half of 2006 and calculated each team’s winning percentage, Pythagenpat percentage and Matt’s approach, then compared how well each metric predicted that team’s second-half record. Here’s what I found:

A team’s first-half winning percentage had a .05 R-squared with its second half winning percentage. In other words, a team’s first half record had very little correlation with its second half record. Think about that for a moment.

The reason is simple: regression to the mean. Here’s a little table of how the bottom half teams did compared to the top half (based on first half records):
```
              First  Second   Diff
Bottom Tier    .440    .479   .039
Top Tier       .560    .520  -.040
```
Both sets of teams played nearly .500 ball the second half of the year. Crazy, huh?

The Pythagenpat formula was a definite improvement over the first-half winning percentage, with an R-squared of .11.

Matt’s method was a slight improvement over Pythagenpat; R-squared of .12.

This was not a definitive study by any means, but it does indicate that Matt’s method has some merit. Most importantly, however, this little exercise reminded me of the power of regression to the mean. If you want to predict how a team will perform in the second half, regress first, apply a Pythagorean formula second.

By the way, this clearly wasn’t a definitive study of Matt’s formula, just an inquiry, if you will.

Checkers is just complex tic-tac-toe

A mathematical proof has been published that checkers will always result in a tie, if neither side makes a mistake. Took the guy 19 years to develop the proof. Remember War Games, the Matthew Broderick movie in which the Pentagon’s computer went haywire when it realized that some games are just unwinnable?

References & Resources
For more about regression to the mean and team records, read Clay Davenport’s article and the subsequent discussion at Batter’s Box.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG