Interview: Dave “Retrosheet” Smith

I recently had the pleasure of picking the brains of David Smith, the man who brought the world the fantastic and underrated website, Retrosheet, AKA the website the collects the boxscores and play-by-play accounts to every game it can get its hands on.

Tell us all a little about your background—age, family, where you’re from, etc.

I am 59 years old, married and have a 10-year-old son. I was born in Dayton, Ohio, where I lived for two years, and we then moved to Connecticut when I was two. From age seven I grew up 30 miles north of San Diego, in Escondido, California. I have lived in Delaware since 1975.

What do you do for a living?

I am a biology professor at the University of Delaware.

How did you first become a baseball fan?

My mother has always been a big fan (and still is) and I started watching Brooklyn games on TV with her in the early 1950s when we lived in Connecticut.

Did you play at all as a kid? Where you any good?

Yes, I started at age 10 (Little League) and played into the middle of the season of my junior year in college. I was a catcher. I wasn’t great, but I was good enough to play regularly all those years. I led my high school team in home runs (and strikeouts) one year.

Favorite team? Do you have a favorite team in the other league?

My favorite has always been the Dodgers. I was also a fan of the PCL Padres and still root for them unless they play LA. In the AL, I like the Orioles very much as well, at least partly because my wife is a lifelong Oriole fan.

Favorite baseball memory?

The 1963 World Series. I was 15 and just lived and died with the Dodgers all year. It was really tough when St. Louis got close in September because everyone kept remembering the previous year when they lost the lead and playoff to the Giants. Getting a sweep over the Yankees in the World Series was the most wonderful possible way to cap it off.

Favorite player? Least favorite?

My favorite was always Sandy Koufax. That started long before he was a great pitcher. The first game I attended was July 18, 1958 in the LA Coliseum which I badgered my parents into taking me to because Koufax was pitching. He faced six batters, struck out two, walked four and was gone. Amazingly he then started the next night as well, but I was not there. I couldn’t have been happier when he turned into a star.

My least favorite is hard to pick, because I don’t have such strong negative feelings. I suppose if I have to pick one, it would be Juan Marichal. As with all such choices, this is completely personal and this mostly reflects the feelings of an ardent Dodger fan about one of the superb stars of the mortal enemy.

A Hardball Times Update
Goodbye for now.

How many games do you watch in a given season?

I watch a game almost every day on TV. In person I usually see between five and 10. This year I have been at the park for seven.

How did you make the move from just being a baseball fan to being an active disseminator of baseball information for others?

I have to give credit to Bill James for having the national presence and foresight to start Project Scoresheet. I had been collecting my own play-by-play accounts at that point for about 25 years, but Bill showed that there were others who shared my passion for this information at least to some degree. I must add that the contributions of Project Scoresheet to Retrosheet were the scoring system and the software. As an organization, Project Scoresheet is a very poor model for what Retrosheet has become.

What inspired you to start this website?

The organization began informally in 1989 with the plan to distribute data by floppy disk. Of course, it would be four years before we had any full seasons to distribute! By that time, the internet was really beginning to take off and it was an obvious choice for us. Since our bedrock core principle from day one has been that we would always give our information for free to anyone who wanted it, then the internet was not only convenient, but an ideal vehicle for us. So in 1994 the website started in a very modest way.

How do you finance the site?

We don’t have much in the way of real dollar costs, just the monthly fee for the website and some expenses for copying game accounts, etc. Those costs are easily covered by donations we receive, the bulk of which have come from the Retrosheet Board of Directors and webmaster. Our major “financing” is from the enormous number of hours put in by volunteers, literally thousands of hours. I shudder to think what the real dollar cost of their skilled, selfless work would be on the open market.

If someone wants to donate money to Retrosheet, how should they contact you to do it?

There are two ways. Either make out a check to Retrosheet and send it to me or use the PayPal link on the front page of our website. I wish to stress that no one should feel obligated to make a donation, no matter how much they use the website.

Have you ever considered offering page sponsorships like Sean Forman has over at baseball-reference?

Never. I have great admiration for Sean and his site, but his operation is completely different from ours. In my opinion the core of our success has been that everyone knows that we are completely non-profit and that no one on the Board, especially me, has ever had any income from any Retrosheet-related activity. This completely open policy has, again in my opinion, led to many people being willing to volunteer their time, since they know they are not being taken advantage of.

In the book, The Numbers Game, author Alan Schwartz wrote a section on you and Retrosheet. What did you think of that?

I knew he would mention us since he came to my house and spent a day viewing my basement (the “vault” as we like to joke) and talking with me. I was, however, stunned at the amount of attention he gave us and the glowing way he wrote about the organization. I am deeply flattered and very grateful.

How many hours a week do you spend on retrosheet?

These days it probably averages about 30 hours a week. It used to be more.

Is the play-by-play data typed in by hand by someone? If not, how is it added in?

The historical play-by-play accounts are all entered by hand using custom software written for that purpose. The current accounts are collected electronically each day from a reliable source. That process is semi-automated with the expenditure of 30-45 minutes by me per day.

What sort of fact-checking procedures do you go through before putting up any information on your website?

The primary procedure is to generate season totals for each player from our event files and then to compare those numbers to the official totals for that season. We are now making the transition to compare the daily totals to the daily official totals, which will improve the quality of our data even more.

Not that many years ago, Retrosheet went down for a couple days. What happened? Have any steps been taken to assure that it won’t happen again?

That happened when the server was physically located in my basement. It was there for about eight years and this was the only serious problem we ever had. A tree remover dropped a branch on the data line leading to my house. It took about three days to get it replaced. Since January of 2007, our website is now hosted by a professional company with multiple backups and a huge bandwidth. Not only does this obviously increase security and reduce the chance of outages, it also lowered our monthly costs from $210 to $10.

Ever thought of creating a coaches register section?

To be sure, and work is currently underway on exactly that. As always, I can’t offer a promise as to the timing, but I think we may post at least some coach information by the end of the year. Stay tuned.

When did you launch the site? How much traffic did you initially get?

It started in 1994. We initially had a couple of dozen unique visitors per day.

How much traffic do you get currently?

There are fluctuations, of course, mostly related to our receiving publicity in The New York Times or on ESPN (or from you!), but the average for the last year is about 2200 unique visitors per day. That has been increasing at about 10% per year for the last three to four years.

When you get feedback from users, what’s the thing(s) they say they like best about the site?

By far the most frequent comment I get is from people who are very happy to have been able to check on the details of a game they recall from their youth. Very often it is the first game they attended and there are strong memories with fathers, grandfathers, uncles, etc. I am touched that people choose to share these personal stories and I am very proud we are able to help.

What’s your personal favorite feature/aspect of this site?

The multiple interconnected links always please me. You can start by looking up a player, then click on the box score of a game he played in, then click on another player in the game, then maybe note a specific umpire and go check out his games officiated. Tom Ruane is responsible for all that functionality and I am very grateful for his hard work in this regard.

Did you ever think you’d have this much success collecting data?

Not even close. When we started, I told people I hoped to get 5% of the games since World War II. We now have about 90% of those games and something like 60% of all games played since 1901. It is success literally beyond my dreams.

There’s about 200,000 games played in MLB history by my reckonin’. How many do you have box scores for?

Probably on the order of 130,000.

How many games do you think we can possibly gain play-by-play data for? Are 19th century seasons, for example, recoverable or lost?

The National Association is in good shape, thanks to the tremendous work of Bob Tiemann. However, games from 1876 to 1900 will always be hard for us. I am sure that some sizable portion will remain beyond our reach in terms of play by play. However, we will ultimately be able to produce accurate box scores for all games and daily logs for players and those are good items to have.

Do you do any other baseball research, or is it just running your website?

I love doing research. There are two types. First, I get requests from teams and writers and ESPN each week, usually something each week relating to an unusual occurrence that recently happened. For the second type of research, I take pride in my annual presentations at the SABR convention. I have made 13 of those now and won the Doug Pappas award for the best presentation in 2001 for my work on the 1951 pennant race.

It’s a great site and you do a fantastic job. Do you ever get sick of people telling you it’s a great site and you do a fantastic job?

That’s good too. No, I love hearing that sort of thing. As I said above, it is very rewarding that the information I am passionate about is useful to so many others.

What should the next Retrosheet update consist of?

We will be adding at least one more league-season (plus 2007) this winter. I can’t be more precise than that because it depends on the pace of the proofing.

Already, now for the stupid stuff: Growing up, who was the first celebrity you had a crush on?

Wow. How about Joan Baez. (Remember, I’m old!)

Favorite song?

Alice’s Restaurant.

Second-favorite sport?


What are your five favorite movies?

Casablanca, Dr. Strangelove, The Meaning of Life (Monty Python), The Princess Bride, The Hunt for Red October

Favorite Muppet? Why that Muppet?

Animal. He was the favorite of my father and my son loves him too. It’s nice to be able to get away with being completely out of control the way he does.

Favorite color?

Dodger blue.

Favorite baseball writer? Favorite baseball book?

Leonard Koppett. The Thinking Man’s Guide to Baseball

Last good book you read?

Pox Americana by Elizabeth Anne Fenn. It is a great combination of history, politics and microbiology (which is my special area of expertise with biology).

What’s your favorite kind of ethnic food?

Mexican (I’m mostly from Southern California, remember!)

Alcoholic beverage of choice? Non-alcoholic?

Gin and tonic. Cherry coke.

Strange one: about 10 years ago there was a Japanese film called Afterlife. Its plot: when you die, you go to a way-station for three days. While there, you choose one memory from your life. You then spend the rest of eternity with that memory. If that’s what the great beyond is like, what memory would you choose to spend all eternity with?

Reading a history book with my son.

If you had to have something inscribed on your tombstone, what would it be?

Well, I won’t have a tombstone, since I will be cremated, but in the spirit of the question, Let’s try: “He didn’t take himself too seriously.”

Comments are closed.