Rider, slurve and… Titanic by Max Marchi April 9, 2010 It was said that I threw, basically, five pitches—fastball, slider, curve, change-up, and knockdown. I don’t believe that assessment did me justice, though. I actually used about nine pitches—two different fastballs, two sliders, a curve, change-up, knockdown, brushback and hit batsman. Bob Gibson, Stranger to the Game. Since PITCHf/x day one, I have been thinking about how to accurately label each pitch thrown. MLBAM’s Ross Paul came up in 2008 with a solution that does the job (fairly well) in real time, but when people write articles about a single pitcher (or a small number of them), they usually perform their own ad hoc classification. Currently Ross is working on improving the accuracy of his classifying method and we should expect the entire PITCHf/x database to provide labels more similar to the ones we produce on a pitcher-by-pitcher basis (I suppose). Thus, we just have to wait. End of the article. Well, not really. I have a sinking fastball to either side of the plate, a cutter (which changes the direction of my fastball so it breaks instead of sinking) to either side of the plate, a curveball I throw at three speeds and three angles, a straight change—using the same arm speed and position as a fastball but with grip and release that slows it dramatically, and change-ups to different locations that I throw off my sinker and which look like batting practice fastballs. Orel Hershiser, Out of the blue. Since PITCHf/x day one, I have been thinking about the buckets into which pitches must fall. Currently we have: {exp:list_maker}Fastball Four-seam fastball Two-seam fastball Sinker Change-up Curve Slider Cutter Splitter Knuckle {/exp:list_maker} Are these categories all we need? Do we need all of them? A.J. Burnett throws a hard curve, clocked at above 80 mph; Chris Carpenter’s number two is a 12-6 deuce, much slower (75 mph), with a great arch (-8.9 inches of vertical movement). I’m not sure I want to classify them as the same pitch. I’m fine with classifying them both as curveballs when writing a top 10 list, like “here are the 10 best curveballs in MLB ranked by Run Value.” On the contrary, I’m not so comfortable when evaluating batters’ abilities against certain type of pitches. For example, when facing right-handed pitchers, Mark Reynolds seems to have success against hard curves (+0.87 runs per 100 pitches in 2009), but suffers against uncle-charlies (-0.26). On the other hand, Pablo Sandoval holds his own facing slow curves (+0.24), while being helpless against tight ones (-0.76). I don’t care whether Burnett and Carpenter call their pitch simply curveball: Hitters are definitely seeing two different animals and reacting to them in different ways with different degrees of success. So how many buckets? This is a question I tried to answer before plugging data into the statistical software (you always have to know what you’re doing before pushing that button; also you should avoid the Texas Sharpshooter’s fallacy. (For those not willing to follow the wiki link, he is the guy who used to shoot, then paint the target around the hole produced by his bullet.) Fastballs. Four-seamers and two-seamers are terms that suggest how a pitcher grips the ball, but batters are more focused on how the ball behaves after it has left the hurler’s fingers. And so was the vernacular in the past, as Rob Neyer and Bill James illustrate in their Guide to Pitchers. I expected to find sinking fastballs (the two-seamers), rising fastballs (I know they don’t actually rise; they’re those with a high positive vertical movement), riding fastballs (significant horizontal movement to the throwing side) and cutters (moving on the glove side of the pitcher). Thus, four different pitches, maybe one more if you think the high-90s heaters need to be classified in a league of their own). Curveballs. My guess was two, as shown in the example. Changes of pace. Let’s not consider Jamie Moyer for now, otherwise we should say three, four … 10? Probably two again. One harder, with horizontal movement similar to the fastball; the other one more of a pure slow pitch. Then there are splitters and forkballs. Thus, there might be as many as four different change of pace pitches, as long as splitters and forkballs, other than having their own name and their way to be delivered, have a peculiar behavior on their flight toward the plate. Sliders. The slider made me stop for a while, to consider the big picture: Maybe we can classify all pitches in 10-15 buckets; but chances are that, when we perform cluster analysis on all the pitchers’ pitches, they form a continuum that’s hard to separate in few categories. Slurves anyone? Slutters? Odds and ends. We are left with knuckleballs, pitches coming from extreme angles (think Chad Bradford), and other strange beasts (bloopers, palmballs, shuutos, gyros). So when I launched my favorite statistical software, I was expecting to get all the pitches classified in 10 to 15 groups.A Hardball Times Updateby RJ McDanielGoodbye for now. A shortcut. There’s no way to perform a cluster analysis on the millions of pitches in the PitchF/X database. I took a shortcut that has many limitations, but I believe it can be used for a first take on the subject. Using MLBAM classification, I took for each pitcher his average fastball, change-up, and so on, then erformed the cluster analysis on these average pitches. Following are results for 2009 right-handed pitchers. Results Fourteen buckets came out of the* cluster analysis (using speed, horizontal and vertical movement, release point as the classifying variables). So far we are quite in line with our initial hypotheses. * In studies like this there’s not one classification, so it should read “the cluster analysis I’ve chosen to show.” Later I would hint at results from other takes on the issue. The following table shows the translation of pitchers/pitches combination from the MLBAM classification to the one produced by the analysis I performed. I included only those combinations whose group, according to the clustering algorithm, is nearly certain (probability > 95 percent). "new classification" 1 2 3 4 5 6 7 8 9 10 11 12 13 14 MLBAM Change-up 59 1 6 0 0 0 1 1 148 2 7 3 0 0 Curve 0 113 17 0 0 8 1 0 0 140 9 0 4 0 Cutter 0 0 1 1 0 0 57 0 0 0 0 5 0 0 Fastball 0 0 0 1 4 0 27 14 5 0 5 7 0 9 Four-seam fastball 0 0 0 10 6 0 6 36 0 0 7 14 0 82 Knuckle 1 0 0 0 0 4 0 0 0 0 0 0 0 0 Sinker 0 1 0 0 16 0 2 29 0 0 0 0 0 0 Slider 0 6 262 0 0 3 8 0 1 1 1 0 41 0 Splitter 0 1 0 0 0 0 6 0 3 0 0 1 0 0 Two-seam fastBall 0 0 1 1 23 0 4 152 0 0 0 0 0 1 To make sense of the new classification, some familiar labels have to replace the numbers from 1 to 14. To do that we need another table, reporting the average characteristics of each of the 14 pitches. class. speed h.mov. v.mov. 1 79.5 -5.3 6.9 2 81.2 3.1 -3.8 3 84.5 1.7 2.3 4 96.1 -7.8 6.6 5 90.8 -9.2 1.2 6 70.5 3.3 2.8 7 89.8 0.1 6.5 8 89.8 -9.4 7.0 9 84.3 -7.3 4.2 10 75.1 6.0 -6.0 11 84.2 -6.2 -3.8 12 86.6 -3.9 10.3 13 78.5 5.3 1.3 14 92.8 -4.1 10.5 They put a radar gun on the kid’s fastball a few minutes ago. […] Ninety-three point four miles per hour. That’s how they tell you speed now. They don’t try to show it to you: ‘smoke,’ ‘hummer,’ ‘the high hard one.’ I miss the old clichés. They had life. Who wants to hit a fastball with a decimal point when he can tie into somebody’s ‘heat’? William Least Heat Moon, Blue Highways. The curveballs we outlined in our initial example are pitch No. 2 (the tight one) and No. 10 (the slow one). No. 4 is clearly a fastball, one that very few pitchers can throw, stopping the radar gun in the high 90s. Iin this bucket fall the well known heaters of Joel Zumaya and Kyle Farnsworth. There are several other fastballs in the above tables. No. 8, which makes up most of MLBAM’s two-seamers and two-thirds of the sinkers, actually shows, on average, a vertical movement similar to the one of the heaters (No. 4); what differentiates it from nearly all the other pitches is the great horizontal movement (on the throwing arm side). Some of the pitchers throwing this one are Joba Chamberlain, Joe Nathan, John Lackey and the Cardinals’ one-two starters. No. 5 is the other pitch with a lot of horizontal movement. Again, the speed is good for a fastball and MLBAM sees it as either a two-seamer or a sinker, and the very low vertical movement value confirms the sinking action. Here are the sinkers of Brandon Webb, Fausto Carmona and Roy Halladay. No. 12 and No. 14 share the highest vertical movement, though they come at very different speeds. MLBAM sees both of them mainly as four-seamers. The former pitch has Paul Byrd, Livan Hernandez and Trevor Hoffman among its adepts; the latter Joakim Soria, Grant Balfour and Brad Penny. Finally, No. 7 leaves the pitchers’ arms at around 90 mph, and has no lateral movement; this means that the right-handed batter sees the pitch as tailing toward the outside corner, like a slider or a cutter. The velocity and a look at MLBAM’s classification indicates we are dealing with the latter. If you need another clue, yes, Mariano Rivera is in there. They call that a cut fastball now, but it’s what we used to call a sailer. Charlie Metro, quoted in The Neyer/James Guide to Pitchers. Sliders (as defined by MLBAM’s algorithm) go into two buckets. No. 3 shows lateral movement similar to the cutter (No. 7) but lower vertical break; No. 13 breaks more like curves (No. 2 and No. 10) in the horizontal plane, and it’s also clocked at less than 80 mph. We might consider this last pitch as a slurve: the sliders of Jason Schmidt, Bronson Arroyo and Francisco Rodriguez are to be found here. Changes of pace are split between groups No. 1 and 9. The former runs into right-handed batters like some of the fastballs (No. 7 and No. 8) and is very slow; the latter travels in the mid-80s and exhibits more lateral movement. I wonder whether the pitches also differ in the way they are thrown (circle vs straight change?). Speaking of slower pitches, there isn’t a cluster identifying the splitters—they end up classified as No. 7, together with the cutters, or as No. 9, the “hard” change-ups. We are left with a couple of groups. No. 11 collects a mix of change-ups, curves and fastballs. What stands out in the average measures of the pitch is the negative vertical movement. Once we look at the players delivering this kind of pitch, Chad Bradford, Brad Ziegler, Cla Meredith, Peter Moylan, … well you get the point: They are mainly low-arm-angle hurlers. Finally, No. 6 mixes various really slow pitches. The knuckleballs are there, along with a few leftovers from the submarine/sidearm group and some roundhouse curves (Bronson Arroyo’s and Koji Uehara’s). In case you are wondering whose is the knuckleball ending up in group No. 1, among the slow change-ups, well, that’s how some of Red Sox Dusty Brown pitches were classified by MLBAM. Let’s name them! Now the LORD God had formed out of the ground all the beasts of the field and all the birds of the air. He brought them to the man to see what he would name them; and whatever the man called each living creature, that was its name. So the man gave names to all the livestock, the birds of the air and all the beasts of the field. Gen 2:19-20 Okay, before laying out a few caveats, future planning and concluding observations, let’s give a name to those numbers. I’m going to suggest one or more for each pitch type, and I expect you to either approve one of them or suggest something else in the comments section. {exp:list_maker} No. 1 – Slow change or, as they used to say in the past, simply slow ball. No. 2 – Hard curve, tight curve. No. 3 – Slider. No. 4 – Heater (hummer, blazer…). No. 5 – Sinker. No. 6 – Floater, junk, feather. No. 7 – Cutter, sailer. No. 8 – This one tails to the throwing arm side. I would suggest tailing fastball, but according to Neyer and James, they used to call a pitch from a righty that runs into a right-handed batter a riding fastball. No. 9 – I really don’t like the terms hard change and slow change, so I expect good suggestions from you for this and No. 1. No. 10 – Slow curve, drop curve. No. 11 – Low-arm-angle pitches. How do we call them as a group? Sidearmers? Submariners? No. 12 – Okay, this is a fastball that’s not quite fast (high 80s), but stays up. I go with rising fastball. No. 13 – Slurve. No. 14 – Similar to No. 12, but 4-5 mph faster. Hopper comes to my mind. {/exp:list_maker} Alternate classifications. The classification I chose to present is the one I found easier to interpret and more in line with the initial hypothesis—thus yes, I did some Texas sharpshooting in the end! However, alternate clustering (obtained with different parameter settings and choice of explanatory variables) produced similar results. In particular, removing the release point information (as many have shown to be inconsistent/unreliable) doesn’t prevent the clustering algorithm from detecting the submariners/sidearmers. That, together with removing the pitcher/pitch combinations with a sample size of less than 30 produced a classification in 19 groups. The differences with what I outlined in the previous paragraph consisted in {exp:list_maker}three change-ups instead of two (two different hard change-ups are identified, one that stays up—+6in. of vertical movement—and one that rides into right-handed batters—-8in. of horizontal movement); a third group for the reeeeally slow curves, leaving the pitchers’ arms in the low 70s; the low arm angle pitches split into two groups, with velocity being their main separator (87mph vs 78 mph); one more fastball, something between the hopper (slower), the rising fastball (less “rise”) and the riding one (faster but with a smaller tail)—a straight fastball?; one group mixing fast change-ups and slow fastballs (this is the hardest to digest, but we should have anticipated such a beast—think about the change-ups that look like batting practice fastballs in Hershiser’s quote). {/exp:list_maker} The rest of the classification matches with the 14 groups previously described. Some examples of repertoires. Doc Halladay’s repertoire, according to this extended labeling of pitches, consists of a riding fastball, a sinker, a cutter, a hard change-up and a slurve. Josh Beckett gets tagged with heater, sinker, cutter, a slow curve and a fast change-up. Justin Verlander’s classification is quite easy and not much different from MLBAM’s: high heat, slider, tight curve, riding change-up. His two-seamers get labeled as riding fastballs. The pitchers who really put this classification to a severe test are those continuously varying release angles. Arroyo’s fastballs either rise or run into batters (the rider into righties, the cutter into lefties); his slider gets tagged as a slurve, his curve ends up among the floaters and his change-up is considered a slow one. Jeff Weaver comes out with a slider, a sinker, a cutter, a rider, a hard change and a slow curve. To-do list. Classifying pitcher/pitch combinations using average values and starting with MLBAM’s labeling exposes the results to many shortcomings. A better, but way longer, approach would be to first perform a cluster analysis on each pitcher (going game by game would be even better), then perform the “meta-classification” on what comes out of that. Some research is needed to assess whether the pain of going through all the work has any value. Other than checking whether some hitters perform well on the slow curves, but poorly on the tight ones, I believe it would be interesting to check if some combinations make pitchers better: Is riding fastball/hard change a better one-two sequence than heater/slow change? Which one is the better addition for a pitcher who already possesses a slider: a tight curve or a roundhouse one? Furthermore, I would like to look at injury data in the near future. Are the pitchers with a hopping fastball more likely to make trips to the DL than those gifted with a tailing motion in their number one? (completely made-up example). Opera naturale è ch’uom favella; ma così o così, natura lascia poi fare a voi secondo che v’abbella. Dante, Par. XXVI, 130-132 Translation: “A natural action is it that man speaks; But whether thus or thus, doth nature leave To your own art, as seemeth best to you.” Meanwhile, let’s get some fun out of this lengthy article. Let’s have some major leaguers make the calls on the pitches’ names. Tug McGraw: No. 8 Bo Derek fastball (“nice little tail”); No. 7 Cutty Sark (“it sails”); No. 5 Titanic (“it sinks”). Bill Lee: No. 2 Toilet seat (“They [Bert Blyleven, Nolan Ryan and Camilo Pascual] threw curveballs that were called “toilet seats.” A lot of hitters would buckle both knees when they first saw it. It gave the appearance that they were on the throne taking a cr**”); No. 6 Pus. Satchel Paige: No. 8 Midnight rider; No. 1 Nothin’ ball; No. 14 Jump ball; No. 6 Bat dodger. Ricky “Wild Thing” Vaughn: No. 4 Terminator. Now it’s your turn, in the comments section below. References & ResourcesData. PITCHf/x data from MLBAM. Books Bill James, Rob Neyer – The Neyer/James Guide to Pitchers: An Historical Compendium of Pitching, Pitchers, and Pitches Bill Lee and Jim Primer – Baseball Eccentrics: The Most Entertaining, Outrageous, and Unforgettable Characters in the Game. Orel Hershiser, Jerry B. Jenkins – Out of the Blue: Orel Hershiser. Bob Gibsom, Lonnie Wheeler – Stranger to the Game: The Autobiography of Bob Gibson. William Least Heat Moon – Blue Highways: A Journey into America.