My new post summing up the different factors that influence how umpires call pitches is up at Beyond the Box Score. It’s a continuation and expansion of some of my more recent posts here, so check it out if those interested you.
Continuing my look at how different variable affect how umpires call pitches, today let’s talk about what happens in each inning.
We’ll start with a table:
|Inning||Runs / 150 Pitches|
Remember that positive numbers are good for the pitchers (fewer runs), while negatives indicate more scoring. The innings that jump out are the first, sixth and extras. I have no idea what to attribute the sixth inning to (perhaps starters are tiring and getting more wild in general – which contributes to umpires being less lenient). I also don’t really know why umpires help the starters so much in the first – but I’m guessing it has something to do with an unconscious desire to give the pitcher the benefit of the doubt at first, or to start the game off fast.
But I think I understand why the extras (and understand the sample size for all the extra innings combined is about 1/9 of any other inning) are so favorable to the batters. Without someone scoring, the game can’t end. If the game doesn’t end, I can’t go home. I’m sure there’s no conscious reason why the umpires would behave this way, but I’m not sure I’d blame them if there was. After 3.5 hours of calling pitches, I’d probably want to do everything in my power just to be allowed to sit down.
I’m still working on putting together a longer article combining all this information together. Unfortunately some of the analysis is taking longer than I was hoping. Look for it early next week, though.
Another in the series of short posts breaking down how umpires call close pitches for pitchers. Again, I’ll be doing a much more thorough job with this in a longer article this week over at Beyond the Box Score.
This time I looked at whether better pitches got more close calls than bad pitchers. I broke all pitchers who pitched in 2006 and 2007 into three groups – those who had career runs averages (runs allowed per 9 innings) of less than 4.00 up through 2006, those with a RA between 4.00 and 6.00 and those with over 6.00.
My hypothesis was that the better pitchers got more calls in 2007, as umpires looked more favorably on better pitchers. Surprisingly, this doesn’t appear to be the case. Good pitchers actually received .01 runs per game less than average from umpires’ calls. Bad pitchers were hurt by close calls to the tune of .25 runs per game. The middle group came out at .05 runs per game above average as a whole.
That’s a result I wasn’t expecting.
Probably the most common response to my look at catcher framing was that the pitching staff had to have something to do with the results. I totally agree, but I was curious about how the staff would affect the number.
I thought of three possible “biases” that could change how a pitcher is judged by an umpire: age, reputation (by which I mean success) and early-game wildness. I’m sure there are probably others, and I’m willing to take requests for study.
Anyway, these will be probably be combined into a longer post for Beyond the Box Score later, but I thought I’d post the results as I got them here.
The first test I looked at was by age. I broke down all the pitchers from 2007 into three categories – under 25, between 26 and 35, and older than 35. The age breakdown was somewhat arbitrary, but I wanted to get a young group and an old group, with the hypothesis that umpires were kinder to the older pitchers.
Turns out that young pitchers saw missed calls cost them .37 runs per game, while older pitchers benefited by .23 runs per game. The middle age bracket gained .14 runs per game.
There are a couple of potential flaws to this study that I’ll point out, but I don’t think they’re too serious. First, the release of the Lahman database doesn’t include the Retrosheet ids for rookies from 2007, so I didn’t have a birth year for them – although my presumption is that most will fall in the bottom group. I eliminated them from the study – which will lower both the number of opportunities and the number of missed calls from the young pitchers. If anything, this dampens the actual effect and younger pitchers were hurt a lot more.
Second, there’s definitely a selection bias in play. Older pitchers are those who have pitched well enough to stay around, while younger pitchers may be drummed out of the league under 25. I’m not sure how to correct for that yet, but I might have a better idea when I try to measure how big a role reputation plays.
Long story short, younger pitchers appear to lose at least .6 runs per games to older pitchers based on umpires calls. This may not be solely related to age, but to a combination of factors that are correlated to age, which I’ll try to examine over the next few days.
I’ve decided to start work on a pitch classifier similar to what Josh Kalk describes over at from small ball to the long ball. I know he’s already built a good one, and can generate some really cool graphs from it, but he hasn’t released the algorithm, and it’s a good opportunity for me to use my Programming Collective Intelligence book. I know he’s got some advantages over me… rumor has it that he’s a theoretical physicist (and so is John Walsh)… but it should be a fun attempt nonetheless.
The goal is to use the PITCH f/x data to automatically classify each pitch as a fastball, curveball, whatever. Besides the obvious usefulness of this data for evaluating pitchers, I think it can be used to enhance some of my work with catchers.
Anyway, this could be a fairly long process (and there may be some other interesting things coming along with it). I figure I’ll post progress updates here. After all, what is a blog but a place for me to blather on about things no one is interested in?