A Library of Batted Ball Stats
February 15, 2006
(Almost) every player for the last four years.
I’ve posted a lot of batted ball tables on this blog. Up to now, I’ve hesitated to just “dump” them all on the Internet because I just don’t believe in “data dumps.” The tables are kind of confusing and need to be interpreted carefully.
However, this is really cool data, and I think the table format works pretty well (thanks to comments from readers of this blog). So, the heck with it. I’ve posted the batted ball tables of every major leaguer who saw a decent chunk of time last year. You can start with the Batted Ball Index, which includes an index of all teams and a player search button.
I’ve also added some tips regarding how to read the tables. So please look them over and leave any comments or questions you might have about the stats on this blog. And spread the word—I hope that I didn’t do all this work in vain!
You have done what’s been needed to have been done for the longest time.
By breaking things down into the components, it is at these levels that you can:
1 - truly describe quantitatively what we see and appreciate as fans
2 - be able to better determine how much a player can influence the play
A marriage of scouting and performance analysis. Sabermetrics at its finest.
Posted by tangotiger on 02/15 at 10:48 PM
First off, this is really great work. Thanks for posting it all.
Second, I just looked at the M’s page and have two questions/comments:
1. There are two listings for “Raul Ibanez”. I assume the top one is for Adrian Beltre (whose breakout 2004 came while hitting only 5% of his OF for HR? wow) and the bottom one is the Raul Ibanez the LF/DH.
2. Are any of these numbers park adjusted?
Posted by Trev on 02/15 at 11:22 PM
Tango, thanks. That means a lot to me.
Thanks for catching that mistake, Trev. I have a feeling there are a few other mistakes, too. I’ve corrected the Seattle page, and if anyone catches other mistakes please let me know.
The numbers are “kind of” park adjusted, as described on the Intro page (near the bottom on the right). The net runs per ball and home run/OF are park-adjusted, but not other stats (such as strikeout rates).
I have no good excuse for why I did it this way. I wanted to see what park adjustments on the component level would do to individual players and only got halfway there. Sorry if that feels like a “half ass” job.
Posted by on 02/16 at 08:37 AM
Awesome! Studes, this is great stuff. Nice work putting this all together.
Posted by Tybor on 02/16 at 09:36 AM
At the GB/FB level, it becomes important to distinguish between the run value of a single. Just off the top of my head, I’d say that the run value of a GB single is .43 runs, while it’s .50 for a FB single. I’m sure the double value would be similarly different.
K and BB should be adjusted by park. I’d guess that they are probably the most influenced of all the events, after the HR.
Posted by tangotiger on 02/16 at 01:00 PM
I love data dumps. Would you be willing to post all 2002-2005 lines as a single Excel file, including players no longer playing?
Posted by Chuck on 02/16 at 01:16 PM
Tango, very interesting point about the relative value of a GB vs. FB hit. Are you aware of anyone who has established linear weights by batted ball type?
I agree that K rate should be adjusted by ballpark, but I’m less sure of the BB rate. My own research (published in the THT Annual) indicates that batted ball rates may be more persistent than BB rates (making them the higher priority). Whatever, we can all agree I’ve only done half the work to fully park-adjust the data.
Chuck, I need to think about your request. BIS goes to a lot of expense to gather the data and make it available to their customers, and I don’t think they would appreciate my making their data essentially “free.”
Posted by on 02/16 at 01:35 PM
MGL has published them in the past at BTF. I have his list at home. I’ll try to find it for you.
You could also separate the FB into “long FB”, “med”, “short”, and again assign different run values for each kind.
Of course, you could then do it by actual x,y coordinate, and simply say a ball that lands at 0.8 radians, 317 feet and it took 2.43 seconds to get there from home plate is worth +.72 runs.
I’m not suggesting that you do this, but you are on the correct path to that ultimate goal, which is really
runs = FZR + PZR + BZR + PARK + luck
Posted by tangotiger on 02/16 at 03:02 PM
Agree with all that’s been said above, Studes. I’ve been able to get very little done at work the past few days as a result. Thanks much.
Posted by Kent Bonham on 02/17 at 08:50 AM
excellent work, most entertaining stats I’ve seen in a very long while. one question, have you figured the year to year correlation of any of these yet?
Posted by dustin on 02/18 at 08:34 PM