The Baseball Graphs Blog

Tuesday, February 28, 2006

The Fielding Bible

I received a copy of John Dewan’s Fielding Bible last week.  I’d been anticipating this book for quite a while.  I first heard about it many months ago and subsequently helped edit the player comments.  Which is my way of telling you I may not be completely unbiased here.

Nevertheless, I found the “Bible” to be a worthy addition to the ongoing development of fielding statistics, and that actually surprised me a bit.  Fielding stats based on play-by-play data have received a lot of attention over the past several years, led by MGL’s Ultimate Zone Rating and David Pinto’s Probabilistic Model of Range.  Other very knowledgable folks, like Chris Dial, Mike Emeigh and David Gassko (among many others) have also added depth and insight to the discussion.

As far as I know, John Dewan and Bill James (who virtually co-authored the book) have not been part of these discussions, and it shows.  The book doesn’t really add a lot of new thinking to topics already covered by people like MGL.  However, play-by-play fielding stats are still in their infancy, so an independent approach that verifies the work of others is welcome.  Plus, I think the “Bible” includes enough new ideas to engage even the most educated fielding maven.

Most importantly, the book presents its stats and findings in a an easy-to-understand manner that helps the reader understand what’s behind many of these stats.  And that’s something that, frankly, few statistical writers have done so far.

There are many sections of the book to enjoy:

- Bill James’s comparison between Adam Everett and Derek Jeter.  In this opening essay, James concludes that Jeter is just not a good fielding shortstop at all, and Everett is a great one.  Most importantly, he reviews his thought process, approaching the issue from a number of angles.  He doesn’t endorse any single fielding stat as the best possible solution to the question.  This is James’s writing and thinking at its best, helping the reader understand the original question and how best to answer it.

- One-Year and Three-Year Registers of fielding stats.  There are a variety of fielding stats, though the central one is Dewan’s plus/minus system (see below).

- My favorite section, a graphical display of where hits tended to land against every major league team, with two pages for each team.  This is a brand-new way of looking at the data and I enjoyed it tremendously.

- A player ranking and comments section.  Again, John Dewan doesn’t rely upon any single fielding stat for his rankings or comments.  He incorporates many stats into his assessment, and even relies on scouts when the stats disagree.

Special sections on:

- The best and worst at turning double plays
- The best and worst at fielding bunts
- Relative Range Factor, a new stat by Bill James that is basically an update of the range methodology he used in Win Shares.
- Revised Zone Ratings

This last secton is interesting because it reveals that Dewan (who invented Zone Rating while at STATS) and James have disagreed about something in Zone Rating for the last 20 years or so.  Dewan was the one who decided to include balls fielded out of a player’s zone in his Zone Rating numerator and denominator; James thought this was a bad idea.  This same topic has been discussed repeatedly at Baseball Think Factory.

The good news is that John changed his method for the “Bible” and now lists Zone Rating that only includes balls in the zone and has a separate listing of balls fielded out of zone.  In other words, you can look at the two side-by-side and make your own determination as to what they mean.

The key stat is the “plus/minus” system, which is basically a simple version of UZR.  For shortstops, for instance, they calculated how often a groundball hit in a certain vector at a certain speed was fielded successfully, on average, by all shortstops.  They then compare each individual shortstop to those averages, based on the specific balls hit to that shortstop.  The plus/minus system is the number of balls above or below the average shortstop.

They don’t adjust the batted ball for handedness of the batter or ballpark and they don’t convert the plus/minus system into runs.  For outfielders, however, they do translate the plus/minus system into bases (this is called the “enhanced” plus/minus system).  In the player comments, however, they do show how the player performed at home and on the road, as well as how they did with left or right handed pitchers on the mound.

In the end, this is the strength of the Fielding Bible.  They’ve done a great job of presenting a lot of information and, instead of hiding it behind simple rankings or numbers, they’ve laid out a lot of individual detail.  If you want to disagree with their rankings, they give you the data to do it.  And that’s a healthy thing to do.

Bottom line:  VERY highly recommended.  Especially for the player comments.  smile


Posted by Studes on 02/28 at 12:32 PM
GeneralPermalink
Tuesday, February 21, 2006

Batter Types

One thing I’ve wanted to do is use the batted ball library to investigate ways to categorize batters.  When we think of pitchers, we automatically think of strikeout pitchers, or groundball pitchers.  But when we think of batters, we tend to think of home run hitters, or hitters for average.  Nothing wrong with that, but I’ve wondered if there’s a more insightful way to categorize the men at the plate.

So here’s what I did.

Click for more...
Posted by Studes on 02/21 at 09:37 AM
Baseball StatsPermalink
Monday, February 20, 2006

Welcome, Brian Cashman

Believe it or not, my work is featured in several new books.  Dayn Perry has a fine new book called Winners, which includes a reference to the Win Shares Trade Balance Sheet that Mike Carminati and I produced last year.  In the index, my last name is listed between Strawberry and Sutcliffe.  That’s rarefied company.  I enjoyed Dayn’s book quite a bit, once I stopped staring at my name in the index.

In Baseball Hacks, Joe Adler has a list of his favorite sites and he calls Baseball Graphs “One of my favorite independent web sites… This site presents groups of statistics graphically, looking at statistics in ways that you’ve never seen before.”

Baseball Hacks is a very good resource for anyone who would like to better take advantage of all the statistical information available on the Internet, such as Retrosheet and the Baseball Databank.  Highly recommend for the potential geeks among you.


In a new book called Fantasyland, which I haven’t read yet, we learn that a real live General Manager likes to read Baseball Graphs, too.

When I ask Yankees general manager Brian Cashman if he reads any of this stuff, he rattles off a list of obscure sites he’s browsed, including something called Baseballgraphs.com, which maps the history of the game using advanced quantitative methods like defense efficiency ratio and fielding independent pitching.

To see what Brian Cashman might have been drawn to a couple of years ago, take a look at my 2003 page.  The style I developed there and in the historical graphs has since informed The Hardball Times website and Annual.

Welcome, Mr. Cashman!  Good luck this year.  And thank you, too, Dayn and Joe.  I hope your books find a wide audience.


Posted by Studes on 02/20 at 07:37 PM
Permalink
Wednesday, February 15, 2006

A Library of Batted Ball Stats

I’ve posted a lot of batted ball tables on this blog.  Up to now, I’ve hesitated to just “dump” them all on the Internet because I just don’t believe in “data dumps.”  The tables are kind of confusing and need to be interpreted carefully.

However, this is really cool data, and I think the table format works pretty well (thanks to comments from readers of this blog).  So, the heck with it.  I’ve posted the batted ball tables of every major leaguer who saw a decent chunk of time last year.  You can start with the Batted Ball Index, which includes an index of all teams and a player search button.

I’ve also added some tips regarding how to read the tables.  So please look them over and leave any comments or questions you might have about the stats on this blog.  And spread the word—I hope that I didn’t do all this work in vain!


Posted by Studes on 02/15 at 01:18 PM
Baseball StatsPermalink

Peanut Butter Wiki

One of the services we use at the Hardball Times is called a Peanut Butter Wiki. It sounds silly, but it allows you to set up your own wiki as easily as creating a peanut butter sandwich.

A wiki is a superb tool for collaborative efforts. If you're a student in a study group, you can use it to work on group papers. At THT, we used it to share ideas and create the 2006 Hardball Times Annual.

PBWiki was our tool of choice because:

1. It's incredibly easy to use.
2. It's free.

You can't beat that. Consider this a ringing endorsement of PBWiki and visit it today!

PBwiki logo


You'll be happy to know that I have received more free space on our Wiki just for telling you about it. Consider this a win-win.
Posted by Studes on 02/15 at 09:54 AM
GeneralPermalink
Monday, February 13, 2006

Jeremy Bonderman

Bonderman, Jeremy

Net Runs per Ball % of Batted Balls %/OF %/PA Total Net Runs
BFP OF LD GB OF% LD% GB% HR K BB OF LD GB IF NIP Tot R/G
2003 727 0.06 0.43 -0.08 29% 25% 43% 14% 15% 9% 8.9 58.4 -19.3 -4.5 -12.0 30.4 1.6
2004 793 0.06 0.37 -0.09 31% 18% 48% 14% 21% 10% 9.7 35.4 -22.7 -4.0 -22.4 -6.1 -0.3
2005 801 0.05 0.42 -0.09 30% 19% 48% 10% 18% 8% 8.1 46.9 -25.0 -5.3 -23.0 0.4 0.0
Avg. 774 0.05 0.40 -0.09 30% 21% 46% 13% 18% 9% 8.9 46.1 -22.3 -4.6 -19.1 7.5 0.4
vs. MLB 0.02 0.05 0.01 -1% 0% 2% 2% 1% -1% 2.9 5.0 2.0 1.1 -4.9

Posted by Studes on 02/13 at 09:25 AM
Baseball StatsPermalink
Saturday, February 11, 2006

Batted Balls in the NL Central

STL

Net Runs per Ball % of Batted Balls %/OF %/PA Total Net Runs
BFP OF LD GB OF% LD% GB% HR K BB OF LD GB IF NIP Tot R/G
2002 6135 -0.01 0.33 -0.11 31% 21% 44% 9% 16% 10% -17.2 313.6 -220.0 -42.4 -103.0 -89.3 -0.6
2003 6375 0.04 0.36 -0.10 32% 22% 41% 12% 15% 9% 67.0 382.6 -201.7 -51.6 -101.6 78.3 0.5
2004 6104 0.09 0.31 -0.12 30% 18% 48% 12% 17% 8% 124.3 255.1 -257.1 -36.3 -142.3 -74.2 -0.5
2005 6047 0.07 0.32 -0.13 27% 20% 51% 12% 16% 8% 89.1 280.1 -283.8 -31.6 -124.6 -90.6 -0.6
Avg. 6165 0.05 0.33 -0.12 30% 20% 46% 11% 16% 9% 67.2 306.4 -239.4 -40.5 -118.0 -42.4 -0.3
Vs. MLB 0.03 0.36 -0.10 31% 21% 44% 11% 17% 10% 47.6 327.1 -194.3 -45.5 -113.7 10.2 0.1


HOU

Net Runs per Ball % of Batted Balls %/OF %/PA Total Net Runs
BFP OF LD GB OF% LD% GB% HR K BB OF LD GB IF NIP Tot R/G
2002 6205 0.02 0.39 -0.11 30% 22% 44% 10% 20% 10% 21.5 365.8 -209.6 -44.1 -165.3 -48.9 -0.3
2003 6176 0.02 0.37 -0.12 30% 21% 44% 11% 18% 10% 21.9 330.4 -220.7 -50.8 -130.1 -64.4 -0.4
2004 6201 0.09 0.36 -0.11 32% 20% 44% 12% 21% 10% 121.3 299.7 -203.3 -42.1 -185.4 -26.8 -0.2
2005 6023 0.04 0.31 -0.11 29% 21% 46% 11% 19% 8% 45.6 284.1 -226.8 -35.5 -180.1 -125.9 -0.8
Avg. 6151 0.04 0.36 -0.11 30% 21% 45% 11% 20% 9% 51.8 320.0 -214.9 -43.2 -165.5 -66.9 -0.4
Vs. MLB 0.03 0.36 -0.10 31% 21% 44% 11% 17% 10% 47.5 326.3 -193.8 -45.4 -113.4 10.2 0.1


MIL

Net Runs per Ball % of Batted Balls %/OF %/PA Total Net Runs
BFP OF LD GB OF% LD% GB% HR K BB OF LD GB IF NIP Tot R/G
2002 6339 0.04 0.36 -0.10 32% 21% 42% 12% 16% 11% 60.6 343.0 -180.5 -51.4 -71.0 80.4 0.5
2003 6459 0.06 0.36 -0.08 32% 21% 41% 13% 16% 10% 95.3 362.2 -158.4 -60.1 -101.3 132.9 0.8
2004 6204 0.05 0.34 -0.07 33% 18% 44% 10% 18% 8% 68.2 273.1 -134.0 -46.4 -153.2 -2.5 0.0
2005 6208 0.05 0.31 -0.08 33% 21% 41% 11% 19% 10% 65.8 278.4 -148.7 -47.1 -148.5 -11.8 -0.1
Avg. 6303 0.05 0.34 -0.08 33% 21% 42% 12% 17% 10% 72.4 313.8 -155.9 -51.2 -118.6 49.1 0.3
Vs. MLB 0.03 0.36 -0.10 31% 21% 44% 11% 17% 10% 48.7 334.4 -198.6 -46.5 -116.2 10.5 0.1


CHC

Net Runs per Ball % of Batted Balls %/OF %/PA Total Net Runs
BFP OF LD GB OF% LD% GB% HR K BB OF LD GB IF NIP Tot R/G
2002 6236 0.02 0.38 -0.10 32% 22% 42% 11% 21% 11% 25.1 341.0 -165.7 -43.0 -178.8 -40.2 -0.2
2003 6227 0.00 0.36 -0.11 27% 22% 47% 11% 23% 11% 4.4 319.5 -198.0 -37.8 -191.4 -122.7 -0.7
2004 6262 0.05 0.34 -0.09 31% 19% 45% 11% 21% 10% 60.2 274.8 -166.0 -42.9 -195.0 -84.3 -0.5
2005 6185 0.06 0.31 -0.10 30% 21% 46% 13% 20% 10% 79.2 271.5 -183.7 -33.6 -168.5 -54.3 -0.3
Avg. 6228 0.03 0.35 -0.10 30% 21% 45% 12% 21% 10% 40.9 301.5 -178.3 -39.5 -183.5 -76.5 -0.5
Vs. MLB 0.03 0.36 -0.10 31% 21% 44% 11% 17% 10% 48.1 330.4 -196.2 -46.0 -114.8 10.3 0.1


PIT

Net Runs per Ball % of Batted Balls %/OF %/PA Total Net Runs
BFP OF LD GB OF% LD% GB% HR K BB OF LD GB IF NIP Tot R/G
2002 6131 0.06 0.37 -0.13 28% 21% 48% 13% 15% 10% 75.0 354.3 -264.9 -33.4 -71.5 44.1 0.3
2003 6293 0.02 0.39 -0.11 29% 22% 45% 11% 15% 9% 23.7 410.9 -236.6 -39.1 -92.5 44.8 0.3
2004 6197 0.07 0.34 -0.10 32% 19% 45% 10% 17% 10% 99.9 283.9 -205.6 -42.1 -113.0 1.6 0.0
2005 6264 0.07 0.33 -0.12 31% 21% 44% 11% 15% 11% 90.6 314.8 -235.2 -45.8 -66.8 43.8 0.3
Avg. 6221 0.05 0.36 -0.12 30% 21% 45% 11% 16% 10% 72.2 339.3 -235.2 -40.2 -85.9 32.7 0.2
Vs. MLB 0.03 0.36 -0.10 31% 21% 44% 11% 17% 10% 48.0 330.0 -196.1 -45.9 -114.7 10.3 0.1


CIN

Net Runs per Ball % of Batted Balls %/OF %/PA Total Net Runs
BFP OF LD GB OF% LD% GB% HR K BB OF LD GB IF NIP Tot R/G
2002 6296 0.01 0.36 -0.10 29% 21% 46% 11% 16% 10% 9.7 342.2 -205.2 -43.0 -95.1 -4.9 0.0
2003 6423 0.04 0.37 -0.10 31% 23% 43% 12% 15% 10% 59.0 407.0 -203.0 -44.3 -71.9 133.9 0.8
2004 6451 0.10 0.37 -0.08 33% 20% 43% 13% 15% 10% 161.6 334.5 -164.4 -50.6 -92.5 178.3 1.1
2005 6397 0.09 0.36 -0.09 33% 21% 41% 13% 15% 9% 147.0 364.1 -172.7 -51.8 -97.4 176.8 1.1
Avg. 6392 0.06 0.36 -0.09 32% 21% 43% 12% 15% 10% 91.2 361.6 -186.1 -47.4 -89.2 118.0 0.7
Vs. MLB 0.03 0.36 -0.10 31% 21% 44% 11% 17% 10% 49.3 339.1 -201.4 -47.2 -117.9 10.6 0.1

Posted by Studes on 02/11 at 02:46 PM
Baseball StatsPermalink

Historical Win Shares Spreadsheet

I’ve updated my historical Win Shares spreadsheets, which now include 2005 Win Shares.  You can download them from…

ftp://ftp.baseballgraphs.com/winshares

The 2004 and 2005 Win Shares are somewhat different from the original Bill James formula, as explained on the Hardball Time website.


Posted by Studes on 02/11 at 02:17 PM
Win SharesPermalink
Tuesday, February 07, 2006

Competitive Balance Review

Dan Fox recently wrote a great article at the Hardball Times called Competitive Balance and the CBA.  Dan looked at the typical variance in salaries between 1990 and 2005 to try to ascertain if the latest Collective Bargaining Agreement has restored any competitive balance in the majors.

Dan’s conclusion is that major league competitive balance has gotten worse instead of better during the time in question.  I like his methodology a lot, but I took a slightly different approach to the data and came to a different conclusion.

Specifically, I left the New York Yankees and the four expansion teams of the 1990’s out of the sample.  My reasoning was pretty simple; the Yankees don’t really count because they’ve made it clear (up to this past offseason) that they don’t care about any friggin’ payroll tax.  And the expansion teams might have skewed the data because their payrolls almost certainly followed a different path during the time in question.

I replicated Dan’s graph and applied to my outcomes.  Here ya go:

image

You can refer to Dan’s for an explanation of the graph, but the key line is the red one, which represents the coefficient of variation (refers to the axis on the right).  Suffice to say that the CV has generally risen, but it reached its peak in the late 1990’s and has actually decreased since then.  Maybe, just maybe, the latest version of the CBA has helped after all.


Posted by Studes on 02/07 at 03:22 PM
EconomicsPermalink