The Baseball Graphs Blog
Thursday, April 28, 2005
Red Sox vs. Devil Rays
I posted a review of a Boston/Tampa Bay game a couple of days ago at The Hardball Times. I also added the “P” of each situation to the graph, which might appeal a bit more to everyone’s graphical esthetics.
Also, there is a satirical take on Win Probability from Tim Keown at ESPN. I have a feeling he might be a contributor to the RedsZone Forum!
Thursday, April 21, 2005
Astros’ WPA
There’s a new blog in town, Third Coast Baseball, that has a very nice WPA review of the last two Astros’ games. This looks superb. As other folks publish their own WPA reviews, please post links to them here. It will be fun to see how everyone is handling the output, and share ideas.
Wednesday, April 20, 2005
Blue Jays/Red Sox
David Tybor has posted another WPA graph—this one of last night’s Blue Jays/Red Sox game, which was won in the late innings by the Blue Jays.
David’s approach to graphing the game is confusing to me, and we both thought it might be worth discussing here on the site. Here are some points:
- Essentially, the scale of the chart flips every time one team rises above/below 50%. This is confusing to me.
- Remember that some segment of the population is colorblind, meaning they won’t be able to make any sense from this graph.
- David’s response is that there is a lot of white space on the graphs I produce (good point!), and he was trying a new approach.
Any responses? My reply would be that essentially flipping the Y axis when one team rises above 50% makes it very hard to follow the flow of the game graphically, and it isn’t worth the tradeoff of less white space. One way to address the white space problem is to add more info to the graph (as long as it’s relevant), such as the “P” of each situation.
For those interested, I point you to this summary of Edward Tufte’s work. Tufte is the Godfather of Good Graphs.
Tuesday, April 19, 2005
WPA Updates
I posted a Game in Review at the Hardball Times, covering the pitching duel between Rich Harden and Jarrod Washburn last Saturday. Also, David Tybor has discovered the spreadsheet, and he posted a graph of last night’s Mets/Phillies game (scroll down to see it).
I love this graph, by the way. The final score was 5-4, Phillies, which would lead you to believe it was a close game. But the Phils led 5-0 going into the ninth, and even Cliff Floyd’s home run didn’t have a significant effect on the Phillies’ chances of winning.
In other news, I’ve stopped tracking the Reds’ WPA on a daily basis. It was just too much time, because I was trying to capture fielding WPA as well, which meant I had to watch every game in its entirety. But as a result, I do have some beginning thoughts on how to allocate WPA debits and credits between pitchers and fielders, using only play-by-play logs.
- Credit the following events 100% to the pitcher: Strikeouts, walks, HBP’s and home runs, for obvious reasons.
- Also, credit pop flies to the infield 100% to the pitcher, because they are converted into outs virtually 100% of the time.
- Line drives are trickier. My observation is that line drives caught by infielders are mostly a matter of chance, though you could also call it positioning. So I don’t tend to give infielders much credit for catching line drives, just as I don’t give them much of a debit for not catching line drives. Maybe 5% at the most?
- Line drives to the outfield do provide more of a chance for fielders to make an impact. Maybe 10%?
- Groundballs and flyballs provide fielders with the greatest chance to make an impact, and I would credit and debit 10% to 20% of the outcome to them in those cases.
Now, these are just guidelines for discussion purposes. Only use them if you want to. And, if you do want to allocate WPA between pitchers and fielders, you’ll find pretty quickly that you should really watch the entire game.
One last thing. I’ve been using the play-by-play logs in mlb.com’s “Gameday” application, because they offer the most detail, including the type of batted ball and who caught each ball. If other folks have their own preferred play-by-play logs, please let me know.
Thursday, April 14, 2005
Another spreadsheet upload
I uploaded a new version of the spreadsheet. The previous one had a slight lookup problem in one of the tables that was throwing off some of the math. Please download this latest, which is still version 1.3, from ftp.baseballgraphs.com/wpa.
Two game updates:
The Cheat posted a WPA review of one of the White Sox’s game.
And I’ve got a review of the latest Reds’ game, featuring the intentional walk to Pujols with one out, and a game-ending double play.
Wednesday, April 13, 2005
New version of the spreadsheet uploaded
I found a bug in the spreadsheet involving run differentials of more than ten. Fixed it, and uploaded the new version (Version 1.3) to ftp.baseballgraphs.com.
I also posted the WPA of the most recent Reds/Astros game at the RedsZone Forum.
Tuesday, April 12, 2005
Sky’s WPA
Skyking has written up two games using WPA. I should have linked to these before, because he does a nice job of using WPA to diagnose a game. Maybe I’ll steal a couple of his ideas!
Here’s the link for the Devil Rays/Blue Jays game.
And here’s the link to the Braves/Marlins game.
Rangers/Mariners Game
I chronicle the WPA of last Saturday’s Rangers/Mariners game over at The Hardball Times. If you have any questions or comments about the article, you can post them here.
A central issue I raise in the article is how to measure the criticality (or importance, or leverage) of a specific situation. There are three definitions I know of:
- “P", which is the difference in WP between the current situation and the WP if the pitcher pitches out of the situation to the end of the inning with no runs scoring. Invented by Doug Drinen.
- Tangotiger’s “Leveraged Index”, which is similar, except that it measures the difference between the current situation and what would happen if the batter does something positive (I’m not sure how Tango calculates this). The key to Leveraged Index is that an average situation is set to 1.0.
- Keith Woolner took Tango’s concept of Leveraged Index and incorporated his own math. His approach is to calculate the difference a single run scored would make in WP and define that as the measure.
These are three very different ways of using Win Probability to define criticality (outs, batter, runs), though all three are based on the same idea: calculating the difference between the current state and some potential future state.
I have another idea. What if criticality is defined as the difference between two potential future states? For instance, what if it is defined as the difference between a strikeout in the current situation vs. a home run, given the current situation?
The notion of a “criticality measure” may be the most important thing to come out of Win Probability. It would be something many fans would find useful, I think. So it’s worth kicking around.
Monday, April 11, 2005
Blue Jays’ WPA
Thomas Ayers has posted the WPA totals from yesterday’s 4-3 win over the Red Sox at the Batter’s Box Interactive Magazine.
Spreadsheet Tips
If you work with the WPA spreadsheet, here are a few tips:
- Don’t save the spreadsheet under a name other than WinExp.xls if you want the macros to work. I generally use the spreadsheet as is when logging a game, then save it under a different name when done. Remember, the macros won’t work if you save it under a different name.
- Make sure your Macro security is set to “Medium.” Otherwise, they won’t work.
- The macros don’t seem to work on a Mac. Don’t know why.
If you do log a game, be sure to enter the info in the “Start” page, as well as the pitchers in the “Front End” page. Then, when logging a game, you should enter each difference in score, inning, outs, and base situation, then note each of the following in the four yellow description boxes:
- Play description
- Offensive Player involved.
- Percent of play to be allocated to pitcher
- The name of the fielder, if any has WPA credited to him.
It probably works best if you save the file in two places: one with the original WinExp file, then store it in another directory, which you use to log a game. This way, if you change something in the file, it won’t affect the original one.
Sunday, April 10, 2005
WPA Central
There seems to be a lot of interest in Win Probability Added these days. I have a Win Probability Added spreadsheet that Jon Daly and I built based on the previous work of Tangotiger, Keith Woolner and Doug Drinen, and I’ve used it to track several games at The Hardball Times.
I’m happy to share the spreadsheet with others, and several people have gotten copies. The thing is, WPA is an evolving approach, and the spreadsheet has evolved too. Plus, my e-mail conversations with other folks using the worksheet have become a bit disjointed. So… I’ve decided to use this blog as a central gathering place for WPA devotees using the spreadsheet. For this summer, we’ll limit our Baseball Graphs posts to WPA, its uses, applications and implications.
In related news, I’ve come to an agreement with the RedsZone Forum to track all of the Reds’ games this summer with the spreadsheet. I had wanted to track a specific team all year long with WPA, so this agreement has given me some focus. I’ll be posting WPA results on their site each day, hopefully within twenty-four hours of the game’s end.
Plus, I’ll be writing a weekly column for The Hardball Times called “Game in Review,” in which I’ll use WPA to track the ins and outs of each one.
So that’s the story. If you want a copy of the spreadsheet, Version 1.2 is available at ftp.baseballgraphs.com/wpa. Keep in mind that I keep finding bugs and ways to improve the interface. Plus, there are lots of unresolved issues regarding how to score certain types of plays. So, if you’re interested in WPA, please drop by often and post any questions or issues.

