The Win Shares Baseline
December 16, 2003
A month ago, I wrote a couple of articles about Game Shares and Loss Shares. I think I lost a few people along the way, and wound up with an idea that was much too complicated to explain. And if you can’t explain an idea, it won’t get far.
So I went back to the drawing board and devised an easier way to determine appropriate Win Share baselines for each player. Baselines are crucial for an adequate interpretation of Win Shares, as explained in the Loss Shares article.
This baseline is the number of Win Shares an individual player would produce if he were an average player (batting, fielding and pitching), given his playing time. His actual Win Shares can be compared to the baseline to determine how many Win Shares he contributed, above or below an average player.
This average baseline, by the way, is certainly not the most appropriate baseline to use for player comparison—a replacement level baseline is much better. However, to calculate replacement, one first needs a sense of average. Replacement level will come later (if I dare!).
The method I’ve devised is based partly on a paper Charlie Saeger sent me called “expected Win Shares” and based partly on suggestions from Steve Rohde in the Historic Win Shares per PA commentary. Also, I should extend my usual debt of gratitude to Tangotiger, who suggested this approach in the first place.
So here’s what I did to establish the Win Shares Baseline for each player:
I selected different playing time factors for the four types of Win Shares.
- Plate Appearances per batter (with one adjustment, to be explained)
- Innings and Save-equivalent innings per pitcher
- Innings played per fielder (except pitchers)
- Plate Apperances per batting pitcher (National League only)
The plate appearances for each batter were adjusted to account for team-by-team differences in plate appearances per game. I had a little trouble getting my arms around this idea, but finally got there thanks to Steve Rohde’s help.
I then established the league baseline by dividing each Win Share type league total(batting/pitching/fielding) by playing time league total(PA/etc./etc.).
Thanks to Paul’s and Charlie’s suggestions in the comments below, I have also established separate specific batting baselines for National League pitchers and batters. This establishes a more sensible baseline for batter and pitchers in the National League.
I then multiplied each individual player’s playing time of each type by the league average and Voila!—an average Win Shares baseline for each player, compiled across all types of Win Shares, and for all players.
For each player, I’ve calculated his Win Shares (making some of the changes noted previously on this blog), his expected average Win Shares baseline, and the difference between the two (WSAA). Please note that WSAA is a MUCH better way to evaluate players than straight Win Shares totals.
One other note: If you’d like to play with replacement levels yourself, feel free to do so. You can simply copy this data into a spreadsheet, and multiply the expected average Win Shares by the factor you think most appropriate (80%, 50%, whatever). Hopefully, I’ll find time to investigate replacement levels later this offseason.
I believe this is an important step in Win Shares evolution. Notice, for instance, how the rankings change when you move from Win Share ranks to WSAA ranks.
Here are the American League (sortable) totals.
And here are the National League (sortable) totals.
I just discovered your site and all of your great articles on Win Shares, so I’m commenting here on a bunch of articles, not just this one. Hope that’s okay. Plus there’s a 2500 character limit, so I’m making this two posts.
Loss Shares - You compare Ty Wigginton and Jason Phillips and indicate that they have the same amount of WS but did it in different outs. If I understand your argument, you are saying that they shouldn’t be considered equal WS-wise because Ty made more outs than Jason. But there is already a correction being made in WS for the number outs. Ty starts out 12 runs better than Jason and the WS system makes them equal *because* of the outs. You can argue about the .52 point but I don’t understand how you can argue that we need to take outs into consideration *again*.
Bonds Held Back by own Pitchers - I totally agree with this article. But I have a scenario for you that makes this a little unpalatable. Take a team that is so bad on offense that quite a few of its every-day hitters have negative claim points. This will result in the few hitters that have positive claim points being boosted quite a lot. As an example (which everyone can take with a grain of salt, since it’s not “real”), in a recent DMB 1969 simulation, the San Diego Padres were very bad offense-wise. Their only hitting stars were Nate Colbert, Ollie Brown and Al Ferrera. With negative claim points set to zero, these guys had 12, 9, and 8 WS respectively - respectable average seasons. Meanwhile, Cito Gaston, Larry Stahl, Tommy Dean, and Chris Cannizzaro were so bad that they get 0 WS because their claim points were low. When I don’t make this artificial cutoff, they get -8, -4, -2, -2 WS respectively. *BUT* because the other batters have to make up for their ineptitude, Colbert, Brown and Ferrera get boosted to 25, 18, 15 WS levels. That’s quite the jump and it looks strange. But if you carry the -WS logic through, I think it makes sense. Those negative claim point hitters are *SO* bad, that Colbert had to be almost-MVP-quality to allow San Diego to win as many games with their offense as they did.
All in all, yours is a great site and I’ll be coming back for more articles, so keep them coming.
Here is part two of my post..
Overvaluing Relievers - I agree with many that in WS pitching tends to be undervalued for the sake of fielding, but I never knew about the starter/reliever problem until I read your article. I’ve always felt strange about the PCL-2 (W/L/S/Hd) points because they’re so artificial. And now you’ve given me a good statistical basis to dislike it! I’m going to look at Charlie’s claim point formulas and maybe use them instead for my WS calculations.
This article - I’m trying to understand your point about WSAA and I just don’t get it. I don’t understand why, in a specific season, I can’t compare players using their WS figure. I guess I’ll have to play around with your sortable charts to see.
Jean, thanks so much for stopping by. Sounds like you’ve done a lot of great work with Win Shares.
Your Park Factor formulas are essentially correct. I got very little difference when I made the change, but I did get some difference (less than a Win Share before rounding; sometimes a full Win Share once rounded). It’s still an important step to increase the validity between marginal runs and Win Shares. I’ll pull out my spreadsheet tonight to see if I can give you any more insight as to how I got a difference and you didn’t.
Regarding WSAA and Loss Shares (same concept), it’s really kind of simple and I should practice my explanation. If a player has created the same amount of Win Shares than another player in less playing time, then he’s helped his team more. If Win Shares were based on true replacement level instead of 52%, then this wouldn’t be the case. But it’s not, so we need to make this adjustment. I guess, in a way, it is “arguing with the 52%.” Instead of trying to “correct” the 52%, I’ve chosen to add an average baseline.
Let me know if that helps.
Regarding the negative Win Shares, it’s hard to comment without seeing the data. But if a team had a ton of folks with negative Win Shares, then the correction would actually be even more important. It shouldn’t skew the positive Win Shares out of proportion with players on other teams—it should bring them in line.
My question would be, how do Colbert’s batting Win Shares compare to guys with similar stats in your league?
Thanks again for your comments. Please do keep dropping by. If you have other research ideas, or insights you’ve had yourself, feel free to share them.
Posted by studes
on 12/18 at 06:35 AM
Ah, studes, something I have always wanted to see, but never had the time or skill to show it to myself.
Jean—if you want to use my Claim Point formulae, I disavow the strikeout adjustment, based on further research. The individual strikeout makes so little difference that it Nolan Ryan is probably shortchanged only a couple of Win Shares, career.
I have a problem with this statement: If a player has created the same amount of Win Shares than another player in less playing time, then he’s helped his team more.
Isn’t that totally counter to the point of Win Shares, which is to generate one number that expresses the amount that a player has helped his team?
To me, Wigginton’s RC with more outs being considered equivalent to Phillips’ lower RC with less outs is just fine. I’ve always struggled with how to evaluate two players where one has more RC than the other, but the other has a better RC/27. Determining RC above Marginal Runs was the breakthrough for me and I don’t know why I didn’t think of it until I read the Win Shares book.
Basically, Batting WS is RCAA, so I question having WSAA.
Now, if you want to discuss changing the 0.52 factor, I’m all with you there. In fact, I’ve got an argument about the preponderence of the 0.52 factor in Bill James’ equations. But I’ll make that in another post. 8-)
In the spreadsheet that I created to calculate Win Shares, I made the occurence of 0.52 a parameter. I actually believe that 0.52 is the wrong number for splitting WS between Offense and Defence. On a lead by Tangotiger, I plugged in the Fibonacci number (sqrt(5)-1)/2 and got much more reasonable numbers—numbers where pitchers have a better chance of making the MVP lists.
But you *CAN’T* take that same number and use it in the Marginal RC formula for Batting WS. I don’t understand why Bill used 0.52 there—I attribute it to a desire to use 0.52 everywhere. 8-)
I’m still trying to figure out what number we should use there—something closer to 0.25 seems better. Does anyone have an idea of what a suitable replacement value should be? That would eliminate the Wigginton/Phillips debate (only to make it a Phillips/someone else debate, I’m sure!).
Once I’ve got my fielding numbers calculated in my spreadsheet, then I hope to expand it to other years—I only have one year. And then I can talk with some sense of authority (yeah right!).
Two potentially simple algorithms for determining the batting threshhold number:
1) Take the data from studes’ last post, seperate the guys making < $500k or some similar number and were signed as free agents, and take their performance as “replacement level.”
2) Pick your favorite system/website for projecting the major league performance of minor leaguers, and plug in the guys who were exposed to the recent Rule V draft. Those guys are about as free as they come, $350k.
Based on what people have said, I wouldn’t worry about nailing it down with too much precision; I doubt being off by 5% from the “ideal” number will make a difference to many people’s totals.
In general, I think you will find the number to be well below 52%. However, it gets more interesting if you want to make it more GM-specific. For instance, in looking at the Kaz Matsui signing, the Mets had more options than the above algorithms would suggest.
Based on the way Bill James approached Win Shares, you shouldn’t make the two thresholds cancel out to anything other than one. This is something I attempted to explain in the first article.
The reason .52*RS and 1.52*RA works is because the two cancel each other out when you put them into an equation, and you’re left with basic run differential. Bill James built this whole system on the premise that, within the normal bounds of baseball teams’ performance, run differential is a good predictor of won/loss record. I believe this was a valid way to think of it (though some would certainly debate the point).
.61 and 1.61 work. So do .9 and 1.9. But if you make them .25 and 1.61, the system doesn’t work, because then you no longer have RS-RA=a proxy for Wins/Losses.
Jean, if you want to play with the .52, they you should be in favor of WSAA. They’re addressing the same issue, which is that .52 was a relatively arbitrary number. It worked because it allowed most players to accumulate at least some Win Shares. It effectively acted like a zero base (at least in James’ mind).
Marginal runs are not the same as RCAA. They’re RCA .52 of A. Average at least has some meaning. .52 doesn’t, really.
Here’s what WSAA does, in the Bonds/Pujols playing time debate. Bonds and Pujols created the same number of Win Shares (according to my latest iteration of the system). But Bonds did it in less playing time, right? So how should we think of that?
WSAA answers the question. It basically says that, if Bonds’ time missed were filled in by someone who played on an average level, then Bonds and that person added 27.8 Win Shares above average, vs. 24.9 for Pujols and an average person.
Also, remember that WSAA is not the end goal. It’s a step to get to a replacement level (which James called “Pandora’s trap”), which is the most correct way to answer the Bonds/Pujols question.
Go back and read the “Loss Shares” section of the Win Shares book, if you haven’t lately. Even James admits that the system is incomplete without them. Expected Average Win Shares serve as a proxy for Loss Shares. If you can follow the logic in my Loss Shares and Game Shares articles, you’ll see why.
Posted by studes
on 12/18 at 04:29 PM
By the way, there is no “Loss Shares” section of the book. Loss Shares are covered in the same section as Replacement Level (pages 107-109). Really, all I’m trying to do here is carry on James’ work, which he acknowledged needed to be done.
Posted by studes
on 12/18 at 07:37 PM
By the way, Jean, I went back to my home park factor spreadsheets to determine why I got some different numbers from applying the park factor differently and you didn’t.
I’m going to assume it’s because James calls for the total Runs Created to be rounded as a final step, before computing Marginal Runs. I’m also assuming that you didn’t round your XR calculations. This is the reason I sometimes got a difference and you didn’t, I think. It’s a minor technical detail.
Posted by studes
on 12/18 at 07:47 PM
Commenting is not available in this channel entry.