The Baseball Graphs Blog
Saturday, February 26, 2005
Errors and Winning
I received an email from a high school baseball coach today:
I’m a new assistant high school coach and I am trying to relay the importance of not making errors to my infielders. Do you have or know where I can find a graph, chart, or statistic that shows the more errors you make the less likely you are to win?
You know, I tend to get so carried away with advanced baseball statistics that I sometimes forget the basics. And we should always remember the basics, right? So here’s a graph of every team in major league history, illustrating their fielding percentage (errors divided by total chances) and their winning percentage:
It seems to be pretty simple, right? If you make less errors, your fielding percentage goes up. And, as your fielding percentage goes up, you win more games. But take a closer look.
See, the triangles pretty much flatten out once fielding percentage reaches around .860, which implies that if you average less than four or five errors a game, errors don’t impact winning. But that has more to do with baseball history than anything. Those fielding percentages below .900 primarily occurred in the 1800’s, when some teams registered as many as ten errors a game. Those guys really played with a patch of leather on their hands instead of what you’d call a glove. Today, major league teams average less than one error per game. So let’s draw a different graph that corrects for this data problem.
I created an “error index” for each team, which basically compares each team’s errors per game to the average number of errors per game that year. This way, teams are compared to other teams in similar playing conditions. Here’s a bar graph of the index against the average winning percentage:
As the index goes up, the wins go down. Said differently, if you make more errors than your competition, you’re more likely to lose. Teams that made half as many errors as the competition averaged a winning percentage of .600. Teams that made 50% more errors than the competition averaged under .400.
It really is pretty simple after all.
Friday, February 25, 2005
Baseball is a game steeped in history. That’s part of the fun of the game. That’s why I created my historical baseball graphs—to help people better understand and appreciate what’s happened in baseball’s past.
History has been presented on timelines for several hundred years now. One of the better-known early timelines was this one, created by Joseph Priestley in 1765:
You can click here for a larger version of the timeline. Hopefully, you can see the power of a timeline from this early example, with which you can see how the lives of famous people (at least, famous people in the 1700’s) overlapped with each other.
Now, let’s talk baseball. I often get confused about the early years of baseball, and which present-day teams relate to teams from the past. For instance, the New York Yankees began their existence in Baltimore—and they weren’t called the Yankees until the early 1910’s.
Thankfully, Alex Reisner developed a baseball timeline that handles this for you. Here’s a small version of the National League:
Can’t really read that, right? Sorry. I suggest you get the American League and National League timelines directly from Alex’s site in this PDF file.
Thursday, February 24, 2005
Replacing the Starting Pitchers
A couple of months ago, I posted some work regarding Win Shares replacement level. I won’t go into all the details of why I’m semi-obsessed with this, but it’s mostly due to a sick personality disorder. Ask my therapist.
Thanks to some excellent feedback (mostly pointing out my bad math), I updated the study for relief pitchers and position players, but never posted my final conclusions regarding starting pitchers. And I use the word “final” loosely.
I delayed the starting pitcher results because the methodology is more complex. Originally, I had found that starting pitchers have an extremely low replacement level of 40%. But Tango pointed out that starting pitchers are much more likely to be replaced during the year, compared to position players. And this created problems with the results.
So I went back to the data and placed pitchers in the first group (starting starting pitchers), based on how many games they had started as of the end of May. These guys represent the true top level of starters as of the beginning of the year, and all other starters represent the starting pitcher “bench”. I included pitchers in the “bench” group if he started at least two games, or one game if that was the only game he pitched.
And this change had the expected result, as the replacement level rose to 57% in the American League and 59% in the National League—about ten points below regular players and relievers.
So my final conclusion is that a baseline Win Shares level is 60% for starting pitchers, and 70% for everyone else. I plan to use these two figures in the 2005 Hardball Times Win Shares. As a reminder, these figures are multiplied by each player’s expected Win Shares to obtain a specific baseline for that player.
And I’m going to call it Win Shares Above Bench (or Baseline—just WSAB) instead of Win Shares Above Replacement. Replacement levels have become a controversial term, and “Bench level” is a more appropriate description of the analysis. Thanks to Tango for the idea.
Tuesday, February 22, 2005
Introducing Baseball Analysts
Rich Lederer and Bryan Smith have joined forces to create a new baseball website called Baseball Analysts. Rich and Bryan have been running successful blogs for a couple of years now, and this joint effort is sure to be a real winner. Congratulations, guys.
Sunday, February 20, 2005
Baseball Almanac Graphs
Did you know that the Baseball Almanac site has created graphs on their site? Unfortunately, they use the misleading term “charts” to describe their graphs (they also call their data tables “charts,” so you can’t easily find the graphs through their search engine) and it must be said that their site organization needs a serious upgrade. I have a hard time finding anything there.
Monday, February 14, 2005
Graphics and Santana
I continue to comb the Internet for great examples of the ways data is presented. Here’s one: The Baby Name Wizard.
This blows me away. It’s designed in Java, I believe, and it’s an incredible and imaginative way of presenting the relative frequency of baby’s names over the years. Think of how powerful something like this might be with baseball stats—makes me wish I had learned Java somewhere along the way.
As a reminder, you can find baseball graphs, with many of the same interactive features, at the Major League Charts site.
Meanwhile, the Twins signed Johan Santana to a four-year deal worth $10 million a year today, which is just a good deal for the Twins. I believe there’s an option for a fifth year, as well.
Santana was going to be a free agent in two years, so the Twins get two years of arbitration-eligible Santana at a relatively high price (at least in the fifth year), but free agent Santana at a relatively low price.
The Twins do take on some risk signing a pitcher to a four-year contract. Presumably, that is why Santana signed at such a low salary. Compare his deal to Roy Oswalt’s recent deal with the Astros—Oswalt has three years to go before free agency, and he signed a two-year deal for $8.5 million a year.
Multi-year contracts during arbitration years probably bears some discussion. About half of all five-year arbitration players had multi-year contracts this past year, so it’s a common strategy. By signing Oswalt to a two-year deal, the Astros don’t give up a lot because arbitration-eligible players have a limited downside (their salary can only decrease 20% in one year, I believe) and they now have a limited upside as well, if Oswalt has another great year.
Whether this sort of deal typically plays out to the team’s or player’s advantage would make for an interesting study.
Sunday, February 06, 2005
Korean Baseball Graphics
The Red Reporter has posted links to a series of Korean baseball cartoons. I don’t know Korean at all, but as a big fan of cartoons and graphics, I just love this stuff. Find your favorite team and see if you can use the cartoons to guide your understanding of a new language.
Thanks to Batgirl, whose fans are busy translating the Twins’ cartoon.
Thursday, February 03, 2005
Mrs. O’Leary’s Curse
I’ve never been one to put much stock in the Billy Goat Curse of the Chicago Cubs. I’ve lived in Chicago over twenty years, and I never heard of it until the Cubs made the playoffs in 2003. It was just an odd, historical footnote to the Cubs’ past. I personally think the story was pounced on by the media, as a way to compare and contrast the Cubs with the Red Sox, but it has never really held much currency for true Cubs’ fans. It’s not truly primal in nature. It doesn’t cause a shudder, nor speak to our dark nature. It’s a second-rate curse.
I have a better curse in mind—one that the media totally missed. The Curse of Mrs. O’Leary’s cow.
The National Association of Base Ball Players was the very first professional baseball league, formed after the Cincinnati Red Stockings made it clear that professional baseball would be a big hit with the fans. The first ten teams were the Philadelphia Athletics, the Washington Olympics, the Washington Nationals (TWO teams in DC!), the New York Mutuals, the Cleveland Forest Citys (an odd oxymoron), the Fort Wayne Kekiongas (no clue), the Troy Haymakers, the Rockford Forest Citys (two teams with the same nickname? And what did it mean?), the Boston Red Stockings and the Chicago White Stockings.
Suffice to say that the Chicago team, called the White Stockings, was actually the original manifestation of today’s Cubs. The granddaddy of our lovable, Sosa-less losers. They would later be known as the Colts and the Orphans (after Cap Anson was fired)—the name Cubs would not be given to them until 1902. But the curse that would cause so much future rue was inflicted that very first year.
The NA teams played many other teams, not just those in the NA, but it was only the NA games that counted in the standings. The goal was for each team to play every other team in the league five times, but that didn’t work out. Still, they played the season as much as possible, and each team was supposed to finish its schedule by November 1st (remember that the next time you think the current postseason goes on for too long), at which time the first professional league baseball champion would be crowned.
A close race developed between Philadelphia, Boston and Chicago. In fact, the Chicago team was in the lead on October 17, 1871, the day the city caught fire. I won’t go into all the details of the Chicago fire, but it almost literally wiped the city out. The fire spread so fast that people couldn’t outrun it. Some intrepid folks jumped into open graves in Lincoln Park in order to let the fire pass over them. And the White Stockings’ ballpark, uniforms and equipment did not survive. They may have been buried with the other ashes that formed the landfill now known as Streeterville, by the Magnificent Mile.
The team had to play their final games on the road. Without a home, certainly demoralized by the devastation at home, they lost every game and the pennant to the Athletics.
It was two years before the White Stockings were able to play in the National Association again. Their final years in the NA were lackluster, though they did go on to be perhaps the key founding club of the National League in 1876, and they had some great teams over the next two or three decades.
But the die was cast. The scourge had been revealed. Temporary successes couldn’t suppress the formidable power of the Chicago Fire. It’s time to acknowledge the origin of the Cubs’ true curse.