Saturday, August 23, 2014

Talent in baseball is not normally distributed. It is a pyramid. For every player who is 10 percent above the average player, there are probably twenty players who are 10 pecent below average. -  Bill James.

Note RPG is Runs Per Game, and is based my formula discussed in a previous blog post.

I love the work of Bill James, but I believe that this is one of the dumbest things that he has ever said. I believe that he is correct about the distribution of talent of players in the major leagues, and yet I believe that baseball talent is distributed normally FOR HUMANS ON EARTH. In fact, if you pay players millions of dollars to play a game that is considered fun so that almost any person on earth who could play the game would want to, what you will end up with is a group of people who were +4.25 on the normal curve. What, if this were true, would the talent distribution look like in major league baseball (or any other sport where the players are paid millions of dollars for that matter) look like? It would look like a pyramid. For every player who was 10 percent above average, there would be 20 players who were 20 percent below average. The fact that this is true does not dispute that baseball talent is normally distributed (among humans on earth), it helps to prove it.

Here is the Distribution of Batting RPG using my Formula. It batters from 1981-2013 with more than 250 At Bats and are not Pitchers.

Here is the same formula with a trend line-like formula going from 4.5 standard deviations to 5.29 standard deviations starting at the highest point on this chart.

There is not the hard cut off one might expect and hope for, but players play hurt, managers and GMs are not perfect judges of talent, players don’t play up to their talent level due to chance. Also, this is just batting that the players are being rated on.  Some of the players below the .5 RPG mark are Mark Belanger and Ozzie Smith.  The Max distribution is at .5 RPG, and is players with .5 and .51 runs per game. There are 271 Player seasons at that level, 5099 above it and 3155 below that point. It looks a little “normal-y” but I stand by Bill Jame’s analysis and my own.

If you want to see something that looks a bit more like a normal distribution look at RPG for pitchers all time:

And with a trend line:

I mention all of this because it seems clear that pitcher run distribution is normally distributed and pitcher run distribution is a good way to determine, by era, the quality of play. The RPG of the pitchers, if it is higher, means that the rest of the league is not as good. Pitcher RPG went up during World War II. It has been dropping most of the history of baseball.

Due to this, an adjustment I make, after adjusting for ball-park effect and adjusting for the amount of runs scored during that year, I set pitcher RPG to zero. This is what is charted above. I feel that it makes it possible to compare 2013 to 1923 to almost any year on record. (I am a little leery of comparing 1800s baseball to current times. I don’t have a hard cutoff date.) The effects of doing this seem to pass “The sniff test” to me. Babe Ruth’s 1923 compares very favorably to Barry Bond’s 2002, or any other season in the history of baseball, but the average player in 1923 is not as good as the average player in 2013. Perhaps in 1923, the cutoff for being in the Big Leagues is +4.00 standard deviations above average instead of +4.5 today.

No comments:

Post a Comment