Writing this blog, I intend to continue to explain exactly
how and where I get my stats from but I also want to get started on one of the
more fun things I have in mind, which is to blog every single season of Major
League Baseball from 1871 to the present.
The rules in previous years, was, of course, not the same as it is
today, but the statistics record is complete enough for my purposes. In 1871 they did not use fielding gloves, and
made many more errors than players are credited for today, but they kept track
of who made the errors. The pitchers
threw underhand and pitched every game unless they were injured, but they still
recorded who was pitching and what the stats were against them. The statistical rules for deciding who were
the best and players seem to apply to that time almost as well as today.
As appears on my website (but not this blog) all stats are
adjusted for quality of play, and my way to judge quality of play is by
pitcher’s batting average. It is assumed
that pitchers, in terms of batting represent average batters, in that if you or
I (at the ages of 20-40) were to train every day to hit a baseball, and were to
attempt to hit a baseball 1 day in 5, and the person reading this is not a
person who has ever been paid to hit a baseball, we would perform about the
same as major league pitchers. This
assumption bears out pretty well. We do
not see the dotted I graph that we see when we look at batter’s statistics with
a mode high above a normal appearing graph, but that has a steeper slope before
the dot of the I and a less steep right portion. An example of a dotted i. selection graph
appears in the previous post. This chart
is also in the previous post.
Another measure of quality of play is standard deviation of
the players in the league. As the
quality of play increases, the difference between the average and elite player
decreases and the standard deviation becomes smaller. Here is Pitching RPO – unadjusted (for the
numbers I use, pitcher ROP is set to 0) with standard deviation graphed
together. The correlation coeficent of
the 2 scores (Player RPG for players with 250+ AB and all pitcher hitting) is
.8601.
The Blue is Pitcher's Runs Per Game and the orange is Batter's Standard Deviation for players with 250+ At Bats.
As I said parenthetically, the way that I adjust for this is
to set Pitchers RPOs to 0 and so subtract the pitcher RPO for each player for
each season by a pitcher’s average RPO.
The results of this is that Babe Ruth still one of the top season of all
time in 1923 (behind a couple of pre 1900 seasons and Barry Bond’s 2002) and
the top seasons are a good mix of old-time and new seasons, as one might expect
if one is picking evenly between top players who are 5+ standard deviations
from normal. (normal being pitchers, not other batters) If one selects 2196 of
the best batting seasons of all time (Giving one 15.43 batters per season. The seasons of the 2196 are clustered near
the present (There are more people in the United States today and more people
in the world who want to play Professional Baseball, so this is also what one
would expect. There are less players and
the game was less relatively popular, and so there are far fewer top season in
the past. When I select my all-star
teams, this is reflected by those numbers.
The number of players I select will be a rolling average of the number
of top seasons. (Except for the years 1943-45 when the quality of play did go
down. I will take an average for these
seasons.)
I do not at this time adjust pitcher’s records because I
have no evidence that pitchers have become better over time. I guess that is a subject for another blog
post.
No comments:
Post a Comment