Tuesday, July 25, 2017

Better Information

Over the previous three seasons, 2014-2016, Jose Altuve posted a hit average of .304, the best in the majors for players with at least 200 games started. Hit average is the component used in the daily Beat the Streak posts. Hit Average is hits divided by plate appearances. Sometimes better batters have lower hit averages, because a high number of their PAs end in walks. Think of the difference between Ted Williams and Joe DiMaggio. Williams was the better batter in 1941, but DiMaggio was the one with the long hit streak. Williams hit average was .305, DiMaggio’s was .310. DiMaggio had a few more hits, but Williams had nearly double the walks. All those walks were good, but they were wasted opportunities for hits. If Williams goes 0 for 1 with three walks, the hit streak is over, even though Williams had a really good game.

So in general, a higher hit average should mean a higher probability of getting a hit in a game. This spreadsheet shows a graph of the percentage of games with a hit as a function of hit average, and there is a nice linear relationship. The r-squared is .775. Let me point out, however, a few points. For example, Altuve gets hit in 77.9% of hits games in this time period, while Daniel Murphy, with just a .283 hit average, gets hits in 77.6% of his games. Altuve had 177 games with one hit, 190 games with multiple hits. Murphy had 166 games with 1 hit, 142 games with multiple hits. So Altuve bunches his hits, while Murphy spreads them out more.

There are other interesting players out there, like Anthony Rendon and Kris Bryant who have low hit averages but get hits in a high percentage of games. I’m going to try to build this into the model and see if there is a predictive improvement.



from baseballmusings.com http://ift.tt/2v685er

No comments:

Post a Comment