Tuesday, April 4, 2017

Beat the Streak Picks

Here are the top picks my programs produced for use in Beat the Streak. This post mostly explains the ideas behind the calculations. In addition, this post shows tests on the Neural Network (NN). This post discusses an NN that includes the ballpark. I recently update the models, and the results of those tests are here.

For 2017, I am just going to publish the Log5 hit averages and the NN probabilities with parks factored in.

First, the Log5 Method picks:

0.278 — Corey Seager batting against Clayton Richard
0.278 — Jose Altuve batting against Hisashi Iwakuma
0.272 — Eduardo Nunez batting against Patrick Corbin
0.272 — DJ LeMahieu batting against Zach Davies
0.264 — Charles Blackmon batting against Zach Davies
0.264 — Justin Turner batting against Clayton Richard
0.262 — Javier Baez batting against Adam Wainwright
0.262 — Nolan Arenado batting against Zach Davies
0.261 — A.J. Pollock batting against Johnny Cueto
0.260 — Adrian Gonzalez batting against Clayton Richard
0.260 — Hunter Pence batting against Patrick Corbin
0.260 — Buster Posey batting against Patrick Corbin

For those who are new and don’t want to follow the links, I am calculating hit average, hits/PA. I take the three-year average (2015-2017) and regress to the 2017 mean for batters with less than 600 PA:

(Hits + (LGAVG * (600-PA)))/600

So if LGAVG is .230 and someone went 25 for 100 over the last three seasons, their regressed hit average would be .233. For the current season, I regress for anyone with less than 200 PA. So if the same person went 2 for 4 in the opening game, their 2017 regressed hit average would be .238. I do the same thing for the pitchers. I then figure the Log5 for this season, the Log5 for three years, and average the two to get the number above.

So early in the season, players with good three year hit averages will dominate, or someone like Corey Seager, who is off the charts in a short time.

The NN with Park picks. Note that this uses the same values as the program above. Instead of calculating a Log5 BA, the NN calculates the probability of getting a hit in the game.

0.278, 0.722 — Jose Altuve batting against Hisashi Iwakuma.
0.272, 0.713 — DJ LeMahieu batting against Zach Davies.
0.278, 0.713 — Corey Seager batting against Clayton Richard.
0.272, 0.705 — Eduardo Nunez batting against Patrick Corbin.
0.261, 0.703 — A.J. Pollock batting against Johnny Cueto.
0.264, 0.699 — Charles Blackmon batting against Zach Davies.
0.257, 0.695 — Yunel Escobar batting against Sean Manaea.
0.262, 0.689 — Nolan Arenado batting against Zach Davies.
0.250, 0.688 — Jean Segura batting against Lance McCullers.
0.260, 0.686 — Buster Posey batting against Patrick Corbin.

The first number is the Log5 hit average as calculated above, for a sanity check. The second number is the probability of the batter getting a hit in the game. So the NN concludes that Jose Altuve is the most likely player (the program knows about) to get a hit. Altuve dominated the leader boards last season. Note that the best players in this list have about a 25 to 30 percent chance of not getting a hit, which is why putting together a long streak with the best information is a very tough task.

I’ll do one more sanity check. Altuve should have faced Iwakuma a number of times in their careers. In that time, Altuve is 16 for 38 against Iwakuma with one walk and one hit by pitch, a .400 hit average. Since the start of 2015, Altuve is 6 for 11 with a walk against Iwakuma, a .500 hit average. Note that the NN makes no use of the actual batter vs. pitcher stats. Often, they don’t exist, or exist in such small sample sizes to be meaningless. Based upon the hit averages of the two players, it says that Altuve should hit this pitcher well. In this case he does.

Posey hits Corbin well, too, but he hits a lot of pitchers better.

A final note. You will not find any Tigers or White Sox players here. I do not track rosters, so I used last team played for this season to decide what batters are facing the opposing starters. Since those two teams have not played yet, I don’t have data.



from baseballmusings.com http://ift.tt/2nzNbw4

No comments:

Post a Comment