Saturday, August 12, 2017

Hidden Streaks

An article at 538 discusses using a Hidden Markov Model to predict when pitchers are on hot and cold streaks. I had some experience with hidden Markov Models in previous jobs, and I like the idea. It’s also quite likely what they found is an injury indicator:

Salazar’s situation is not unusual. Many transitions between hot and cold streaks correspond with injuries. Pitchers on our list took 59 trips to the DL in 2016, and in 28 of those cases, they went through a notable frigid period in the two weeks before their injury took them out of the game.6 The chance of that many pitchers going cold by chance just before a DL stint is very slim, so it’s likely that our method can also detect injuries. In particular, we found evidence that clusters of several slow pitches in a row are associated with a hurt pitcher.

Every year I become more convinced that players experiencing poor season without going on the disabled list are fighting some kind of minor injury or ailment. If something like this can alert teams early, possible severe injuries might be avoided. That is, rest the pitcher before the stress caused by the minor injury reaches a breaking point.

I do have questions, however:

  1. It’s not clear to me how they trained the models. At some point the data needs to be segmented into examples of hot and cold streaks. This can be done by hand or automatically. Also, did the training and test data overlap? Usually you want to train on one set of data and test on another.
  2. They do a lot of data clean-up. Nothing wrong with that, but I do worry that biases sometimes creep in to throw out data that would hurt the model.
  3. They only model fastballs. That could be enough, but if a slump is caused by losing the mechanics of a slider, per se, do those slumps get detected?

I’m interested to see how this works going forward. If they can start predicting injuries before they happen, it would be a huge boon to the sport.

I also want to comment that the “Hot Hand” may be a poor term for this. It may just be healthy/unhealthy as the dividing point, rather than hot/cold.



from baseballmusings.com http://ift.tt/2fAncGE

No comments:

Post a Comment