Racing is better with more data – a real world example

Last night the headline “Breeders’ Cup Turf Sprint 2016: Favor Outside Posts” caught my eye… “Oh, a data post!”. While it does have some useful information, including a link to a 2012 discussion on PaceAdvantage, it is not a data post. It is, however, a great example of how racing could be better for all of us with more accessible data.

With no disrespect meant to the author, imagine how much more useful that post would be if he could’ve gone to Equibase (or Santa Anita’s site) and performed a search along the lines of “winners at 6.5 furlongs on the turf at Santa Anita from 1991 – 2016”. Assuming that one of the data points returned was post position, it’s just a matter of sorting by post position and counting to determine which posts have produced the most winners (and a little easy math to assign a win percentage to each post position if desired). The whole exercise would take less than 10 minutes (a little more with percentages, but not much!). Barring a crowdsourced “let’s compile data on the downhill turf course!” party, it would be an incredibly arduous task to discover something that could otherwise take less than 10 minutes.

I can’t say this enough: Hats off to Keeneland for making their racing data freely available in an easy to use format. Because of it, I was able to put together a similar post about how post positions perform on the turf course at various distances, but with counts and win percentages of each post’s performance.

The BC post and PA thread also discuss salient points such as horses with the outside posts are also aided by early speed and the horse for the course angle. But those angles could also be investigated further with easy access to usable data!

Also, if you missed last week’s lively discussion of data… (including links and tweetstorms!).