Release the Data: Chorus Edition
You may or may not have seen the Release the Data post by Dan at @thorotrends, an open letter to Equibase imploring them to essentially offer historical data for free. Paulick jumped aboard the #releasethedata train by usurping Thorotrend’s traffic posting something about it at Paulick Report.
The original post, while welcome, well-reasoned and appropriately addressed to Equibase, if not a little long, failed to mention that the topic, racing data being made available programmatically available, is one that’s been discussed far and wide as long ago as 2005. I don’t necessarily agree with everything in his post. For example, I think Equibase can and should make their SOME of their racing data available for free to anyone via APIs and make other data more accessible by making it affordable and available via APIs. Your opinion may vary.
In the past when these points have been raised, they’ve been met with “but the historical data is freely available, as PDF charts.” or “you can just use Stats Central.” Yes, that’s true and that’s sort of the point. Stat-heads like @thorotrends can’t do any fancy number crunching with PDFs or Stats Central and groovy programmers like @robinhowlett can’t make apps with PDFs/Stats Central.
Quibbling on specific points aside, with the help of Jessica Chapel I’ve collected all the previous instances I could find on this topic to help strengthen the point and add to the chorus, so to speak. If some brave soul wanted to wade through the various forum sites and compile discussions I’d be happy to add them.
So, without further blather, here’s everything I could find on the topic of “releasing the data.”
Update: Also read this post by Jessica at Railbird
Did you post something about freeing racing data or have a link to add! Send to me at @exactamundo!
Blog Posts (be sure to read the comment threads where available!):
Jessica Chapel in 2005:
“Racing must recognize soon the power of the medium and figure out how to use it to the sport’s advantage. I’m talking about making more information easily available online (look at all the stats, summaries, and player biographies baseball provides on MLB.com), making it easier for new fans and the curious to find a way into playing the horses (this means going beyond just past performance chart tutorials and freeing the quantities of historic data hidden behind paywalls), and embracing blogs and RSS.”
Jessica Chapel in 2007:
“What the industry also hasn’t figured out is the benefit of making charts and statistics widely, easily, and freely available. Baseball has been pushed along by the growth of fantasy games, as well as the game’s stats nerds, and the obsessive and intelligent analysis of baseball data in both pursuits can be credited with — I think it’s fair to argue — keeping the sport in the mainstream even during during dark days such as the 1994 strike and the ongoing steroids scandal. It doesn’t seem unreasonable to conclude that racing could be similarly buoyed if it removed the hassle of finding any information at all about races run more than a week ago.”
Jessica Chapel in 2008:
“Free data and historical stats, that’s the way to build the fan base.”
Jessica Chapel in 2009:
“If you look back to 1990 and see what information was available and how it was made available, we’ve accomplished a lot,” Equibase president Hank Zeitlen tells Paulick, and that might be true — but it’s not enough.
Ray Paulick in 2010:
“Can you imagine how dead Fantasy Football would be in college dormitories if the NFL protected its statistical information the way Equibase does?”
Matt Gardner 2012:
“Even if you don’t believe that the sabermetric revolution in baseball is everything it’s made out to be, I think most would admit that it’s provided a fascinating way to look at the game that never existed before. Think of how many very smart people have used statistical analysis to change how fans look at the game. Could the same thing happen to horse racing if it were easier for its most die-hard fans to analyze the sport further?”
Tweets:
@o_crunk @r2collective re:Bloomberg Baseball tool – I started building something similar for racing. Biggest problem? Equibase.
— thorobase (@thorobase) March 10, 2010
Just got a quote from Brisnet to buy all their Comprehensive Charts files; $15,000. That's 50+% off by the way.
— thorobase (@thorobase) March 11, 2010
@o_crunk @EJXD2 @irish_1 count me as in favor of a jockey club / equibase API for chart info. apps are fine but data is better.
— dana byerly (@superterrific) June 1, 2011
API, API! RT @EJXD2: My tenure with @TTimes has concluded. Now, who wants to talk horse racing data?!!
— dana byerly (@superterrific) July 13, 2011
Friendly reminder: racing should continue being jealous of baseball in the data dept http://t.co/gjf1bRR
— dana byerly (@superterrific) September 8, 2011
@tbchat APIs allow developers to easily access data, that means they can build things with the data 1/2
— dana byerly (@superterrific) December 7, 2011
@tbchat the easier it is for developers to build things with data, the more things get built with the data. in this case more apps 2/2
— dana byerly (@superterrific) December 7, 2011
@NJDerek yep, when data is easily accessible from a programmatic POV, cool things can be done w/it. and not only app, sites too.
— dana byerly (@superterrific) December 7, 2011
@NJDerek data owners still also retain full control of usage, so API doesn't equal "giving it away for free", up to data owner.
— dana byerly (@superterrific) December 7, 2011
@pressthepace condensed comment preview: “nice points” & “wish the industry would free data available via API for ppl to build on”
— dana byerly (@superterrific) January 14, 2012
That last link is remarkable. Bloomberg's API is intended to "spur innovation" and allowing others "an alternative" to proprietary tech.
— o_crunk (@o_crunk) February 1, 2012
This is where the world is going. In racing we have a shit down of data locked behind walls of no innovation and no plan. Dead world.
— o_crunk (@o_crunk) February 1, 2012
@BklynBckstretch @o_crunk making them available programmatically versus pdf is a start
— dana byerly (@superterrific) February 1, 2012
@BklynBckstretch I'll add that I'm a happy user of historical PDF charts but IMO industry should be looking at APIs more than apps @o_crunk
— dana byerly (@superterrific) February 1, 2012
@BklynBckstretch because APIs would explode app growth @o_crunk
— dana byerly (@superterrific) February 1, 2012
Proof that API does not equal "giving away data for free". Personal use fees for Befair, commercial also available http://t.co/UP9KqMQ4
— dana byerly (@superterrific) April 16, 2012
@pressthepace @jnchapel it's funny that we've been talking about dusting it off… & that the imaginary racing API would be great for it!
— dana byerly (@superterrific) September 10, 2012
Trakus could be so much more – efficent data distribution, an open API for developers, etc and this is what they come up with?
— o_crunk (@o_crunk) September 13, 2012
Trakus could be the group that leads industry out of the .pdf past performance dark ages. But here's jockey efficiency ratings, have fun!
— o_crunk (@o_crunk) September 13, 2012
@raypaulick Attempted to procure national apprentice rider standings a few weeks ago. Told it would be $500 to get top 5 list.
— Marcus Hersh (@DRFHersh) October 8, 2013
@heylaserbeam @DRFHersh @raypaulick This type of thing, data hoarding & gouging for simple data,was what drove me away from racing for yrs
— David (Neighborhood Guy) (@bellringerwins) October 8, 2013