
A couple of weeks ago, my darling 11-year old son, who is an avid cyclist too and a member of a local club, was leaving on a scouting camp. Now scouting camp means: no TV, no internet, no mobile, no tablet, no... nothing... just enjoy life with your scouting buddies and have fun... And while my 11-year old was super looking forward to that - he also was in a bit of a panic, as it meant that he would have to miss the first 10 stages of the Tour. He made me promise that I would keep track of the results for him - which of course I dutifully took to heart. But not without doing something graphy with it :)) ... So here's a couple of things that I looked at and experienced.
Getting the Tour de France data
So my go-to-source of Tour de France data is a local TV-station's website: Sporza Tour. They have a very handy results section that has a bunch of data, and as it so happens, it's pretty easy to export that data into a Google Spreadsheet, which I have published over here. There's three tabs in the sheet:
- a "riders" tab which includes basic information about the 198 riders and their 22 teams.
- a "sporza" tab, which actually imports data from the Sporza website on a daily basis, using the illustrious "ImportHTML" function. It's a bit of a hack, as I basically had to reverse engineer the URLs of the results pages on the Sporza website, and then use the function to get the data automatically. Works though.
It's a pretty great summary of the current situation in the Tour. Now all I need to do is get that data into Neo4j to see what fun we can have with it in a graph. Of course, the good news is that Google spreadsheets allow you to load data really easily from a CSV file - so that should be easy enough.
That's, of course, what I will be doing in the next blogpost.
Cheers
Rik
No comments:
Post a Comment