Feeel the Heeeat

Sep 16, 2014 13:44


Pairing deep natural curiosity with the desire for a new programming challenge often produces great results. Such was the case with my recent addition of a Cycling Heatmap page to the biking section of my website.

I’ve always been a map geek, going all the way back to the neighborhood street maps I drew when I first started grammar school. I got my first handheld GPS back in 2000, well before such things were common, and before the government stopped intentionally inserting a random offset in order to make civilian GPS unnecessarily inaccurate.

Naturally, that interest also manifested in my career as a web consultant, where I used the Google Maps API for one client to display the locations of warehouses, delivery trucks, and customers, so that they could optimize their delivery routes.

As a cyclist, when I first saw the Strava Global Cycling Heatmap, I was pretty excited. For the first time-and for any location on the planet-we can see what roads cyclists (as a class) actually prefer to use!

Great data, but it also spurred my curiosity about my own road use. With years and years of cycling GPS files on hand, I have plenty of data; all I had to do was figure out the technical details of extracting it, summarizing it, and displaying it.


That’s where it got hairy, firstly because my cycling GPS no longer outputs user-friendly, text-based GPX files, but compressed binary files in Garmin’s proprietary FIT file format. It took a while, but I was able to hack Kiyokazu Suto’s Garmin::FIT module and fitdump perl script to get the latitude-longitude points I needed out of Garmin’s obfuscated FIT files.

Then I had to find software to generate a heatmap. That seemed easy enough, but it took several tries to find one that could handle anything more than a minuscule quantity of data. I gave up on Google Maps API Heatmap Layer, because their heatmap layer is limited to a pathetic 1000 data points. I looked at Leaflet, Highcharts, Mapbox… Nothing looked promising.

Finally I went back to Google and discovered that if you pair the Google Maps API with a Google Fusion Table in just the right way, it will accept up to 100,000 data points, which is closer to what I needed. So we gave that a shot. Even though the documentation for the known-inadequate Google Maps API heatmap layer was incestuously interwoven with the possibly-useful Google Maps API Fusion Table heatmap documentation, which caused a lot of unnecessary confusion.

Unfortunately, 100,000 points was only about one month of cycling data, so I had to write a script to further summarize my data before feeding it into a Fusion Table. Basically, I rounded my lat-long values from seven to just three decimal points and threw out consecutive duplicates, which reduced the dataset quite a bit.

It also had a side-effect of cleaning the data up. Since my GPS logs location once per second, points were more densely-packed when I was moving slowly, and more sparse when I was moving quickly. Plot that on a heatmap, and it would look like I rode more often in places where I went slowly! But rounding the lat-long values abstracted all those low-speed duplicate points down to one, which fortuitously made tracks display more evenly no matter what speed I rode at.

Finally I had to load my data into a Fusion Table. In the end, by rounding my data I was able to get 14½ months of rides into 98,870 points, representing all 122 rides I took from July 1 2013 through September 15 2014.

How pleased am I with the resulting map display?

Well, it satiated my curiosity about where I ride, and it also was a fun way to brush up my technical chops in terms of cartographic programming skills. I don’t know if any other cyclists will care or benefit from looking at my usual routes, but it would be neat if that were true.

Overall, I’m happy with the result, but it certainly has some externally-imposed shortcomings, all ultimately traceable back to the fact that I had to squish all my data down into 100,000 points.

Because I had to round off my lat-long values, the tracks I only rode once can be seen when the map is zoomed out, but the points nearly all disappear when you zoom in!

If I had more control over the heatmap’s appearance and color-groupings, I could probably fix that, but because those heatmaps are generated on Google’s server rather than the browser, they have provided virtually no options for customizing its appearance.

The rounding also becomes painfully obvious when you zoom in, as what appear to be linear tracks ultimately separate into evenly-spaced individual dots, just like looking at a halftone print under a magnifying glass. At high resolution, it becomes so painfully ugly that I had to programmatically restrict the user’s ability to zoom in!

Ugh! The tradeoffs and limitations give me the shivers. But at certain zoom levels, the result is pretty usable.

Did I learn anything new about my riding? Not that much, since I’m already pretty familiar with the roads I use.

I already knew that I spent a lot of time in Back Bay, on Mass Ave out to Lexington, doing the Quad loop around Concord and Carlisle, and also heading out Charles River Road to Watertown, or Beacon Street to Weston, and Glezen from there out to Sudbury.

I was pleased to see the presence of some new roads that I’ve added this year: the whole Dover loop, the Mystic Lakes route up to Winchester, and both Trapelo and Concord Ave through Waltham and Lincoln.

Of course, I’m equally amused by some routes that I haven’t done this year, but which appear thanks to the older 2013 data. That would include my former commute down to Quincy, which included climbing Dorchester Heights; hill repeats on Summit Ave in Brookline; and Virginia and Mill Street in Concord, a part of the standard Quad loop that I now usually skip.

I guess the only big, new revelation is that although I live within half a mile of the ocean, I never ride along the north or south shores! To find good seaside riding, I either have to go thirty miles north to Cape Ann, or fifty miles south to Cape Cod!

In addition to last year’s ride data, I would love to incorporate my GPS logs that go back another five years; that would change this map quite a bit. However, the rounding that would be required to jam all that data down to 100k points is so extreme that the map dots no longer correlate with individual streets, so the display winds up being completely worthless.

But for this exercise, I’m pretty happy that I was able to overcome the technical hurdles and produce the reasonably good result you see here, based on a good-sized clump of recent data. It’s a victory and an accomplishment in and of itself!

garmin, gps, maps, boston, strava

Previous post Next post
Up