It's amazing how far sixteen lines of c++ can get you.

Posted by robert on 24 September 2013 in English (English)

So I was rubbing shoulders with the traditi-gis crowd on Tuesday & Wednesday last week for the FOSS4G GeoHack event in Nottingham University. The team that I gravitated towards was being run by DEFRA folks eagerly clutching copies of their Open Government Licensed air quality data and providing the task of creating an air-quality-aware router for cyclists, walkers or other health conscious individuals.

The data is fairly interesting as it’s a mix of measured and modelled data, and the most high-resolution data provided was that which was modelled specifically along a patchy network of major (OS-Meridian-derived from what I could tell) roads - so not necessarily the easiest thing to work with.

The plan from the table involved using postgis & pgRouting to route along the provided linestrings & build a web interface on openlayers for users. Now, OS Meridian isn’t known for being particularly routable at the best of times, and the patchy nature of the provided linestrings was making this look like a daunting problem. On top of that I personally was worried about the ambition of building from scratch a from-to routing interface on top of openlayers in the remaining time (I’ve spent two day stints just working on single fiddly javascript bugs on web maps before). The stumbling hurdle seemed to be that everyone was running a windows on their laptops and so needed to negotiate use of a “server” with the requisite software installed for their purposes.

Now, see, I’m an openstreetmap dude. And I was a bit puzzled by why we weren’t just using all the tools we already have. We’re quite spoiled in the openstreetmap community, having a largely complete & high quality, inherently routable road network in many countries and a decent array of web fronted routing tools, even counting just the open source ones. And just because we were given the data based on OS Meridian, doesn’t mean we’re stuck with OS Meridian for the routing as long as we’re able to find a reasonable way to apply the measured data to the new road network. Yet people were still intent on re-doing all the legwork in patching together a melange of tools.

I’ve become quite familiar recently with Dennis Luxen’s OSRM (Open Source Routing Machine) through my work on GPS trace analysis in That Shouldn’t Be Possible™. The great thing about OSRM of course is that it comes with a pretty impressive web frontend that’s stupidly quick to set up and truly shows off the speed & interactiveness you can get with OSRM. OSRM’s also very customizable - the “profile” determining the weights given to each road is defined by a lua script. This gives a low barrier of entry to users wanting to create even fairly complex rules.

It just so happened that the hack I wanted to perform wasn’t quite something that the lua scripting would allow. I wanted to add an extra weight to some streets by edge, and the OSRM lua calls are really built around assigning weights to whole ways. But this is Free software (“open source” if you like). We don’t tend to let things like that stop us. So I hacked a call to a postgres database (using the wonderfully straightforward libpqxx) into the inner loop of OSRM’s extractor pre-processing tool, modulating the weight of each edge by the result of this call. Really, it was such an easy hack that the hardest bit was remembering how painful string manipulation is in c++. And almost as much effort went into getting the build system to cope with the new libpqxx dependency, but hey, that’s c++ for you.

As it was, I was left with a few options over how to apply the air quality data to the roads. In the end, I decided to use postgis to locate the nearest linestring in the air quality dataset for each edge and assign that air quality to it. If there’s no air quality linestring within ~100m of the target edge, we assume a reasonably high air quality. There are many other ways we could have done it involving various interpolation schemes, but this we decided was good enough for a proof of concept.

I only had my little cheapo netbook with me and I spent 80% of the first day worrying about why my wifi card wasn’t working, but fortunately I was able to scrounge some network access from others for any pieces of the necessary toolchain I was missing and remembered I had the greater_london osm extract from geofabrik lying around (my little netbook is comfortably able to cope with London given enough patience), so by the Wednesday morning I had a built & working demo london dataset. So there you go. Sixteen lines of c++. This is the power of Free software and open source that you don’t get anywhere else.

We presented our work along with a nice looking “my local air quality” map put together by others (including a latecoming strk who also provided the air quality overlay used in our OSRM hack) using cartodb, and came in third. Which is not bad considering the amount of work some of the teams had clearly put into their projects over the two days.

The (very hacky) code is on github @ branch pmp. Beware there are a good few hardcoded strings in there - it is only a proof of concept.

Air quality aware router based on OSRM

The problem naturally with this tool and its limited dataset is that it will always just try and route around the areas it has data for. Authorities will necessarily measure and model those areas that are likely to have poor air quality and the resultant routes will more or less just try their best to stick to backstreets. This is a data problem that I will leave people like Tom Chance to rant about. If we wanted to just route across streets for which data is present (as was probably the organizers intention), it would have been a simple one line change to simply weight edges beyond n metres from an air quality linestring out of existence.

I left the GeoHack with a little bit of worry in the back of my mind as to why this solution wasn’t obvious to others. I think it’s that traditi-gis folks still see openstreetmap as a pretty tile (“base layer” as they would say) provider, and not the massive dataset and tool resource it is. This is possibly an attitude hangover from Google Maps, which for years people chose as their default “base layer”, but would give you nothing but tiles. No data. No fun. And many people see openstreetmap purely as a Google Maps replacement. It’s understandable. But we clearly still have a lot of awareness work to do amongst even our neighbour communities.

Comment from Tom Chance on 24 September 2013 at 08:10

Great stuff. You’re trying to put me through the fun of setting up OSRM, I see.

Although it only covers the Greater London area, you can use the same process I described in my blog entry to map average nitrogen dioxide concentrations onto the OSM linestrings. That will give you a complete dataset covering every road, path and monkey puzzle tree in the capital.

The same dataset (the LAEI) also contains data for particulate emissions, which would help in central London. In town the NO2 concentrations are basically way over the legal limits everywhere, but the particulate emissions are concentrated on main roads, so you could still find a slightly less bad route.

Anyhow, if someone did fancy taking this further, I think there would be interest in, for instance, helping schools to plan healthier walking routes.

Comment from AlexBainesB on 24 September 2013 at 09:17

Hey guys,

As I can’t code in C++ or anything else for that matter, I am humbled by what you are doing and feel a bit like a fan, telling JK Rolling what ending to write for Harry Potter. But what would be really cool: Is an air quality geiger counter. Using your phones GPS to track your location, it estimates the dose of pollution you have received in your day.

At the point where you have had an unsafe amount of pollution it sends you a text warning, to let you know. At the end of the week it sends you an email summary, with charts showing when your dose spiked into the red, and maps showing your locations at those points.

If you could add your family members to it and elect to send other people a summery. Then people could send their data to their MP and ask them what they are doing about it.

Although I have suggested the route planner myself, I think that is a bit backward. The route planner puts the onus on the individual to “plan” a safe route. A geiger counter, makes it clear that it is the authorities responsibility to control the air pollution.

If you have a route planner, the movement/freedom of the people who are concerned is restricted first. What you want is everyone to have the same freedoms. When people have had their amber warning for the day they may choose to change their behaviour.

Also the route planner approach means people plan a “safe” route and don’t think about it again. A geiger counter, will keep the issue in people’s minds and will keep them informed if their “safe” habits become infringed by more pollution. If a pollution geiger counter got built we could tweet it at and door step politicians with it. You could tell Boris: You can use this App to help keep your family safe from London’s dangerous levels of pollution. Doing it that way, it might get news coverage, and a good uptake.

I spoke two a bloke two weeks ago who made loads of money selling an App. His was a stethoscope. The key to the app going well is getting lots of TV news coverage. The big thing in public health at the moment is for people to “know their numbers”. They mean cholesterol. But particulates is just as important.

If fact: (thinking allowed) there is a guy at Imperial, who wants to develop a public health intervention, based around new technology. He might be interested in this.

Best wishes Alex

Comment from robert on 24 September 2013 at 10:05

Those are some interesting idea there Alex. I suppose the difference between the approaches is like that of bottom-up change and top-down change.

I know the idea of air-quality-aware routing has been kicking around for a while - in fact I think Martin from cyclestreets would love to be able to add it to their product if he had the right data (& time to add it amongst all his other feature requests).

As for c++, well, y’know, none of us were born knowing c++. You just kind of decide to sit down one day and figure it out. The conceptual wall between programmers and non-programmers is imaginary.

Login to leave a comment