It has been a few weeks since I wrote about the public beta release of Cygnus, the Telenav conflation engine for OSM data. Since then, I have since been approached by a few folks who wanted to take it for a spin. One of them is long time OSM contributor MikeN. He is preparing an import for Holt and Atchison counties in the U.S. state of Missouri. We worked together on scaling some technical hurdles. Here's a report of what we (well mostly he) did.
Mike obtained the source data from Holt and Atchison counties from their official GIS:
I obtained an updated road network from their official GIS, extracted and translated the tags, and followed up with a review against current aerials as well as checking for connectivity, glomming like road segments, and simplifying geometry. The final goal is to obtain permission to import, go through the import process steps, and merge new data onto the existing OSM data.
The next step was to convert the data into the OSM PBF format that Cygnus requires. This is when Mike got in touch with me to work through some technical difficulties:
Since Cygnus required the PBF format, I used Osmosis to convert. This failed because the nodes did not "have a version attribute as OSM 0.6 are required to have". I have learned from Martijn that OsmConvert works without a version attribute, and was able to verify this on my second county.
The next catch was that PBF doesn't accept negative node numbers. The simple workaround is to just use a text editor to remove the minus sign from
ref='-. This seems a bit dangerous - would that file upload if accidentally selected? If so, many low numbered objects would be corrupted around the world. Hopefully, the conversion from OSM to PBF can be moved to the Cygnus chain so that it can accept zipped OSM since most users will start with .OSM data.
That is a great suggestion. On the one hand, we don't want to make Cygnus too easy to use. (Cygnus is in the end a tool to help with imports. It should never be easy to just import data in OSM. There are strict guidelines, and any tool to help with imports should make the user consider the process very carfully.) On the other hand, handling the conversion from JOSM XML (including the negative IDs) to valid PBF is a mechanical step that most any Cygnus user would need to perform, so I would like to include that in a future version.
With that out of the way, we worked together to produce the Cygnus JOSM XML change file.
The result was that it did pick up the new roads and they appear to connect properly into the existing road network. The cases of modified geometry were also detected. And although the node placement was different, the rest of the roads were properly untouched.
It turned out that the number of changed / new roads was fairly minor. A future import would therefore not be too invasive. Here is the OSM base data versus the updates suggested by Cygnus for one of the counties Mike is working on:
In total, Cygnus suggested 68 updates. 31 entirely new geometries, and 37 updated geometries. The updated geometries were mostly caused by connecting the new ways to the existing network, adding a node to the existing way where that happens.
Mike is still working with the counties and the community to move this import forward. Working with Cygnus gave some good insights and will hopefully help prepare the actual import when it happens. This is pretty much an ideal use case for Cygnus, and I hope to see more of them.
Mike also has a wish list for Cygnus:
Future enhancements - In my case, I extracted the surface tags from the GIS source. It would be interesting to have more control over tag merging - such as taking surface tags from the 'new' ways if there is no current surface tag. And in the case of renamed roads, to be able to give the new name a priority for the merge.
I already discussed this with my team and this is high on our list of Cygnus improvements - together with support for POI type nodes.
Get in touch with me if you are ready to give Cygnus a try with local data you have!