Following recent discussions on OSM mailing lists about tag homogenisation it struck me that there probably wasn’t a good summary anywhere of the tools that people use to change the tags in OSM data into something that they can use. You might wonder why on earth we need to do this, given that OSM has natural language names for everything, but unfortunately many words used in OSM might have different meanings around the world, such as “city” and “highway”. However, this isn’t a diary entry about that - it’s a brief summary of what I’m aware of and have used for converting the “super detailed” tagging in OSM into something more appropriate for e.g. something rendering OSM data.
The documentation for this is here. I wrote a diary entry about how I use this method to change the “name” language of Welsh-speaking places in Wales to Welsh, and Scots Gaelic-speaking places in Scotland to Scots Gaelic.
It’s pretty simple to use but it’s a once-off conversion designed for when you might use osmosis at data load - it’s not especially useful within some other process.
The main alternative for “things that you used to do with osmosis” these days is osmium, which is available both as a library and as a standalone tool. I’m not aware that it supports tag transformation directly, but it wouldn’t surprise me if it did (it can do almost anything else).
This is documented here. Out of the box, osm2pgsql contains a “filter_tags_generic” function that does a bit of filtering but doesn’t actually change any tag synonyms.
For a simple “change tag X to Y” example, have a look here in the lua tag transform definition that I use for the map at the top of this page. There you can see various “not very much used” tags (and in some cases “really shouldn’t be used”, such as “highway=unsurfaced”) changed to a “lowest common denominator guess”.
Elsewhere in that file you’ll see some tag values that you won’t see in OSM at all. These get handled in the rendering like this and the benefit is that you get rendering code that’s easier to understand - each “logical thing” has its own rendering code and it isn’t cluttered by lots of further tests either in the .mss or the .mml that has to say all the way through “is it X or Y”.
One of the things that the maps here display is England and Wales’ “public rights of way” (tagged in OSM with a “designation” tag). I use series of tag transformations to display different legal statuses in different colours, and paths without that legal status in grey.
osm2pgsql is typically used when creating a database for rendering or doing geocoding from. Alternatives exist, such as imposm. I’m not aware of alternatives that support tag transformation, but that doesn’t mean that there aren’t any.
mkgmap is typically used by people converting OSM data into something you load onto Garmin devices (eTrex, Nuvi, etc.). If you download an OSM-based map for Garmin from somewhere, the chances are that it was created with mkgmap.
Garmin’s internal format has a series of hex codes for everything, so essentially everything in a Garmin style definition is a series of tag transformations (like this one for “points”).
In that example you’ll sometimes see two different things at the left-hand-side converted to the same Garmin feature (e.g. “amenity=conference_centre” and “amenity=convention_center”) and tests combined with an OSM tag (e.g. “amenity=fuel & shop=convenience”).
I wrote a diary entry a while back explaining how to do very simple changes to a Garmin map style.
OsmAnd is probably the most powerful, most customisable OSM renderer that almost no-one thinks of.
Out of the box you get the ability to download .obf files from OsmAnd’s servers (up to a limit if you go for the free version) or you can create your own maps using OsmAndMapCreator.
What’s less well known is that it’s fairly simply to change the rendering style to something that you have created (see here) and that that can be done independently of the .obf file loaded. The tricky bit is that OsmAnd’s default style definition is a bit of a monster, as it seems to contain “all the style variants that OsmAnd can display” with lots of if/then/else logic for e.g. “is it currently night” and “are we in a car” etc. You can however add your “extra style rules” to a separate file to keep it simpler.
In addition, you can also perform tag transformations directly in OsmAndMapCreator - see here for the sort of thing that already exists, for example.
I’m currently working on a version of the two OsmAnd rendering files that perform the same “designation” processing that the online maps that I use do, though that’s some way off. It uses the same method as the osm2pgsql style.lua - change the things that I am interested in to a new tag name based on other OSM tags, ensure that gets added to the .obf file, and then render it as required.
There are lots of others that I haven’t used - this list post mentions the routing engine Valhalla, and I’m sure there are others.
Comment from SK53 on 7 December 2019 at 10:06
Osmfllter also provides a fairly lightweight way to change tags. I haven’t used this myself, but syntactically it may be a fit easier than the Osmosis ones.
Lua with osm2pgsql & mkmap are great, but more tools to transform within OSM formats so that rules don’t have to re-written for each consumer would be nice. I’ve used osmosis tag transforms for what Robert Whittaker calls “Ghosts” (shops which have closed but are still present on OSM & need surveying)., but the idea of doing writing 10s of such transforms in the XML syntax is rather daunting.
Comment from SomeoneElse on 7 December 2019 at 10:18
I’ve actually used osmfilter quite a lot myself - surprised I forgot about it! One downside is that you need to convert to a .o5m file before using it and then back again afterwards; another is that I found (and other people have found) it to be a bit tricky to get exactly the right filter in place if you only want to make a couple of changes.
Comment from PierZen on 13 December 2019 at 19:11
To export OpenStreetMap data to the UNOCHA Humanitarian Portal, I group OSM data by thematics such as health, education, etc. For such an Extract-Transform-Load system, we need to assure that we cover well the various thematics (we identifiy all the key-value combinations) and validate key-values to assure quality.
ogr2ogr let’s me load OSM Planet for a country into PostGIS. Sqlite-Spatialite can be used for smaller projects. It recognizes PostGIS functions. Otherwise PostgreSQL-PostGIS.
This way it is easy to test-validate the data. For DR Congo planet file, I filter data using series of condition statements. But this could be done using a translation table :
key value thematic
amenity hospital Health
building hospital Health
amenity clinic Health
In the example above, we could also create a generic feature variable to group various key/values that represent the same feature (for example, hospitals).
Quite important to support projects like for the Ebola outbreak north of DR Congo, we need to identify all features, even if misspelled tags. Tag classification assures that we dont miss features. It also help validate the data with a summary of key/values combination for the country.
Comment from PierZen on 13 December 2019 at 19:20
I repeat the block for the unformatted table example