Made-up names and how to avoid them
Posted by SomeoneElse on 15 August 2017 in English (English). Last updated on 11 August 2021.There’s recently been a thread on the talk-gb mailing list where someone has decided that, despite previous custom and practice there, the “name” field in both English- and Welsh-speaking areas of Wales should be a compound of both the English and Welsh names. No-one says “I’m climbing up Snowdon / Yr Wyddfa today”, they’ll use one name or the other, not both together.
In the Welsh-speaking areas the Welsh names are more likely to be used; in the English-speaking areas the English names. It’s not a hard-and-fast rule; this peak in the Black Mountains is referred to about equally by both the Welsh and English names, despite it being in a predominantly English-speaking area.
Wikipedia gives an idea of Welsh-language take-up here. That’s a bit broad-brush; for example I don’t think there’s an isogloss between Carmarthenshire and Swansea where people gain/lose the ability to speak Welsh.
So how is it possible to extract data from OSM with the Welsh name in the Welsh-speaking areas and the English name in English-speaking ones, both when creating e.g. a rendering database for the first time and when updating it as people update OSM? Firstly we’ll just consider the “loading the database” part.
There are a couple of possible solutions to the problem. I used “osmosis”, which has a handy “tag transform” feature. The Welsh one is here; the English one is similar.
Very roughly, the Welsh-speaking area of Wales corresponds to this area. That’s not perfect, but it’s not a bad approximation for a rectangle. I downloaded the latest Welsh data from Geofabrik and cut that area out of it:
osmosis --read-pbf wales-latest.osm.pbf --bounding-box left=-4.82 bottom=52.02 right=-3.34 top=53.69 --write-pbf wales_cy_before.pbf
Convert the “Welsh-speaking” part to names based on “name:cy”:
osmosis --read-pbf wales_cy_before.pbf --tag-transform transform_cy.xml --write-pbf wales_cy_after.pbf
Create a copy of the larger file with names based on “name:en”:
osmosis --read-pbf wales-latest.osm.pbf --tag-transform transform_en.xml --write-pbf wales_en_latest.pbf
Merge the two together (do it this way around and the “Welsh” file seems to take precedence):
osmosis --read-pbf wales_cy_after.pbf --read-pbf wales_en_latest.pbf --merge --write-pbf wales_merged.pbf
And load:
osm2pgsql --create --slim -d gis -C 2500 --number-processes 2 -S openstreetmap-carto.style --multi-geometry --tag-transform-script ~/src/SomeoneElse-style/style.lua wales_merged.pbf
The “osm2pgsql” command to use obviously varies depending on the data and the style used; I’m using this lua tag transform (that’s unrelated to the “osmosis” tag transforms described above) and this map style.
The result is that this road now displays as “Stryd Fawr” and this road as “Glasfryn Road”. Success!
Edit: There’s an automatic script to do this (for the style I use) here. I’ve updated that to use a .poly file (thanks to SK53 for that). Interestingly that includes St Davids in the “Welsh” part - so Glasfryn Road is now “Ffordd Glasfryn”! I’ve also used Scots Gaelic names for the far northwest of Scotland, and the results can be seen here