For years, an issue with Kurdish language, Arabic script, and OpenStreetMap tiles has been on my radar. In 2023 I got OSM to update Noto fonts on the tile server, but Google has moved their latest changes to individual repos.
I’m continuing to workshop a PR for that.. but in the meantime, I thought to check if OSM needs more of the language-specific Noto fonts. Back in spring 2019 I did a mini survey of where Unicode blocks were used around the OSM world.
Today I added Python scripts to check Planet PBF files (specifically name and alt_name tags on nodes) and find usage across Unicode blocks.
There are names with Latin alphabet and frequently associated characters (superscripts and subscripts, dingbats, diacritics, IPA, half-width, old italic, runic, spacing modifiers, punctuation, emoticons/emoji, and symbols from math, music, currency, and maps).
-
Africa has: TIFINAGH, ARABIC (supplements and presentation forms), CYRILLIC, ETHIOPIC, NKO, HEBREW, CJK, HANGUL, and GREEK.
-
Asia has: CYRILLIC, GREEK, HEBREW, ARABIC, SYRIAC, COPTIC, ETHIOPIC, BALINESE, JAVANESE, CJK + YI + BOPOMOFO + KANGXI, HANGUL, MONGOLIAN, TIBETAN, THAI, MYANMAR, LAO, KHMER, ARMENIAN, GEORGIAN, THAANA, SINHALA, TAMIL, ORIYA, BENGALI, GURMUKHI, GUJARATI, DEVANAGARI, KANNADA, MALAYALAM, OL_CHIKI, and TELUGU.
For the Americas, OSM already includes fonts for Cherokee and Canadian Aboriginal Symbols.
Those two scripts and OGHAM, TAGBANWA, and BAMUM were misused in Asia. The instance of TAGALOG script was a little uncertain. I removed an Apple logo because it’s from the Private Use Area.
The current font download script is pretty good, and includes additional fonts (Adlam and Tai Viet) which aren’t actively used.
The one alphabet which I will recommend adding is Glagolitic. Stone letters have appeared in several locations around Baška, Croatia (street view , street view 2), but also get misused on tourist site binoculars (assumedly using Ⰹ to represent its shape) and I’d previously seen it in the Canary Islands.
I am considering setting up a script checking weekly edit downloads for common errors and suspicious Unicode blocks.