SimonPoole's Diary

Update name suggestion index

Posted by SimonPoole on 17 September 2017 in English. Last updated on 22 September 2017.

A couple of weeks back I regenerated the data in the “Name Suggestion Index” from a current planet dump, adding a largish number of new entries. The index is used by iD and Vespucci to generate canonical spellings for well-known brands and to apply the correct presets at the same time.

Naturally the raw list contains a lot of nonsense, and that’s why there are two ways to reduce noise to an acceptable level: one, a list of (mis)spellings that are mapped to a canonical value https://github.com/osmlab/name-suggestion-index/blob/master/canonical.json and, two, a filter https://github.com/osmlab/name-suggestion-index/blob/master/filter.json that removes names that we don’t want, for example Bank for banks.

Previously you could drop names only globally. Now you can drop them specifically for a type of object. For example, in older versions anything with the name “Casino” was dropped. Now only casinos with the name “Casino” are. (That was the reason why, in earlier versions, the suggestions didn’t work for the French supermarket chain of that name.)

The index is not perfect, mainly because it is not country-specific (and creating such and index would be, IMHO, too much work). But, it works quite well, even given its limitations.

Now, why am I writing this: The update added a lot of names in non-Latin scripts and other new entries that need to be checked for whether or not they are actually useful. Considering that iD is used by the majority of new mappers, improving the index has a direct effect on the quality of their contributions.

OpenStreetMap

Update name suggestion index

Discussion

Log in to leave a comment