We all used building:levels and alt_name without giving it a second thought. Why are these keys built that way? Why not levels:building? To me, it looks like there is a rule for building composite keys.
ref is the basic tag for storing a reference number. For a reference number in some third-party table, we add a suffix: ref:third_party. That is because the new tag still contains a reference number. We have all such numbers in ref* keys. The rule of thumb is, the meaning of a value is defined by the basic tag before the suffix. ref:third_party is still a ref, and source:maxspeed is a source.
Sometimes we cannot use suffixes for historical reason. That is the case with name: we use name:en and other suffixes for names in other languages. For that reason, we build composite keys by prepending a content with an underscore: local_name or place_name. These are still names — a reversed order from the semicolon notation.
Of course, an underscore is also used for multi-word keys: public_transport and admin_level.
Then, there are namespaces. The most known is addr: with addr:housenumber and so on. Without a suffix, addr key has no meaning. The same with contact: and turn:. Namespaces are used for marking a group of tags that have the same meaning, have similar value formats and they are usually described on a single wiki page.
Some namespaces are used for tying properties to a part of the object described by the main tag, and for adding more specific properties of it. For example, building:* tags describe attributes of a building, and we also have roof:type and fire_hydrant:type. These words are most often put on the same object as a key or a value, e.g. building=yes or amenity=fire_hydrant, but also can mean a part of a structure denoted by these tags, like how buildings almost always have a roof.
The definition for namespaces is very vague, and some people mistake basic tags for namespaces. For example, we have 2.6k addr tags in the database. Sometimes people try to impose an prefix on a set of established and well-used tags to group these: it improves sorting in editors and allows for introducing many more similarly-named keys without “polluting” the namespace-less set of tags. That is what happend with “contact:” prefix: it is rare to see imports using “phone” and “website” tags without it.
Suffix or a namespace?
Telling a basic tag with a suffix from a namespace of the second type is harder. For example, what would be correct, building:height or height:building? roof:height or height:roof? This depends on four things:
- Which of the basic tags for each of parts is used more often, hence is expected to come first? In this case, building is used 28 times more that height. roof key is virtually unused.
- Which of these parts is more commonly used as a namespace? height: is used as a namespace for only three popular (more than a hundred usages) keys, none of which is globally spread. For building:, the number of prefixed keys with more than a hundred usages is around 120, for roof: — around 30.
- When removing the suffix, will the value be meaningful for the basic tag? It definitely won’t be for building=100 m and roof=100 m, but will be for height=100 m.
- Will the basic tag without a suffix have the same meaning for the kind of objects with other similarly namespaced keys? In case of buildings, height would be enough without a suffix, and these tags are pretty widespread. But roofs are parts of buildings, so you would have either a suffix or a namespace.
So, for building height you would use a plain height key because of the fourth point. But for roof height, you would choose roof:height because roof: is commonly used as a prefix, as per the second point, unlike height:.
A case against brand:wikipedia
The reason for this post is the recent import of thousands of brand:wikipedia and brand:wikidata tags. I argue that the better choice would be wikipedia:brand and wikidata:brand, for the same reason as source:maxspeed and ref:whatever.
I accept the introduction of separate tags for an object and its brand: we can have two links for the McDonalds brand and a single notable restaurant under that brand. That covers the item 4 in the above list, and item 2 is not applicable, since both wikipedia and brand keys have not been used for namespaces. But points 1 and 3 are in favour of wikipedia:brand: the value is still a wikipedia article, and it is processed similarly to the value of wikipedia tag. And we have four times more wikipedia keys than brand keys.
To conclude, I suggest we do a mass-retagging of these imported or automatically processed keys before this mistake creeps into the wiki. Either wikipedia:brand or brand_wikipedia would be better options.
In some cases we failed to notice composite keys in proposals that are built contrary to the norm described here. Now you have to do some non-obvious tagging, which requires looking for the correct keys in the wiki:
- bridge:name instead of bridge_name (like old_name)
- source:ref, though the correct key source_ref is used 10 times more often. Note that ref:source would not be entirely correct, since you should be more specific in the suffix. source=tmnt with ref:tmnt=1 would be the correct choice, better than source_ref=1.
- This whole section on *:wikipedia prompted by this edit. Thankfully, we have only 20k of these keys, including the imported brand:wikipedia, so there is still time to fix this.