Thought experiment: What if values had no keys?

Posted by SwiftFast on 16 June 2017 in English (English)

I've been contemplating the various OSM tagging disputes and disagreements, and I thought of this: If you take 100 regular people to a park with benches, grass, trees, and some gravel paths, and ask them what they're seeing, they will probably all say "benches, grass, trees, gravel paths". There would be close to a 100% consensus and no major disputes.

So why is OSM tagging different? Our problem is that the system forces us to categorize those physical objects into abstract, artificially constructed buckets called "keys", and, naturally, this categorization is subjective and different mappers will want different schemes. Should grass be put in the "landuse" bucket, or the "natural" bucket? Wait, what if a landuse=military has grass? Maybe we should create a new bucket called "landcover" and put grass there? How do we reconcile all that with man_made? What do we do with the prexisting tags? And on and on.

Why not abandon the buckets?

Clearly, many tags do require key=value pairs. e.g. name or website. But these are usually non-physical and non disputed. In our hypothetical eutopia, a node can have both traditional tags and value-only tags. The former are usually abstract, the latter usually physical.

I am well aware this requires API changes. I don't know if it's worth it, I know it has shortcomings, and I don't know if it's practical any time soon, or ever, but I thought I'd share for the sake of discussion, brainstorming, and so on.

Here are some tagging examples:

format: key=value, key=value ... [value, value, value, ....]

  • A grassy park with some picnic tables [park, grass, picnic_tables] (If someone wants to fine-grain it in the future, removes picnic_tables from the area and adds picnic_table for specific nodes).
  • Plaza with concrete floor: name="The Plaza", website="http://etc" [plaza, concrete] (nowadays: Is plaza man_made or landuse or amenity? Let me check the Wiki one more time...)
  • Military base with trees and grass on a sandy beach. name="base1", [militaryBase, trees, grass, sand] (nowadays: Disputed tagging mess involving landcover,landuse,man_made,surface,natural)

As a bonus, this resolves the "double values" problem. If a shop sells both flowers and watches, by all means: name="John's flowers & watches", [florist, watches] (Or perhaps [shop, florist, watches])

Comment from Math1985 on 16 June 2017 at 07:40

I fully agree.

Note that we have different categorisation systems in OSM: * The color scheme of the icons used in the default rendering (for example, blue is a transport category which does not exist as a key classification) * The categorisation system in OsmAND (which for example has a finance category) * The way JOSM categorises features * The Flosm categorisation system

All of them use different categorisation systems, which shows that a central one-size-fits-all categorisation system does not make sense

Fortunately new tag look-up systems, such as in iD, which by default don't show the key names at all, largely resolve this problem.

Hide this comment

Comment from Zverik on 16 June 2017 at 08:21

Good idea, but it would involve not only API change (quite small actually), but also modifying all the wiki pages and all the editors.

d1g proposed something like this two years ago in the Russia forum:

Hide this comment

Comment from Piskvor on 16 June 2017 at 08:39

4, 3, 1, 2, 50, 70, asphalt, no, yes|psv, designated. What did I mean there? (Hint: it's a way ;))

Hide this comment

Comment from SwiftFast on 16 June 2017 at 09:29

@Piskvor I clearly said that traditional tagging is required for some values. Numbers fall into that category.

Hide this comment

Comment from rorym on 16 June 2017 at 14:44

One advantage of the OSM key-value is that you can use new values, and data consumers will be able to simplify it by just using the key. So if you're working with buildings, you don't have to have a special rule for building=gazebo and can just generalise it to building. You're approach wouldn't allow that.

This opens the other idea of going the other way. Currently we have 2 item tagging key, value. You're suggesting having 1 or 2 item tagging: key, value and just value. But maybe we should have multiple value tagging, so you could have multi-level tagging as much as you need. You could tag a church as religion-christian-catholic, or shop-clothes-male-suits (rather than our current shop=clothes, clothes=male and then I dunno how you'd tag suits). This way someone could simplify as much as they want, shop, or shop-clothes or shop-clothes-male etc.

I don't think this'll happen though, the current system is good enough. ;)

Hide this comment

Comment from SwiftFast on 16 June 2017 at 16:24

@rorym, I would say there is a difference between, say, "building" and "landuse".

The former is not what I would describe as an "artificial bucket", it's actually a useful, real world piece of data, while the latter is an artificial construct with little use on its own (this is why we see many building=yes but rarely landuse=yes).

Since "building" is a useful attribute, the building could be tagged ["building", "gazebo"], and a generic building can be tagged ["building"] rather than building=yes. This is why I suggested [shop, florist, watches]. The shop part is useful for a data consumer, even if it doesn't care about or doesn't recognize florist,watches.

Hide this comment

Comment from SwiftFast on 16 June 2017 at 16:25

I agree that the current system is probably good enough :)

Hide this comment

Comment from Tordanik on 16 June 2017 at 16:59

I also tend to find the "artificial buckets" annoying. Not only do they make tags harder to remember and cause discussions about which bucket to put something in almost every time we invent a new tag, they can also lead to misunderstandings about the meanings of a tag. Particularly infamous in this regard is the "natural" key, which has a tendency to make people assume that they are only allowed to tag "natural water" or "natural peaks" with those tags.

In theory, it would be possible to get rid of the problem without an API change by introducing a special key such as "feature". The values of that key would then be the feature's type such as "park" or "farmyard". Replace the arbitrary small buckets with a single big bucket. But it would still be a massive change that is unlikely to ever happen, sadly.

Hide this comment

Comment from SwiftFast on 16 June 2017 at 19:11

@Tordanik, that would be a problem, because a single bucket allows a single value, and sometimes you need more. think: amenity=park, surface=grass.

@rorym alternatively, one might argue that "building" is not an "artificial bucket", so the traditional key-value system would still be used for buildings.

Hide this comment

Comment from SwiftFast on 16 June 2017 at 19:15

I like rorym's hierarchy system.

Hide this comment

Comment from Warin61 on 17 June 2017 at 05:16

Nice Idea. Unfortunately it would still need buckets for properties like name, elevation, width, height, access.

Then you would need more buckets to separate things like post codes from council boundaries. Country boundaries from state boundaries.

Then you would need to be able to identify what the value 'trees' might mean .. is this just a ground cover or is it a forestry land use?

While the present OSM bucket brigade is full of contradictions, most developed by the never ending additions of both keys and values, hopefully OSM will evolve to resolve them.

Hide this comment

Leave a comment

Parsed with Markdown

  • Headings

    # Heading
    ## Subheading

  • Unordered list

    * First item
    * Second item

  • Ordered list

    1. First item
    2. Second item

  • Link

  • Image

    ![Alt text](URL)

Login to leave a comment