OpenStreetMap

History of all Tags

Posted by tyr_asd on 31 August 2016 in English (English)

TL;DR: head over to http://taghistory.raifer.tech/ for usage graphs of arbitrary OSM tags over time (by number of OSM objects).

In OpenStreetMap, tags define what an object is. Whether it is a mountain, a river, a house, or a postbox: Every map feature has it's own tag (or set of tags).

OSM doesn't have a fixed set of object categories. Over time, a more and more faceted and diverse set of features got mapped in OSM, thus the amount of different tags grew. At the same time, sometimes, tagging of a specific thing changes: Features that used to be mapped with one tag, get newer, better and more refined tags. That's OpenStreetMap evolving.

Of course, OpenStreetMap is also still growing, but not all the tags are getting more widely used at the same pace: For example, while it's quite possible that most of the world's railway stations are already mapped in OSM, there are still many juicy pastures left to be mapped out there.

a friendly goat

While there exist superb tools to get to know about the current state of all tags used in OSM (Taginfo most notably, but also the Overpass API to some extend), until now it was quite difficult to get oneself a good picture of the data evolution process. For example, questions like: from when on a specific tag was getting used, when an obsoleted tag got taken over by a different one or which tags got more traction lately are difficult questions to answer with OSM's current tool set.

For some of these questions, people programmed their own solutions, each answering their own question, like how many km's of Italy's roads were there in OSM over time (link), or how many buildings have been mapped in Austria (link). Similarly, the OSM-Analytics platform has recently started to provide such statistics for arbitrary regions for a limited set of map features (currently one can choose between buildings and roads, but there are plans to add more in the near future). What all of those tools have in common is that they can't handle the full variety of tags that's so essential in OSM.

To step into the gap between tools like taginfo (where the full variety of OSM's tags is so beautifully visible – stay tuned for Jochen's talk on SOTM in a couple of weeks!) and the more specialized tools like osm-analytics, I've created taghistory which allows one to get a historical usage graph for each of OSM's tags (with daily granularity) and to compare different tags against each other:

highway=ford vs. ford=yes

The tool is currently in it's very early stage, the're many things to do and improvements to be done. It's also important to note that the historical usage of a tag is currently only defined as a the respective number (count) of OSM objects! That's similarly to the statistics produced by taginfo, this metric is subject to the some limitations, most notably the effect that one cannot directly compare the number of tags used for different linear and polygonal features such as roads, land cover, etc. because such features are typically divided up into many OSM objects of different sizes. For example, an existing road may be divided up into two pieces when a new turn restrictions is added, resulting in that the count of each of the tags used on the road (even obsolete ones) is increased by one in the OSM database. That means that one needs to pay close attention when comparing tags that are typically used on such features, even when comparing subtags that are typically used on the same kind of parent object (e.g. different values of the highway tag).

That being said, have lot's of fun while digging into the depths of OSM tags' history. Here's the link of the tool again: http://taghistory.raifer.tech/ (and the link to the project's source code repository and issue tracker: https://github.com/tyrasd/taghistory). What's your favourite tag? I find the created_by graph quite interesting:

history of the usage of the created_by tag

Comment from mvexel on 31 August 2016 at 21:33

Very cool!! Interesting to explore the data that way. It's fun to try and recreate what happened, for example here:

centerturnlane

My guess: * People started mapping ways with center_turn_lane=yes * Someone decided that those tags needed to go away and wrote a bot * Mappers decided to use it anyway :)

Oh but wait:

center-centre

New guess: * Someone decided that the correct spelling was centre_turn_lane and wrote a bot to rename the tags * Mappers decided to use center_turn_lane anyway :)

Hide this comment

Comment from tyr_asd on 1 September 2016 at 06:58

Math1985 has more interesting examples on his osm diary page: http://www.openstreetmap.org/user/Math1985/diary/39404

Hide this comment

Comment from Alecs01 on 1 September 2016 at 19:00

Excellent tool, thanks!

Hide this comment

Comment from d1g on 10 September 2016 at 14:43

tyr, I had idea to include Google results with "amenity=public_building" query

e.g.

Full timeline:

2006-03-24 wiki: amenity=public_building added to map features

2007-10-16 JOSM: amenity=public_building added

...

2015-11-15 JOSM: office=administrative, office=government added

2016-03-02 wiki: office=administrative and amenity=public_building deprecated

2016-04-01 JOSM: amenity=public_building dropped and deprecation warning added

should be drawn as vertical lines with number. Where every number is linked to external resource to see if there any mistakes during discussion.

We (or Math1985) shouldn't fiddle with wiki or any other resource to see when tag was added/removed/mentioned for the first time.

We definitely need such tool.

Hide this comment

Comment from d1g on 10 September 2016 at 15:08

For example, "payment:troika" was discussed deep in the public trasport thread http://forum.openstreetmap.org/viewtopic.php?pid=596732#p596732

http://taginfo.openstreetmap.org/keys/payment%3Atroika#overview

There no discussions of it at tagging list or at wiki or anywhere else in OSM.

Hide this comment

Comment from d1g on 10 September 2016 at 15:25

... but wait, if you search more, "payment:troika" is used in OsmAnd already https://github.com/osmandapp/OsmAnd-resources/commit/996c7727287ebadbea0919d83be4a2d4fa8adccc#diff-bc091b281dee9cb9288fad5990fe5538

and was discussed in some more minor discussions at other channels

Hide this comment

Comment from GRUBERND on 15 September 2016 at 19:04

lovely tool. i guess you are using stats from a database analysis. how about counting the nodes associated to way/polygon objects instead of the objects themselves? this would totally eliminate the statistical jumps through splitting, merging and other operations.

Hide this comment

Comment from tyr_asd on 15 September 2016 at 21:11

Funny idea, that could indeed partially improve the issue with split ways. Still, a proper solution would have to track the actual length and/or area of the respective objects.

Hide this comment

Comment from Jojo4u on 30 September 2016 at 12:24

Under which licence do the generated charts stand?

Hide this comment

Comment from tyr_asd on 2 October 2016 at 10:05

@Jojo4u: You're free to do everything you want with the generated charts as long as you comply with ODbL's minimal requirement for produced works, i.e. citing OSM as the data source. A link back to this blog article and/or the website taghistory.raifer.tech is very much appreciated, though. :)

Hide this comment

Comment from joost schouppe on 9 November 2016 at 14:25

Would it be hard to implement permalinking to the charts one makes?

Hide this comment

Comment from tyr_asd on 10 November 2016 at 17:14

@Joost, probably not too hard. There's already a ticket on github for that, where any progress will be documented: https://github.com/tyrasd/taghistory/issues/6

Hide this comment

Comment from Polarbear on 16 December 2016 at 21:30

How often is the demo site http://taghistory.raifer.tech/ updated? It seems to be stuck at some time in October or so?

Hide this comment

Comment from tyr_asd on 17 December 2016 at 22:48

Sorry, currently, there's no updates! :( I've been looking into doing updates via Overpass' augmented diff, but I've run into some issues which need to be resolved upstream before it can work (see links in https://github.com/tyrasd/taghistory/issues/10). The alternative of reprocessing the history dump every week or month is currently also not really an option because of my limited computing resources.

Hide this comment

Leave a comment

Parsed with Markdown

  • Headings

    # Heading
    ## Subheading

  • Unordered list

    * First item
    * Second item

  • Ordered list

    1. First item
    2. Second item

  • Link

    [Text](URL)
  • Image

    ![Alt text](URL)

Login to leave a comment