The return of the OSM rank table

Posted by bdiscoe on 3 March 2018 in English (English).

To follow up from my previous post, I did some further work on generating and putting online a table of OSM node/way ranks table

The data that’s there right now is from today (2018-03-01) and the deltas are vs. 2 weeks ago (2018-02-12).

Standard disclaimer: Last-modified-rank is only vaguely related to contribution, there is no way at all to measure actual quality or value of contribution across users, because it’s subjective, and users are very different from each other. However, this table can be very useful for an individual mapper to see how their amount of contribution changes over time, and to identify, for example, accounts that are moving up rapidly which usually indicates they are doing an import. Similarly, if your rank moves down, it can mean that someone (correctly or not) has modified or deleted your mapping work.

For those curious about the technical mess that’s currently involved, here is what I did:

  1. Download of the weekly planet file from Planet OSM (39 GB), this takes around 12 hours.
  2. Running a small Linux C++ app that uses Osmium to parse the pbf and generate a CSV of users along with number of nodes and ways that they are the last modifier of.
  3. On Windows, running SQLiteStudio to ingest that CSV as a table in a database.
  4. Run a C++ app that uses SQLite to query the database and generate the HTML output.
  5. FTP that HTML up to a server.

Comment from mmd on 4 March 2018 at 17:24

Given how much discussion there was on data privacy issues with regards to Pascal Neis’ HDYC tool (with the result of requiring people to logon and only providing for one user at a time), it’s really intimidating that you manually analyzed the mapping behavior of each and every single top 2000 contributor (!) and put it up out there in public as “Notes” column.

I’m pretty sure things like “sadly deceased Nov. 2016” for sure don’t belong in such a list. Also, I find comments like “badly overnoded import of Fukuoka woods” or “very badly inefficient waterways” not that helpful.

Any chance you remove this information altogether?

Comment from Zverik on 4 March 2018 at 19:30

Comments like this make me immediately go and save the webpage. Take it as a opinion table of a user, not as an official OSMF user ranking. To me, such personal touches are what makes a table useful.

Comment from bdiscoe on 5 March 2018 at 05:56

Thanks, Zverik.

mmd, I’m not sure what you mean by “not that helpful”. The table is my attempt to understand who is editing, how much, where and when; this is potentially useful for other people too. If we are trying to understand why some accounts have made 24 million nodes, noting that these include geometrically inefficient waterways clarifies the mystery. If we are trying to understand why an account suddenly stopped editing, then noting that the mapper is no longer with us clarifies the mystery. If we are trying to estimate how long it will take to manually clean up the USA TIGER or NHD imports, then the last modified number nodes/ways by those import accounts gives us a metric of that progress! You may find them “not helpful” to you, but they are certainly helpful to others. It is indeed the result of years of study of publicly-open data by effectively anonymous accounts, so there is nothing “intimidating” about it, surely? There is not even much “opinion” in it, unless you consider a statement like “low-quality imports are bad” to be an “opinion”.

Comment from dieterdreist on 15 March 2018 at 16:56

@mmd deceased people are not subject to any privacy regulations or directives. There is no privacy for the death. If there are any privacy concerns with this list, for sure it’s not in the comment you cite.

Login to leave a comment