OpenStreetMap

OpenStreetMap active users

Posted by pnorman on 7 January 2016 in English.

This is a repost from my blog

Periodically people make the claim of over 2 million active users for OpenStreetMap, but what this mean? This is the total number of accounts, including those who never edited, those who left long ago, spammers, and actual active contributors.

The closest metric to a standard is active users over the last 30 days. Although we can’t get that number, we can look at the changeset dump and analyze it with ChangesetMD and some SQL.

The SQL is fairly simple.

SELECT COUNT(DISTINCT user_id) AS active_users, date::date FROM osm_changeset JOIN generate_series('2007-01-01'::timestamp, '2015-12-31'::timestamp, '1 day') AS d(date) ON (created_at <= d.date AND created_at > d.date - '30 days'::interval) GROUP BY date ORDER BY date ASC;

OpenStreetMap was around before 2007, but the way data was stored was different so changeset dumps aren’t reliable that far back.

Taking the resulting file and a bit of gnuplot magic gives us a graph.

OpenStreetMap active contributors

There’s been a steady upwards trend with strong seasonal variations. This makes sense, since mapping is an outdoors activity.

Why does this matter? Besides accuracy, it’s important to be using a meaningful number when looking at growth.

Another reason is comparing with any other membership numbers. We want the OpenStreetMap Foundation to be representative of OSM contributors, but we need to measure that against a realistic number. The OSMF has 700 members, or 2-3% of OSM contributors. This could use improvement, but is in the normal range for foundations. If we incorrectly measured against the total number we’d get 0.03%, an absurdly wrong number.

Discussion

Comment from BushmanK on 7 January 2016 at 03:46

Thank you for another valuable bit of statistics. However, once you’ve got your hands on this sort of data, could you go a bit further and plot some additional graphs: - breakdown of a single year (2015, for example); - accumulated number of people, who have created an account, but didn’t do any contribution within following several months (hard to guess, how many months will make sense to assume that account was abandoned immediately after it was created). It would also be interesting to know, how many users had no edits at all.

Comment from imagico on 7 January 2016 at 09:26

That is in line with the Active contributors per month on:

http://wiki.openstreetmap.org/wiki/Stats#Contributor_Stats

and No. of active members last 30 days on:

http://osmstats.neis-one.org/

In principle although the choice of a 30 days window is perfectly reasonable as a you have to pick one solution it would be nice to have a more detailed spectrum of the contributor activity w.r.t. frequency - like with one week, 30 days, 3 month, six month, year. Data we have here right now is:

  • one week: ~10k
  • 30 days: ~25k
  • year: ~150k with ~50k recurring

which indicates a significant number of users who contribute regularly but less often than monthly.

Ideally such stats should also include users who contributed only in changeset discussions and with notes - it would be especially interesting to know if these are primarily a domain of otherwise active mappers or if there is a distinct group of users who primarily engage in discussion only and do not perform edits themselves.

Comment from SimonPoole on 7 January 2016 at 19:59

Paul did you remove zero-edit changesets? Up to at least March 2011 they are quite significant distorting factor see http://www.openstreetmap.org/user/SimonPoole/diary/23352.

Comment from pnorman on 8 January 2016 at 00:29

Ideally such stats should also include users who contributed only in changeset discussions and with notes - it would be especially interesting to know if these are primarily a domain of otherwise active mappers or if there is a distinct group of users who primarily engage in discussion only and do not perform edits themselves.

I’ve got the discussion data in the same DB, so it would be easy to run. My impression is that discussions won’t make a big difference, but there are more note-only contributors. Unfortunately, I don’t have notes in a PostgreSQL DB so can’t check this.

Paul did you remove zero-edit changesets?

No. I’d have actually preferred to generate a graph based on user logins or other activity, but for obvious reasons this isn’t possible.

In principle although the choice of a 30 days window is perfectly reasonable as a you have to pick one solution it would be nice to have a more detailed spectrum of the contributor activity w.r.t. frequency

Log in to leave a comment