Statistical data of the Dutch OSM mappers.

Posted by marczoutendijk on 5 February 2016 in English (English)

Trying to improve the commitment of new mappers and to help them overcome the obvious beginners problems when trying to map, the Dutch community (after discussion in the user-forum) started to welcome new mappers as soon as they had made their first edit (in the Netherlands) on the map. To find out who the new mappers were, I used this rss-feed, provided by Pascal Neis.
This welcome program started on the 1st of August of 2015 and continues to this day. It is run by me and as such is a one-man task.
During this process I became curious to the mapping behaviour of the mappers and started to collect some data about their activity:

  • when did they start their user account?
  • when did they start to map?
  • how many edits did they do?
  • and much more

Soon I realized that I needed more data (over a longer time span) to get a better insight and so I contacted Pascal Neis and asked him to provide me with the relevant data, dating from some years back. After some startup problems with the data - not all the mappers seemed to be present in the data - I started my research with a dataset that contained the following data:

  • userID
  • username
  • date of registration
  • date of first edit in the Netherlands
  • date of their latest edit
  • number of changesets

First results

The dataset I have used for my research contained 3205 mappers that have done a first edit in the Netherlands between 1-1-2014 and 29-1-2016.

On first inspection of the data, it surprised me to see that some mappers did their first edit 7 years after they had created an account! This, then, was the first thing to investigate: how many days (after registration) pass before the first changeset is created?
Next I investigated how many days passed before the mapper did his latest (and very often his last) edit.

We see that most mappers (77%) create an account and start to map immediately, but 4% of the mappers waited more than 3 years before they did a first edit. But it is striking to see that for almost all of those mappers (68%) this first edit is also their last! So called “hit-and-run” mappers.
“Last edit” is of course hard to tell, because they might return some day in the future and do another edit, but experience so far doesn’t prove that.
Of course it is difficult to draw conclusions based on a rather small dataset, but it nevertheless looks not to far from truth to conclude that OSM mapping is basically in the hands of a small group of dedicated persons.

When are you a regular mapper?

If I look at my own status, I have got the label: “a crazy mapper”, whatever that means, but once every three days (on average) I’m mapping: adding new things, fixing errors, searching for errors etc. But even if you add/change/correct things once every three months, you’re a regular mapper.
The number of days since your last edit is a good measure of your status. See the next table:

This table shows that 138 mappers (4%) did edit something but did not return for a period of more than 730 days (2 years) after this edit. This is the maximum my dataset can reveal (because it spans 2 years and one month), and it is possible that some of those mappers will return in the future, but it is not very likely.
1411 mappers (44%) did their latest edit more than 1 year ago and still another 25% of the mappers did not return to mapping within (at least) 6 months.
One might say that for the majority of the mappers it is a one-time-only affair. Probably fixing something in their own area (missing names, shops, houses etc) and then never return.

A good measure of your mapping activity is the number of changesets you have done, and that is what is in the next table.
(Showing # changesets, # mappers in numbers and as %, sum of group left to it.)

This table shows:
1225 mappers did create 1 changeset
10 mappers did create (each!) more than 1000 changesets

And 82% of the mappers created between 1-9 changesets. From the graph it is obvious that this is almost a perfect example of an exponential curve.


In the Netherlands we have an active (albeit small) community of mappers and there is no indication that this community is different (statistically) from the complete set (2 000 000+) of OSM mappers (see links below), but it is also clear that the results that we get from the different datasets are not always easy to understand and only after at least one more year we might get some results that show us if the welcome program that we run in the Netherlands has improved the participation of the Dutch mappers!

some useful links:

Comment from SimonPoole on 5 February 2016 at 21:42

There are regularly mappers that return after numerous years of no activity (numerous as in more than half a decade). Obviously not a large number, but then OSM was a -lot- smaller then.

Comment from Harry Wood on 5 February 2016 at 23:52

There’s that “Long tail” shape again. I did a whole talk about this at SOTM2014

The welcome program is a great idea. Maybe the effect on the stats would be to “help users up the curve” thereby making the long tail and the elite spike less spiky. Or maybe not. It might also encourage more users to come along, making it longer. Either way, giving new users a good welcome will help build an OpenStreetMap community. It’s only logical.

There was some talk of a having an OSMF working group dedicated to the idea, but I don’t think it was established in the end.

I wonder whether the new changeset discussions feature might be useful for welcoming people. This is a strange idea really, because just sending an OSM message is more appropriate, but… changeset discussions are public, which changes the dynamic somewhat. What if we establish a convention of welcoming somebody on their 1st changeset? Then we can all collaborate on making sure new users get welcomed in a friendly way without accidentally duplicating messages and effort

Comment from Glassman on 6 February 2016 at 03:20

Great research.

My welcome program initially targeted just mappers in an area that could likely attend an OSM Mapping Party. Starting this year, I’ve expanded it to state wide. The next step is to create a survey for new users. A survey would give us their perception against the hard data that you reported.

At some point we need to find a better method of welcoming new users.


Comment from joost schouppe on 6 February 2016 at 14:53

When you do try to make a statistical comparison between people who did and did not get the message, be sure to use an appropriate statistical technique. Most simple would be to just compare something like number of people with a changeset between six months and a year after their first changeset (making sure this amount of time has passed for all cases studied). That way, you obviously lose a lot of cases that are still too young. The alternative is to use survival analysis. There even is a Kaplan Meier toolkit for Excel.

Re the surveying idea: could be very interesting, but the response rate will likely be very very small. Response is often a function of interest in the subject, and our study group are people with proven desinterest.

In the Belgian new contributor welcoming thing, we also register what their first edit was. Might be interesting over time to see if that has any predictive value.

I don’t know if we should automate this kind of thing too much. I believe the value of a message like this has much to do with how personal it is. What would be nice over time is to have a central way to register who is getting welcoming messages and who isn’t. So as to avoid that people watching their area of interest send duplicate messages to someone watching a larger area. And it would be a nice tool to highlight areas that - aren’t - getting messages.

Comment from Harry Wood on 18 February 2016 at 15:45

‘central way to register’ or the welcome messages could just be public. Hence my suggestion to welcome people on their first changeset (use public changeset discussions)

Login to leave a comment