OpenStreetMap

Contributor numbers revisited and empty changesets galore

Posted by SimonPoole on 20 July 2014 in English (English)

Some of you may have seen my blog posts on contributor numbers in the past (May 2013 , December 2013) on our contributor growth based on analysing the regular changeset dumps.

While I was fairly sure that the numbers were correct, I did know that I hadn't taken one potential systematic problem in to account: users that had signed up and produced only empty changesets. Now it could be argued that such users at least tried to contribute and should be counted, but opinions may differ on that. It was clear that any effect on current trends would be minimal, all modern editors will typically only allow you to save if you have actually changed something, implying that an empty changeset is something that can only happen in an error situation (for example an editing conflict, or a crash of the application).

Here are the corrected graphs:

The effect of removing the empty changesets lowers the overall number by roughly 25'000 users, however as expected does not effect the general trend. End of June 2014 we had an accumulated total of 421'701 contributors, increasing at a rate of around 8'000 per month.

The following graphs compared to the previous ones show that most of the effect of the change is in 2009, which as we know is the year changesets were introduced in, so some issues there are not quite unexpected.

In mid-2014 we have already had more than 90'000 active contributors, we can expect a substantial increase in the year total if the trend continues.

Back to the empty changesets, when and how much of an effect did removing them from the numbers have:

and the same relative to the new contributors per month:

(note: the above is the difference between the old numbers with zero edit changesets and the new ones, in some cases this simply caused users to be counted one month later, which explains the single negative month)

As already mentioned all changesets before May 2009 were generated when changests were introduced. I haven't been able to determine what caused the large numbers from May to November 2009, however the changesets in question do not have a created_by tag and I suspect that they were created after the fact, by some kind of mechanical process. Nobody that I asked can remember, maybe a reader can shed some light on this.

After November 2009 most of the empty changesets were created by Potlatch 1 and a well founded suspicion is that they were caused by P1 live mode. This continues up to April 2011 when Potlatch 2 was made the default editor and the absolute numbers have remained stable in the 100-200 users per month range since then.

Now it is important to remind ourselves what we are looking at: the numbers are new users that signed up, tried to edit, failed and never had a successful edit after that.

In other words, users that wanted to participate that we lost (the total number of empty changesets is far higher, however for now I'll assume that regular contributors are more tolerant of things going wrong than first-timers).

Done deeds are done deeds and there is nothing we can do about users that we have already lost, however what we can do is try and improve the situation going forward.

I've produced some numbers on which editors the users were using and nearly all (yes including JOSM) turn up, however the majority of the users effected are using iD (which is not a surprise given that it is the default editor) and, big surprise, "Go Map!" at nearly the same level. Given that "Go Map!" has a far lower user base than iD (editor stats), this indicates that there may be a real problem with the app this has in the mean time been resolved. Naturally given that iD is what is usually used by potential new contributors, further investigation is warranted too.

Comment from mcld on 20 July 2014 at 14:14

Empty changesets are possible?? Gosh that surprises me.

This must add bloat to the database and to various dumps without any benefit, surely? Should we consider perhaps that the server should return an error when something attempts to record an empty changeset?

Hide this comment

Comment from SimonPoole on 20 July 2014 at 14:55

The bloat aspect is neligable, even though currently more than 10% of all changesets created are empty.

From an API point of view while the empty changesets could be discarded they can't being completly stopped from happening. They way to think of them is more: "setting a start marker", "adding edits", "setting a stop marker".

Hide this comment

Comment from bryceco on 20 July 2014 at 19:27

Go Map!! silently creates an empty changeset during first-time password validation. It is invisible to the user and doesn't imply that the user was attempting to edit and failed. (This "feature" will be fixed in the next release.)

Hide this comment

Comment from davespod on 21 July 2014 at 10:59

Seems to me there is a good news story here. The massive reduction in the number of "unsuccessful users" in April 2011 appears to coincide with Potlatch 2 becoming the default editor. I am willing to bet this is causation, rather than just correlation. It looks like a massive step forward was made in terms of solving this problem with Potlatch 2.

iD became the default editor in August last year, and apart from a very brief blip upwards, it looks like the number stayed pretty much the same. I suspect, though, if we looked at some other metrics (perhaps number of users coming back to edit again?), we would see indications that new users do get on better with iD.

There do seem to be indications that a permenant drop in the number of "unsuccessful users" took place near the beginning of this year (albeit not dramatic). I wonder what caused that. Were there any significant changes to iD at that point?

Hide this comment

Comment from SOSM on 21 July 2014 at 13:32

@davespod

I haven't been following iD developement all too closely, but an OSM editor is one of those things that tend to mature fairly fast in real use. It is clearly not out of the question that there have been significant improvements in iD itself (relevant to the this discussion, there have naturally been a lot of improvements in general), but it could just as well be improved browsers.

If I get around to it, I'll produce some numbers on all empty changesets vs. editors given that any issues will affect all users and not just first timers.

But as you say, it is good news that the issues have been dropping.

Hide this comment

Leave a comment

Parsed with Markdown

  • Headings

    # Heading
    ## Subheading

  • Unordered list

    * First item
    * Second item

  • Ordered list

    1. First item
    2. Second item

  • Link

    [Text](URL)
  • Image

    ![Alt text](URL)

Login to leave a comment