OpenStreetMap

Will the DWG block us all one day? - Part II

Posted by SunCobalt on 26 May 2018 in English (English)

As recently suggested in my previous post "Will the DWG block us all one day?" I took a deeper look into it. I have scraped all 1934 user blocks available last friday together with some user information.

As already indicated, the DWG is blocking more and more user (< no judgement) .... probably along with the increasing interest in OSM and the influx of trolls and paid editors.

In May 2015 an user registered more than 100 accounts with names of public transport stations in Berlin and was blocked. In October 2017 around 60 accounts related with an education session called COMS2200A were blocked. During April 2018 29 accounts from GlobalLogic India were causing troubles and got blocked after a complaint in the osm.org user blogs.

dailyActivita

If the trend continues, 2018 will be another record year of user blocks.

MedianBlocks On the other hand, the median block duration is at a very low level. (in case someone does not like annualisation, here is a YTD May chart.

If you look who is blocked, you can see that the vast majority are newbies. A mapper is considered as new if the account age at block date is less than 6 months. (If one is blocked more often, he could fall in both categories) However, experienced mapper blocks are also on the rise slowly on average. BlocksByType

The average block duration (chart in months) seems to be unrelated to the mapper type. The spike in 2014 is mainly caused by the 100 years block of user Sorein. In 2016 emacsen blocked some new users for 10 years which resulted in the other spike. Example

BlockDurationType

[irony] Just in case there are some troubles with the DWG (I am looking into it for a friend), you may want to know whom to avoid. [/irony]

DurationbyDWGMember

The chart shows the average block length in months. The bubble size represents the number of blocks made. DWG stand for the OSMF Data Working Group account. They blocked 6 accounts for 9 years, all belonging to one criminal case. The other outlier is emacsen, which -as mentioned earlier- blocked someone forever. So let's leave this aside.

DurationbyDWGMember2

[irony] In case you are in trouble, you should avoid Peda, SomeoneElse and woodpeck if you are a new mapper. Try pnorman instead. If you are already a while around, SomeoneElse seems to be a better choice than pnorman and woodpeck. [/irony]

Comment from escada on 30 May 2018 at 05:23

Since your last two graphs do not take into account the reason for blocking users, they are useless imho. Maybe woodpeck deals with mappers that destroy a lot of data or in a large area.

I see this as a blame game for the DWG, I think you should have focussed on the reason why people get blocked. But as you wrote "Just in case there are some troubles with the DWG (I am looking into it for a friend)", your investigation seems biased. If you would have revealed the reason why your friend in blocked (did she/he import from copyrighted sources, did she/he vandalise the work of others ?, did she/he add advertisements ?), I might have an idea why you are doing this research.

Comment from SunCobalt on 30 May 2018 at 06:25

Since your last two graphs do not take into account the reason for blocking users, they are useless imho.

Mikel Maron suggest a chart that shows who applied blocks. I found it interesting too. If you find it useless, just disregard it.

I tried to get data for the block reasons. The reason is in a text area. However, I found an example with no text at all, another one I found was completely in russian and so one. If you find a way how to get the reason without going through every 2000 block manually, please let me know.

Maybe woodpeck deals with mappers that destroy a lot of data or in a large area I am assuming the same i.e. that there are reasons why block durations differ.

see this as a blame game for the DWG, I think you should have focussed on the reason why people get blocked. But as you wrote "Just in case there are some troubles with the DWG (I am looking into it for a friend)", your investigation seems biased. If you would have revealed the reason why your friend in blocked (did she/he import from copyrighted sources, did she/he vandalise the work of others ?, did she/he add advertisements ?),

I will add [irony] [/irony] tags to my text for you. :/

I might have an idea why you are doing this research.

Ah, that's what you mean when speaking about "blame game"

Comment from escada on 30 May 2018 at 08:02

Sorry, I missed the irony, but I know that some people have disputes with some members of the DWG regarding reverts and possibly blocks as well. Without further context it is hard to know whether you are doing this as a neutral observer or as someone taking sides with the blocked person.

Indeed, I'm not interested in who did the block, but in who was blocked and who(*) requested the block. I assume the DWG is not handling on their own, but that they are processing requests from the community to block people. So perhaps the people that block a lot just process more external requests than the others.

Some people in the above list only served the DWG for a short period (and are no longer part of it), so it is normal they placed less blocks than others.

I like the comment you gave to explain the spikes on the "Mean Block Duration", I'm missing that kind of additional information on the last two graphs, that's why they are "useless". They just give hard numbers without context, the numbers are not explained.

(*) not the actual person, but was it a group consensus, was it 1 person asking it, that kind of stuff.

Login to leave a comment