Don't know what to think of it of this research

Posted by escada on 18 June 2015 in English (English)

Somewhere in April, I bought a smartphone and installed OsmAnd on it. During my first ride with it, I discovered that someone tagged a stretch of an highway with maxspeed=50. I noticed it, because OsmAnd suddenly warned my that I was speeding.

The same day I changed it back to the normal 120 and I left a changeset comment. Today I got a reply to that comment (in Dutch):

"Deze werd in OSM geplaatst voor een onderzoek naar de temporele kwaliteit van OpenStreetMap. Alle gemaakte fouten, die nog niet verbeterd werden door de gemeenschap, worden vandaag verbeterd."

The translation is something like

"Those errors were placed into OSM for a research in the temporal quality of OpenStreetMap. All deliberately made mistakes, that are not yet corrected by the community, will be corrected today"

Any thoughts ?

Comment from Glenn Plas on 18 June 2015 at 11:11

This is a terrible way to 'test' the community. So we have 3rd parties introducing easter eggs into OSM for whatever reason. maxspeed is an important tag, it should not be messed with in this way. How is this measuring quality anyway ?

Comment from Vincent de Phily on 18 June 2015 at 11:57

How long did they leave those deliberate mistakes in ?

Figuring out how long mistakes remain in OSM (or any other data set) is a noble cause, it helps build (or destroy) confidence in the data.

But introducing a deliberate error in a dataset that is used live and in snapshot form by countless different actors is a Really Bad Idea. A less harmfull way to measure things is to look at changesets that fix honest-mistake errors and draw conclusions from that. Finding relevant changesets is harder, but the alternative method used by these Dutch people is IMHO not acceptable.

Comment from escada on 18 June 2015 at 12:05

The one that I found on April 1, 2015 was introduced on March 18, 2015. Assuming all other mistakes were introduced at the same moment, they were in the database for 3 months.

Comment from escada on 18 June 2015 at 12:07

For the record, the mistake I found was in the Flemish part of Belgium, so they might be Belgians as well.

Comment from Richard on 18 June 2015 at 12:44

Assuming the "research" quoted is academic, I wonder what the ethics committee of their university might make of this...

Comment from SimonPoole on 18 June 2015 at 13:32

@escada could you please forward a link to the changeset in question to the DWG so that the account can be permamently banned.

Hide this comment

Comment from nebulon42 on 18 June 2015 at 13:39

@Richard In technology related research ethics is still often met with a "How do you spell that?" attitude. So I doubt if there even exists an ethics comitee. UK might be different, though. Of course something like this should be discouraged and I also think that the DWG should be involved in that.

Comment from escada on 18 June 2015 at 13:43

As SimonPoole asked, I informed the DWG with the changeset in question and the user name. I had already informed the user via the changesets comments about this diary entry.

Comment from Glenn Plas on 18 June 2015 at 17:31

Thank you @woodpeck. Clear signal, his account is probably a dud however. I don't think this research was very academic in nature, it sounds like an idea a minor would execute. Too bad it went unnoticed for so long.

Comment from joost schouppe on 18 June 2015 at 19:08

I just heard that it was a university student in my hometown Ghent that did this. He did this within some research for a thesis (masters degree) comparing quality of OSM and other data sources for navigation purposes.

Comment from dcp on 19 June 2015 at 06:01

IMHO he should not get his Master degree just because of his behavior. An academic does not destroy the work of others!

Comment from joost schouppe on 19 June 2015 at 06:14

That said, I used to run a little IFTTT project to collect the new contributors in Belgium. Not only did I welcome the mappers with some basic tips and links, I also did a quick revision of their first changeset. I couldn't continue this because I was on the road for a year, and unfortunately noone picked it up completely while I was gone. If we would still have done that, almost certainly all the mistakes would have been fixed within the week. Also, we wouldn't have to be doing all this hating, we would have just said: dude, this is not the way to go. Let's find another way to test your hypothesis.

That said, it's a pity this guy didn't just contact the openstreetmap community first. It's not that hard to create a pseudo-experimental setting to measure speed of correction of certain types of mistakes - and results would have been much more significant. It's a bit like going to a neighborhood and vandalizing some public property, just to measure how quickly local governement fixes stuff in different neighborhoods.

Comment from escada on 19 June 2015 at 08:44

@Joost, when he/she would have contacted the community, we would be warned and might have be more alert to changes and/or actively start looking for them. So he/she couldn't do that without influencing the study they wanted to make.

It would be more correct to see e.g. how fast an actual speed limit change was added to OSM. But of course that would mean they should know about the change from the beginning.

You might wonder what's the next study: steal something from a shop and see how fast the police can find you ? And when you are not catched after 3 months, bring back the goods ? :-)

There was a time were I watched new people as well in my area, but unfortunately I do not do that anymore for various reasons (o.a. time). Though I regret now that I didn't investigate this user's changesets better.

Comment from SomeoneElse on 21 June 2015 at 15:11

The tricky thing here is detecting the problems. Of course, there are lots of QA tools around, and they are very useful at detecting "geometrically unfeasible" data (that in most cases is created by new users getting the hang of the editing tools, not vandals). The problem is that the only thing that will spot something like this as an error:

is someone who's familiar with the area and knows that "that does not exist". So we need to encourage the new editors (the same ones making the mistakes now!) to continue mapping and become the OSM users familiar with an area spotting any deliberate vandalism in the future.

The DWG can help with blocks etc. as needed, but if problems aren't spotted due to a lack of local OSMers, no-one (including the DWG) will know that there's a problem. Even when some changes do look "unlikely" (as has happened recently with some changes in both North and South America) the lack of OSMers on the ground means that "yes, those changes do look wrong, but we can't say for sure".

  • Andy (a DWG member but not writing on behalf of it here)
Comment from Glenn Plas on 23 June 2015 at 09:01

We are talking about a change of max_speed on 1 single road (or is it 2?). How this can be representative in nature is beyond me.... The sample size is way too small to be significant.

It sure felt more like a Virtual Geocache game in OSM. "I've hidden a needle , now you figure out which haystack it's in"

Comment from joost schouppe on 13 July 2015 at 13:01

Glenn, they did introduce more than just a few needles in the haystack, I believe about 40.

Hide this comment

