OpenStreetMap

CLC06 Import Cleanup

Posted by Abbe98 on 5 January 2015 in English.

When I first got involved in OpenStreetMap all my mapping was done in the iD editor, a great user experience for new users easy to work with and tagging made easy. My first experience with iD and OSM was terrible. It was because of a old low quality forest import that covers a huge area, shown below.

OSM screenshot

Most of the import is two massive multipolygons with tons of relations. In iD it was impossible to merge other areas into the multipolygons and the forests had to be heavily remapped. The result of other mappers ways to deal deal with this has lead to two massive multipolygons with broken relations, resulting in rendering issues. First image below shows the quality of the import, the second shows a rendering issue in iD as a result of broken relations(Not related to the reverse inner bug).

JSOM screenshot.

rendering issue in iD.

Below is two screenshots from Overpass Turbo, showing how the imported data looks like in the OSM database. The first image shows both ways and relations, the second just shows relations(multipolygons).

Ways and relations.

Relations

The imported multipolygons has become a issue to the OSM ecosystem. It makes it hard for developers and designers to use the data because applications such as Tilemill and Mapbox Studio having rendering issues with the data. It becomes a issue for new mappers that has to remap major areas(in iD this means dragging nodes), that can’t merge smaller areas such as lakes and has to deal with relations.

I’m looking into solutions for splitting the two major multipolygons into pieces along major roads(they should be it anyway) or in a worst case scenario delete them. If anyone has a idea about how we should deal with this please share your knowledge.

There is other imports like this one in Sweden some of them also from EEA Corine Land Cover 06(Link to latest version of the dataset) but I haven’t looked into them as much. I think that we need forest imports at least here but we need to be able to work with them from or usual tools such as JSOM and iD.

Any thoughts?

Link to related changesets

Location: Hillersta, Katrineholms kommun, Södermanland County, Sweden

Discussion

Comment from Sanderd17 on 5 January 2015 at 22:40

This is why imports are considered bad in many cases, they don’t take maintenance into account. First of all, there’s no community build around it, secondly, the data structures are sometimes hideously complicated.

The multipolygon relation itself isn’t too great either, inner and outer have no meaning on a sphere, and you notice that when small mistakes cause those reversed inners. But it’s the best we have now.

If you really want to work with multipolygon relations of that size, you should use JOSM, use it to split the relation, and take out small forests again and again.

In the other case, I don’t think anyone should oppose just deleting a bad import, when manual quality is received in return.

Comment from Abbe98 on 5 January 2015 at 22:59

The thing about deleting the two major multipolygons is that the import is two years old and it has been a lot of work made to it. So splitting it would keep the good parts from being removed as well.

Comment from jremillard on 7 January 2015 at 19:26

Land use imports should be broken up into smaller relations when they uploaded. If you are handy with software, you might be able to pull it out, break up, delete the old data, then re-import it. You could also propose just deleting the entire thing. we should not be afraid of deleting imported data that is causing problems. You can decided what you think is best, then post an email to your local mailing lists, and see what people think.

Comment from jremillard on 7 January 2015 at 19:37

If you are looking for ideas on fixing it, you could write a python script takes an OSM file as input, and uses Shapely to chop up the relation into a bunch of smaller relations, outputting an OSC file. It would probably be ~ 25 hours of work.

Comment from RM87 on 7 January 2015 at 22:47

You can split the forest relations along the roads, railways, cutlines and power lines. It is a manual work, but at least it is improving the data.

For example this area in Estonia had several big forest areas imported from corine: http://www.openstreetmap.org/#map=16/59.3301/24.2921 . Some of them have been split along the roads, railways and power lines to make the relations a little bit more editable.

If the forest is not fully coniferous and you do have either nrg imagery available or a lot of time for outdoor mapping, then you can split the forests further by separating coniferous, mixed and decidous forests.

Comment from Abbe98 on 12 January 2015 at 09:56

So 25 hours, that’s the time it would take me to remap it by hand…

I decided to delete the main multipolygons, I have deleted one right now, it took some hours because of all conflicts, now I’m mapping the removed forests by hand in smaller parts. When I have mapped the removed forests I will remove the second multipolygon. and do the same.

Note that my cleanup just applies to the two main multipolygons, as the smaller areas/multipolygons can be handled in editors such as iD.

Log in to leave a comment