OpenStreetMap

Fixing multipolygons for the renderer

Posted by Harry Wood on 4 September 2018 in English (English)

Just thought I would re-iterate something imagico blogged about. Some important features may have disappeared from the map near you. You should check! Use the OSMInspector "Areas" view:

flickr

Generally the pink things are more serious data bugs on there I think, and they're worth fixing because it could be a big building, or local park, or some other important feature which may have recently disappeared. This is due to a change in the renderer. It's an improvement, but it means it is being more strict about these data bugs, so we need to fix them! (Yes we're fixing for the renderer, but these were worth fixing before because other systems will always have struggled with this data)

But these bugs generally involve the dreaded multipolygon relations, which are not the easiest thing to get your head around. I took a look just now and found quite a variety of different problems. Some fairly easy (e.g. this one just needed an "inner" way joining up properly to make the whole building re-appear)

flickr

...some not so easy. Problematic relations can be overlapped by other ways. Sometimes the multipolygon isn't necessary at all (created by mistake perhaps). In that case we should take the chance to convert it to a simple closed way, to make life easier for everyone ...but that can be a fiddly process in itself.

So I'm inclined to say we need to pro-mappers to attack this task. But being "pro" is all relative. Most mappers are probably like me. I'll happily wade in and try to tackle such things up to a point, but then sometimes I hit such a tangle of data, I just throw up my hands and think to myself "either somebody has thought carefully about this data, and there's a reason it's like this, or somebody's made a mess, or maybe both... but I can't tell" at which point I'm quite happy to leave it and move on to find an easier bug to fix.

So get stuck in! There's quite a lot to fix:

I wonder if there's a way of knowing where the BIGGEST multipolygon bugs are.

Comment from imagico on 4 September 2018 at 17:46

I wonder if there's a way of knowing where the BIGGEST multipolygon bugs are.

That depends on how you quantify a multipolygon bug.

If you mean the most complex multipolygon that was ever broken or that is currently broken that is rendered in maps you need to look at islands probably. The Great Britain MP is >680k nodes.

The most complex multipolygon rendered with a color fill is likely Lake Huron (390k nodes) - but there are a number of other lake polygons with fairly similar complexity. These break quite frequently.

I once called the Merowe Reservoir MP the most broken multipolygon in the database because it at that time contained more than a hundred errors. Most of these were noded self intersections though - which osmium can handle. Have not checked how many of those are still left.

In most cases with difficult to fix broken multipolygon the best advise to give is probably: Split it into smaller parts which are easier to deal with. Large lakes and islands are the exception here since they are by convention always mapped as a single MP. Here the best would be if experienced local mappers keep an eye on those.

Comment from Harry Wood on 4 September 2018 at 21:26

hmm yeah. Coastlines. That old problem. They're big and they break all the time.

I suppose I'm after a measure the "most significant" disappearing objects. Could be a fuzzy measure of various factors, but... easier judged manually I suppose

Comment from Warin61 on 4 September 2018 at 23:19

Hi, The ones I strike that 'I cannot fix', I find after a day or two thinking about it .. I can fix. Just takes me some time to ponder the problem. If your not local you may not have the knowledge of OSM resources that can help with broken admin boundaries .. these get broken when someone 'improves' a shared way.

The other OSMinspector tool I use is routing .. this indicates places where routing will not work ... they term it major roads <1m separation .. those I go after. Usually merging two nodes or joining a node to a way does the job there. The other one to watch out for is duplicate ways .. those are not so easy to fix but resolve routing issues.

Comment from Alan Bragg on 6 September 2018 at 00:38

Nice post. Got me looking at my area. I soon found the OSM Relation Analyzer at http://ra.osmsurround.org which helped me a lot.

Comment from Harry Wood on 14 September 2018 at 07:55

Last night I did a bunch of fixing, and as I look back at OSM inspector today, I can see that I succeeded in my aim, to fix all the pink errors in London (within the M25). This is satisfying.

flickr

...but among this sampling of multipolygon relations I think I noticed an another automated check we might do. I noticed some multipolygons which were created as two or more ways as 'outer' role, together forming a simple area, which was unnecessary and could be represented as a normal way. We should "simplify down" to a way in these cases.

The criteria for this kind of "unnecessary" is possibly a bit complicated, but can be deteremined automatically. We'd need check none of the parts have other tags/memberships (or if they do have tags, they would need to all be the same tags). Also tricky things like tagging involving oneway directional ways might be a legitimate reason for splitting the area up into a multipolygon. Also if an multipolygon area has very many nodes (e.g. big lakes) then that's a legitimate reason. But otherwise... it can be just a way, and it should just be a way.

I wonder how many of these unnecessary multipolygons there would be. Maybe the map is totally littered with them.

Comment from woodpeck on 18 September 2018 at 06:43

Re. "unnecessary multipolygons", there's another class that is as unnecessary but even more difficult to spot and that's when someone draws, say, three disjunct but close patches of landuse=forest and puts them all in one multipolygon relation. I have heard practitioners of this technique argue that it saves space since "landuse=forest" only needs to be stored once, but in my mind this is really not worth the added complexity we burden mappers with when doing that. The only situation in which this might make sense if the three patches of forest form a named entity together.

Comment from Harry Wood on 18 September 2018 at 07:33

Oh yeah. I came across one of those while fixing. I guess this is a peculiarity of landuse (/landcover/natural) tags actually. I mean as soon as you add a name tag, or any "top level" object tagging (e.g. amenity=school / shop=supermarket) then it becomes correct to use a multipolygon per "One feature, one OSM element", but if it's solely tagged landuse (and maybe some other properties tags) then it's nicer to stick with ways.

Comment from Verdy_p on 19 September 2018 at 23:30

I've seen the case of disjunct but close patches used in the same multipolygon coming from the progressive improvement of surrounding smaller polygons to detach them from an old larger but less detailed polygon. As the polygon is very large, it's hard to redraw it completely from scratch without breaking it and creating large holes. And especially with iD, this can happen any time and is hard to see when it occurs. Only JOSM users will notice that these multipolygons can be separated, but it requires reapplying the tags of the multipolygon down to the components. before they can be removed as "outer" members of the multipolygon. If all patches have been separated and there remains no members, then only the now empty multipolygon can be removed. But it will be oftren recreated if one component is still large and still needs improving. So basically only small patches are removed and separated, and large multipolygons remain even if there is still only one member in it, because it this member way will continue to be split again to detail it further.

So I'm not surprised to see that there remains multipolygons with only one member, this is a minor problem and not a real issue, but the sign that further work needs to be done (and is sometime pending) to detail the content, but this is not done immediately because mappers frequently have long list of things to do and intermediate steps are necessary: trying to do everything at once creates a case where "todo lists" tend to explode exponentially, never ending for an initial modification that was intended to be much more local and not with a so huge impact that it had to process recursively all the touching polygons covering a larger and larger area...

Login to leave a comment