As mentioned in my last entry, I wrote a tool using Osmium to parse PBF and look for inefficient ways, i.e. ways that if you ran simplify on them, would drop hundreds of nodes and not change shape. I’d been running it on small countries and US states, but this evening I tried it out on a PBF of all of North America, and here is the prize-winner for the most bloated, wasteful way: a small dirt road between some houses and the coastal wetlands in Nova Scotia, Canada:
That’s 2000 nodes, or one every 5.6 centimeters.
By the time you read this, I’ll have cleaned up way 85927697, but I’d also like to offer to anyone else, if you are an experienced editor who focuses on a specific part of the world, if you would like me to run my tool on the extract for your region, I can send you a list of the worst ways and you can clean them up. Let me know!
Also, a word to importers: Please, please make sure you check your data for this kind of mess BEFORE you upload. In this case it was Steve in Halifax importing CanVec in 2010, but similar things are being uploaded all around the world, all the time (I know because my tool finds them!)
Comment from gileri on 4 December 2015 at 19:47
Nice catch !
Comment from imagico on 4 December 2015 at 20:02
The Import Guidelines already contain a remark concerning this:
Of course CanVec - the scourge of OSM…
Comment from Vincent de Phily on 7 December 2015 at 09:23
I was surprised to not find any overnodeing checks in the main QA tools (OSMI, Osmose, Keepright…) Maybe you could try to integrate your code in one of these, to reach a wider audience ?
Comment from stephan75 on 7 December 2015 at 16:16
Indeed I also would appreciate when there is a possibility to make your toolchain public.
And thumbs up for your efforts so far!!!
Comment from baditaflorin on 8 December 2015 at 12:43
I also have some scripts that detects this kind of errors. Did you put the code somewhere ?
Comment from scruss on 12 December 2015 at 23:35
Of course it would be a CanVec way. Most of Northern Canada is horrid land use squares, all at the maximum node limit, splitting up lakes and real features. It’s almost impossible to edit, and likely never will be corrected.
Comment from mikelmaron on 13 December 2015 at 06:35
@bdiscoe nice work. would love to see a TileReduce version of this, to make it easy for anyone to run repeatedly in any country