As mentioned in my last entry, I wrote a tool using Osmium to parse PBF and look for inefficient ways, i.e. ways that if you ran simplify on them, would drop hundreds of nodes and not change shape. I'd been running it on small countries and US states, but this evening I tried it out on a PBF of all of North America, and here is the prize-winner for the most bloated, wasteful way: a small dirt road between some houses and the coastal wetlands in Nova Scotia, Canada:
That's 2000 nodes, or one every 5.6 centimeters.
By the time you read this, I'll have cleaned up way 85927697, but I'd also like to offer to anyone else, if you are an experienced editor who focuses on a specific part of the world, if you would like me to run my tool on the extract for your region, I can send you a list of the worst ways and you can clean them up. Let me know!
Also, a word to importers: Please, please make sure you check your data for this kind of mess BEFORE you upload. In this case it was Steve in Halifax importing CanVec in 2010, but similar things are being uploaded all around the world, all the time (I know because my tool finds them!)