Simon Poole’s recent user diary pointed out a disturbing fact: only 5.2 million US addresses exist in OSM. According to research, there are at least 113 million residential and 5.6 million commercial buildings in the US that need addressing. Given that only about 4% of the US has been addressed in 10 years, it should take us (based on a purely linear model) 250 years to address the remainder of the currently extant buildings. Even if the curve is closer to expontial, it will still take decades.

Furthermore, unlike in many European counties, the US isn’t particularly pedestrian friendly. Hand surveying massive rural sections of the US is unrealistic. What’s more, these counties are going to be the last (least progressive) about opening up addressing data for use in OSM.

Right now, probably 99% of US addresses are from large scale imports. DC, Atlanta, NYC, San Diego, Seattle, Chicago. Progressive metropolitan areas are nearly exclusive in terms of opening up addressing data for OSM. Sure, a few counties in Minnesota beg to differ, but they are the exception, not the rule.

The fact is that the usability of the map in the US is seriously detracted without addressing data. Furthermore, our community is too small to support surveys of this scale. Our per capita number of OSM users who contribute locally is orders of magnitude below that of European countries, especially Germany.

But we have the data. Sure, the address ranges included in TIGER aren’t perfect, but they are in line with the CANVEC address ranges used in Canada. From my low level comparison with Google data in a rural town, addresses were off by 10-20 feet max. More rural areas increased the uncertainty level, but it is significantly better than literally nothing. Nor are they particularly difficult to export, as the tags are simple to interpret. If we get better data, deletion is a breeze with a simple search and select deletion method in JOSM.

Many of the old guard of OSM protest this, saying they aren’t good enough. The fact is that perfect is the enemy of good. We need something, but right now 96% of the US doesn’t have addressing. Steve Coast understands the OSM can’t break the barrier into being anywhere as close as good as Google/Bing/etc without addresses. We need TIGER address ranges in OSM as soon as possible.

Comment from lyx on 11 April 2015 at 21:23

Sure having addresses in OSM is nice. Unfortunately having them won’t make much of a difference regarding the usability of OSM in the US, especially the rural parts, because what we need here first is fixing road geometry and classification: All over the rural US we have thousands of “residential roads” which in reality are mostly agricultural/forestry/mining tracks, many unclassified roads, some tertiary highways and very few are even really residential roads. Also important would be to get some information on road surface into the database, at least the information if the road is paved or not. So, if everyone who wants to import addresses into OSM would first pick a rural county and fix up the TIGER data there before proceeding with an address import (maybe just for that single county?) we would make a lot of progress IMHO.

Comment from Nakaner on 12 April 2015 at 07:43

I fully agree lyx.

America still suffers from an old, bad import called TIGER which has not been fixed yet. Importing the next TIGER data set would be as effective as spraying Roundup on every American mapper (if mappers were plants).

OSM is not a data storage center for public GIS data. If someone needs addresses for the whole US, he/she should download the TIGER address data and add it to his PostGIS database. Because TIGER is public domain, this no licensing problem.

From my low level comparison with Google data in a rural town, addresses were off by 10-20 feet max. More rural areas increased the uncertainty level, but it is significantly better than literally nothing.

I believe that Google uses TIGER addresses in large part of the U.S. They might have added a random error on each address coordinate to be able to prove copyright infringement.

As I wrote at other blog posts: People map if they miss anything. If they miss no data, they do not map, i.e. nobody misses data in rural areas. I think that Frederik’s classification of the U.S. OSM community is not as bad as the answer at the Talk mailing list indicate. (There are some people who neither fit in category A nor in category B)

(a) A project for hackers and couch potatoes who trawl their county web pages and other sources to look for stuff they could “upload” to OSM (because it’s such a big country and nobody could possibly, yadda yadda yadda)

(b) A project for people who roll up their sleeves, travel to places of humanitarian crises, and help those in need by creating maps where the government hasn’t done their job well.

Comment from SimonPoole on 12 April 2015 at 09:51

Just two comments:

  • given the lesson that should have been learned from the original TIGER import, just because a dataset looks reasonably in one place, doesn’t mean it isn’t aboslute rubbish in a different corner, it is unlikely that you will find a consensus for a wholesale import
  • there is nothing stopping anybody from using brain + TIGER address ranges as a reference to add house numbers, perhaps OSM-US coule provide the data in a digestable form (aka overlay or similar)

Comment from woodpeck on 12 April 2015 at 21:27

What exactly do you mean by: “The fact is that the usability of the map in the US is seriously detracted without addressing data.”? If you enter a US address into OSM’s Nominatim geocoder for searching today, TIGER address data will be queried in addition to OSM; so if the address is in TIGER, you will find it, even without going through the exercise of an import. Some third-party software might rely on OSM data exclusively and not implement a TIGER fallback like Nominatim does, but should we really pollute our database with TIGER address ranges just because a few software vendors can’t be bothered to take the extra step? - Let’s try and have high-quality, precise address data in OSM (i.e. not TIGER import), and for everything else fall back to TIGER externally like we do today.

Login to leave a comment