How well mapped is a town in OSM? There are many ways to answer that question depending on how you define “well mapped-ness” (or just mappedness). One simple approach is to look at the number of points of interest (POI) mapped. My assumption is that POI are a higher order feature: mappers establish a base layer of roads first, then move on to enriching the map with POI. So having more POI in an area could be a proxy for mappedness. A benefit is that it’s easy to measure. All you need is the POI data from OSM, and town / city boundaries. I took a stab at it. Here’s a preview:
Prepare the data
We need POI from OSM as the quality indicator. As a proxy for POI, I use nodes that have a
name tag. This lets me sidestep the curious OSM attribute hierarchy that would require selecting subsets of
amenity and a few other attribute keys from the data.
There’s two ways to go about this. For smaller areas you could use Overpass, using a query like this one:
area[name="Salt Lake City"]->.a;
For larger areas, it’s better to work from an OSM PBF data file, which is the route I took. Download one from your favorite source, like BBBike or Geofabrik. Then use osmium to filter:
osmium tags-filter utah-latest.osm.pbf n/name -o utah-namednodes.osm.pbf
Regardless whether you took the Overpass or PBF route, you now have an OSM file with named nodes.
How you can get place / city boundaries will depend on where you are. In some areas, OSM itself may be a great source, check Wambacher’s boundaries site. In the U.S., official place boundaries are easily available from Census, which is where I got them.
Loading the data
To do the needed spatial operations, you could use QGIS, but I like PostGIS. So I load the OSM and place data into a spatial datbase.
First I prepare an OSM ‘snapshot’ database:
createdb osm &&
psql -d osm -c create extension hstore &&
psql -d osm -c create extension postgis &&
psql -d osm -f pgsnapshot_schema_0.6.sql
pgsnapshot_schema_0.6.sql file is part of the
osmosis distribution and contains the database schema.
osmosis to load the data:
osmosis --rb utah-namednodes.osm.pbf --wp database=osm
The TIGER place boundaries come in shapefile format so I use
shp2pgsql to load those into a separate table:
shp2pgsql -s 4326 -D -I tl_2018_49_place.shp > places.sql &&
psql -d osm -f places.sql
Okay, data ready. Now for the processing.
Querying the data
What we want is a measure of POI density. We need two things for each place boundary:
- The number of POI contained in the place boundary
- The area size
Let’s add two columns to the place boundary table to hold that information, and a third column to hold the density result:
alter table tl_2018_49_place add column poicount integer;
alter table tl_2018_49_place add column areakm2 float;
alter table tl_2018_49_place add column poidensity float;
With a smarter query, you could skip the two intermediary tables.. Now let’s fill the columns with the data we need:
update tl_2018_49_place set poicount = (select count(o.*) from nodes o, tl_2018_49_place p where st_within(o.geom, p.geom) and p.name = tl_2018_49_place.name);
update tl_2018_49_place t set areakm2 = st_area(st_transform(t.geom, 2850))/1000000;
update tl_2018_49_place t set poidensity = poicount / areakm2;
I use a meter-based projection appropriate for the area so I can get the area size in km^2, but strictly that is not necessary because we’re looking for relative results.
For good measure, I loaded both tables into QGIS, split the density values into quantiles, and used those buckets for visualization of the results, see above. As expected, the cities and towns on the Wasatch Front, where we have a fair number of active mappers, score well in mappedness, whereas smaller towns in rural areas generally score poorly. My QGIS project file is here.
In a follow up post, I will go into how the result could be improved. I would appreciate your feedback as input for that post.