OpenStreetMap

How many contributors are active in each Country?

I recently put together this visualization of users editing per Country with along with some other basic statistics. This analysis is done with tile-reduce and osm-qa-tiles. I’m sharing my code and the procedure here.

Users by Country

This interactive map depitcs the number of contributors editing in each Country. The Country geometries are in a fill-extrusion layer, allowing for 3D interaction. Both the heights of the Countries and the color scale in relation to the number of editors. Additional Country-level statistics such as number of buildings and kilometers of roads are also computed.

Procedure

These numbers are all calculated with OSM-QA-Tiles and tile-reduce. I started with the current planet tiles and used this Countries geojson file for the Country geometries to act as boundaries.

Starting tile reduce:

tileReduce({
  map: path.join(__dirname, '/map-user-count.js'),
  sources: [{name: 'osm', mbtiles: path.join("latest.planet.mbtiles"), raw: false}],
  geojson: country.geometry,
  zoom: 12
})

In this case, country is a geojson feature from the countries.geo.json file. I ran tile-reduce separately for each Country in the file, creating individual geojson files per Country.

The map function:

var distance = require('@turf/line-distance')

module.exports = function(data, tile, writeData, done) {
  var layer = data.osm.osm;

  var buildings = 0;
  var hwy_km    = 0;
  var users = []

  layer.features.forEach(function(feat){
  
    if (feat.properties.building) buildings++; 
  
    if (users.indexOf(feat.properties['@uid']) < 0)
      users.push(feat.properties['@uid'])
    }
  
    if (feat.properties.highway && feat.geometry.type === "LineString"){
      hwy_km += distance(feat, 'kilometers')
    }
  });
  done(null, {'users': users, 'hwy_km': hwy_km, 'buildings' : buildings});
};

The map function runs on every tile and then returns a single object with the summary stats for the tile. For every object on the tile, the script first checks if it is a building and increments the building counter appropriately. Next, it checks if the user who made this edit has been recorded yet for this tile. If not, it adds their user id to the list. Finally, the script checks if the object has the highway tag and is indeed a LineString object. If so, it uses turfjs to calculate the length of this hwy and adds that to a running counter of total road kilometers on a tile.

After doing this for all objects on the tile (Nodes and Ways in the current osm-qa-tiles), it returns an object with an array of user ids and total counts for both road kilometers and buildings.

Back in the main script, the instructions for reduce are as follows:

.on('reduce', function(res) {
  users = users.concat(res.users)
  buildings += res.buildings;
  hwy_km += res.hwy_km;
})

The list of unique users active on any given tile is added to the users array keeping track of users across all tiles. If users have edited on more than one tile, they will be replicated in this array. We’ll deal with this later.

The running building and kilometers of road counts are then updated with the totals from each tile.

Ultimately, the last stage of the main script writes the results to a file.

.on('end', function() {
  var numUsers = _.uniq(users).length;

  fs.writeFile('/data/countries/'+country.id+'.geojson', JSON.stringify(
    {type: "Feature",
     geometry: country.geometry,
     properties: {
       uCount: numUsers,
       hwy_km: hwy_km,
       buildings: buildings,
       name: country.properties.name,
       id: country.id
      }
    })
   )
});

Once all tiles have been processed, this function uses lodash to remove all duplicate entries in the users array. The length of this array now represents the number of distinct users with visible edits on any of the tiles in this Country.

Using JSON.stringify and the original geometry of this Country that was used as the bounds for tile-reduce, this function creates a new geojson file for every Country with a properties object of all the calculated values.

Visualizing

Once the individual Country geojson files are created, the following python code iterates through the directory and creates a single geojson FeatureCollection with each Country as a feature (The same as the countries.geo.json file we started with, but now with more properties.

countries = []

for file in os.listdir('/data/countries'):
  country = json.load(open('/data/countries/'+file))
  countries.append(country)

json.dump({"type":"FeatureCollection",
           "features" : countries}, open('/data/www/countries.geojson','w'))

Once this single geojson FeatureCollection is created, I uploaded it to Mapbox and then used mapbox-gl-js with fill-extrusion and a data-driven color scheme to make the Countries with more contributors appear taller and more red while those with less contributors are shorter and closer to yellow/white in color.

Here is a sample of that code:

map.addSource('country-data', {
  'type': 'vector',
  'url': 'mapbox://jenningsanderson.b7rpo0sf'
})

map.addLayer({
  'id': "country-layer",
  'type': "fill-extrusion",
  'source': 'country-data',
  'source-layer': 'countries_1-1l5fxc',
  'paint': {
    'fill-extrusion-color': {
      'property':'uCount',
      'stops':[
        [10, 'white'],
        [100, 'yellow'],
        [1000, 'orange'],
        [10000, 'orangered'],
        [50000, 'red'],
        [100000, 'maroon']
      ]
    },
    'fill-extrusion-opacity': 0.8,
    'fill-extrusion-base': 0,
    'fill-extrusion-height': {
      'property': 'uCount',
      'stops': [
        [10, 6],
        [100, 60],
        [1000, 600],
        [10000, 6000],
        [50000, 30000],
        [100000, 65000]
      ]
    }
  }
})

This current implementation uses two visual channels (height and color) for the user count. This is repetitive and the data-driven styling could be easily modified to represent number of buildings or kilometers of roads as well by simply changing the stops array and property value to buildings or hwy_km.

To show more information about a Country on click, the following is added:

map.on('mousemove', function(e){
  var features = map.queryRenderedFeatures(e.point, {layers:['country-layer']})
    map.getCanvas().style.cursor = (features.length>0)? 'pointer' : '';
  });

map.on('click', function(e){
  var features = map.queryRenderedFeatures(e.point, {layers: ['country-layer']})

  if(!features.length){return};
  var props = features[0].properties

  new mapboxgl.Popup()
    .setLngLat(e.lngLat)
    .setHTML(`<table>
      <tr><td>Country</td><td>${props.name}</td></tr>
      <tr><td>ShortCode</td><td>${props.id}</td></tr>
      <tr><td>Users</td><td>${props.uCount}</td></tr>
      <tr><td>Highways</td><td>${props.hwy_km.toFixed(2)} km</td></tr>
      <tr><td>Buildings</td><td>${props.buildings}</td></tr></table>`)
    .addTo(map);
});

Much of this code is based on these examples

Location: Goss-Grove, Boulder, Boulder County, Colorado, 80309, United States

Discussion

Comment from GOwin on 29 June 2017 at 12:26

Have you considered normalizing the user numbers with their country’s area?

Comment from Tomas Straupis on 29 June 2017 at 13:07

Normalising with countries population would also be interesting thus giving proportion of mappers in the country.

Comment from Jennings Anderson on 29 June 2017 at 16:23

Thanks GOwin and Tomas, these are both great ideas. This is also a general concern with some tile-reduce comparisons, that tiles are not all the same area. Normalizing (especially between tiles) if doing side-by-side comparisons by area is important.

At the Country level, I wonder how these numbers may change. My hunch is that the US will go down and Countries like Germany will grow.

Comment from PierZen on 7 July 2017 at 13:49

A recent blog did show that 50% of the Canada population only occupy 2% of the territory. To take account of large territories like Canada, where there is almost no poputation in the nordic parts, a good comparizon would be the contributors per city area in the main cities.

Comment from ethylisocyanat on 10 July 2017 at 13:47

You know, that it is a cartographic sin to stain countries by absolute values?

Better display this data relative to population or land size!

Log in to leave a comment