Global Validation Procedure

Posted by Lisa Marie Owen on 4 August 2015 in English.

I’ve just finished a cleanup and validation of field survey data collected in Bangassou, Central African Republic as part of the Missing Maps Project. I drafted a procedure for the global validation of OSM projects. I would love some input from others who have completed similar projects so I can refine the procedure and publish on the OSM Wiki.

Global Validation of a Project Area

  • Define the area to be validated – this will ideally be predefined in a gpx file
  • Setup
    • In JOSM, open the gpx file containing your area border
    • Zoom in to an “acceptable” size area and click File–>Download from OSM. Make sure the “Download as new layer” box is checked. There will be a message in the bottom right corner of this screen telling you if the area selected is too big for the server. Sometimes it will say the size is okay, but you will get a server error when you try to download – this means that the area was too big and you need to try again with a smaller area.
    • Once this is downloaded, go to a different area within your border and repeat with the “Download as new layer” box UNCHECKED. Do this until OSM data is loaded for your entire area.
    • Go to Imagery and select the appropriate provider for your project (usually Bing)
  • Scroll across the map and trace broad areas of problem imagery (low resolution or obscured by clouds). Tag the problem, the imagery provider and the date, like “name=low-res Bing image border as of August 4 2015”
  • Check major roads for tagging consistency. Refer to the OSM wiki for guidance on tags. Highway Tag Africa and individual country pages are helpful.
  • Look at the list of villages by going to Presets–>Search for objects by preset–>Geography/Places/Village
  • Scan the list to identify naming issues, tagging issues, duplicates, or other errors. For example, you may see “Village” as part of the name, a village split and labelled 1 and 2, spelling variations for the same village, or overt spelling mistakes.
  • Repeat for other features as needed, also checking the Tags/Memberships panel for consistency
  • Make sure all villages have limits traced using landuse=residential (not name=residential) and that the whole village area is a single object rather than having neighborhoods traced separately.
  • Run the validation tool within JOSM and address the problems it finds
  • Upload changes to OSM
Location: Nabarka, Mbomou, Central African Republic


Comment from Stalfur on 4 August 2015 at 20:40

Data Quality and overview is something that I’ve been trying to get a better handle on.

For example this is an automated evaluation of the status of Mbomou region

My bots query Overpass for settlements (city/town/village/hamlet) and assign them to regions by using Nominatim. Then overpass is used to see if a road reaches it (track doesn’t count at moment), if there are residential streets, buildings, amenities (church, school etc) and so on. The bots will only mark green for road network, for other values they mark yellow (partial) or red (none). It is up to humans to upgrade evaluation to green, for example if all buildings are on the map.

For example Bakouma has been evaluted as being linked to road network, but with no streets or buildings but an amenity. Looking at the Bing and Mapbox previews it is easy to see why there are no streets, so I logged in as OSM user and updated its Imagery status to red.

In the Mbomou list there is a link to the right that allows you to evaluate the imagery for the region, using 1 2 and 3 as shortcuts for the quality of imagery.

For more mature examples check out Höfuðborgarsvæðið or Norðurland eystra in Iceland, where the quality difference due to lack of imagery and or data is evident. A more African example comes from Kgalagadi in Botswana.

This is a beta tool but I have high hopes for it and any input is appreciated. This could be useful not just to give automated overviews but also for spotting those nameless villages (bottom of list as no name), multiple settlement nodes (imports gone wrong) and lastly as a sort of a task distributor, more on that on the github pages.

Comment from Alan Bragg on 5 August 2015 at 10:41

Nice work. I like your procedure. It caused me to checkout “Presets–>Search for objects by preset” . This JOSM search will be very useful to me and your blog caused me to try it for the first time. Thanks for writing this diary entry.

Comment from Lisa Marie Owen on 6 August 2015 at 12:59

@Stalfur - that looks great! What an interesting project. Disastermappers at University of Heidelberg have developed an imagery checking tool that would be a great tie-in, though it is being used right now for identifying where mapping efforts are needed. Here is a link to its current version:

I noticed the tool picks up village names that have been removed (when the villages were combined), will these remain or disappear when updated?

Thanks for sharing! I look forward to seeing its further development and use.

@Alan Bragg - Glad to be of help!

Comment from Stalfur on 6 August 2015 at 14:38

A previous PyBossa imagery effort was the inspiration for my imagery evaluator, they are complementary I believe.

Names or nodes that are removed are planned to be archived in some way. Many nodes need fixing, names corrected or added, geographical location fixed (many imports miss the villages by 100s of meters) and such.

Can you give me some examples of removed village nodes? So far the script has just been run once to gather settlement nodes but the plan is to run it at regular intervals to be up to date on missing/none/partial status - leaving the evaluation of being good to humans as before.

Thanks for the message, commenters don’t get an update when there is a reply, only the blog author.

Comment from Lisa Marie Owen on 6 August 2015 at 14:50

Examples of removed village nodes: Yongo Fongo 1 Suite, Yongo Fongo 2, Yongo Fongo 3 were all combined, as they were all taken as GPS points in the field to mark the extent of the same village, but were input to OSM as separate villages originally. This was one of my tasks within the global validation - combining both names and residential area outlines. Other names were also cleaned up, as many of them contained Pk numbers, which indicate how many kilometers from Bangassou the villages are and are used interchangeably with village names. I put Pk’s as alt_name where available.

Comment from Stalfur on 6 August 2015 at 15:00

I will keep this area in mind as I write the re-importer. Will keep you posted once I have something more to show.

Comment from Lisa Marie Owen on 6 August 2015 at 15:01

Thanks Stalfur!

Comment from drolbr on 14 August 2015 at 17:01

Concerning the original validation procedure: I you want to download the data in one go or at least larger chunks then you can use the mirrored_download plugin. It adds an extra menu item “Download from mirror …” that allows to fetch the data from the Overpass API instead of the main API. That saves load for the main API and increases as a side effect download speed.

Log in to leave a comment