OpenStreetMap

Why am I against wholesale import of administrative boundaries from any 3rd party source for the Philippines

Posted by maning on 11 August 2015 in English (English)

Disclaimer: This is specific to the Philippines, not a general OSM issue.

One of the most difficult data to collect in OSM are administrative boundaries (admin_level=*). It defies the on-the-ground rule. One cannot just go out and start surveying admin boundaries with a GPS. On the other hand, we see the importance of having admin boundaries in our database. We can define town/city limits. It improves geocoding. The maps looks nice. Humanitarians need them because they can plan and allocate resources according to administrative jurisdictions during a crisis. The only logical way to have this in OSM is to get them from various sources and do an import.

The most comprehensive source we found for the Philippines is from the freely available GADM. This website has a comprehensive collection administrative boundary data for free down to the smallest administrative units for many countries including the Philippines. Over the years, I tried to track down the provenance of GADM’s PH data. My geo-forensic skills lead me to people saying that the PH dataset originated from our national mapping agency. So, it seems very authoritative, why don’t we just import them? I say NO and here’s why (again, I am pertaining here to the Ph situation, GADM data maybe good in other countries and I have the utmost respect to the maintainers of the site sharing this to the public).

The license is incompatible. Period. End of discussion. Eugene discussed this years ago in our mailinglist.

Even if the license is compatible, the data quality is REALLY bad. Again, from Eugene’s mail to the list, here’s screenshot comparing OSM and GADM boundaries in Quezon City.

qc_gadm

If the above arguments does not dissuade you, consider this story.

Administrative boundaries imply a level of authoritativeness in the data. Ordinary users of the data may take it as exact and definitive. For example, during the immediate relief operations after Typhoon Haiyan in the Philippines, OSM volunteers across the world participated in the remote mapping response. We generated a tremendous amount of building, roads, and landuse data which were used by many international response agencies. Since we lack and can’t remotely map admin boundaries in the affected areas, response agencies resorted to using 3rd party admin boundaries (GADM is one of them) as an overlay to OSM derived basemaps. These maps were used by humanitarian agencies to organize relief operations. When I went to the affected areas months after the Typhoon, local authorities are complaining that in some instances, international agencies are insisting in their own maps particularly the village boundaries as a basis of relief supplies allocation. Local authorities would insist that these boundaries are wrong and does not exist in reality. Of course, response and relief planning should be dependent on other factors and not just based on a map, but, a wrong map exacerbates an already confusing situation.

So am I saying that we shouldn’t be adding admin boundaries in the Philippines? No, what I’m saying is that any 3rd party admin boundaries that covers the whole country you stumble upon either from the internet or directly from various agencies are wrong. No comprehensive data exist in the Ph.

What can we do now?

Talk to your local government authorities. You may find good data from specific local government units, if ever you get a hold of such data, let us know and if it is good enough, we can start the process of adding them in OSM.

Go out and survey! Map places as nodes. Oftentimes, a place node (place=town,city,village) is enough. For finding places in OSM, would you want a geocoder to give you a wrong boundary/polygon or a node/point that tells you exactly that this place exist in reality? In many cases, this is what I’m trying to do. Here’s a screenshot of village nodes we mapped in a remote town in Leyte. Compare that to the inaccurate (by 3 kms south) imported boundary of the town. Unless we get a better source to replace this boundary, I would rely more on the place node we surveyed with the local community.

burauen

I know, sometimes, it is very frustrating, OSM is heralded as one of the best freely available geographic data, yet, we lack the most basic admin boundaries over many areas. Imports might be the immediate solution, but, I ask PH mappers no to do this. I’m fine with missing boundary data for now, in time we can improve this the OSM-community-way, by importing, we maybe propagating more errors rather than helping improve our map.

Location: Burauen, Leyte 2nd District, Leyte, Eastern Visayas, 6516, Philippines

Comment from SimonPoole on 11 August 2015 at 10:41

I’m not sure why you are assuming that GADM data has a acceptable licence for OSM (no issue with the rest of your post though).

Please see: http://wiki.openstreetmap.org/wiki/Contributors#GADM_.28Global_Administrative_Areas.29

Comment from redsteakraw on 11 August 2015 at 14:28

I would say there should be tiles with just the road shapes and names to be used as a reference for mappers. This was done for United States mappers with the more recent TIGER maps.

Comment from MapMakinMeyers on 11 August 2015 at 16:40

I agree with Maning. GADM is junk in my opinion. I have reached out to Dr. Hijmans (http://desp.ucdavis.edu/people/robert-j-hijmans) numerous times and have spoken on the phone with him with regard to the data and sources. He does not feel the need to cite his sources. He does not know where all the data is from!? And he is impossible to work with. You could send him 50 e-mails and he will not reply (yes, I did this and he does not reply…).

At one point there was a license for the data mentioned - it was a non-commercial.

I think talking with the proper government agency in the country the boundaries are for in any country is best. If a nation does not have quality data with a proper license, then users should take action and go out and map.

Comment from Rovastar on 11 August 2015 at 19:49

Disclaimer: I am not local to this and have not looked at the data just your interpretation. The license issues aside I am unsure of your objections here.

The picture that you had for the “really bad” example (and I presume you picked a particularly bad case to make you point) I do not think is bad at all. It is mostly a little offset that is all. If you realigned them it would be much clearer. An semi mechanical human variable import could easily overcome this. Even if you do not do this and have a compare and contrast of the dataset with OSM data can result in areas that need to be investigated by other means, e.g. on the ground (if that is at all possible with admin boundaries), etc.

It appears you haven’t consider these possibilities.

Also I am a little confused as to adding something that is roughly/nearly/high percentage correct instead of the alternative which is, well, nothing at all. This an ongoing issue with most opendata projects I know of - having something that is not 100% correct and over time improve it versus waiting until it is perfect before adding it.

Comment from seav on 11 August 2015 at 21:23

@SimonPoole, I don’t know where you read that Maning says that GADM’s license is compatible. He did say that “The license is incompatible.”.

@Rovastar, I’m “Eugene” mentioned by Maning, and I actually tried to remove the “offset”. You can’t do that for the whole dataset with any consistency. If you go several kilometers north of the area shown above, the offset drastically changes as seen in the map below.

Northern Caloocan GADM vs. OSM

Comment from maning on 11 August 2015 at 22:26

I’m not sure why you are assuming that GADM data has a acceptable licence for OSM

@Simon, thanks for the link.
That is partly the reason why I’m saying the license is incompatible (paragraph 4), some borders from selected countries were given explicit permission, others may get this impression that GADM is kosher for OSM. I should have added that link in my post.

It is mostly a little offset that is all.

@Rovastar, that image is actually one of the “best” ;). In other areas it is worse, as Eugene mentioned in his reply, this is not a consistent offset issue. The second image in my post of Burauen, Leyte has a boundary offset by as much as 3km from the farthest village south. The imported boundary of this area closely resembles the GADM data although it might not come directly from GADM. As I said, there maybe other sources but all of these data are similar to GADM in terms of quality.

adding something that is roughly/nearly/high percentage correct instead of the alternative which is, well, nothing at all

In most cases, I agree with the “roughly/nearly/high percentage correct instead of nothing at all” mantra except for admin boundaries. There are various factors in play such as maintaining relations integrity, too few mappers, etc., but more importantly, unlike a road or a building, mappers cannot easily correct them through satellite imagery or field surveys with a GPS.

Comment from SimonPoole on 12 August 2015 at 11:33

@maning & @seav the wording is simply a bit ambigous. First citing a third party that the licence is incompatible and then continuing with “Even if the licence is compatible ….” it really isn’t and the permission we got for a handful of countries was after the fact in dire circumstances.

To be very clear: anybody that imports GADM data -now- -will- get caught and the data -will- be removed immediately..

Login to leave a comment