OpenStreetMap

Finally, the project for importing the municipalities limits is finished. It took longer than expected considering some of the issues that arose during the project. First I would like to thank all the members of the team, and also to the volunteers who helped the team when doubts and issues arose.

Importing official data to OSM seems like a simple task at first, and it is up to certain extent,but when the amount of data is not that big, however when we talk about all the municipal limits of the whole country like in this case, problems might arise.

First problem, data availability. Before we started with this project (previous to 2014), the geographic data from the Mexican Government weren’t open. there was the possibility to get them for academic or personal use but using them in a project like OpenStreetMap required overcoming a series of legal gray areas which no team would be willing to deal unless they had the resources to tackle it. The best example of this case is Google, the team from Google Maps has used Mexico’s official geographic data (that means the databases from the INEGI) throughout the years, nevertheless before this data was declared open data by the Mexican Government, you needed a series of legal and licensing agreements that required analysis by a legal team, hence it was a considerable amount of expenses just to create a project of cooperation with the government, that’s the kind of deal only a big corporation like google could afford to fund so, as good as the information from INEGI might be , no individual or team from the OSM community would have the means to afford the expenses required in a licensing deal just to import data to OpenStreetMap, in fact that was one of the reasons we initially thought about creating a NGO to push the geographic open data agenda within the MExican Government.

However, and in a kind of surprising way, the government took the decision to release a lot of their data as open data, including the geographical data. The way in which this happened and the factors that allowed this to happen would need another post, but if you’re interested a bit of the history woul could watch the conference given by [] from the INEGI when he was invited to give a conference in the State of the Map LatAM in 2014 in Chile.

Once the data was declared open data by the government, it ceased to exist the need of an entity whose agenda was to push for the opening of the government’s geographic data. It was then that we better focused on the following objective which was taking advantage of the richness of the INEGI data in OSM and it’s up to certain point a perfect example of why the governments should release their data as open data, if not for the fact that the government’s initiative to open their data came to fruition, our efforts would have been spent on lobbying and political activism instead of starting to work with the data and use them for the benefit of the citizens (which at the end is one of the objectives of the work the INEGI does), besides the fact that there was no certainty that all that lobbying work would really push our agenda until its final objective of opening the data, so it’s a really fortunate fact that this had happened in an organic way within that institution itself and within the mexican government. Governments and institutions who manage geographic data in other countries could take note of this.

Second issue, data conversion and conservation. Uploading data within an import is not that simple, you must follow a series of procedures in order to preserve the previous information, even regardless of previous data validity, in order to avoid destroying valuable data previously contributed by osm users, it’s really complicated to validate automatically which information is valid and which isn’t’ for a whole country unless there’s a very clear and specific reason such as data going against the OSM tagging rules which is the only guide to determine what data should be kept, however that’s not enough and you should make a backup previous to the import.

For starters, the data for boundaries doesn’t come ready to be uploaded to OSM out of the shapefile, it’s on a INEGI specific projection, and it’s not topologically ready to be converted from shapefile format to osm format, it needs to be reprojected, transformed, and properly split at the common boundaries, this is an important step to avoid way duplicity and we found out it will be a real challenge for more granular boundaries like the colonias boundaries.

Just an example, once converted to OSM compatible format, the limits file had more than 1,700,000 elements, which can’t just be uploaded in a single changeset, it was necessary to split this huge file into several changesets in order for the OSM API to allow the data to be updated and also to validate errors in a state by state basis, which meant an enormous amount of time spent doing manual work, such as the verification of the admin_centre and the separation of non boundary data merged to boundary ways, since by replacing this limits with the addition of the backed up data we could be incurring in uploading errors from the restored data which would end up being a disservice for the osm map. The detection of such errors was done with the help of some scripts, but the evaluation of whether it was an error or not ended up most of the time being judged by the team, based on the editing guidelines of OSM, which took a considerable amount of time compared to a 100% script based validation and was of course exposed to human error.

In the case of Mexico’s municipal limits, we’re talking about 2,457 limits for which there were some previously limits present in selected states, where it was clear that they were imported from the INEGI but never documented and hence were invalid, in this case we replaced the limit with the updated INEGI data including the data identifier from the source at INEGI in order to facilitate future updates of these limits whenever the source updates them and also the other way around, for the INEGI to retrieve this data from OSM to identify where it should update its mapping efforts on the field based on what the OSM community is updating.

Third problem, idiosyncrasies, since before we started with the import, we knew from reading of the experiences of other previous imports, that some people is never happy with the imports, either because they don’t share they same objectives of the importing team or because they have their own interpretation of the philosophy of openstreetmap on which they consider themselves owners and guardians for the data they contributed, and assume that no one else should touch or modify their contributed data. This notion that whomever trying to improve OSM through a big scale import is destroying or damaging OMS is well known within the community since the times of the TIGER import in the United States, but the reality is that OSM wouldn’t be what it is today in part thanks to the importing efforts or it may be, only that it would take longer to improve without such efforts. Errors might be made, but if that’s the cost for improving the map at a reasonable speed then I think that’s a reasonable tradeoff as long as the errors are fixed.

Opinionated and polarizing views will always take place in open source communities, some folks will be satanized just because being part of a corporation or for having their own agenda, but in reality everyone has their own agenda, the point is whether those agendas contribute to the improvement of the project.

In order to keep this import current and to help out the INEGI to update the cartography of municipal limits, we set out to list every municipal boundary in Mexico’s OSM Wiki (Thanks Irk_Ley !) , this way the INEGI, and actually anyone, will be able to track which boundaries are being modified according to the local legislation which is possible for those users that have a legal backup to modify the boundaries from the ones that INEGI set.

Finally I would like to thank all from the import team noting it wasn’t only people from Telenav but we also received a great deal of technical support from the the HOT OSM team , local OSM mappers from Mexico and Puerto Rico and Europe. I don’t mention names because they’re all already listed in the import wiki ;)

Discussion

Log in to leave a comment