Is the OpenStreetMap Rails App Appropriate for Other Data Sets?

Posted by mikelmaron on 19 June 2013 in English (English)

I've begun technical advising on the next iteration of a collaborative mapping project, to collect, discuss and disseminate data and stories on deforestation in Democratic Republic of Congo. I first began thinking about it over a year ago, with this write up on Moabi and GeoWeb challenges, and have since helped WWF iterate on the old platform with Sigaptaru. Right now, I'm at the offices of IIASA, a global scientific research institute situated outside Vienna, in a castle. A few weeks from now, I'll be in Kinshasa, challenging most all of my assumptions about the project.

Right now, I want to challenge some of the technical direction this project is taking, and I hope you can help. I may be in danger of seeing every map through an OpenStreetMap lens; though it's possible I may be on to something.

Naturally, I want this project to be based on existing, active open source projects. Architecturally, we favor focused components integrated through appropriate data sharing and APIs, over a monolithic system. In Moabi, this means good functional separation between data management, and exploration presentation and communication, with links back to dig in if desired. We've starting talking about this as the difference between the kitchen and dining room (though our kitchen would be open plan, and anyone can come in and cook. The analogies are endless). There are groups of folks already collaborating on creating and sharing DRC data, but "bilaterally" and without a full community or tool for coordination. The data collected in Moabi is relatively specialized, like details of mineral right concessions, REDD+ project boundaries, field surveys, artisanal logging sites, proposed road projects, etc.

So the model of collaboration and collection of geographic data does have some parallels to OpenStreetMap (if a smaller potential contributor base), but the data itself does not always fit into what's appropriate for OpenStreetMap. Does it then make sense to fork the OSM rails port, and build off the same code base?? And further, model the community interaction and experience of working within this amazing project for the past 8 years? OSM is certainly the most successful mapping collaboration out there. MediaWiki, the software behind Wikipedia, is reused a tremendous amount. Perhaps there's a similar dynamic possible. The USGS took this approach when looking to work on the National Map. And it's not just the rails app, but the leverage of the entire architecture of planet dumps, map tiles and indexing, the great new iD editor, a well developed import process, flexible consensus based discussions of representations in tags. As the OSM app and associated tools get better, a system based off it can improve in parallel. The National Park Service has been working on beautiful and usable ways to distribute their maps as Park Tiles, and we could model that as well.

The question is, what's missing in OSM that's part of the Moabi requirements, how hard would it be to build these new features, how difficult will it be to maintain a branch, and how much of these features are useful to OSM core?

Here's some of the departures from current OSM that we've discussed. There needs to be some internal definition of a feature layer, and we may need permissions or at least focus on particular tags on particular features for particular user groups (ie some fields may be imported but need not be edited in Moabi). Perhaps the way iD is representing forms for objects could be useful here. Imports need to be properly marked and sourced; this might mean flagging user accounts as import or bot accounts. Collaborators need a space to discuss the focus of data campaigns, which could be a geographic region and/or a topic, a way to set a number of data goals and monitor progress of those goals. Some of those goals could mean a kind of quality control.This could be an implementation of Groups, as we started during the SOTMUS code sprint, integrated with an enhanced Tasking Server. A group might have a leader. Feature layers are published as tile sets, which are composited together by the tile server in different ways. A kind of CMS could power the front end, displaying various maps, highlighting useful work, telling stories behind the data, sharing out.

What's the alternative to building off OSM. Well, I guess that would be GeoDjango. Not a bad start, and flexible.

Would love to hear thoughts from those of you who experienced with building web mapping applications, developers of the OSM website, and anyone else who thinks this is a great idea or the first step towards painting ourselves into an OSM shaped corner.

Location: Neu-Guntramsdorf, Guntramsdorf, Bezirk Mödling, Lower Austria, 2353, Austria

Comment from ebwolf on 19 June 2013 at 16:03

Machine tags... At USGS, I started exploring machine tags as a way of managing importation metadata. For instance, tags in OSM are simple key=value pairs. When USGS data was imported into OSM, you'd see tags like (working from memory and it's been a while):


These tags provided a relational link back to the feature in the Geographic Names Information Service database. But these tags provided no informational content to OSM editors and users and are frequently removed. I've also been told by some very active OSM contributors that the presence of these tags led them to believe that they shouldn't edit the feature and much of the data imported from GNIS has been untouched.

A machine tag would look a little different:


The same feature may also have a tag like:


Which would essentially make OSM not just a data conflation environment but also an ersatz lookup table across spatial databases. The part before the ':' establishes the "schema" that the tag participates in along with the non-machine tags.

Unfortunately, the OSM platform is too "egalitarian". You can create machine tags like this without touching the platform but the key element that's missing. The OSM platform needs a permissions system to only show these tags to certain users. Or allow data to be displayed based on a filter on these tags.

One of the "hacks" I made in Potlatch2 and elsewhere was to hide some tags that the USGS used internally. You can see these tags in the planet file or if you look at the history. But you cannot change them in the OSM platform. These hidden tags were a key part of the back-end system using some Python, FME and ArcGIS that was used to manage the "process" of producing authoritative data from user contributions.

There are other platforms being developed. But I think OSM Railsport still has the richest, most mature feature set. One difficulty of using OSM is that it ties you to a community that can be challenging to work with.

Comment from wonderchook on 20 June 2013 at 05:12

@ebwolf: One thing we've done is link things the other way. To the OSM ids. Which isn't perfect either since people can delete the feature and re-add it or split it or any other number of things that could change the OSM id. Though it has worked fine for what we have been doing so far.

@mikel: I wonder maybe the issue is that the OSM code base has completely been aimed at one base. I think there have been forks of it but the modifications have been for specific projects rather than with the goal of creating a geodata platform that could be deployed for a number of uses.

I think also the social aspects you are suggesting would make the rails application more useful for other types of communities. Meaning I would think the goal of using the codebase for another project would be to build a community around data. Currently that would be really hard, since all the other community tools that osm uses would also have to be set-up.

Comment from mikelmaron on 20 June 2013 at 16:29

@ebwolf: Machine tags, a good idea. I'm also thinking it could help with attribute level focus/access; much of the enhanced data is not going to fit in standard tags, and can organize group focus around that. Definitely permissions/roles are going to be something to add, so that's something to investigate ... is it relatively straightforward to add to the current app, or is the notion of open so deep, that it would become a sink.

@wonderchook: I've been thinking a lot about enhancing the social functions directly in the rails app, we started hacking on it during the SOTM US sprint day. So possibly, this project provides some support to really advance on those features. Question is, how much of what's needed for Moabi is also needed for OSM.

The other platforms I'm thinking about are Cartaro and GeoNode. There you have GeoServer on the back end, and then Drupal or Django for presentation. Just GeoDjango might be good enough too, to start from. Interested to hear more opinions on other options definitely.

Comment from JimmyRocks on 20 June 2013 at 23:00

Issues with your system

It sounds like you want to base socialization off of datasets. I could envision this being implmented as having a discussion attached to each node, way, or area in the database. Flickr is linking images with OSM data in what could be a similar way.

Machine tags / hidden tags

@ebwolf brought this up as a way to make sure your tags stay in the larger openstreetmap system.

The hidden tags would prevent people from deleting them or getting confused on what to do with them.

The way that the USGS system works, it basically removes the ability to use the Advanced tab in Potlatch2. This is one way to restrict what the users can edit, see, and remove.

@wonderchook brings up some good points on why this could create a problem. The USGS system is so locked down that we don't see a lot of issues, but it is a possibility.

If you're going to use tags specific to your project, I would just add them in your project with a prefix and combine them with standard tags as well, like a machine tag:

"moabi:water_rights", "group_1"
"waterway", "river"

If you do conflate your information with the larger OpenStreetMap dataset, you can leave your tags in. But someone can go in and delete your extra tags.

I think for future imports to OSM from the USGS GNIS database, I will try to maintain a list of the node ids that are associated with the GNIS ids so that people can delete them in OpenStreetMap, but the information will not be lost. This is something you could keep track of in your system as well.

Pros for using OpenStreetMap

  • If your data license is compatible, information can be imported back imto the main OSM database
  • There are a lot of great tools out there that can easily be used with your platform (JOSM, ID, mobile apps)
  • The planet.osm file itself has a lot of great tooling built around it.
  • A lot of issues with OSM have been brought up and discussed, and at least partially solved. This means you can spend more time mapping and less time worrying about problems.

Cons for using OpenStreetMap

  • It forces you to use the OpenStreetMap data model
  • It may be quicker to create tools that edit the map with other platforms.
  • Although there are easy ways to convert OpenStreetMap/Planet data, it still might be easier to use a system that supports a more standard format natively.

Comment from migurski on 21 June 2013 at 04:45

One interesting outcome will be to gauge the suitability of the OSM tool chain for data at a different scale, like forest outlines vs. roads. How many of the tools assume street-scale in all things, and enforce that in minimum zoom levels for editing (iD) or high-precision data output (planet XML)?

Comment from mikelmaron on 21 June 2013 at 08:41

@JimmyRocks There probably won't be much that's useful for OSM, but perhaps. For certain data layers, we can choose the license, but other imports might be restricted by source.

Maybe I'm not too concerned about the data model, perhaps I'm just used to it. Certainly it will be weird for importers, who will want OGC standards I expect. Editing, well I think iD is one of the best out there for straightforward editing.

@migurski Good point. We could relax those zoom level restrictions in iD, dependent on the data layer being edited. What do you mean by "tools assume street-scale and enforce that in high-precision data output"?

If anything, I'm most concerned about defining some notion of layers in OSM, and in adding permissions.

Comment from migurski on 1 July 2013 at 21:04

It’s a less important point, but the XML output uses seven digits of precision after the decimal point, and for data sets like land use it might be more appropriate to use 3-4. Less important, though.

Comment from nfgusedautoparts on 19 March 2014 at 14:12

i generally like the idea of using : id to break things out. to protect things against arbitrary editing, i have found that just sticking a README tag in (in all caps as most editors will sort that to the top of the tags list) works for getting mapper's attention. i generally use the text to indicate why i want their attention; for this use case something like "warning: this object participates in external links for , do not delete without first contacting " might work out. i mention this because i'll wager that getting a permission system in OSM may be a very difficult political sell.

Comment from nfgusedautoparts on 20 March 2014 at 01:42

having thought about this a little more, instead of permissions how about an "externally linked" tag supported by the editors, with appropriate warnings about modification of objects carrying such a tag? probably a much easier sell than the notion of access controls in the OSM database.

Comment from mikelmaron on 20 March 2014 at 10:54

@nfgusedautoparts: they may work. but since our use is outside of properly, we have more latitude in implementing new features.

Login to leave a comment