Recent diary entries
We all know the OSMF has a small membership compared to the mapping community. Worse, it is skewed towards certain countries. You might hear people say Germans and US Americans dominate the membership.
In a perfect world, all countries have a similar participation rate to OpenStreetMap, and mappers from all countries participate in OSMF by the same degree. The first is not something that is easily changed, but the second should clearly be our ambition. In France, a discussion about this led to a recruitment campaign with the explicit goal of rebalancing. This piqued my curiosity. Oh, and it increased membership with 90 people (from a mere 42).
But numbers without context are of little use. Guillaume (user Stereo) supported the French with membership statistics, and hooked me up with a list of membership by country for all countries. Now I'd say OSMF membership would ideally reflect the mapping community - skewed slightly for countries with a bigger data-user community. Unfortunately, no thorough statistics about mappers by country is available. Fortunately, there is something close to it: "the number of daily mappers" in a country. It is not a perfect measure:
hdyc has a de facto "estimated home location". This would allow to take some of the random noise out of the equasion
"active mappers" (people with at least three months with a mapping day) might be a better measure
in some cases, it's impossible to find the country of people who only map abroad
But it's close enough, and it's available. Pascal Neis, deserving his last name as usual, sent me a list of all countries with the avarage number of daily mappers over the last 12 months.
Numbers with context
The easiest useful measure in this context is the membership rate, which allows to compare countries. For the world, you get 26 OSMF members for every 100 daily mappers. This rate varies from 0 in Tanzania to 94 in the US. On a map, it looks like this:
For greater detail, I made a table:
First we have the number of active mappers and the number of OSMF members.
Then we look at the membership rate. The column "expected membership" gives you the number of members you would expect if all countries were the same. The representation rate is 100 when a country is perfectly average. Germany scores 156, which means that for every 100 expected OSMF members, they actually have 156. A country like Russia is heavily underrepresented with just 29 OSMF members for every 100 we would expect.
For privacy purposes, countries with less then 1 or 2 OSMF members are shown with "-1" instead of their actual values.
One more thing
I should probably have been working on my position statement for the OSMF Board election, since I am one of the candidates. But since participation, growth and diversity are some of my core interests I just couldn't stop myself.
- Update: corrected the table screenshot
OpenStreetMap Notes make it easy to share some info about mistakes or missing data in OpenStreetMap. It's a very open system, allowing for posting and commenting from third party apps and even without logging in.
I'm working on an exhaustive analysis of how notes are used. Here are some prelimanry findings.
Notes are often personal
While the wiki doesn't encourage using Notes as a personal to-do list, very many Notes get closed by the original poster. In fact, of all Notes that were posted by a logged in contributor, that are at least 90 days old, and are already closed, 50% were closed by the original mapper. The vast majority of those "self-closed" Notes (96%), never saw interaction. Notable though, is that IF there was some sort of interaction, there was a reaction by the original contributor in almost 90% of the cases.
One of the things that I want to investigate, is if these people who are closing so many of their own Notes, are also up to closing other Notes. What I would like to know is if the Maps.me surge in Notes created more note-closers, and if so if they were new to notes or not.
Maps.me is huge
When Maps.me implemented Notes into their app, the impact was huge. Basically, the number of Notes doubled overnight. The graph shows the monthly opened notes. In the first months of Maps.me Notes, the added notes were about the same as all other notes together!
Since its peak, the number of Maps.me Notes has been steadily going down, while the number of other notes is going up. My method of identifying Maps.me notes is looking for "#mapsme" or "Maps.me" in the opening comment, so maybe this string isn't included anymore in some versions?
And we can't keep up, right!?
Of course, this surge had a negative impact on the closing rates of Notes. However, the overall statistics that we are used to seeing, paint a somewhat overly negative image.
A tiny bit of methodology. I'm interested in how we handle incoming Notes. That means you need to look at rates, not raw numbers. One way to do that, is to define a goal, and see if we're attaining it. For example, the graph below takes the goal "Notes should be closed within 90 days". (note: I did not take account of the re-opening of Notes here.) That of course means you have to wait till a Note is at least 90 days old before you make a decision on the stat.
The graph shows that until the Maps.me surge, we had a slightly downward trend, but usually at least 60% of Notes were closed within 90 days. The "overall" line shows that this rate went steeply down as Maps.me surged. However, this is mostly caused by the Maps.me notes having a much worse performance. The closing rates of app-free notes did go down a bit, but in most months stayed close to 60%.
Those annoying Maps.me users?
The reason for a slower closing rate might be simple: Maps.me Notes usually aren't made by people who close their own Notes. So a hypothesis I want to check is, if you split Notes between "made by people who tend to just work on their own Notes" and "Notes made by people who just want to share some information", is there still a difference between Maps.me and other notes?
One of the main complaints about Maps.me users is that they don't respond to comments. Of course, there can only be something to respond to, if their Note first gets commented. For that to happen, the Note has to be non-anonymous and commented by someone else who was logged in. To my surprise, this is a relatively rare situation. Less than 4% of all existing Notes were made by logged in users and responded to by other logged in users.
In this rare situation, there is a reaction from the original contributor in about 30% of the cases. If the Note was made through osm.org or an unknown app, this rises to about 40%. However, if the Note was made with Maps.me, response rate was only about 10%.
Since Navmii makes only anonymous notes, this rate can't be counted for that app. The recent StreetComplete notes get a 26% response rate in similar circumstances. So there is clearly something going on with Maps.me. Some further measures to increase comparability might be necessary, for example excluding Notes from heavy mappers. But before we saddle the horses, don't forget what the actual number is we're talking about. In 2016, if Maps.me users were exactly like the general population, there would be just 2500 more reactions from Maps.me original contributors, on a total of 123.797 Maps.me notes that year.
So the lack of communication might be relatively minor compared to the "abuse" of Notes by people who don't know what they are meant for. If someone would like to do some text analysis to see if automatic classification of usefulness is possible, I could offer a file that is ready for consumption. (the Notes dump is relatively easy to parse though)
A bit harder to analyse, would be the hypothesis that the difference is because Maps.me notes tend to contain some info that is hard to ignore (like "this place is closed now"), but hard to verify ("did you really mean Place X"? If you don't trust the contributor, and they don't answer, you have to go check).
In other news:
What on earth happened here? From May to July 2017, anonymous comments went absolutely through the roof, and then just as quickly went down to normal again. Most seemed to have been nonsense comments. Does anyone know what happened here?
Work in progress
I've been dreaming of a global database of local basic statistics about OpenStreetMap for years. That turned out to be a little over-ambitious for me. Hence my retreat to local statistics for Belgium only. Analyzing Notes is a kind of proof of concept. This article is a little side to the process of publishing statistics about Notes for the world, but also all continents, countries and "regions" in the world. Since these are the tools I'm fluent in, I wrote the analysis scripts in SPSS. They are available on Github. It should be relatively straightforward to translate to other, more open languages. Publishing will be on the Swing.eu platform. Unfortunately closed source payware, but they donated a version for OSM analysis purposes.
Shape this project
I usually start of projects like this with a clear goal in mind, then lose track of that goal completely because of all the interesting side streets I find. Working towards the Swing.eu platform helps shape that a bit. But what kind of questions would you like to see answered? I intend to make it easy to play around with the numbers yourself, but maybe there's some aspects I'm missing entirely. Let me know! Feel free to post here or as a Github Issue.
This is my position statement for the December 2017 board election.
Where I'm coming from
Since joining OpenStreetMap, I’ve found myself on a slippery slope of ever stronger engagement to the project. Not only have I been mapping at least every other day, I’ve grown into being a community organizer. At first I was mostly interested in South America, where it felt like OSM has a much larger niche to fill than in Europe. I didn't start off as an open source and open data enthusiast, but as someone crazy about maps. As a sociologist and data analyst I was fascinated by the data and the people behind it. I liked the way OpenStreetMap could solve real problems, and enjoyed being part of those solutions. Riding on the tails of Jorieke Vyncke (current Missing Maps coordinator) and Ben Abelshausen (awesome OSM routing developer and tireless organizer), it always seemed logical that we should build the map together.
Building OpenStreetMap Belgium, we worked with crafty local mappers, while supporting like-minded people around the world. We helped the development of what tools we could make with our local community, instead of complaining about the lack of global solutions. We had beers (and tea) together, to put faces to the usernames. We worked on humanitarian mapping to build consciousness about the project and to create a network of volunteers. We worked with open data - as just another tool to improve OSM. We worked with local government and other organizations to increase our visibility - using their networks for exposure, instead of having to build one from scratch. We worked on larger events - they would have been a failure without Ben’s sense of responsibility and the network of volunteers we grew during the mapathons. We worked on our online presence, with a nice website, a single point of contact for questions about OSM in Belgium, a newsletter. Finally, we are finalizing becoming a local chapter. We felt how more and more people started taking the project seriously.
This approach has been quite successful, and I would like to apply that experience to the entire project. It means focusing on growing the number of volunteers and letting them grow in their roles. It means sharing the work as much as possible, but still make sure things get done. I think I have been instrumental in that process, and I believe I could help the OSMF realize more of its plans. That’s why I am a candidate for the Board Elections.
Priorities as a Board Member
As an OSMF Board Member, I would devote most of my energy to growing the community. For me that means supporting local volunteers - which is why I am enthousiastic and impatient about microgrant and local chapter and event support plans. There are many relatively simple things that could help community builders everywhere (here's a collection of ideas I worked on). As a board member I would focus on finding more such ideas - and turning them into realities. That would need growing the OSMF and the Working Groups, finding more people to share the work. And it requires helping the OSM community be a more friendly place - not just a fun place to map, but also a fun group to be part of.
Developing local communities. We should try to offer more to local organizers. Some basic tools and functionalities, maybe a network for best practices. Local mappers should have more tools available to monitor the successes and failures in their community. We need those organizers to come to SotM, even if they can’t afford to (and offer our help actively). At SotM, full focus on local successes and failures should be obvious. A small amount of money can go a long way - we could do more to help.
OpenStreetMap needs as diverse people as possible. We should take the time to think about solutions, even if they seem to be impossible.
- How can we reach more non-English speakers, and how do we break the dominance of those that are fluent in English? In Belgium, a third language was part of the solution to surmount the language divide between French and Dutch speakers. But even so, we noticed we were excluding those who weren’t so good at foreign languages. So we include all three languages - it’s a lot of work, but it is necessary.
- Communication isn’t just about languages, but also about bridging the gap between cultures. We should be aware of the gaping holes in our understanding of each other, be it based on education, culture or gender. We should be more active in helping people learn to do this, and avoid pointless arguing, especially on media that encourage that.
- Even though OSM is a deadly serious thing, we should never forget that people contribute because it’s fun. A new mapper picks it up because they enjoy fixing that first mistake. Just as important is that an advanced mapper can keep enjoying the hard work they do. It should be just as enjoyable to become more active in the community.
We should realize we don’t know ourselves all that well. Still we have endless discussions based on assumptions, or just our own personal experience. To have more meaningful discussions, we need more facts. That means more analysis and research on how we function as a community. There are plenty of researchers both within and outside our community who are interested, let’s talk to them. My work experience might help here, as I work in government data research, mainly turning data into into actionable info. Between the mapping and organizing, I've devoted a lot of time to OpenStreetMap analysis (see my profile for an overview), I intend to expand that and actively work on the relation between OSM and science.
We should be aware of the risk of running out of steam. Every year, most people who picked up mapping, stop mapping. That’s normal, but it also means we need to keep growing in order to survive. The same people who enjoyed mapping from a blank slate, might not be the people who enjoy fixing mistakes or adding ever smaller details. We need to keep looking for new ways to engage people, to spot the new use cases that were unrealistic dreams just a few years ago. At the OSMF Board, we should actively seek out new use cases for OSM, especially if they have the potential to grow the mapping community.
That said, I don’t think we should change all that much. We are rightly proud of being able to achieve so much with so little structure. We should not sacrifice our criticism or our open way of working in exchange for anything else. There are almost always win-win compromises possible. It’s just a matter of creativity and good will to find them. There is still an enormous potential for growing our community. Most people still haven’t heard about us. As the project matures, many use cases are only now becoming possible. Many minds are only now opening up to the option of open collaboration. We can grow, and we can do it together.
For more about how we’re building OSM Belgium, check out OSM.be
For more about me, check out my OSM profile
Peter Mooney, Frank Ostermann and I first met at a workshop about Crowdsourcing in National Mapping in Leuven. There were people from national mapping agencies from around Europe, who came to talk about their experience with working with crowdsourcing. I talked about the crowdsourcer’s perspective. It was a bit frustrating to be the only OSM-community representative, as I know that we’re defined by many points of views. With Peter and Frank the conversation soon went to the science aspect of that same relation. Professional scientists find it hard to talk to OSM, and OSM people find it hard to talk to scientists. We believe we can do better. And we want to do something about it. Rather than just start doing stuff, we want to invite you to discuss this with us. Below is our line of thinking, written by the three of us together.
This initiative is based on our observations that there is room for improvement in the interactions between the academic research and OSM communities. On the one hand, the OSM community often learns late (or never) about research results generated from academic research on OSM. For example, the OSM wiki pages on academic research are likely not to be up-to-date (with the majority of entries from the years 2010/2011, and little after 2014), but nevertheless quite cluttered, containing many non-English entries, and therefore difficult to search effectively. On the other hand, the academic research community has often little information on what are important concerns for the OSM community. As a result, very often academic research is carried out on OSM in complete isolation from the OSM community itself. There has been substantial interest from the academic research community into OSM since at least 2006/2007. This interest shows no signs of abating. One must acknowledge that the incredible success story of OSM is an intriguing source of potential research for academics.
Our initiative has therefore two main objectives:
- Improve communication structurally and in a sustainable way between both the OSM community and the academic research community. This includes communication about research needs within OSM, communication of research results from the academic community to the OSM community, and shared goals and interests.
- Learn about the interests and needs of the OSM community to enable co-created research
Our approach has two stages: First, this blog post aims to deliver some basic information on what we plan, why we want to go forward with it, and how we hope to reach our objectives. Further, it aims to gather feedback from the OSM community through comments, and invites members of the OSM community to contribute, and propose ideas for research studies. As a second stage, we envision a more structured survey that proposes research ideas based on suggestions from this blog post’s comments , e.g. through voting or multiple-choice questions, that offers some open questions to allow for free-form comments, and asks for ideas on how to keep any wiki pages on research ideas and results more up-to-date. However, we are open for suggestions for different approaches!
We aim for the following outcomes, to be shared with both academic research and OSM communities:
- A short report or evaluation of the procedure itself.
- Publish the highest voted or most often requested research ideas on the OSM community pages.
- Establish a mechanism that allows the update of these ideas and feedback any results, e.g. through finding champions or supporters from both the OSM and the academic research communities, or linking with discussion board on OSM research.
Some more info on why are doing this: Academics/Researchers must write papers and do research as a key component of their ‘day job’. OSM community members want to continue to make the OSM map/database even better, map new things, write OSM software, etc. There surely exists some research problems that the OSM community is interested in investigating - these research problems could also be of great interest to the Academic/Research community. This provides great potential for a collaborative platform between Academia and the OSM community to work on problems of mutual interest. Moreover it provides the potential for a new form of collaboration where the results of the research are directed back to the OSM community for discussion and debate BEFORE they are published in academic journals or conferences. We believe that this vision of co-created research between the two communities will be of interest to everyone involved.
Joost Schouppe, OSM Belgium Frank Ostermann, ITC, Faculty of Geo-Information Science and Earth Observation, University of Twente, The Netherlands Peter Mooney, Dept of Computer Science, Maynooth University, Ireland.
Note: this post was announced on the talk mailing list here
Over the passed year, the Belgian community was involved in organizing 10 mapathons. It is an incredibly easy thing to do, once you have the documentation in order. And once you realize you should do as little as possible - just find people who have a location and a recruiting network.
Some time ago, Pascal Neis wrote an article about new mappers recruited through classic channels, Maps.me and humanitarian mapping. I asked and got a changeset dump of all the people who participated in our mapathons.
Here's some stats about that.
Overal, 1925 unique mappers participated in our mapathons, of which 328 were new mappers.
First, did we manage to turn them into returning mappers? Well... As could have been predicted by Pascal's depressing numbers: not really. The data used was from December 2016. You can clearly see that the percentage having more than one mapping day drops as we approach December. That simply means you need to wait a bit before you can do a decent analysis.
Say we give people 3 months, then we only look at the edits from September and before. We got 23% percent of people to map more than once! 10% mapped 3 days or more. Unfortunately, that's even slightly worse than the international average. Maybe we just worked for a more difficult audience :)
We usually tell people to map something in their own neighborhood before starting on the mission. Less than 21 of them did so. And in fact, only 4 of the 328 have more than one Belgian mapping day. As a comparison, we had 2059 people mapping for the first time in Belgium in 2016.
Even if that all sounds thoroughly depressing, it should be noted that organizing mapathons still is a great way to build a community, even if it doesn't show in these numbers. The mapathon movement was crucial in turning mappers into organizing volunteers. Especially the two interuniversity mapathons (with 200 participants last year and over 300 this year) are momentum-building moments. For the State of the Map in Brussels, we somehow managed to recruit 20 Belgian mappers to help out. That would have been impossible without the mapathons.
Apart from that, the constant confrontation with people who don't have any idea about OpenStreetMap, is a stark reminder that we should all keep up the missionary work.
This wiki page has a nice collection of stats on editor popularity. The data is up to date, but the graphs aren't. I'm not a big fan of the logarithmic scale either.
So here's one graph to tell the main story.
I focused on "the big editors" to keep the graph simple. If you want more detail, just head over to the wiki page.
You can read the graph horizontally, showing first the distribution of changesets, then number of unique contributors, then total edits. On the left, market share. On the right absolute numbers.
There's some very clear patterns there. I really like how you can connect the dots for contributors of the "default editor" at the time: first Potlatch, then Potlatch2, then iD. All three of them reached 80% market share at their peak. But iD went down in relative terms because of Maps.me. That could only happen if Maps.me editors don't use iD much. That's a good thing, as it show they are new mappers. And it's a bad thing, as it shows that we haven't (yet) succeeded in getting them more deeply involved in OSM.
To make some of that more clear, here's three more charts. Changesets per contributor show that JOSM users are quite productive. There's also a very clear growth path for JOSM users. Merkaartor has a similar pattern. Maps.me hardly shows, with just 4 changesets per contributor.
Some changesets are bigger than other. JOSM changesets are the biggest. Potlatch2 are somewhere in the middel, and iD changesets are quite small. The average Maps.me changeset has only 2 changes.
So what's the overall productivity of contributors? Here JOSM is quite extreme.
Note that this doesn't say anything about quality or amount of work. For example a JOSM changeset editing thousands of objects could have been made in minutes. Someone could have surveyed a day to collect ten POIs and map them with iD.
As one of the few remaining Potlatch users, I had to make this graph too:
As Potlatch2 lost the status of default editor, the remaining users became ever more productive. That makes sense, because "low engagement" contributors won't find the way to that editor. So the only relevant numbers are those for 2011 and 2012. And compared to that, the low numbers for iD are striking. Low numbers may mean that more people with less motivation can be pushed to make at least one edit, so you can call that a success. This is the argument to call Maps.me a editing a huge success. But it can also mean that the editor isn't as inviting to work on more stuff than just on the thing you wanted to do. Anyway, a much deeper analysis would be necessary to draw any conclusions on that. You'd have to take account of previous mapping experience, later shifts to JOSM, and possible differences between 2011 and 2016 newbies, to name just a few controls. Also: the numbers are rising every year, even as it remains the editor for new contributors.
And then there's the good old Potlatch 1 of course. There's only one reason to open that ugly duckling: go to a place where you think something was deleted, press U, and you can see and recover it. It is amazing that no other editor has a similar feature that makes this so simple.
You can download the cleaned up data here (dropbox).
When OpenStreetMap started, open geodata was basically unavailable. Some governments were quicker than others to release their data. And so some places had huge imports from the start. Whether that was a good idea or not is slowly becoming irrelevant: the map is too full for big new imports anyway. Imports are ever more exercises in conflation: merging sources and using them to validate and improve existing OSM data. The good news is that it means that often the same tools for the "initial" import can be used for keeping the data up to date. Continues synchronization between datasets changes the relation between data provider and OSM.
For a government, a complete and reliable OSM becomes a more valid tool for their projects. The synchronization processes we set up, can form the basis for an extra quality assurance (QA) channel for governments. It might even convince some agencies that there is little to be won by managing some of their data on their own.
To try and capture this changing relation, I started a thread on the talk mailing list. Mikel suggested creating a Wiki page on the subject: here it is. Meanwhile, several people have improved upon it!
During the course of the research for that page, I met Tomas Straupis. I wanted to share what he told me about what they do exactly with government data, and what their relationship is with the government.
Interview with Tomas Straupis
Here's a general idea what we're doing in Lithuania.
Government has datasets d1, d2... dn. OSM has one big dataset O which could be split into datasets o1, o2... om. We take datasets dx and oy which could be mapped (have similar data, like placenames, roads, lakes, rivers, etc.)
Automated importing to either direction is impossible (or not wanted by both sides). Government datasets need strict accountability (sources, documents) and responsibility. OSM has different data and simply overwriting it with government data would be bad in a lot of ways.
So the way integration between OSM and government (and actually any other datasets) is done is by synchronisation - checking for differences and taking action (mostly manual) on them on both datasets. By doing a comparison both government and OSM datasets are improved. The point here is that government datasets usually use official (document) source to update data. OSM uses local knowledge to update data. None of these methods are perfect, so synchronisation/comparison helps to get most/best of both. (as a separate note: here comes OSM strength that everything is in one layer - it is much harder to have a road going through a lake or building or having a street A with address B along it. Government datasets are usually separate and controlled by different institutions, so doing such topology checks is much more difficult there)
For this to work government must open datasets and appoint a working contact point where information about problems in government dataset could be sent and there this information is ACTUALLY used and feedback given.
Do you have more info on the projects, and the software/queries you use?
All info is in Lithuanian... Maybe google translate can help with the links to Lithuanian blog site I will provide below (if not - just tell me I will write the general idea in English).
All OSM data is imported to postgresql database using osm2pgsql and that is used for comparison/synchronisation.
We're doing two types of comparison/synchronisation: 1. POI (point data, for some types of polygons centroid could be used) 2. Road (multi-vector data)
For POI synchronisation we have an ugly but functional universal comparison mechanism. We convert external data to xml file with lat, lon and some properties (or external source provides us information in xml for example via web-service). Then we provide mapping of this external data to OSM data. So having external data, mapping and OSM data we can create reports of differences.
Try automatic translating these two entries to get a general idea: https://blog.openmap.lt/2015/03/14/lankytinu-vietu-sinchronizavimas-i/ https://blog.openmap.lt/2015/04/18/lankytinu-vietu-sinchronizavimas-ii-dalis/
To compare road data, road shapes files are loaded to postgresql using shp2pgsql and then some queries are executed to find differences. Once again general idea is in this blog which you can try to translate: https://blog.openmap.lt/2015/09/22/keliu-numeriu-ir-dangu-lyginimas/
So basically we use postgresql/postgis and php. If you have more specific questions - I'm ready to answer them or send the code, just it is a dirty code as I'm a google copy/paste "programmer"... :-)
Does the government use your input, and how? Is there something structural? Or just mailing them and hoping they care?
Lithuania is a small country, everybody knows everybody :) Now we occasionally drink beer with "government" guys working with gis data. So we know they do change the data. They also give us feedback which data sets are "more important" for them, so we can prioritise comparing those. This way both sides are happy and thankful for help.
Additionally each month we take new/updated government data and do new comparison, so we can see that data has actually been updated.
From more or less "legal" perspective. This central government agency for gis data allows submitting error reports online for registered users (registration is free and open to anybody - http://www.geoportal.lt - created according to EU directive on spatial data). And they must check and give feedback in 20 days. We (OSM) are in somewhat different level - we mail directly to responsible group. One of the reasons for that is that they physically cannot fix all errors we report in 20 days, sometimes there are too many of problems, additionally they know report comes from a "trusted" source.
As per "structure". For point type geometry (for example place names) we currently create a google doc online, where both sides write comments and status of errors. When everything is fixed - we take new updated government data and recreate that google doc.
For roads it is per-case mailing of coordinates and notes... But there is no reason why that could not be done in more "structural" way...
Maybe important point here is that OSM data could have some "bad/incorrect" data entered by mappers with not enough experience. And we do not want to make government gis people to sort/filter out such errors. So we go through all errors ourselves and only send those, which we think are really errors. This is the main reason why we cannot simply "automatically" run queries and send result to government people. There are no "technical/IT" problems to send mismatches automatically.
About amount of work
Initial comparisons of a specific dataset usually produces a large number of differences. Some of those are due to actual differences, some are because of different ways of entering data. So initial amount of work is usually high: both for updating data as well as fine-tuning comparison rules. After that only small amount of work is anticipated, because comparison simply notifies one side about the change in another sides data.
A note from Andrius Balčiūnas, Head of IT departament at GIS-Centras
Georeferenced data is created from ortophoto, but data changes much more often (than ortophotograpy is updated, currently each 4 years in Lithuania). OSM community notices the changes much faster. Therefore collaboration with OSM and their data usage for error checking, allows us to achieve higher data quality and relevancy. As this data is later used in national registries, cadastres, information systems - OSM community helps not only to improve the specific data set, but the whole national spacial data infrastructure content quality. Important thing to note here is that such a collaboration means that even small road segment or other improvement of OSM data by a community member could later appear in official government data.
A note on the ODbL license, and dealing with it. Government can use our error reports to start their own mapping process, but they can't just copy our features. Do you know what they do at your government services?
Two points here:
Government is not using/copying any features from OSM. They get reports about problems and this simply attracts their attention on specific features in their datasets. By using their own sources they fix the problem. It cannot be done in any other way, because all changes/all data in official dataset must have an approved/reliable source. OSM triggers the process, OSM does not give any data.
Any database consists of numerous facts (features/records). Only the whole database can be protected by law. Single facts cannot be protected. If any database is publicly accessible, anybody can look at some facts (place name, street name, hotel name etc.) in that database. Then those facts become the facts they know/have in their brain. They can use it to update/insert such data in any other database irrespective of the permissions of original database. I'm not a lawyer. This is what I've heard from lawyers here in Lithuania. So in practice this means I can take this and that from ANY publicly accessible database (even google), until I do not take "too much" of the database that it is not just "some facts", but "a considerable part of the database". The big question here is only what is "considerable part of the database"...
P.S. 2nd point makes map "easter eggs" almost pointless...
Several people have written on the subject before: when you look at something like the evolution of road network length in OSM, the shape of the curve can tell you something about how complete the network is (on the condition that there are enough local mappers).
This graph shows this evolution for the main roads in Flanders.
You can clearly see that the larger roads were mapped faster than the smaller roads. (note: there is a bug in the OSM-history-importer which prevents deleted objects from being removed from a snapshot. This could explain the continued slight growth of main roads. When people improve roads, they will often delete small portions of them.)
Assuming they are all kind of complete now, you can show the evolution of length as a percentage of current length. This shows quite clearly that there are "mapping priorities": the 60% completion mark comes much sooner for motorways then it does for tertiaries.
While this all sounds quite obvious, it really isn't if you look at the map of road evolution in Flanders. From the very beginning of mapping, contributors have been interested in small roads as well as main roads.
Full size link. Colors: black: main roads, yellow: minor roads, green: slow roads.
If we extend our view to a wider range of roads, we can see that the main roads in general got mapped first, but minor roads soon came to dominate over them. Service roads, tracks and paths (footway, path, steps, bridleway, pedestrian) tell their own story.
(Note: construction and proposed roads are removed from further graphs. I checked taginfo for alternative tagging styles, but they are also quite rare)
Because these last types of roads haven't reached their final form yet, we'll show the yearly growth rate. As this growth was explosive in the first years, we'll start in 2012.
The graph clearly shows that main roads and minor roads aren't really growing anymore. However, the graphs for service roads, paths and tracks seem to level off in 2014. In fact, paths and tracks go up in 2016. In turn, that means there is a lot of mapping left to do. It is surprising to me that this holds for tracks too, as they can be mapped more easily from aerial imagery only. Open data sources of paths and high resolution aerial imagery (both provided by AGIV) could explain the upshot in the mapping of paths and tracks. Other explanations might be succesful relations with the GR and Trage Wegen organisations, or increased contribution triggered by data use.
Network growth versus amount of work
One more thing I do want to share now is the amount of work that is being done. While network completeness was achieved quite fast for main roads, that does not mean that people stopped caring after it was finished. In the animated map or primaries, trunks and motorways below, gray means "existing" and black means "been worked on this month".
These edits can be anything, but here are two examples: work on naming roads and on speed limits. From the beginning of the project, most residential roads were mapped with a name. Length of unnamed residentials started decreasing as soon as 2012. It will likely never reach zero, as many small bits and pieces are hard to assign to any one street. Also, there are in fact roads that do not have a name.
For speed limits, the proportion that has a limit is much lower. Total length of untagged roads only started decreasing in 2014. This tagging is probably slower because it isn't as important for routing and is sometimes seen as a consequence of road classification and location.
These graphs compare the added length for main road types (right) and the number of edits by road type (left). It is quite clear that mapping new roads peaked as early as 2008, but the amount of work done on these roads has in fact only gone up until 2014.
(Note: here, the number of edits is the sum of the number of days a certain way has been edited. The category in which it shows is the last main tag for that day.)
These two graphs show the type of changes for primary and tertiary roads. Traditionally, geometry changes are the most important. As time goes by, their importance starts to lower, and editing tags becomes more important.
In a more general sense, this holds true too. The amount of edits peaks much later than the adding of new roads. In fact, for most road types, it doesn't seem to go down at all.
As usual, I'm torn between answering more and more questions with the data, or scaling it up to more areas. Luckily, for your basic statistics needs, more and more options are finally popping up. See the road statistics provided by Mapbox, Steve Coast or the Missing Maps.
In the case of road network completeness, some efforts have been made to compare current OSM length to CIA stats to measure map completeness. This is problematic, because even if governments have decent stats, they are by their own local definition. Hence the comparison might be off. In the case of Flanders, we have a single, very good source for road lengths. One of the things I want to do next, is to compare local lengths in OSM and official data. This could show is where OSM is probably not finished yet. But you can also calculate this based on the shape of the curves we've seen before. If both approaches give similar results, that would clearly imply that you do not need external datasources to evaluate OSM data completeness.
Another thing is that we have noticed many new mappers first starting to map local paths. I'd like to see if this is a real evolution.
By focusing on road length, you measure both network completeness and level of detail. But neither very well. From a perspective of network completeness, you would have to discount things like cycleways that are mapped as separate ways, or only count dual carriageways once. An analysis detecting really new geometries would do that. I'm planning to do something like that "soon". On the other hand, from a perspective of level of detail road length lacks subtlety. Take the example of cycleway networks. You would have to count all highway=cycleway, but also all the roads that have cycleway tags as part of the cycle network too.
But I told myself not to write articles that are too long to read in one go :) I might have failed.
Bonus: more animated maps
Because they are fun to make and to watch, here are some more animated maps.
Overlaying OSM on top of official road data (Wegenregister), to show where the map is complete
Focusing on "slow roads" (in green)
All data in this article copyright OpenStreetMap contributors, free to reproduce anywhere if source included. Download processed data here.
In my quest to understand the growth of OSM, I had a little fun today.
I took the 1/1/2017 full history dump for Brussels and I extracted a shapefile with all the versions of all the highway=* that ever existed.
Then I wanted to visualize it to see if there was a pattern in how the roads get mapped: "first real roads, then paths" or "everything all the time". So I styled the paths clear green, the roads thin black and used a gray background for the current highways. Then I rendered a slide for every month.
It looks really cool, because it doesn't just show the chaos of our growth. As the black roads are drawn slightly transparent and the monthly slide shows every version of the road in that month, "active areas" show up in heavy black. I think it's really pretty.
On the occasion that it was a featured image in the Weekly OSM, I made a new version without a gray background and with a more logical image size.
While building the program for State of the Map, the program committee had to say no to several people who wanted to talk about their local community – their successes and their challenges. As a kind of compensation, we added a local communities panel (video) and a local chapters congress to the program.
But during the preparation, I also got a lot of feedback from people who couldn’t make it to State of the Map: money, accidents, visa. I got feedback from Brian Pangle (UK), Felix Delattre (Nicarague), Clifford Snow (US/Seattle), Marco Antonio Frias (Bolivia), Redon Skikuli (Albania), Mohamet Lamine Ndiaye (Senegal), Yantisa Akhadi (Indonesia) and Michal Palenik (Slovakia). Most of them didn’t have a chance to be on the panel, or even make it all.
Some of their ideas did make it to the Local Chapters Congress, and helped put things in motion. For example, finally we have the option to follow comments on Diary posts! And there’s talk of putting some money into OSM.org website development for things like massive local messaging, which was a recurring theme there. That might involve helping Gravitystorm’s project to simplify the OSM.org codebase, as that would make contributing code that much easier. Also the idea to allow OSMF membership without payment was mentioned, which was an obvious frustration during the Local Chapters Congress.
What is important to me, is that it goes to show that focused community action can shift the focus of our dev team to issues that would otherwise be lower on their priorities list. I hope we can repeat efforts like this at the next SotM, hopefully even stronger.
This post does two things. First, it will give you, the local community builder, a lot of ideas about things you could do to work on a tighter and larger community. Second, it tries to set an agenda. It offers you several ideas which you could adapt, promote or realize.
There are three subjects:
What are our main dilemmas when organizing our communities
What kind of tools do we need to build community
What stuff are we doing now, that actually works
It was entirely built around the answers from the people mentioned above, plus our own experience here in Belgium.
Community builders' dilemmas
Relatively little feedback on this, looks like we’re a confident bunch. But their are some interesting points.
The challenge of mobilizing mappers: too soft vs too hard. We’re all volunteers, and if you push too hard, you’ll push people away. But if you don’t take action and keep it up, you’ll never get beyond three people at your activities.
Building a local community means making decisions. Is it acceptable to offer financial rewards? Do we focus on finding the "mapping nerds" who create huge amounts of data? Or do we need to adapt to less obvious groups - people who often can’t even read a map, but have excellent local knowledge?
Being local means embracing local culture. But we also want OSM to have a unified voice and a unified data model. And what do we do with well-intentioned outside help, who bring their own funding but also their own ideas and priorities?
Where the global community can help
In the answers, local communication needs were a top priority. The mailing lists, forums and IRC are good for reaching hard core mappers. But the large majority of contributors aren't there. So how do you reach the local mapper who isn’t active anywhere on these channels?
We need an easy way to contact local mappers
When you want to organize a local activity, you need external tools like Pascal’s mappers around me. Or you could query Overpass and make a little list of who has been working on that area. Just collecting the info takes a long time, and then you have to send messages one by one. It is impossible to send a message to all your OSM contacts if you just have their username. Allowing otherwise is obviously not without risk, so some anti-spam measures have to be implemented from the start.
We need to connect the new mappers
It is very labour intensive to connect new mappers to their local communities. Several people running a program to send a message to every new mapper in their region have given up, even as this cool little website makes the work a bit easier. In Belgium, we use welcome.osm.be . It is a simple user interface which takes the New Mappers feed from Pascal Neis and makes it easy to send people a standard welcome message. One is defined as "Belgian" based on the location of their first changeset, which is good enough as a proxy for home region.
The message itself focuses on our communication channels, apart from giving some basic mapping tips. The advantage of using a tool is that you can share the workload, and can see who has been welcomed already. Of course, looking at changesets and giving some pointers is very useful – but a lot of work. It also thanks you for your contribution, and gives you someone to contact in case of doubt. It gives a human face to the map. This is something that could be entirely automated within the OSM.org ecosystem – a centralized system with the content provided by the local communities. This would not be an alternative to the Welcome Message you get on subscription, but a complementary message on first edit. Otherwise, it wouldn't be possible to guess everyone's location.
We need a lively community diary stream
Several of us commented on the impossibility of subscribing to comments on Diary posts, which leads to discussion rapidly dying down. This has now been implemented! Over a year ago, after some rather discouraging help, I opened a ticket on github to request this feature. Markus Heidelberg did make a Chrome/Firefox plugin to fix the same problem. It confused me a bit that someone would make an external tool, rather than fix the problem itself. Markus was kind enough to explain that it’s much more simple to write a separate bit of code than to integrate something into our osm.org website. Another argument for everyone to help modernize that codebase. But that won’t fix everything, because people do speak many different programming languages.
Anyway, the ticket remained open for almost a year, and it was only when the idea got wider support during SotM that we got the attention of our programmers. The pull request shows that even a “simple” feature like this is absolutely not straightforward to integrate. It looks like it took quite a bit of effort from Mikel, Ilya, Andy and Tom to do this. Thank you guys!
Still, we could do more to make communications easier. For example, you still need to be a bit of a nerd to find a way to follow the official blog. A subscribe button, anyone? But even to find this blog is a challenge. I find it strange that there are no direct links from the osm.org landing page to subdomains like help, forum., irc. and blog.osm.org .
We need to help new mappers gain experience
Becoming a mapper is not easy. When you often explain OSM to new mappers, you start to realize how many little things you’ve learned over the years. The more developed the map, the harder it will become. Attention for documentation, and making help easier to find will become ever more important. But a human touch might help too.
Godfather program A recurrent idea to help new mappers is to start a kind of “godfather” program. It might be as simple as sending a welcome message to new mappers, personalized with some tips about better mapping of what they added. But you could go further, and coach people as they grow. You would need some reward for that, because it would reduce your own mapping time. So imagine a HDYC not of your own mapping, but of the people you helped.
#reviewmychange OSM is easy for very confident people: you have to believe that little old me is capable of improving this big map made by so many people. At humanitarian mapathons, it is often a relief to people that their work will be reviewed. But why not add a simple feature to the iD editor to mark your own work as “please review”. It could be as simple as adding a hashtag #pleasereview to the changeset comment, and making a little tool that collects and geocodes these changesets into a simple website for follow-up.
A toolbox for local communities
This is a broad concept, but here are some examples of what that could mean:
A little money can go a long way. In the US, it can help you set up a a local Meetup group. In Africa or Latin America, a microgrant would be enough to pay for internet access, a mapping device and transport costs. If we’re capable of getting free pizza for our mapathons, we should be able to do this too.
A local web presence is something several people commented to as being very useful. Could we have a local community website starterkit, similar in ease to set up to a Maptime chapter?
Could we build communication and tracking tools (new mappers, QA, stats) built on admin boundaries instead of bounding boxes?
Things that work
A central theme on the answers about things that work, is that none of them are easy. It takes time, it takes effort, and the impact can often be quite disappointing.
Some long-time mappers even believe that we’ve reached our potential: everyone who is interested in OpenStreetMap knows the project by now, so there is little to be won by reaching out. This is typical for a swarm organisation: it’s only those who are at the edges of the swarm that see the growth. It is the networks of the newer people that will help you grow – not your own.
All the more reason to learn about things that have worked for others. This chapter talks about how to grow your community, but also about community consolidation. You might have a lot of people working on the map, but who have never done anything but add info to the map. Minimal community engagement is necessary: how else will they keep their mapping habits in line with the wider community? And of course, they are the first place to look when you want to do stuff to grow your community.
When it comes to engaging existing mappers, there is no alternative for real life meetings. Even though we’re an online community, it is personal contacts that build ties. And these are the ties you need to turn mappers into organisers.
A good place to start, is by watching changesets and commenting on them. It’s one of the few ways of getting to know the people who add data but aren’t active anywhere else.
Adapting to different communication styles is essential. If you’re only using mailing lists, don’t be surprised that the level of engagement stays flat. Take the Bolivian talk e-mail list that had about two active members for years. Then Bolivia started a Telegram supergroup and suddenly there’s 40 members, of which at least a dozen are quite active. Here in Belgium we adopted Slack during the State of the Map, and it’s still quite active for more informal communication and quick questions.
But of course, having many channels makes things complicated. Especially if what works in one country doesn’t in the next. it will be a lot of work to find the right channel and to get people in the channel that's best for them. An adapted welcome message makes it easier to integrate new mappers.
Where the local map is already relatively complete, there is little enthusiasm for mapping parties. The quaint model of going out collecting data and then mapping over a beer attracts much less people than other activities. But in places where the map is still quite basic, it can be very successful in building engagement and getting attention.
Doing exiting stuff, as Felix Delattre puts it, is effective to find new people. By doing something completely new and unheard of, you can create a lot of excitement about OpenStreetMap. In Nicaragua, being the first to create an online and paper map with all the bus routes in the capital can do that for you. The exposure this gives you, has an effect beyond the original mapping community that made the project possible in the first place.
Lacking big projects like this, showing real life use cases is an obvious way to connect to your audience once you get their attention. If you know your public, focus on what you know they could use. If you don't, show the diversity of cool stuff you can do with OSM.
You need a way out of your inner circle. Engage outside organisations. You are basically tapping into existing networks, rather than building one from scratch. For example, connecting with “data science” people, but also local government, entrepreneurs, IT people. Working together with Trage Wegen has introduced many new mappers to OSM over the last two years in Belgium. This is an organisation focused on the threatened little paths and tracks that connects our messy towns and villages to the sparse open space. The people who support them are passionate about this subject, and it’s not that hard to take their passion for “slow roads” and turn it into a mapping passion, since a mapped path is harder to disappear.
Especially in developed countries, Meetup seems to be a useful tool for creating events. Clifford Snow did an entire session on the subject (video). These events can be as small as a bar hangout, but it can also be used for much larger events. It is quite easy to start a group. As an organizer you have an idea how many people to expect, and Meetup does all the hard communication work for you (maintaining contact list, sending out reminders, thanking for showing up).
Meetup is very local: it will suggest groups to hang out with based on both your location and your other Meetup groups. So you will get a lot of subscriptions from people already active on Meetup, but not yet very interested in OSM. And you will almost automatically find meetup groups which have similar interests, where you might go and talk about OSM.
There are some challenges though. Meetup realizes the value of their network, and so you need to pay to be an organization on their website. Prices depend on the country (3 €/month in Belgium, 15 $ in the US). In practice, this is paid by the very motivated organizers themselves. As there is no free alternative, it might be an idea for central OSM organisations to provide this money instead. The impact is clear, and the investment is minimal. I would dare say that without Meetup, there would probably not have been a State of the Map in Belgium this year.
Both Belgium and Seattle talked about using Humanitarian Mapping as a recruitment tool. It helps attract people who would otherwise not be interested in OpenStreetMap, and gives you a chance to introduce them to the wider project too. It’s also a place to turn your hardcore mappers into volunteers. There are well defined tasks to do, like organizing, promoting, giving talks, making documentation, validating data or helping out individual mappers. That makes it easy to become a volunteer. The repetition of events gives them the opportunity to grow into ever more complex tasks.
This will sound controversial to a lot of people, but imports can be a recruiting tool too. Clifford and Jeff Meyer talk about how they used an import to grow their community here. Imports aren’t easy, and having an ‘import party’ is usually a bad idea. But good imports are possible, and they provide an opportunity to recruit more technically oriented people who would balk at the idea of tracing thousands of buildings.
So, what else?
What dilemmas do you want to talk about? What do you think about the proposed needed tools? What worked for you or your local community? How can we make the life of new community builders easier?
And most of all, how do we keep the momentum we seemed to have during and after SotM 2016?
Recently, I wrote about how you could use government road data to improve OpenStreetMap. Here's a move in the other direction.
As an employee of the city of Antwerp, I was involved in the recent 'validation' of the Road Registry (Wegenregister) for our city. This registry is managed by the central Flemish government, but final responsibility for the content is with the municipality. Validation means the central government gives us a new dump for us to check for errors. This way of working is only a temporary situation: in the future, we will be live editing in the central database itself.
There's an amazing amount of cleanup left to do, but we decided to focus on the completeness of the main road network. Before, we did this by comparing with our own city registry of roads. But that is not being updated anymore. So for the first time, we used OpenStreetMap for the validation. Using FME, we identified roads which exist in OSM, but not in the Road Registry. We excluded service roads and "slow roads" (paths, tracks, cycleways), as these are less of a priority right now.
Next time, we will also look at roads that are in the Road Registry, but not OSM. In some case, the lack of road in OSM is really an indication of an error in the Registry. For example when a road has been closed, and the government somehow missed that. This is more work, because the Road Registry contains a lot of little bits of "roads" that are really just driveways. Because nobody cares about them, they aren't in OSM. But they are quite hard to filter out from the Registry data.
The cleaned up dataset of roads that are in OSM and not in the Registry was really quite limited. Only 138 situations needed manual review. Of those cases, 32 were a simple matter of slightly different geometry. For example when OSM mapped the road as a polygon, which we didn't really take into account. We identified 33 cases where the Road Registry was clearly wrong. Then there were 31 cases that looked like they shouldn't have been in the selection anyway: they are private driveways, parking aisles, tramways. About half of those needed a fix in OSM. But the "tramways" were actually dedicated bus roads on top of tramways.
Most of the "mistakes" detected in OSM were caused by larger geometry issues. Sometimes the centerline of a road is debatable, but in most of these cases OSM could be improved, sometimes vastly. These were most often roads that hadn't been touched in years. Only in a couple of cases was OSM really vastly wrong. This happened when the city reorganized streets, and somehow, nobody noticed. Most striking was the Troonplaats, which is a quite popular square. In several cases, OSM had already been corrected in the month or two between data download and final analysis (though to be honest, some of those were fixes of mine). A few mistakes were caused by errors in or outdated road classification.
There was one striking case (pictured above), where we were convinced OSM was wrong, but we apparently missed a big change in the road geometry. Fortunatly there was a [Mapillary sequence], of course one of the 1.1 million pictures uploaded by filipc. Even though the aerial photography in Flanders is excellent and recent, the only place this road shows up is on the OSM map.
Legal stuff (edit)
As Stereo pointed out in the comments, OSM cannot be copied by a non-ODbL source. I always translated the license of OSM as "if you merge your private data with OSM data, you have to open up your data". But that's not correct, it should be: "if you merge your data with OSM data, you have to open up your data AND prohibit anyone from ever making it private again". In this case, the Flemish government allows (and explicitly wants) TomTom and Google to take official data and use it to improve their private data.
Because of that, us government workers are not allowed to copy features from OSM. But there is a precedent: the New York City government uses OSM to track changes to their buildings as imported into OSM. I'll trust their research that ODbL does not exclude using OSM to detect errors, if you then proceed to do your own surveying before making changes to your own dataset. This is also what the License Working Group believes, as Simon Poole (thanks!) pointed out in the comments. I understand this bit of text was supposed to have landed in the Legal FAQ page, so I went ahead and did that. Please revert if this is inappropriate.
The ODbL always made sense to me, and it kind of still does. Say I was to download all of OSM to my own server, and redistribute it under a more open license. Then someone else could just take that data and close it off. But this case does help me understand those who aren't very happy about this license a bit more. In the case of government, it means you can't -really- integrate OSM into your processes. For example, you couldn't take OSM, validate it with your own data and redistribute the result under the license of your choice.
Have a look
You can have a look at the cases here. There's a bit of work left on the cases with a difference in geometry. The easiest way to get the Road Registry into your editor is with this (slightly outdated) WMTS:
You can contact me to get the FME models we used to identify these roads - they aren't very complicated. You could easily do similar things in open source software.
TL;DR: Government road data, processed to help you map roads in Flanders, Belgium. All the tiled layers are available for use in your favorite editing software.
About the data
The Flemish government has a large project to measure most stuff you find in the public domain, the GRB (Dutch). The data is measured to incredible accuracy, but the project is not focused on maximum recency. Update frequency is once or twice a year. When it comes to roads, only those that need an official streetname are included.
That's a bit limited for some purposes, so they started the Wegenregister (Registry of Roads). The idea is that all roads are included, also "slow roads" (paths and tracks), private roads and even future roads. They started of with the centerlines of roads from the GRB and enriched it with National Geographic Institute (NGI) data for smaller roads. It isn't quite finished yet: a lot of local governments must still validate the data, and there is no automatic procedure in place to feed new GRB roads to the database. So you can expect some of the "future roads" to be quite present. The NGI data is also of varying quality: it is quite complete and has generally good geometry, but it can be quite outdated.
The scope of the Wegenregister is to offer a complete road network, not navigable data. It does not include anything like access restrictions, detailed lane info or max speeds. It does contain road surface information. It is divided into segments, which go from one junction to the next. Only if a new road is added, an existing segment will be split. That means segment ID's are relatively stable. If a segment has a change of attribute somewhere, this is dealt with by dynamic segmentation. Basically, that means you have a table saying stuff like "from meter 0 to 100 asphalt, from meter 100 to end concrete".
Finding missing roads
I did some quick visual checks in my own mapping neighbourhood, and I did find a LOT of missing roads. Some forest paths, several small alleys connecting backyards to the street, some graveyard paths, some driveways. I would say 95% of the missing paths/roads still existed, about 75% worth mapping in OSM.
Enough to warrant some closer inspection.
It is open data with an OSM compatible licence, which you can download through a website. First I tried FME, as we have processes in this software at my dayjob to do similar analysis that I could reuse. Alas, it didn't scale well for larger data. QGIS, after some trial and error, did the job no problem. The main processing operations took about 36 hours on my not-fancy-at-all laptop.
First I took the OSM road data (as a shapefile, from Geofabrik), saved it in our local projection and buffered it by 7 meters. Then I used difference to find the parts of the Wegenregister that were outside of that buffer. Next I threw out segments of under 10 meters (unless they were entirely outside of the buffer). I also calculated the percentage outside of the buffer. The result are A LOT of segments (220.000 out of one million) , which are either missing in OSM or have a very different geometry.
Sharing the results
The result is still a shapefile of over 60 megabyte, so nothing you can just put on umap. Luckily, it is quite easy to make a TMS service from a shapefile using Mapbox Studio. These services can be used in a little leaflet map like this one, but can also be added in iD or JOSM.
Make sure you open the layers (button top right): you can use three background maps, see the whole Wegenregister, add Strava and see the OSM road network more clearly overlayed.
Mind you, I DO NOT want you to just get out your editor and start copying these features. There are several reasons why a road might be missing in OSM, some good, some bad:
- private roads in a forest: can't see them, can't survey them
- it doesn't exist anymore
- it isn't a road (but for example a dedicated tramway)
- it doesn't exist yet (indicated in yellow in the complete dataset)
- the OSM road is newer than the extract I used
- geometry is so different that segments don't match
[EDIT: thanks to tyr_asd you can now copy the URL to share your current view :)]
But you don't need to go out surveying for every single change either. In the map I provided, you can combine the view of missing Wegenregister roads with aerial photgraphy, OSM gpx and Strava gpx layers. If they all point in the same direction, you can be quite sure that OSM is wrong and Wegenregister is right.
URLs for mapping
These URLs can be added in JOSM, iD and OsmAND. In iD, click the layers button (righthandside of the screen), then click on the magnifying glass next to Custom or 'Aangepast' to insert one of the URLs. To use this in Osmand, check my previous diary entry on Strava (only works for layers containg .png). If you use JOSM, you know things like this :)
Wegenregister, missing roads only:
Strava, all data:
Strava, recent data only (seems to be hard to re-use)
You can download the entire dataset from the AGIV website (note: this link dies when they publish a new version. Just look for "wegenregister" in their catalog). And here is the entire dataset of missing Wegenregister roads as a shapefile. Use QGIS to extract your local area of interest. Save as GPX to add it to Osmand and go out mapping. Of course, you already have the Strava layer enabled in Osmand :)
I can also provide just the bits of Wegenregister that are outside of the buffer, just ask.
Better mapping practices
Now imagine you've checked your whole mapping neighborhood. The map will stay red, at least till the next update of the process. But what about the roads that you surveyed and concluded were invalid Wegenregister roads. They should be removed too. I'm not quite sure how to go about that.
- We could tell the government. And they might actually listen, but by the time the road is removed from the dataset, three more mappers might have analysed the same segment.
- We could build a list of "untrue" Wegenregister roads and remove these from analysis. There are quite stable unique identifiers available, but it would mean everybody should refer to the same list when marking something in Wegenregister as untrue.
- We could map non-existing roads in OSM (ooh, taboo!), analogous to the not:name tag that was used in the UK to mark that the official name for a road was wrong. I was tempted into something similar in this case, where a path is indefinitely closed off, but still quite existent (as seen from the street and aerial photography)
Seizing an opportunity
I know the Belgian heavy mappers like to work on stuff, but I think this might be a nice opportunity for expanding the community a little more. I've noticed how small paths and local trails are really something that can still attract new mappers. The Flemish Trage Wegen organisation is behind that for a large part, and I sense we could work together with them on a project like this. It is also very similar to the local "inventarisations" they do.
It is a very well defined task, it is repeatable, all the tools and pitfalls can be explained quite easily. Moreover, local governments could be contacted with a very clear proposal - to help them solve a problem they would have to solve themselves pretty soon anyway.
I see two main options, which are possible conflicting.
Option one: a maproulette challenge or Canadian style crowdsourcing tool. It's nice and easy, but it might be a little too simplistic for this task. The Canadian style tool would probably allow to generate a vast error report for the Flemish government, which is quite cool. Microtasking like this is not compatible with the extensive local surveying which we need when the reality isn't very clear though. But it might make the job a little lighter for those working on Option Two.
Option two: we set up a Belgian tasking manager (as in an instance of tasks.hotosm.org) and divide the job. It allows for very specific instructions, providing the analysed Wegenregister as imagery to people who have never used iD before and makes it really easy to track progress. Time-out for the tile you picked should probably changed from two hours to a couple of days though :)
One thing I've learned from working on Missing Maps, is that you need to use an existing network to recruit new mappers. You need an easy, repeatable task to make the work easier on OSM supporting volunteers. And you have an opportunity to take their passion (in this case "helping poor people") and try to channel it into a passion for OpenStreetMap. Change MSF for local government, mapping buildings with mapping roads, and a passion for doing good with a passion for local paths, and there you are.
Working on it
To make such a project possible, we should probably set up an online service doing something similar to my analysis. So newly mapped roads in OSM are removed from the "to map" list, as well as invalidated Wegenregister roads.
My analysis is more a proof of concept than anything else. It would be interesting to go further. For example, one could make a map with just roads that have a different name in OSM than the official name. Or just focus on the planned roads. Or suggest surfacing information for inclusion.
It would of course be nice if it were easy to take the Wegenregister geometry and apply it to the OSM data, but that might be a little too much of a challenge right now.
If you feel like working on such a project, get in touch, start on your own, or come to the SOTM Hackathon in Brussels.
So I've been using the Strava data quite a bit recently. I knew the service from before, but then it was quite empty. The tip came from our übermapilliariator Filip when I was making too much notes mapping a nearby forest.
Strava for forest trails
I have mapped a lot of trails in Flemish forests. We're a densely populated piece of land, with very little forest (in fact, our environment minister literally said that "the purpose of a tree has always been to be cut down"). But even here, I have hardly ever visited a forest where all forest paths were mapped.
It requires local surveying as paths below trees are completely invisible, and we tend to do a better job mapping stuff you can see on sat pics... But even when you do go out to the woods, the resulting GPS tracks can be of bad quality. Strava to the rescue! Several million trips by hiking and biking-nerds are mashed together to give a clear indication of where people run and bike.
The easiest way to use it, is with the [Strava ID editor](strava.github.io/iD/), which comes preloaded with the layers you need. I often switch of the satellite imagery to improve visibility of the tracks. This ID version also contains the Slide tool, which lets you adjust geometry to the available tracks. I haven't had very satisfying results with that myself though. In Belgian forest, you can basically zoom in anywhere and find missing tracks. (For JOSM instructions, see the wiki)
Strava and surveying
Of course, you still have to combine this with some satpic reading skills, other sources and/or local knowledge. For example, when Strava, Wegenregister and Groteroutepaden GPX all point in the same direction, you can be pretty sure there's a path present.
I did spot some situations where people seem to be running straight through a meadow where no path is visible. And the standard view does not take into account time. Sometimes, clear changes are visible over time, see this experiment. So just looking at the global heatmap might get you mapping former paths.
Strava in Osmand
If you don't have other sources, or just want to go hiking somewhere you suspect mapping is incomplete, you can add this layer to Osmand. It will help you find paths with bad geometry, and help you find unmapped paths.Vague lines on the map, combined with a visible trailhead can be enough to verify the existence of the path. So you can add much more paths with just one survey. Note: I hid all polygons and road details on my view, which helps keep the map readable.
In the tradition of the app, the feature is well hidden. First of all, you need to have the "Online maps" plugin enabled. This is just a setting, no downloads required. Standard available layers include "Microsoft Earth" satpics and online OSM maps.
Strava isn't standard. To add it as a layer, you need to open the "Map source" menu, available under map settings. Scroll down till you find "Define/edit". The URL example is with blue lines. You can find more about this URL on the wiki
Now your standard Osmand map is replaced with some blue lines. Great! Re-open the Map Source to get your "Offline vector maps" back. Now you can add the Strava layer as an Underlay or Overlay map. In the example above, I used it as an underlay with the basemap completely opaque. Forests (and other polygons) were switched off - but that does make for increased visibility.
(Note: I already contacted the Osmand Google Group with a feature request to make adding custom tiles just a little easier to use)
Why do we map? It's a question in every OSM mapper interview, and it's often a bit confronting. We do it because we like it, but why do we like it? And in the case of many of us, why we spend such an enormous time on it?
After a brief exchange with a self-proclaimed GIS dinosaur, I felt the need to remind myself exactly what it is I like about OpenStreetMap. I noticed that both for her and me mapping really became part of our identity. It was almost like discussing refugees or social fraud.
This article is very personal. If you like the same things as I do, you're bound to like OSM. But you might like OSM for a completely different set of reasons. If you want a much larger frame of thinking, like why the world needs OpenStreetMap, that's explained somewhere else.
It doesn't wait for anyone
it does what you make it do
it doesn't make big plans about what it will do in the future, it simply does what it can now
There is nothing OpenStreetMap does perfectly. However, you can change that at will. Do you want it to have all the hedges in your town? Just look at the data model adapt and extend if needed, and add them to the map. There's your perfect map of local hedges. Now show off your work and get other people hedging.
OSM does not make big plans about what it will be doing in the future. Instead, it simply does what it can now. I like this mentality. If enough of us follow, we actually accomplish big things [like having more roads mapped in most countries of the world than the CIA believes there are]. But we do it without having wasted money and time on big studies.
We'll have us studying ourselves or have other people collect the funds to study us, thank you very much.
OSM does not tell you what to do. There are no "priorities", so no one has to set them. There is only leadership by example.
It has what I need
Here are some of the things OSM can do, and no-one else can:
detailed hiking trails of my town
shapefiles of all regions of all countries in the world (that I can use for what I want)
a map of the paved roads of South America
a map of the remaining Inca trails (and download straight to your GPS)
I'm sure for some of these cases, some universities might have better data somewhere. I'm sure some governments will have (plans for) better data. But I live now, and I use data I can find.
It knows no borders
Yes, we now have a reference dataset for roads in Flanders, and it has 40 cm accuracy. But look somewhere else if you need hiking trails, and get another dataset entirely if need Brussels (which is physically entirely within Flanders). Yes, they will fix that. In the future. I like the present.
For something as "simple" as an address, there's a service being built on top of all open datasets of the world. But it's a patchwork, empty in many many places. Flanders, by the way, is only there because someone from the OSM community added the AGIV CRAB dataset.
While a service like this might just be the future for the use of authoritative data, it still poses some problems. What happens when government funds dry up? Their service dies, or the data quality starts degrading. There will be no OSM community around to take over their jobs - as communities get built around the actual mapping of things.
Maybe by the time OpenAdresses has a reasonable level of completeness, the OSM community will have learned to integrate external data and find a way to update it with both government and crowdsourced inputs. At that moment, government will have to adapt to a reality where they have to look at OSM inputs as much as to their official procedures. Maybe at that point, some politicians will look at their budgets and think: "We have a crowdsourced free dataset, which we use to keep an expensive infrastructure up to date. Couldn't we just use that data and use a fraction of our current resources to help keep that dataset up to date?".
But even if OpenAdresses works for adresses, it would still mean you'd have to find the best project for your usecase, for every usecase you have. I like having just one repository, where people with very diverse needs and interests are forced to interact.
Governments make set-to-stone definitions of what will be in the dataset. But that's like a planned economy. It adapts to the needs of the past, not the future. It's perfect at producing bakelite fixed phones, but could never invent the cellphone. If I want to go to a building inside a large private area, only OSM will get me there. The private roads are not government managed, so not on the map. The buildings might have no separate address, just an unofficial name or reference. Even if they would, they'll probably start mapping them in separate silos, then think about integrating them. But OSM maps what our mappers are interested in, and the data is integrated by default.
If you need data which has a clear definition, you'll probably be best of using government data. If you value flexibility, you'll probably be better of with OSM data.
Example: - black: government data where trails are out of scope - red: government data, including paths - blue: OSM
Note how there is no way to exit the park in the south-west. Oops! You will not see this kind of error in OSM, as this is one of the most important paths if you want to actually use the park. In fact, this trail was added back in 2008. On the other hand, we missed the service road north east of the path. But who is going to miss that?
It's a challenge to our economic model
Did I mention we're low cost? Based on my back-of-the-envelope calculations, the entire OSM map of Belgium would have cost about 3 million euros at local labour costs. There is no overhead whatsoever, as that is funded by the OSM Foundation. Imagine showing the current OSM map of Belgium to a minister in 2007, and say you will make this with 3 million and nine years time. OK, you might feel obliged to fund a server drive once every few years, maybe donate 20.000 euro? They would probably laugh at your face.
Honestly though, they would also laugh at your face if you would explain that agricultural lands would be mapped in -almost- all of Flanders and just some random parts of Wallonia, oh, and, sometimes with a distinction between meadows and crop fields, sometimes just as one category.
But how did we do so much in so little time? Maybe because of our messy data model - where you go in to correct a street name and wind up fixing ten different mistakes. Maybe because we only do the work when we feel like it and we stop when we're tired of it. If you work for someone else, this part of your job is often only a fraction of your time. The majority of time being used for such things as administration, meeting, evaluation and procrastination. This shows a bit of the Utopian vision behind projects like OSM. Imagine a society where people have the time to do what they want to do. How much more time would you spend on useful side projects like OSM if you didn't need to do the hours somewhere? This is an optimistic answer to the fears for the workless society some people see arriving. Add in a basic income and your all set. An idea popular withing the Pirate Party movement. Which by a matter of coincidence is exactly the same kind of organization as OSM: a swarm.
OSM to me is one big experiment - and I love being part of it.
It's empowering (and fun and quick)
I don't just value using good maps. I value using my own maps. My wife and I once did volunteering an area where hiking guides tried to monopolize the region for their own. We happily started creating and using our own maps to empower ourselves and independent tourists. Creating a map from scratch is a powerful experience. Where the map isn't empty, I like to be able to fix the map myself. I like the feeling of seeing my fix appear on the map - for everyone else to use. I like that I don't have to wait for anyone else to fix it for me. I like how even a densely mapped place becomes partly "mine" by adding that restaurant I went to. It's a primal thing, almost like putting a graffiti in a public bathroom - we're all tempted. Tools like Pascal Neis's Your OSM Heat Map tempt you to stamp your name onto your local area - or to the places you traveled to.
It is empowering when you spot a mistake in OSM and fixe it the same day. It is not when you spot the same mistake in government data, have to make an official note, and see it fixed a full three months later. When using OSM data, you dig just as deep as you want to. Using someone elses data, you are delegated your role and that's where it ends.
As a data user, if you use official sources and something goes wrong, you can sue someone. If you use OSM, you can fix the issue and prevent similar future issues.
It broadens the horizon
Using a GPS unit can make you lazy. It can lessen your map-literacy. But often my wife will look at me - are we going left instead of right because left is unmapped? I don't follow the plan, I go to the places where I'm most likely to find something new. at every crossroads, I will take the road that hasn't been mapped yet. Using an OSM based navigation app, you're not just navigating: you're looking out for improvements the whole time.This is especially true while hiking, where one long walk can result in one large changeset.
It's the community, stupid
Google Maps is a company trying to get you or your data to work for them. Governments reluctantly involve citizens in predefined roles. But OSM is a community of people. Rough edged at times, but incredibly helpful - even if you ask stupid questions.
Though I started mapping alone, it wasn't until I met other mappers at the Meetups in Gent that I really became involved. As OSM is such a chaotic community, there is nothing like talking to people to get a feeling for it.
As you learn, you start to teach - made easy with the beautiful help site. OSM is an ecosystem of people with the most diverse interests, and having diverse people work together is the perfect recipe for creativity and progress.
TLDR: Scroll down for some "pretty" maps showing paved and unpaved roads. In between is a wall of text about how and why I made these.
Waiting for a paved/unpaved road map
So I've been waiting for someone to make a useful map for navigating South America for quite some time. When you want to drive from A to B in South America, there is one essential piece of information you want: is the road paved or unpaved. When you want to travel slow and enjoy using your 4x4, you want the unpaved roads. When you feel sympathy for your kidneys or your car, you tend to stick to bitumen. Either way, you need to know.
Surprisingly, there are hardly any maps available that show this. Paper maps are hopelessly out of date even for basic road network completeness. OSM to the rescue! The road network completeness is pretty impressive considering the relatively small OSM communities there. And even the surface tags are mostly mapped - and I can tell from experience: generally correct.
So use the Humanitarian style. That only shows road surface when you zoom in. You tend to make route planning decisions from far away. I'm in Lima and I want to got to Titicaca over Cuzco, that's zoom level 8. I don't want to zoom in to level 11 to see which roads are paved. Also, default rendering is "paved", so you can't tell the difference between paved and untagged roads. As finding an unpaved road in reality is a nastier surprise than the other way around, it would be better to switch it around.
So post an issue to the main style maintenance. Well, someone did that two-and-a-half years ago. And even with the recent road rendering shakeup, nothing changed to address the issue just yet. One of the problems for mere mortals is that you have to develop the solution yourself, then hope the maintainers (and the community) accept it. And for that to happen, your solution should play well with the rest of the style.
If there's one thing I've learned from the OpenStreetMap community, is that if you want something to happen, you should do it.
While on the road myself, I used Osmand as a solution. Osmand has road surface and even smoothness rendering. You can tweak viewing so that it almost works for lower scale viewing. I tried editing the style myself, but I found zero documentation as to how to do it, and my simple tests did not work at all. I'm also not sure the needed data is even in the generalized world basemap which one would have to use.
Getting exactly the OSM data you need is hard, until you discover Overpass Turbo. It really is a tool that makes querying OSM data accessible to the non-programmer. This was the best solution I could find while travelling myself. Using this Overpass-Turbo query I downloaded just the paved roads as a GPX. Then move it to the right Osmand folder, change the standard rendering of GPX files and voila, you have a tool for half a country. Just make sure you don't accidentally use the GPX for routing :)
While this helped for me, it's hardly a good solution for less nerdy people (yes, I know, compared to other people here I know next to nothing about computering).
So I've been experimenting with different solutions, when whining at issue trackers didn't help. First try was to use the same GPX I used in Osmand as a layer in Umap. For example, when collecting information about paved roads in Bolivia. In this case, downloading a snapshot of data and uploading it to Umap was a good solution. The idea was showing the amount of roads added with the project, so different versions of the same query are overlayed there. (you can even download the data for just one country with a query like this)
But I did run into some limitations. When I tried a map of the whole of South America, the amount of data was becoming a problem. First of all, downloading it with Overpass Turbo crashed my browser. The nice people at help.openstreetmap.org were able to offer a solution: though it isn't obvious, you can actually download OSM data with Overpass-Turbo without rendering it in your browser.
Loading this much data in Umap wasn't really an option. The site would tend to crash as you uploaded. And it doesn't really work for a user too, as you have to wait for all the data to download and there seems to be an issue to get background tiles when using larger datasets. And another issue: if you want to use the map as a tool for mappers too, you need to use live OSM data. Surface tags get added everyday, and I'm not one to go update the map often. Luckily, there are some articles on how to use Overpass Turbo directly within Umap 1 2. But unfortunately, the needed queries are simply too big to use at the scale I wanted. There is an idea circulating to use an intermediate solution between live and uploaded data, which might actually become reality.
Retreat to QGIS
When the question about travelling on paved roads in South America kept creeping up on some forums I'm active on, I tried again. I thought I'd try and make an example of what I want to do in QGIS. The shapefiles provided by Geofabrik are only avaible country by country, and they seemed like overkill for my goal. So I revisited the download-without-render and adapted the query to return the highways for the whole of South America. Not to download too much at once, I split between highway types (example for primary roads).
Getting the data read into QGIS is straightforward - once you know how. The one thing that wasn't obvious to me was that the "Export raw data" option in Overpass-Turbo isn't readable by QGIS by default. You have to change the desired data type in the query to XML from the standard JSON. By the way, you can also change it to CSV if you want to do things like get a list of all named roads in a place.
QGIS is an amazing GIS program that easily beats the un-free alternative ArcGIS when it comes to reading different file formats and rendering large datasets. But you can't just drag and drop OSM files, unfortunately. As I found out using the Learn OSM pages about QGIS, it is not complicated. You don't even need a plugin. Just go to Vector>OpenStreetMap>Topology from XML. This creates a Spatialite database from your OSM file. Then Vector>OpenStreetMap>Topology to Spatialite lets you create a layer with just the tags you want.
This is where the power of QGIS becomes quite apparent. Secondary roads up to motorways for the whole of South America are rendered in a few seconds - and this is 700 megabyte of vector data. It took me a little while to understand how defining drawing styles work in QGIS. Surface tagging is complicated, as the distinction paved/unpaved is in the same tag as detailed information about what kind of pavement or lack thereof is used. But it's easy to make a set of rules.
surface (blue) = 'asphalt' OR surface = 'concrete' OR surface = 'concrete:plates' OR surface = 'paved' OR surface = 'paving_stones' OR surface = 'sett' OR surface = 'paving_stones:30'
surface (dotted red)= 'unpaved' OR surface = 'dirt' OR surface = 'grass' OR surface = 'gravel' OR surface = 'ground' OR surface = 'sand' OR surface = 'earth' OR surface = 'pebblestone'
I could have added things like "asphalt;concrete" or "pavimentado" to that style to use as much possible data. But I don't want to clean data with the visualization - I'll go clean the actual data.
Once you have defined these three types, you can play with rendering quite easily. Adapted to trunk roads, you can save the style as a file and load it to another layer quite easily. Just change the width a bit, and you are starting to build a style. (A way to simplify this for re-use would be to download all the main-road data in one Overpass query, and add a "highway=* AND ..." rule to the lay-out style, so you can do all the rendering within one QGIS layer. This render rule would then be shareable as just one file.)
Look here, maps!
The maps I was able to produce so far are definitely useful. They helped me map surface tags for several 100 kilometers I had driven but not mapped yet. But once you use the Openlayers plugin to add a background map, it quickly becomes apparent how hard it is to style complicated data. The gray, which is quite intuitive as a color for the unknown, is the same as border colors. Blue is the same as rivers. Red becomes unreadable to 10% of men if on a green background.
A useful map: say you're planning to do a little tour to Argentina, starting and ending in Santiago de Chile. The road to Mendoza looks fine (I added the missing bit by now), but make sure not to take that primary unpaved road for the last part. While driving South, do a little detour to the east. When driving back to Chile, make sure you calculate some extra tome as you need to do a small unpaved part. If you want to drive the coast in Chile, take into account that there are some missing links. You're probably better of driving a bit to the north before heading west.
An ugly borderline useful map: say you want to drive from Caracas to Ushuaia. You can't really make out which road to take yet, but it is quite obvious that you do have some options to stick to bitumen if you want to. Biggest problem is Colombia, where very few roads have the surface tag.
Using data is cleaning data
A data issue: the national communities have made some very different decisions about their national road tagging. In Chile, unpaved roads are almost always tertiary at most, even if they are important. Trunk roads are hardly used at all. In Peru, nationally managed roads are trunk, even if you really need a Lancruiser to make it. In Colombia (and Ecuador to a lesser degree), surface tags seem to be considered unnecessary, as everyone knows all main roads are paved anyway. Ecuador explicitly uses road quality to decide on road classification - surface tags are therefor largely redundant.
This makes styling a lower scale map quite hard. It would be nice if everyone would follow the OSM philosophy that road classification should reflect importance of the road above all else. In Europe, simple rules work, because road quality and importance correlate strongly. But in South America, in some countries it does, and in others it doesn't. Argentina did a great job mapping surface, so it is possible to make a good road map there. But as long as no major map style takes this tag in account at low zoom levels, you still have a large risk of sending people to the unpaved trunk road when there is a paved primary road available for the same trip. Data usability, in my opinion, trumps logical simplicity.
Maps that explicitly use the surface tag are of course the best motivation for mappers to add this info. Hopefully I can get some hints on moving forwards. Otherwise I'm already quite happy showing some of you the quality of the data that's already there - mapped even though there is so little immediate reward.
EDIT: A mapper's tool
After a comment from PlaneMad below, i had another look at his diary, and found this gem. With just a bit of fooling around, you can use Overpass Turbo to actually style your output a bit. So I made this map, that shows you the live data rendered in a way to highlight paved, unpaved, undefined or incorrect roads. We had some ITO maps available already, but this solution is fun as it gives instant gratification (any update is reflected within minutes if not seconds) and can easily be tweaked to show the roads or level of detail that interest you.
Getting it online
Back to the main problem: how to share a map like this. While QGIS has a little tool for converting a project to Leaflet, the amount of data involved here excludes that as an option. But even using the built in Print Composer didn't result into anything presentable. One would have to finetune the rendering exactly to the desired scale to make it work. The Openlayers background fail to get rendered properly in the outputs. So far, the best way to make a pretty map out of this, has been to just take a screenshot.
The only thing that would probably work is using something like Mapbox. But Mapbox doesn't come with live Overpass connectivity, and the vector data I would like to use is way too big for my free account. I asked Mapbox for suggestions, and was referred to the QA tiles. But I don't think that's a real solution, as you would still have to upload the data and update manually. So the only real solution would be to have Mapbox include the surface tag in their "roads" layer. There I go again, asking other people to solve my problems :)
Give me a shout if you want to try something similar and think I could be of help. Or even better, tell me what I could try next.
8900 people. That's all it took to make one of the best maps available of Belgium. (*1)
I don't believe there's a decent way to count labour hours, but here's a rough number: 61 labour years, assuming 200 days worked a year, 8 hours a day (*2). Considering Belgian labour prices, I'd guess that represents at least 3.000.000 euros.
I started doing these statistics after someone assumed that the southern/Francophone part of Belgium was underrepresented in Belgium. There's nothing as fun as being able to check these things. Some numbers I published before: it looks like the Dutch speaking part is mapped in more detail.
But the best simple proxy of map quality seems to be contributor density. So where are the contributors at?
Well, they're in Flanders.
It would be silly to stop there: there are more people in Flanders. You could divide them by area, but I believe the amount of data needed to map something is more dependent on people than on space. The Sahara is quite large, but you'll never need as much data to map it as you would for little old Belgium. So here's the same graph, in contributors per million inhabitants:
And there you go: the Flemish are the laggards, Brussels and Wallonia lead. This is really counter intuitive. I started out ignoring this, but it kept nagging in the back of my head. Remember how data density is higher in Flanders.
Then I thought about how one of the most productive mappers in the world lives in Flanders. So what would happen if we just exclude this one guy?
Turns out 44% of all nodes in Flanders were mapped by one person. In Brussels too there is one person who added about 30% of all nodes. Wallonia simply doesn't have someone like this, with the top contributor adding "just" 10% of all nodes. So I made the same graph, but without the number one contributor in each region.
Suddenly, we're all the same. Try and make our politicians believe that!
So that goes to show that even in a densely mapped country like Belgium, one person can still make all the difference.
That takes us back to basic community statistics in Belgium. Here's the number of active contributors per year per region. The bumps in the curve in Brussels are probably because of the small size of the region - just over a million inhabitants.
If we take into account people with at least 5 sessions (active on at least five different days in a year), the numbers drop steeply. Wallonia is clearly number one here, with Brussels and Flanders quite a bit lower.
When it comes to recruiting new mappers, Flanders comes in last.
Do people cross borders? Well yes. To define "home", I first took a subset of people with at least fives sessions in Belgium over all years. Then I simply looked at the region they had most sessions in. Of course, you will have some foreign people this way. It leaves us with 83 Brussels mappers, 995 in Flanders and 675 in Wallonia. Of the Brussels mappers, fully 60% mapped at least 10% of the time across the border. Pretty logical of course, because it's small. Only 18% didn't ever cross over. In Flanders, the numbers are 28% and 50%. In Wallonia a similar 25% and 56%.
I've been working towards creating these kinds of numbers for all regions in the world and dump them into a statistical platform. It'll be some time till I can realize that...
Here's a link to some of the data I used
*1. Well, actually, a bit more by now: I used the history dump of january 2015.
*2. I counted every active day per user as one labour hour. It's just a number I made up. You can make up your own if you want. The number of sessions (total number of active days of all contributors) is 97.270.
Three weeks of @mapillary mapping. Most eventful day: aggressive Porches overtaking, goats on the road, snow avalanche, overtaking Porsches with an accident Just back from a three week road trip, mostly in Italy (here's the complete GPS track in a pretty umap, obviously already available for mapping purposes). Just before leaving, I got a mail from Mapillary asking how come I stopped mapping with them. I explained how I use my smartphone for both navigation and Mapillary, but you can't do both at the same time. This is an Android limit: an app is not allowed pictures while in the background. There was an idea to get around that by making an Osmand plugin, but there doesn't seem to be progress on that. Anyway, I mentioned I do have a second phone I could use, just no mount. So for the second time, they sent me one of their perfect little smartphone mounts. Of course, now I had a moral obligation to be Mapillary mapping the whole trip.
This is how we ride:
In three weeks, you take mostly boring shots. Half of any picture is asphalt, that doesn't help. But the last real travelling day was pretty cool. Got illegally overtaken by a group of Porsches, goats on the road (more behind the curve!), did a 2500 meter mountain pass, shot a minor snow avalanche (move forward two pics for full effect), saw a group of Porsche's having a minor accident (schadenfreude all around). All in a day's work! Those Porsche's did catch up again with us, while we were cooking a nice dinner on the side of the road.
Here are some lessons learned.
You need a willing co-driver, or stop from time to time. I did have some app stability issues, you need to check the orientation of the camara from time to time, etc. It was probably device-specific, but it took me a while to get the settings right. No background threading of pictures, no Osmand running in the background. That seemed to do it, even for full size pictures.
You need a good camera. Smartphone cameras tend to vary in quality by quite a large margin. My onePlusOne did reasonable, my wife's Samsung S5 was poor indeed.
You need a clean window. This is harder than it sounds. On bright days, you get bugs. On gray days, you have raindrops. Some specks are hardly visible with the naked eye, but act as a kind of lense and make ugly spots. Mostly, it's just irritating reflections that mess up pictures. So I was thinking, maybe one should try to put a polarising filter on the lense?
You need plenty of disk space. Yes, you can take small size pictures, but resolution does have it's advantages, especially for road signs. And the Italians have A LOT of those. Not a problem with my OnePlusOne (64 gig memory), but close to rediculous with the Samsung S5: in theory 12 gig, but in practice you can be happy if you have 2 gig spare space. And on a longer road trip, you are going to need some separate storage anyway. I took 80 gig of pictures in total, so I had to keep moving pictures to my laptop. Which isn't as easy as it sounds, as we didn't have 220 volts that often. You can just move pictures back and forth between your smartphone and external storage. When you put the pictures back in the proper folder, the app recognizes them. Just don't forget that Mapillary assumes you don't want to keep a copy of the pictures yourself. They are automatically deleted from the device as you upload them.
You need a device dedicated to Mapillary. You can't run it in the background, you have to leave the device in place for as much as possible.
You need good weather. On rainy days and in bad light conditions you get a lot of bad pictures. That proves to be a real dilemma for me. Bad pictures are better than no pictures, right? I don't want to polute the Mapillary database with ugly pictures, but on the other hand, even on a bad picture you can often make out what the traffic sign says. And there is always some info: number of lanes, railgards, bus stops. Who knows what info you are deleting that someone might find useful? And who knows when the next photographer will be there?
And you need time: reviewing 60.000 pictures is always going to take a while, no matter how quickly you go through them. Ideal for those half-asleep trainrides back and forth to work. So it will take some time before all the pictures are online.
After you come back, you need bandwith. I have a monthly quota of 100 gig and about 80 gig of pictures to upload. So I'll have to spread them out somewhat. If you have even larger sets, I believe snail mail will be the faster and cheaper option. As everybody know, no wired connection beats the bandwith of a pigeon with a flash drive.
OSM quality in Italy: pretty good!
The occasional new roundabout is missing, but quite a lot of POIs are there, most forests are mapped, even most trails seem te be mapped. Of course, there's always something to improve. For example, max speeds are often missing or wrong. A lot of fixing is simple (wrong one ways you noticed, simple mess-ups), but often it isn't. Italy has a huge amount of old towns and villages, and these cannot be mapped properly from aereal pictures. There are just to many little alleys, often underneath houses. Not even GPS will help you there. So you either need to print out maps or use a mobile mapping app and get a local data plan.
Hiking and Mapillary
We did do a lot of little hikes, but I didn't take any pictures on those. That really is a different speciality. You need proper gear, as walking around taking pictures the whole time is not easy nor fun. And it would quickly kill the battery. I asked my wife if she would still travel with me if I would wear something like this. She seemed to be OK with that, surprisingly. So maybe we'll have to look into that. On some of the trails we did, a backpack like that would have been rather impractical though.
Solo de algunos caminos Bolivianos sabemos si estan pavimentados o no. Existen varios heramientos como para verificar esta informacion, como por ejemplo lo hacen estos mapas de ITO. Tambien se puede visualizar en Osmand. Pero no existe ningun estilo de mapa que muestra esta calidad de rutas a un nivel de zoom muy bajo. Por esto, hizé este mapito que lo muestra bien clarito.
Estado del mapa 21/9
Estado del mapa 26/9 (azul=nuevo desde 21/9)
Lo que muestra mas que todo, es qua falta bastante. La informacion del ABC no podemos utilizar, por falta de licencia de open data, y tambien por que no siempre es correcto. Por ejemplo, la ruta de Potosi a Tarija, lo muestran como “en construccion”, ya que solo son unos cientos de metros que en realidad estan en construccion. Por esto, pedimos tu ayuda. Sabes cuales rutas estan pavimentadas en Bolivia? Tu mismo lo puedes corregir, o nos puedes indicar los partes que faltan. Mas facil que una descripcion es mostrar en el mapa. Con este ejemplo puedes mover el punto de inicio y termina de pavimentacion; o puedes buscar los lugares de donde hasta donde esta pavimentada. Cuando esta listo, copia el URL y pegalo como comentario aqui abajo, envialo a mi usario Twitter o envia lo a joost.schouppe arroba gmail.com .
El mapa que hizé no se actualiza automaticamente, ya que con Overpass-Turbo esto funciona extremamente lento. Pero lo voy a actualizar cada rato, ojala se vera un cambio grande! O si no tienes paciencia, lo puedes ver siempre actualizado aqui.
Mapeando con tu ayuda
En 24 horas, mapeamos los caminos Potosi-Uyuni, Potosi-Villazon, Santa Cruz-Yacuiba y Santa Cruz - Puerto Quijarro. Ya son 1600 kilometros mas de asfalto para Bolivia. Joost cumple :) Falta aun mucho. Vea aqui si hay mas rutas pavimentadas que faltan.
El 26/9: otros 600 kilomters mas, con el camino Rurrenabaque-Yucumo y Trinidad-Santa Cruz
Lo que falta clarificar
Villamontes - frontera Paraguay: falta solo un parte, es verdad que falta?
Sucre: realmente solo hacia Potosi tiene asfalto?
Viejo camino Cochabamba - Santa Cruz: sé que hay un parte sin asfalto de 130 km, pero parece que hay partes que falta mapear > YA MAPEADO
San Ramon - San Ignacio de Velasco - San Matias: seria asfalto viejo, mucho hueco. Cierto? > YA MAPEADO
San Ignacio de Velasco - San José de Chiquitos: asfaltado o no? > YA MAPEADO
Somehow, I was able to not worry about multipolygons until recently. You see, if you want to cut up the planet into little pieces according to administrative borders, you are bound to meet those. One expects a place to have a simple border, forming a long closed line. Reality is more complicated. My home country Belgium is a fine example. Brussels is a simple polygon. But Brussels is also a hole cut into Flanders, the northern region. So Flanders is a multipolygon. You need to know the shape of the larger area, the shape of the smaller area within it, and the fact that you need to exclude this inner area. And then that extra non-connected bit in the east, Voeren. We also have the relatively famous Baarle-Hertog, which has bits of Holland within bits of Belgium within Holland. Nothing a multipolygon can't do on a wednesdayafternoon.
However, a lot of software can't handle multipolygons. One of those is the otherwise amazing osmpoly-export QGIS plugin [UPDATE: since March 2016, it does handle it!]. I used that one to convert my shapefile (OGR) archive to the POLY file format I needed for the History importer. POLY is a standard in the OSM community. I mostly use programs with a user interface, so the QGIS plugin was my tool of choice to build a dataset of all the regions in the world based on Openstreetmap (part of my larger project. And my sloppyness means that these pretty statistics for test-case Flanders were based on this not so pretty image:
I only found out because I learned how easy it was to extract shapefiles from the database created by the amazing OSM history importer. And it was only under the stimulation of the similarly amazing Ben Abelshausen, using his virtual machine, that I actually gave it a shot. Creating a shapefile of all the highways valid on January 1st, 2015 is as simple as this:
$ pgsql2shp -f /home/joost/Documents/test/highways -h localhost -u USERNAME -P PASSWORD USERNAME "SELECT id AS osm_id, tags->'highway' AS highway, geom AS way FROM hist_line WHERE '2015-01-01' BETWEEN valid_from AND COALESCE(valid_to, '9999-12-31') AND tags->'highway' LIKE '%'"
(Note: the $ sign is just there for show, never actually copy it)
Of course there is a solution for the multipolygon problem. It just ain't as easy as a QGIS plugin. For me, that is. There are some tools listed at the Polygon Filter File Format wiki page. What we need is the ogr2poly.py script.
And that's where the wiki seems to stop. It refers to a subsite where you can download it. Within the .py file , the only thing it says about using it is this: Requires GDAL/OGR compiled with GEOS.
There are some tutorials around, I'll try to write this with the absolute beginner in mind. After reading a bit, I decided to try on my virtual Ubuntu machine. The first steps will probably be similar in Windows, but probably not the solutions.
First, you need to know that .py means that this is a Python script. That means you will need Python installed in order to be able to run things. Simple check: go to the command line and type "python". If you don't have it yet, you can download Windows installers here. Because it's open source, you can choose between about a 100 different versions. I'd go with the first one. On Linux systems, it seems to be preinstalled most of the time.
Next, install gdal ogr. You can check if you already have it, typing "ogrinfo" in the command line. I didn't, so I installed with the help if this nice little manual did the trick:
$ sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable && sudo apt-get update $ sudo apt-get install gdal-bin
Then the .py file also said it needs geos. I checked, typing "geos-config" in the command line. It seemed just fine.
So it was time to try the actual script. This guide said something about that, though I didn't really follow it. I just put the .py script into a new folder "OGRtoPOLY" in my home directory. Note: in the graphical user interface, it looks like OGRtoPOLY is a subfolder of /home. However, the "real" directory would be /home/username/subfolder. The following command did access the .py file in my case. I put the shapefile and all it's collateral files in this same directory.
$ python /home/joost/OGRtoPOLY/ogr2poly.py /home/joost/OGRtoPOLY/europeregions.shp
But of course, that still returned an error: I needed osgeo. I tried following the instructions here, entering these commands:
$ sudo add-apt-repository ppa:ubuntugis/ppa $ sudo add-apt-repository ppa:grass/grass-stable $ sudo apt-get update $ sudo apt-get install grass70
That ran error-free after I replaced grass70 with just grass. Python still returned the same error. More googling told me to do this:
$ sudo apt-get install python-gdal $ sudo apt-get install gdal-bin
And we struck oil.
The script allows for clever naming of the output files (one poly file for each feature). It can simplify geometry and create a buffer to make sure all the data you need really is in there. You can find the commands for that if you look within th .py file for "Setup program usage" to get the complimentary commands. For example, this command returned all the poly files I needed with names "europeregions_xxxx.poly", where xxxx is the feature's attribute idNUM. Output files were just dropped in my home folder, I saw no way to change this.
$ python /home/joost/OGRtoPOLY/ogr2poly.py /home/joost/OGRtoPOLY/europeregions.shp -f idNUM
I hope this helps. If you can clarify some of the stranger things I stumbled upon, let me know. if you think this info could be of better use somewhere else, do cop-paste or let le know what to do. If you're trying to do the same and run into trouble - sorry, can't help you! Just kidding, I'll try.
Esta semana, encontré un par de pueblos en Bolivia (mi zona de maximo interes) que tenian el nombre "aldea". Fijando me bien, encontré mas que 600 en el pais. Encontrar erores como este es facil con Overpass Turbo. Tiene un asistente, donde pones name="aldea" y ya. Gracias al Twitter de OSM Argentina, sabia que se puede puscar en un pais, no solo en un bounding box. Aqui el resultado. Dejé un par de pueblos como para mostrarlo.
Obviamente, el "name" tag no es para la descripcion de lo que es. Los nodos ya estaban clasificados como place=hamlet , village, etc, asi que el nombre no llevaba informacion extra tampoco. Eran nodos en general sin tocar, me imagino de un mapeo remoto - no es que alguien remplazo el nombre verdadero. Consulté un poco con la comunidad Boliviana, y decidimos limpiar ya.
Como mapeador Potlach2, no tenia idea como arreglar esto en JOSM. Cada vez que hago un intento con JOSM, me desanimo dentro de 15 minutos. Ya lo sé, es problema mia.
Habia visto un par de aplicaciones para Level0, y me parecio util para este trabajito. Era aun mas facil de lo previsto. Una vez hecho el query en OverpassTurbo, se puede Exportar en diferentes formatos. Y uno de estos es exportar directamente hacia Level0. El unico que te falta hacer es dar le el permiso de utilizar tu cuenta OSM. Copié el texto hacia Notepad++, hizé un "Encontrar y Remplazar" de "name = aldea" a "fixme = needs a name". Lo guardas, y boom, 500 pueblos corregidos (max 500 cosas cada edit!).
Ya sé que no se deberia hacer un mass edit y dejarlo no mas. Asi que pensé crear una tarea de Maproulette para controlar los pueblos - quizas habra algun nodo duplicado cerca, como uno de los primeros que encontré. Despues de leer el guia "simple" de crear un Maproulette, me cambié de opinion. Pero me acordé ver un pequeño guia para hacer tareas en Potlach. Asi que hizé un nuevo query para encontrar los pueblos recien arreglado. No pudé importarlos como tarea en Potlach en formato GPX, pero una vez exportado a GeoJSON funciono no mas. Revisé unos cuantos, y es interesante como para encontrar rutas sin mapear. Pero no como para encontrar los nombres, lastimosamente.
Ya sabia que OverpassTurbo es excelente. Ya sabia que junto con Umap puede servir como para hacer mapas lindas que utilizan datos de en vivo de OSM, como este mapa de las cervecerias de mi pais. Y ahora veo que junto con Level0 puede ser una herramienta para convertir gente desanimado con JOSM en power users.
Ni siquiera necesitaba el Help esta vez.
Una cosa importante: solo puedes hacer cambios como este despues de consultar con los mapeadores. No lo hizé esta ves, como me parecia un caso bastante simple. Pero no se puede. Lo siento mucho. Lo hizé en un caso donde la documentacion es muy muy claro sobre como se debe mapear. Pero en muchos casos, no es tan claro - y no se debe forzar su opinion sobre la Manera Correcta de Mapear sin consultar.