OpenStreetMap logo OpenStreetMap

A Local Knowledge Dilemma? - A Data-Driven Alert for OSM

Posted by nukeador on 5 December 2023 in English. Last updated on 6 December 2023.

This is a cross post from the HOTOSM blog.

As Community Strategist and Research Lead at HOT, I would like to take a closer look with you all at the evolving landscape of OpenStreetMap (OSM) contributors, especially in the context of local knowledge and its crucial role in our mapping efforts.

Summary

Our recent study reveals a trend in local knowledge contributions in OpenStreetMap: a small but dedicated number of local mappers, making up just about 3% of contributors who are in the area mapping, is responsible for approximately 75% of the detailed mapping contributions.

This significant finding underscores the vital role of local knowledge and expertise in creating comprehensive and accurate maps, especially in humanitarian and unmapped/under-mapped regions. Despite a general decline in new OSM contributors, the impact of this core group of local mappers remains profound and indispensable for the future of the project.

We would like to engage researchers and mapping communities to unveil what are the implications of these numbers and the opportunities to use them to better support mappers.

The Spark of Inquiry: Simon Poole’s Analysis

Our journey began with Simon Poole’s important observation: a 20% drop in new OSM contributors. This sparked intense discussions within our team and motivated us to investigate further, particularly focusing on regions where HOT is actively involved.

Our findings validated a consistent decline in the number of contributors in most of the 33 countries analyzed over the past five years. However, intriguingly, the volume of mapped elements, like buildings and roads, has been on the rise. This disconnect between contributor numbers and mapping activity led us to delve deeper into the nature of these contributions with deeper analysis.

Why Understanding Local Contributions in OSM Matters

Grasping the dynamics of local contributions to OpenStreetMap is more than just number crunching – it’s about ensuring that maps reflect the lived realities of communities worldwide.

In regions facing humanitarian crises or high poverty levels, local knowledge in mapping becomes invaluable. Accurate maps created with local insights can significantly aid in delivering effective aid and developing sustainable solutions. Our focus on this aspect underscores the need to nurture and support local mapping communities.

Pioneering Methodologies: Towards a Better Understanding

One of our main challenges was to distinguish between local and remote contributions. With the support of Caleb Fagunloye, our Data Analytics and Insights Intern, we developed a pilot methodology focusing on data contributions indicative of ground surveying or field mapping. This innovative approach, though not without its limitations, allowed us to isolate mapping contributions that are likely to come from local knowledge.

We took Rebecca Firth’s illustrative humanitarian mapping framework and isolated the contributions in levels 2-4, essentially excluding edits to the map that could be made using satellite imagery.

We then looked at the users (usernames) who made these changes to try and understand who was adding local knowledge to OSM in these countries in 2022.

Key Insights: The Role of Local Champions

Average: Percentage of contributors and changes they made

  • Our analysis showed that a small proportion of contributors (~3%) were responsible for the majority of local knowledge changes (~75%). This highlights the significant impact of a few highly active local OSM champions.
  • However, this also points to a potential vulnerability in terms of sustainability and depth of community engagement. What happens if these key contributors reduce their activity?

Here you can see a table with the full data for some countries we analyzed, note how just a few contributors are responsible for most of the changes:

Country Total changes to elements (2022) # contributors who made these changes % (#) of contributors responsible for 50% of the changes % (#) of contributors responsible for 75% of the changes % (#) of contributors responsible for 95% of the changes
Nepal 50239 713 0,4% (3) 1,8% (13) 12% (86)
Senegal 2338 172 1,7% (3) 7% (12) 43,6% (75)
Kenya 7415 313 1% (3) 2,6% (8) 28% (87)
Mexico 38556 1078 0,5% (5) 2,6% (28) 21,1% (227)

Some of the numbers here were very surprising. For example, three people in Kenya, Senegal and Nepal were responsible for 50% of all the local knowledge changes to OSM in those countries in 2022.

Forward Path: Expanding Research and Engagement

  • We welcome improvements to the methodology! The more solid it is, the better our understanding of the OSM community landscape will be.
  • Our early research has opened avenues for more comprehensive analysis, especially focusing on the long tail contributions of casual mappers and social science / anthropological explorations
  • We think we still need to understand this analysis in other locations and analyze the evolution and trends over time.
  • HOT will keep the peer to peer support to individuals and communities to implement collective and collaborative actions, improve resources and skills and enable tech to empower and facilitate these local knowledge edits.
  • We have also published a notebook with the code to replicate this user extraction and analysis, for the countries and years of your interest. It shouldn’t take more than 15-20 minutes to get some results.

Limitations

The pilot methodology above is far from perfect. For example, we know that it is possible to add ‘local knowledge’ data to the map remotely (MapRoulette campaigns and imports are two examples).

The tag list used for the analysis also has some known flaws, such as road names being excluded even though they are likely to indicate local knowledge.

It is also really important to say that a Kenyan or Mexican mapper who only does mapping using satellite imagery is still a very valid member of that community of contributors! Although we did this research because we believe in the value of local knowledge in the map, it is not a judgment on other mapping methods!

Also note that the analysis includes people who are mapping on their employee capacity for a corporation/organization, who tend to contribute in high volumes. Taking this into account for a follow-up study and finding ways to exclude them from the numbers, might provide a more realistic picture of paid and not-paid contributions.

Open questions from SoTM EU presentation

This analysis was presented by Pete Masters during the recent State of the Map Europe 2023 and some interesting discussion followed.

  • The situation was familiar for members of more mature OSM communities. Those with now very active local knowledge contributors recognised that, in the past, a small number of very committed mappers did the majority of the field mapping and surveying.
  • Could these ‘core mappers’ end up being gatekeepers and discourage newer mappers from developing their OSM contribution?
  • Where should we allocate resources in supporting local knowledge contributors - new people to OSM, mappers who have shown inclination to add local knowledge or the core mapper group? Do efforts tend to focus on new mappers to the detriment of other groups?
  • Why do mappers do what they do? One hypothesis is that people contributing high quantities of local knowledge data do so because they have a purpose for that data. Is this true? If not, what drives core mappers?
  • How does this analysis look spatially? Does a core mapper in Nepal mean that their home town is very detailed with towns further away increasingly lacking in local knowledge?
  • How does 2022 compare with 2023 or 2021? This is a snapshot, not a trend at the moment.

A Call to Action

Our journey doesn’t end here. We see this as a stepping stone towards a more extensive, nuanced analysis of OSM contributions. We invite community leaders, social scientists, and OSM enthusiasts to join us in this endeavor. Your insights and expertise are invaluable in shaping the future of open mapping.

Discussion

Comment from ImreSamu on 5 December 2023 at 15:10

… Join our #research chat room on Matrix or Slack.

The Slack is only for HOTOSM Members.

Join HOTOSM on Slack
The email address must match one of the domains listed below ( @hotosm.org )
Please try another email.

Don’t have an @hotosm.org email address?
Contact the workspace administrator at HOTOSM for an invitation.

Comment from nukeador on 5 December 2023 at 23:06

Sorry for that, Slack invites are messy, that’s why we set up bridged matrix rooms anyone can read or register to participate.

Comment from kucai on 6 December 2023 at 03:41

in my experience ‘local knowledge’ sometimes just mean that they map it from Google streetview. :)

Comment from pedrito1414 on 6 December 2023 at 10:35

Just FYI, this was also presented by me on Ruben’s behalf at State of the Map Eu https://www.youtube.com/watch?v=a9X7hkHTGDw

Comment from tyr_asd on 7 December 2023 at 18:44

Hey Rubén. Thanks for sharing the code of your analysis. However, from what I can see, the results it produces are not quite accurate:

  • First, you seem to be using a non-history OSM extract as the base of your study. By doing this you loose all but the last change to a particular OSM feature, when it has more than one change. This is going to bias the results towards contributors with many edits, as they are more likely to overwrite a change from another mapper. Also, this will result in an undercount of both the total number of changes and the number of mappers.
  • A second thing I noticed is that you’re only considering OSM features mapped as nodes, while most of the tags you are looking at are typically also found on ways (e.g. amenity=school, amenity=hospital, etc.) and some are even almost exclusive to OSM ways (e.g. building:colour, roof:shape, etc.). My educated guess is that this omission will not incur a large bias towards any mapper type, but it would make comparing countries with each other more difficult (e.g. when different countries have different mapping conventions for either tagging addr:housenumber on the building outline way or as a separate node).

I’ve re-run the calculations on proper OSM history data (using the OSHDB) and came to the results posted below. Interestingly, while the absolute values are off by roughly a factor of two to three, the percentages in the results are not affected by that much.

Also, I took the liberty to add a few additional rows to the table for extra context:

Country Total changes to elements (2022) # contributors who made these changes % (#) of contributors responsible for 50% of the changes % (#) of contributors responsible for 75% of the changes % (#) of contributors responsible for 95% of the changes
Nepal 92,020 1,127 0.5% (6) 2.3% (26) 13.1% (148)
Senegal 5,027 273 1.8% (5) 7.3% (20) 37% (101)
Kenya 19,851 548 1.3% (7) 4.2% (23) 25.7% (141)
Mexico 79,959 1,707 0.9% (15) 3.7% (63) 22.8% (390)
Estonia 41,029 379 1.6% (6) 3.4% (13) 15.3% (58)
Lithuania 177,378 479 0.4% (2) 1.0% (5) 4.8% (23)
Germany 5,556,545 26,610 0.4% (107) 1.9% (499) 12.8% (3,414)
Austria 397,851 4,171 0.6% (27) 2.8% (116) 17.8% (743)
Italy 608,203 7,322 0.6% (43) 2.4% (177) 17.3% (1,264)

Comment from pedrito1414 on 7 December 2023 at 19:44

Martin, this is great feedback, thank you! I’ll share with the community managers in HOT and also add an addendum to the post on the HOT blog post on this…

Comment from nukeador on 7 December 2023 at 19:52

Thanks, can you share the changes to the notebook code to get to those results?

My understanding was that ways shouldn’t be counted to avoid including buildings and roads edits.

Is there a link to the OSHDB dataset per country as in Geofabrik?

Thanks!

Comment from tyr_asd on 8 December 2023 at 19:11

My code is now online. But I’m afraid that there is currently no downloadable OSHDB dataset available from https://downloads.ohsome.org/OSHDB/ to easily replicate this. :(

The ways would only be counted here if they match the given list of tag keys: So, road edits would not be counted, and building edits would only be included if they also have a tag like building:material.

PS: I’ve now also played around with the filters and have an extended version where changes are only considered when the respective change edits (or creates) at least one of the mentioned tags. The idea is to exclude large scale data clean up operations where only tags are touched which are not in the list of “to be considered local knowledge” tags (like this example changeset). You can find the results attached to the code snippet. Interestingly, the relative numbers in these result are again in the same ballpark to the numbers of the “simpler” analysis (maybe very slightly tending a bit more towards the “long tail”). I guess the most likely explanation is that the different mapper “types” (i.e. local knowledge mapper vs. data quality issue fixer) all follow a similar distribution with few heavy mappers and many mappers in a long tail. In retrospect, that should not be considered to be all too surprising, giving the nature of OSM. 😅

Log in to leave a comment