This past week, the 2019 HOT Summit was followed by State of the Map in Heidelberg, Germany. First, a big thank you and congratulations on a job well done to all of the organizing committee and folks in Heidelberg that made these events possible!
I had the opportunity to both lead a workshop at the HOT Summit on Thursday and participate in the academic track at State of the Map on Sunday. I’m writing this post to share a few resources and results from these talks, compiled all in one place.
1. HOT Workshop: Hands On Experience Extracting Meaningful OSM Data by Using Amazon Athena with AWS Public Datasets
This workshop was designed to show the analytical power of Amazon Athena with a large dataset like OSM. The workshop description was as follows:
Learn how to use Amazon Athena with AWS Public Datasets to query large amounts of OSM data and extract meaningful results. We will explore the maintenance behavior of contributors after HOT mapping activations and learn how the map gets maintained, what happens after validation, if the data grows stale, and if a local community emerges. This 200 level workshop is hands on and requires familiarity with SQL. Familiarity with data science tools such as Python and Jupyter Notebooks is helpful, but not required. Sample code will be made available at the state that participants can modify and ask their own questions of the data.
Grace Kitzmiller (AWS) & Jennings Anderson (University of Colorado Boulder)
The workshop included 10 prepared Jupyter Notebooks that contained all of the code to parse the results of an Athena query and generate a number of graphs and maps, such as the following graph which shows the cumulative number of users who have edited in Tacloban, Philippines.
This shows that since 2012, there has been stable growth (a fairly consistent slope) in the number of editors, however, the overall rate was impacted heavily by nearly a 400 person ‘step’ as a result of the disaster mapping for Typhoon Haiyan.
As another example, here is a visualization built with KeplerGL showing the impact to the map in Puerto Rico by disaster mapping for Hurricane Maria (a sample of 10,000 edits)
These are just two examples of the many figures and maps featured in the workshop that can be generated for most of the regions where humanitarian mapping has occurred.
You can find detailed instructions on how to recreate this workshop and run the material locally here.
2. SOTM Presentation: Corporate Editors in the Evolving Landscape of OpenStreetMap: A Close Investigation of the Impact to the Map and the Community
This marked the second year of the Academic Track at State of the Map. Thanks to the hard work of the OSM Science community, the proceedings of this track have been published here. Included is an abstract discussing my latest research on organized editing—specifically corporate editing—in the map. You can watch the full presentation here.
Last Spring, we (coauthors Dipto Sarkar and Leysia Palen) wrote an article that investigated the quantities and characteristics of corporate editing teams in OpenStreetMap. The visualization above shows the aggregate summary of this activity.
My current research looks at more deeply investigating the impact and editor interactions between corporate editors (or other organized editing groups) and other mappers. This requires examining the complete history of the map and breaking it down to individual edits, as visualized below:
Edits from non-paid editors (pink) and paid-editors, primarily Kaart (green & yellow).
Or this visualization of Facebook’s activity in Thailand:
If we zoom in on a particular area, we can see that Facebook’s edits between two previously mapped areas (in pink), are filling in the map.
This graph shows consistent editing activity from Facebook in 2018, followed by a few major events from non-paid editors in Eastern Thailand. This may lend credit to the notion of corporate map-seeding where data-teams start the map in an area and then non-corporate editors fill it in.
Here’s another (quite different) example showing how Amazon Logistics is editing the map in Dallas, Texas. Presumably they are adding valuable navigation-oriented ground-truthed data from their delivery network into the map:
There are a few more examples in the presentation that I talk through, identifying potential interaction patterns between organized editing groups and other mappers. Please leave a comment on this post if you have any questions.
Extra: Preparing for OSM Geo Week.
OSM Geography Awareness Week will be here before we know it! I did not present this at the conference, but find it interesting nonetheless. This is a visualization showing the impact of this event, derived from OSM changesets:
This particular visualization technique is a recreation of results from this paper by Daniel Bégin et al.
How to read this:
- The yellow along the steep diagonal represent all 1-time contributors.
- Faint vertical lines represent geoweeks that resulted in mappers sticking around
- Horizontal lines represent geoweeks where mappers who had previously edited OSM made their last edit during a geoweek.
- The purple at the top are mappers with a significant amount of editing experience who have edited during an osmgeoweek and continue to edit frequently.
Thanks for reading, please leave a comment with any questions you may have.