OpenStreetMap logo OpenStreetMap

ngumenawesamson's Diary

Recent diary entries

To support organizations that use OpenStreetMap data for disaster response, the HOT Data Team is strengthening our data quality and fitness measures.

Several teams at HOT, including the Data Team, Technology & Innovation Team, and the Regional Hubs, are collaborating to develop resources, tools, skill sharing, and community feedback mechanisms that will be avenues for data creators and data users to collaborate to improve OpenStreetMap data quality.

Data Team:

The HOT Data Team presented the top 10 data quality issues in a lightning talk at State of the Map 2022 in Florence. We categorize these data quality issues into three main categories:

Semantic Accuracy

  • Tagging
  • Tasking Manager project consistencies

Positional Accuracy

  • Spatial offsets
  • Feature tracing inconsistencies
  • Logical consistencies of map features

Completeness

  • Temporal inconsistencies
  • Road network inconsistencies
  • Completeness of health facilities
  • Completeness of public service data for sustainable communities
  • Administrative boundaries

The Data Team is also defining use cases and data quality metrics. Measuring data quality starts with identifying core datasets for each of our impact areas. Examples include highways and health facilities for Public Health, water & sanitation, transportation, and education for Sustainable Cities & Communities, and waterways, buildings, and highways for Disasters & Climate Resilience.

We then evaluated the use cases and the metrics for assessing the quality of each dataset, enabling us to identify ways of improving data quality.

Technology & Innovation Team:

Technology & Innovation Team is implementing automated tools for measuring OpenStreetMap data quality.

One tool in development combines data quality metrics, defined data quality issues, and data models to provide baseline tagging information. This includes approved tags for the impact area dataset and primary and secondary attributes that create a basis for detecting and querying live data quality issues. We have seen the tool play a valuable role in tracking quality issues by generating real-time data quality reports in the ongoing Türkiye-Syria mapping activation.

The tool uses the defined data quality metrics of bad tag value, missing tags, and incomplete tags to validate and improve semantic data quality in OpenStreetMap.

Semantic validation

For positional data quality aspects, the tool uses the defined metrics of bad geometry, overlaps, duplicates, and orphan nodes to validate the geometry of mapped features.

Geometry Validation

What the tool will look like:

Before mappers upload data, data quality measurement will be available for them to see in the Tasking Manager, helping them review their mapping and find issues. The tool will offer the mapper tips on how data quality issues can be detrimental to the impact of their mapping.

As mappers edit the map and upload their changes to OSM, the tool will provide live data quality assessments to the project managers and mappers within the Tasking Manager.

Live data quality assessment

Regional Hubs:

The regional Open Mapping Hubs are working on their individual approaches to data quality. They are publishing summaries of their plans, sharing their definitions of data quality, common data quality issues and requirements for their region, the tools they use, and open mapping best practices for regional mappers.

The Open Mapping Hub - Asia-Pacific has already published their approach to improving data quality on the OpenStreetMap Wiki. As one of the ways to implement their approach, they will be activating a road validation campaign for some of their priority countries.

Coming up next:

The Data, Tech, and Community teams are collaborating with the regional hubs to compile a Global Data Quality Strategy that will guide teams across HOT in collaborating to implement standards for improving the quality of data produced by HOT and community partners.

Stay tuned for the call for public review of the HOT Global Data Quality Strategy soon!

Our Top 10 Data Quality Aspects

Posted by ngumenawesamson on 25 July 2022 in English. Last updated on 2 August 2022.

Background.

Even as OSM is rapidly growing in content and contributors, its credibility has been one of the main concerns for authoritative users. The belief that it is made by volunteers can limit the trust in the value of this free data source within traditional GIS communities. At HOT, we have prioritized the top 10 data quality aspects that we want to minimize. These aspects have been categorized under 3 categories the are Positional Accuracy, Semantic Accuracy, and Completeness.

We came to reach the top 10 list through a number of consultations with the Data Quality Working Group, representative from open mapping communities, and associates from HOT regional Hubs.

This information has been shared for reference by all OpenStreetMap data contributors, and users across the open mapping ecosystem, data quality associates at the HOT Regional Hubs, HOT partners and other communities that engage with OpenStreetMap.

HOT is focusing on prioritizing these aspects and implementation on how to minimize/eliminate them through HUB centered community engagements in form of trainings, collaborating with partners and developing tools that can be used to improve the quality of mapping.

There are many other issues affecting the quality of OSM data, however, our top 10 data quality aspects are;

1. Spatial offsets.

2. Temporal consistencies.

3. Feature tracing inconsistencies.

4. Road network consistency.

5. Completeness of health facilities.

6. Completeness of public service data for sustainable communities.

7. Administrative boundary inconsistencies.

8. Tagging.

9. Logical consistencies of map features.

10. Tasking Manager project consistencies.

===============================================================

1. Spatial Offsets:

An offset is the degree of deviation of an object from its intended position.

Category: Positional Accuracy

Possible Sources:
  • OSM contributors that don’t recognize or know how to mitigate imagery offsets.
  • New mappers that are not aware of the offsets.
  • Different tile offsets in a given project on the TM.
  • Use of different satellite images while mapping on the same vicinity.
  • Low accuracy from mobile data collection tools as a result of obstacles while collecting data.
Examples on how we can address this:
  • Strengthening and monitoring validation teams
  • Extended training sessions on offsets for newcomers and validators.
  • Stressing offsets in the instructions section of the TM
  • Check the history of the existing feature and compare with the different satellite images available. -Advice mappers to remove offsets (especially JOSM editor users) when they exit a project that require offset
Rationale:
  • Spatial offset is one of the most leading positional accuracy aspects that originate from the mis-alignment of satellite images that are used during desktop digitization that results from placing features at positions that deviate from their original.
  • Spatial offsets result in overlapping features which affects the positional accuracy of OSM data. For example, buildings overlapping roads, public facilities in the middle of roads, buildings in water bodies.

2. Temporal consistencies:

Inability to acquire capture date of imagery or range is too broad to make capture metadata useful can lead to inconsistency in mapping. Temporal Inconsistences

Category: Completeness

Possible Sources:
  • Using imagery that is outdated - how can you identify what is the most recent imagery source?
  • Data derived from AI/ML via feature extraction can be based on a mosaic of scenes with different capture dates. It can be very hard to verify the recency of data in these datasets (whether they’re buildings, roads, or something else)
  • Data collected by mappers using survey tools can be seen as “wrong” or because it can’t yet be seen on imagery
Examples on how we can address this:
  • Developing training, guidance or technical solutions to imagery capture date determination (for project creation)
  • Developing/delivering training for mappers to determine imagery recency
  • Work with imagery providers on technical solutions
Rationale:
  • When outdated imagery is used, especially in areas where more recent mapping has occurred; mappers may delete valid data because it doesn’t appear in the imagery.

3. Feature tracing inconsistencies:

(overlapping buildings with buildings, buildings with highways, point features on buildings).

Overlapping features is a common aspect of data quality in OSM. The most visible aspect is when buildings overlap highways and compromise the spatial properties of these features. In practice, buildings should not overlap highways unless otherwise. In other cases, point features fall within the middle of highways.

Category: Positional Accuracy

Possible Sources:
  • Offsets resulting from differences in satellite image alignments.
  • Compromised accuracy from the data collected through mobile devices where the device experienced an obstacle during data collection exercise. This causes point features to deviate from their expected positions.
  • Differences in coordinate systems with data conflated from official sources. Different agencies collect data in different coordinate systems and incase of a conflation the conversion from UTM to WGS creates a shift in the position of the features on ground.
  • Overlapping data from duplicated efforts.
  • Uploading AI generated buildings without syncing them with the available OSM data.
  • Low zoom levels in ID Editor where it shows no data mapped at that level.
  • Blurry imagery in combination with beginner mappers
Examples on how we can address this:
  • Correcting offsets before mapping new features.
  • Data cleaning practices for mobile collected data before it is uploaded to OpenStreetMap.
  • Following the corrected data conflation procedure while uploading AI generated data.
  • While mapping with ID Editor, make sure to zoom to the possible high level to be able to see if the features are mapped or not.
  • Good initial training and good feedback from validators.
Rationale:
  • During validation, more time is wasted trying to correct the overlapping features.
  • Building geometry inconsistencies affect the positional accuracy of buildings which can have an impact on assessing damage on buildings in case of disaster. Overlapping data reduces the geographical extent of the data and may result in wrong coverage analysis.
  • Duplicated buildings and multiple buildings mapped as one can mislead in estimation of the number of households that may require relief incase of a disaster and the responding agency my derive wrong statistics from the data

4. Road network inconsistency:

(Segmented highways with inconsistent tags, hierarchy, outdated highway tags, and surfaces, hanging roads)

Category: Completeness

Possible Sources:
  • Road mapping projects by beginner mappers with limited skill on snapping and road tagging.
  • Tracing of roads at low zoom level in Editing tools especially ID Editor.
  • Size of the projects and tasks can introduce errors, smaller projects can cause inconsistencies Size of tasks could mean nodes do not appear in a task and therefore will not be visible.
Examples on how we can address this:
  • Training new mappers on the geometric properties of roads like connectedness.
  • Clear instructions on the features to be mapped and emphasized by project managers.
  • Experienced validation teams for road mapping projects.
  • Training for Project Creators.
  • Having local photos of areas to map can help improve - wiki tool in JOSM allows you to bring in street view.
Rationale:
  • Broken roads pose a challenge during network analysis and road routing would be affected.
  • With inconsistent roads, navigation using OSM navigation tools becomes difficult. Inconsistent highway tags can lead to poor planning of response routes, and the responding agency may end-up using a longer route or a non motorable route.

5. Completeness of health facilities:

OpenStreetMap (OSM) data is collected by volunteer remote mappers around the world who at times may have limited resources to collect data on a large scope or even deploy surveys to collect attribute information about certain features. In the past 15 years, since the beginning of OSM, health facility data that volunteers produced has not been collected evenly and of today, most data is concentrated in urban areas thus large areas are still underrepresented and or the available datasets are not completely attributed.

Category: Completeness

Possible Sources:
  • Not enough resources to cover big scope health facility data collection projects.
  • Mapping using mobile applications that allow limited attribution like Organic Maps, Maps.Me.
  • Non standardized data models to be used during data collection activities.
Examples on how we can address this:
  • Encourage public health related projects in all OpenStreetMap communities through the Hubs.
  • Standardize health data models to be used during data collection for health facilities.
  • Ensure open participatory approaches for health data collection and mapping like including doctors and nurses in the project designs, special mapathons.
  • Referring to healthsites.io for complete health facilities data model.
Rationale:
  • Health facilities are only concentrated in the urban areas leaving a gap for rural health facilities not mapped yet the entire public need to access health facilities be it urban or rural.
  • A gap in the completeness of health facility data affects spatial analysis for where the services are located. Any person responding to a woman in labor may prefer to see the nearest health facility that offers maternity services.
  • Also attribute completeness for example one would like to know the opening hours for a given health facility.

6. Completeness of public service data for sustainable communities:

Geographic and attribute data for public services in education, water points, and sanitation waste collection. As with health facilities, there has not been wide mapping projects that focus on these point features the way it has been for buildings and roads. The case of Uganda, detailed mapping of social facilities was done in the refugee camps in northern Uganda and also mapping for clean streets of Kampala where solid waste mapping was done in 2017. The remaining parts of the country have gaps in the social facility data

Category: Completeness

Possible Sources:
  • Not enough resources to cover large scope mapping projects.
  • Mapping using mobile applications that allow limited attribution like Organic Maps, Maps.Me.
Examples on how we can address this:
  • Focusing on detailed point data collection beginning with cities, municipalities then to other towns through activation of data collection campaigns.
  • Considering available resources for data collection models for example Uganda Refugee mapping data model
Rationale:
  • In relation to health facilities, other public social facilities are concentrated in either urban areas or in areas where HOT projects have taken place. There is a big gap in the distribution of mapped public facilities on OpenStreetMap.
  • As with health facilities, there has not been wide mapping projects that focus on these point features the way it has been for buildings and roads. The case of Uganda, detailed mapping of social facilities was done in the refugee camps in northern Uganda and also mapping for clean streets of Kampala where solid waste mapping was done in 2017. The remaining parts of the country have gaps in the social facility data.

7. Administrative boundaries:

Topological inconsistencies, broken relations, and outdated information.

Category: Completeness

Possible Sources:
  • Administrative changes in the boundaries of districts and sub districts render the available data in OSM outdated.
  • New mappers who may try mapping of boundaries in OSM and end up deleting boundary relations.
Examples on how we can address this:
  • Coordinating with the agencies responsible for mapping boundaries on updating administrative boundaries in OSM.
Rationale:
  • Sometimes new countries are created, new districts are being split and municipalities are being elevated to cities. As these changes happen, they should be also reflected on OpenStreetMap.

8. Tagging:

Objects in OSM are created by digitization and the attribution of tags. OSM does not provide a rigorous classification system of the geographical objects. It just gives some recommendations and a set of predefined tags that can be used to define the objects. Thus, the final description attributed to the objects is defined by the mappers based on their knowledge about the object. This can lead to incorrect tag definitions since sometimes it is difficult for new mappers to differentiate between objects that fall in similar classes

Category: Semantic

Possible Sources:
  • Misspelled tag values or capitalization of tags by experienced mappers who know what to do but are careless in application of tags.
  • Uploading information to OSM without undergoing a data cleaning and conflation process.
Examples on how we can address this:
  • Training new mappers on tagging and giving them access to all available OSM tagging resources.
  • Discourage uploading non official data to OSM.
Rationale:
  • Tag information can affect decision making where an agency is relying on numbers to inform decision.
  • An indoor corridor wrongly tagged as a tunnel might be calculated as a shortest path by navigators and routing applications like PG Routing.

9. Logical consistencies of map features:

There are map features that are positionally related to others and must be located with on or near other features.

Category: Positional Accuracy

Possible Sources:
  • Wrong tagging of features by beginner mappers for example tagging a railway station as a bus stop.
  • Uploading data collected from the field without cleaning it and checking the GPS position of the data.
  • Shift caused by differences in the coordinate systems.
  • Mapping/tracing features at low zoom level.
Examples on how we can address this:
  • Using a standardized coordinate system for data collection that is similar to the OpenStreetMap coordinate system.
  • Developing training materials for new mappers to guide them on OSM tagging.
Rationale:
  • Logically there are points that must be inside buildings. They include cafes, schools, pharmacies, and supermarkets, which should be located within the building polygons.
  • Points that are semantically related to the road network and must be outside the road like bus stops, parking, and street lamps, traffic lights which are related to the roads and are usually located very close to them but not on them but which should not be located within buildings.

10. Tasking Manager project consistencies:

Issues with data quality can be directly derived from inconsistencies relating to the quality of projects created on Tasking Manager

Category: Semantic

Possible Sources:
  • Tasking Manager Projects themselves can be a source of errors. These include errors from overlapping projects, unclear instructions on what should be mapped and what shouldn’t be mapped, the level of difficulty.
  • Projects with dense existing mapped data can be compromised if the level of difficulty is set to beginner. In a building mapping project, roads should not be mapped because it may result into hanging roads mapped in one task and not in the adjacent tasks.
  • Unresponsive project managers leaving critical questions unanswered
Examples on how we can address this:
  • Ensuring all project creators have an appropriate understanding of good project creation and management.
  • Requiring, and delivering, a minimum level of participation and refresher training for all project creators.
  • A TM task instruction template with project info on fixed places: e.g. first required imagery, second required mappings, third yes or no offset etc.
Rationale:

Projects can be troublesome for mappers (and validators) for many reasons, such as: - Permissions and/or difficulty not appropriate for the actual difficulty of mapping. - Instructions not complete, confusing, misleading or just ‘too much text’ for mappers to follow - Wrong or not the most appropriate imagery set as default - Asking for too many, or dissimilar features; and/or task size not appropriate for features requested.

My 214 Mapping Days of 2021

At the beginning of the year 2021, one of my targets was to upload a changeset at least every day until the end of the year. However, I have not managed to hit my target. Nevertheless, this year still emerges the year with my highest number of contribution days. OSM Contributions

Mapping Days

With my GIS background, contributing to OpenStreetMap has become one of my hobbies because the efforts invested in adding features to the map are reflected in the number of humanitarian challenges that are being solved through the availability of Open Spatial Data. Amidst all the challenges like low internet connectivity, expensive data bundles, limited access to digital devices, and others, the inspiration is derived from the support and opportunities that the OpenStreetMap Uganda and Humanitarian OpenStreetMap Team (HOT) has provided to the OSM Communities in Uganda. We appreciate.

My SOTMA Experience:

Earlier this year, I submitted my proposal to speak at the State of the Map Africa Virtual conference and I was so much excited when my proposal was accepted and given an opportunity to speak at the conference. To be honest, the year has been full of excitement, learning, contributions but also sharing information on what the OpenStreetMap Uganda Community has been doing over the past years. The OpenStreetMap Uganda Community gave me an opportunity to spearhead a number of community-driven projects and, at the same time to share with the rest of the world what we have achieved through those projects by presenting at the State of the Map Africa Conference. Besides the opportunity to present the Remote Mapping & Photo Mapping New Cities project, I also participated in a number of lightning talks at the HOT Summit and organized a watch party in Kampala where the rest of the community members were able to stream the conference on the big screen.

Presentation

Watch Party

I have been able to learn a lot from the presentations from the different speakers and copied a number of ideas that will be scaled to fit my community. A lot of thanks to the organizing committee of the State of the Map Africa conference, and all the partners that contributed to making SOTMA 2021 a successful conference.

Location: Kamwokya, Central, Kampala, Central Region, Uganda

FOSS4G2018 Experience.

Posted by ngumenawesamson on 19 September 2018 in English.

The unforgettable experience at 2018 FOSS4G in Dar es salaam, Tanzania.

## Introduction. Excitement started from the day I received an email that I had won the travel grant to the 2018 FOSS4G in Dar es salaam. It all began when I submitted my paper to be considered for the presentation at the 2018 FOSS4G. Clean Streets Kampala is a MapUganda project running in Kampala City with the major aim of creating open data to make the city clean. My paper was subjected to voting and it emerged as one of the accepted presentations at the conference with #374 as its identity. My presentation can be found here ## The journey to Dar es salaam. It was the best experience to cross my country for the first time to attend the first international conference. Me and a team of youth mappers left Kampala on Saturday August 25th using a bus for Dar and the journey was fun. We arrive in Dar es Salaam on the morning of August 27th. We checked in our hotel and then went to the conference center where we were welcomed to the conference, given tags, t-shirts and other stuff. welcome to FOSS4G ## The conference. The conference The 2018 FOSS4G was my first ever FOSS4G conference to attend and my experience is unforgettable because of so many friends that I made, workshops I attended and presentations. Friends The conference organization was on point and the most amazing part of the conference was the use of Attendify app that made the whole conference paperless. There was also a lot of new things to learn during workshops and presentations like how different geo-spatial practitioners are using open source tools in their daily work, Gender inclusiveness in OSM communities, new softwares and many more. I also got a chance to attend the HOT summit where the teams were presenting about their activities in Africa and beyond. My favorite presentations and workshops included; * An interactive understanding of free and open source geospatial ecosystems. * Modeling natural hazard in gvSIG * Missing maps year 4! * Esri Youth Mappers workshop at Ardhi University. Youth mappers workshop at Ardhi University ## The social events 2018 FOSS4G social events were amazing. 1. The first social event at The Badminton institute in made me meet new friends like Esau from Tanzania and David Williams from California. 2. The Gala at Golden Turip hotel was full of fun with alot of dance moves from almost every attendee of the conference and also perfomance from the local dance groups. Local dancers 3. The Travel grant program event at High Spirit was also interesting. It gave us a chance to play cards that were given out by Geocat. Playing cards ## About Dar es salaam It feels good to be in Dar es salaam. The cold breath of the coast of the Indian ocean. Dar es salaam The people were so hospitable and welcoming. There was always good service in the hotel where we were staying. I liked the sea food and coconut in Dar es salaam. I got a chance to use the Mwendokasi public transport system that i found so interesting, very fast and so cheap. Also Uber, Taxify were available as transport sources in Dar.

Thanks.

I would like to Thank the organizers of the 2018 FOSS4G conference for their wonderful work, for the travel grant program. Special thanks also go to the HOT team and all other sponsors for making the conference happen. Asante sana

The 2018 FOSS4G left no one behind.

Location: Mnazi Mmoja, Mchafukoge, Ilala Municipal, Dar es-Salaam, Coastal Zone, 11106, Tanzania