Google Summer of Code 2023 Final Report: Landmark-based Navigation in Valhalla

Posted by Junzhen on 24 September 2023 in English.

Abstract

Most GPS navigation systems today rely heavily on street names and place names for route guidance. While this works well in some situations, it can be confusing and less user-friendly, especially when navigating unfamiliar areas or for people with language barriers.

This project aims to change the game by introducing landmark-based navigation into Valhalla, the OpenStreetMap (OSM) routing engine. Instead of hearing complex instructions like “Turn left onto Obmannamtsgasse,” users will get simple and intuitive directions like “Turn right at the Tesco supermarket.” This landmark-based approach makes navigation more accessible and user-friendly, reducing stress and improving the overall experience.

The high level implementation overview of this project is available here

Landmark Based Navigation Project - Github

Now let’s dive into the details!

Implementation Details

In simple terms, this project enhances Valhalla’s routing capabilities by bringing in landmarks as navigation support. We pull Points of Interest (POIs) from PBF files and store them as landmarks in a special intermediate storage. Then, we associate them to the map’s edges and store them into graph tiles. When clients call Valhalla services, these landmarks become part of the route directions and make navigation easier.

About Landmark Itself: Data Source and Selection Criteria

POIs are naturally supported in OSM PBF Format (“Protocolbuffer Binary Format”), a sort of digital map format. OSM uses “tags” in the form of “key=value” to describe places and things on the map. One special key, “amenity”, helps us find places that are useful or important for folks like visitors and residents. We use this key to select landmarks.

To pick the best landmarks for navigation, we look for ones that are clearly visible, very specific, easy to distinguish, and easy to understand. You can dig into the nitty-gritty analysis here. Based on these criteria, we’ve selected 18 types of POIs as landmarks. These include places like restaurants, gas stations, banks, clinics and more – all things that come in handy when you’re on the road.

Milestone 1: Setting up an Intermediate Storage for Landmarks

Now that we know where and how to select landmarks, we need to build a storage space to store them. Following Valhalla’s conventions, we opted to build a separate SQLite database to serve as the hub for storing landmarks data. This database was designed with a straightforward schema which consists of an id, landmark name, landmark type, and geographical coordinates represented by longitude and latitude. Spatial index is added to support and optimize spatial queries.

Database interactions implemented include classic putting and fetching. We also made sure to support spatial queries - getting all records in a given bounding box, i.e. getting all landmarks that are located within a defined geographic rectangle.

Landmark Intermediate Storage PR1 - database

Landmark Intermediate Storage PR2 - interaction

What’s cool is that we used the C++ pimpl idiom to implement our landmark database API. It is a powerful one especially for API design, allowing the designer to offer up the API interface without exposing the implementation details, effectively giving the API a private implementation.

Landmark Intermediate Storage PR3 - pimpl

Milestone 2: Parse Landmarks From Data Source to Intermediate Storage

With a designated storage now in place, our attention turned to the task of parsing landmarks from their data source, namely the PBF files. We confronted a design decision concerning the parsing process: whether to embed it within the tile-building pipeline or to develop a separate process capable of enriching an already established graph tile when necessary. We went with the second option because it gives us more flexibility for both testing and execution, especially since this is a prototype for a major feature.

“valhalla_build_landmarks”, the separate executable we build, is capable of extracting and parsing POIs from OSM PBF files and storing them as landmarks in the landmark SQLite database.

Landmark Parser PR

Milestone 3: Enhance Tile Storage to support landmarks

Once we’ve gathered all the data on landmarks, the next step is to incorporate this information into the graph tiles that actually fuel our routing system. To make room for landmarks, we made updates to the data structures within these graph tiles. Specifically, we tweaked the edge info data structure to accommodate associated landmarks which are encoded and stored as tagged values.

It’s worth mentioning that it took us some time to find and resolve the problem caused by null bytes. It’s also one of the triggers why we refactored the edgeinfo.cc file.

Enhance Tile Storage to Support Landmarks PR

Milestone 4: Associate Landmarks with Edges and Add to Tiles

The challenge at hand is how do we decide which landmarks should be paired with which edges. It is not so straightforward given the fact that all we have is the graph tiles and a database that contains all landmarks, relevant and irrelevant ones. Here’s how we cracked it: First, we pinpoint the area of interest on the graph tile and fetch all the landmarks within that zone. Then, for each landmark, we call Loki::Search to locate nearby edges. We make these search results into pairs, with each pair consisting of an edge ID and a landmark ID. Once we’ve got this sorted, merge and sort these pairs by edge ID. This step reveals which landmarks should be associated with each edge. Finally, we update the graph tiles to store these associations between edges and landmarks in the edge info.

As both searching for correlations between landmarks and edges and updating graph tiles involve substantial amounts of data processing, we’ve designed some mechanism to distribute workload evenly across threads in both cases. One more thing: updating graph tiles is much trickier than generating new ones. It requires an in-depth understanding of the data structures nestled within the tiles, and involves a great deal of manipulation on edge info offset. Currently, we are updating these offsets in an effective yet somewhat slow manner, which unfortunately introduces a performance bottleneck. Hopefully we can find a way to optimize it in the future.

Update Tile Storage to add Landmark Associations PR

Milestone 5: Update maneuver generation to add landmarks

In the navigation world, a “maneuver” refers to a specific action or instruction that a driver or traveler needs to perform at a particular point along their route. These maneuvers can range from turns and merges to exits and more. Think of them as your GPS’s way of saying, “Hey, make a right here!” In this milestone, we updated the maneuver generation process where a routing result is built into user-facing maneuvers, allowing landmarks to come along together.

Although route results coming from the routing service already include landmark data, we feel it’s not enough. So, we went the extra mile by incorporating additional information. This includes the distance from each landmark to the maneuver point and whether a landmark was on the same side of the road as the maneuver direction. The former is pretty intuitive - using the closest landmark to the maneuver point leads to better direction support. The latter means if the maneuver is to turn left then we should use some landmark on the left side as well, and vice versa. It is particularly handy when a traveler encounters wide urban streets or in case something blocking sights.

We managed to get these two additional pieces of information, and updated our Protobuf to handle the new data structure in maneuvers. Now, every maneuver comes with landmarks which have information about name, type, longitude, latitude, distance, and even which side of the road the landmark buddy is on.

Update Maneuver generation PR

Time to step into the last milestone!

Milestone 6: Update narrative generation to add landmark support

In Valhalla, the narrative generation process converts maneuvers into human readable text. Our final goal in this milestone is to introduce landmark-based phrases into specific instructions. To achieve that we fine-tuned the narrative generation process to support phrases that incorporate landmarks. We then added these phrases to support landmark replacements, specifically in the English locale, for certain instructions including “turn”, “sharp”, “bear” and “uturn”. We even took linguistic nuances into account when crafting these new phrases.

As a result of the last milestone, landmark-based navigation reached its full functionality. Travelers can now receive routing instructions like “Turn left at the supermarket Kaufland” or “Make a sharp right onto Bahnhofstrasse at the bank Credit Suisse.”

Update narrative generation PR

Future Improvements

Even though we’ve made significant progress in the feature of landmark-based navigation, this is just the beginning, and there is still room for enhancements in both design and technical aspects. Here, we highlight some key areas for potential improvements, and you can find more detailed tasks marked as “TODO” in our code base.

Design choices

Optimal Landmark Association Distance

One question to consider is whether our current 75-meter cutoff for associating landmarks with edges is ideal. This distance acts as a strict boundary beyond which landmarks are not going to be correlated with any edges. While this helps reduce interference and enhances guidance, we should consider whether it’s too restrictive or too lenient. Besides, introducing a user-configurable option for the cutoff distance could provide more flexibility.

Expanding Landmark Selection Criteria

Currently, we focus on the “amenity” key when parsing landmarks from PBF files. However, there are other key-value pairs that contain valuable information. For instance, the “historic” key with a value of “monument” could be particularly useful for guidance, especially when the landmark has a descriptive name. Expanding our criteria for selecting landmarks could enhance the user experience.

Handling Different Landmark Types

Among the 18 types of landmarks we’ve chosen, some are uniquely identifiable by type alone, while others require both type and name for clear recognition. For instance, in urban areas, it’s common for multiple restaurants to be clustered together. In such cases, combining the name and type of the landmark is crucial for unambiguous navigation instructions. We’ve analyzed each landmark type (see comments in the code) and need to devise strategies to implement these analyses for an improved user experience.

Technical details

Batch Retrieval for Landmark Database

We’ve implemented batch retrieval, dynamically generating query statements based on input size. However, this feature isn’t currently utilized in our code. To boost processing speed, it would be beneficial to support fix-sized batch retrieval and use it for landmark retrievals during tile updates.

Fetching Landmarks with Level 2 Tiles

Currently, we rely on bounding boxes from level 1 tiles to fetch relevant landmarks. However, these level 1 tiles are not always available. In cases where only level 2 tiles cover a region, we should still be able to access landmarks. Implementing this feature would ensure comprehensive landmark coverage in various scenarios.

Accelerating Graph Tile Updates

Updating graph tiles to incorporate landmarks, as opposed to generating entirely new tiles, presents some challenges. Specifically, optimizing the update process for the edgeinfo_offset_map, which is an unordered map, is crucial. Our current method, while effective, can be slow. Investigating performance enhancements in this area would be valuable.

Thread Balancing for Tile Updates

An associated challenge with tile updates is balancing thread workload. Since updates involve moving entries one by one, the workload isn’t solely dependent on the number of edges or landmarks; it also considers the location of edges within the entire unordered map. This makes it challenging to predict and balance threads in advance. Addressing this thread balancing issue is critical, as it has a significant impact on performance. A holistic optimization approach for both the graph tile update process and thread management would be great.

Acknowledgements

This is my first time contributing to an open source project and participating in a large-scale C++ project with real world data. I would like to express my sincere gratitude to both of my mentors Kevin Kreiser (kevinkreiser) and Nils Nolde (nilsnolde) for their amazing support, helping me when I was confused, guiding me when I was lost. As a toddler in the open source community, I would always remember the discussions we had, the pair programming we did, and the bugs we debugged together.

Contributing to Valhalla has been a fantastic learning experience. I’ve dived deep into modern C++, data structures, algorithms, and geographical information systems. I’ve gained a profound understanding of how to operate within the context of a large-scale project. Beyond that, I am surprised to find and deeply captivated by the remarkable versatility of Valhalla and how its implementations enable such adaptability. I’m excited to keep working on the landmark-based navigation feature, hoping it will eventually make life easier for folks like me. This is just the beginning, and I can’t wait to see where it takes us!

References

Tag, taginfo, key value pairs in OSM data

A detailed document of directions with landmark support in Valhalla

Improvments and bugs in this feature

Location: Hochschulen, Altstadt, Zurich, District Zurich, Zurich, Switzerland

OpenStreetMap

Junzhen's Diary