OpenStreetMap

TheEditor101's Diary

Recent diary entries

NaPTAN Aberdeen Import Summary

Posted by TheEditor101 on 16 July 2019 in English. Last updated on 24 February 2021.

The Import

Yesterday I performed an import of NaPTAN bus stop data for the Aberdeen area. The motivations behind this import were:

  1. To greatly increase the coverage of bus stop data in the area.
  2. To get some experience importing data (hence the very specific and simple scope of this import).
  3. To establish a method of importing NaPTAN data since the original import was abandoned (and all of this very useful data has been sitting unused).

Results

Increased Stop Coverage

The import changsets can be found by looking at user naptan_import_aberdeen.

In total there were 63 regions of the Aberdeen area imported as separate changesets. Across those a total of 1357 bus stops were imported - of which 875 (64%) were new and 482 (36%) were conflated with existing stops.

The Aberdeen region (the city centre) had the most stops with 154 total. While the Countesswells and Footdee regions tied for the least with just 1 stop each.

Experience Gained

I think the most valuable experience I’ve gained from this import was in consulting the community and following the import guidelines. It’s quite daunting when you first see the guidelines and was honestly a little off-putting for me personally because I’ve been fatigued by aspects of the OSM community in the past. Thankfully, the UK community are quick to respond on the mailing lists and very helpful and constructive.

If I hadn’t consulted the mailing lists then I wouldn’t have known how useful and important some of the NaPTAN data fields are. Originally I was of the mindset that only the naptan:AtcoCode was necessary to import since it establishes a link between the OSM data and the NaPTAN data. However, after actually performing the import I can already say that the other fields helped me to distinguish between stops in close proximity with the same name.

Method Established

Overall I’m pretty satisfied with the method I followed. Converting the desired data into OSM XML was trivial using the script I wrote and it would be very simple to extend the script to do more if needed.

The conflation process is what’s really of interest though as it was the main challenge of this import. I decided rather than trying to automate the process it would be more beneficial to conflate the imported data manually in JOSM. After working with it, I can say that NaPTAN data as a whole is pretty accurate. However, there were a couple of situations I ran into which required extra care.

The most common situation is that NaPTAN stops can be displaced from existing nodes by more than 50 meters. Using the JOSM conflation plugin as specified in my import plan, the default distance setting for conflation is 30 meters. I found that bumping this up to 50 meters caught the majority of conflations, but still wasn’t perfect. As seen in the centre of the image below, two of the stops on Garthdee Road are not matched to their corresponding existing nodes (where blue arrows show matches).

Example

I wouldn’t ever advise relying on the plugin alone and recommend always manually scanning through all of the stops to look for possible duplicates or other issues. My process evolved into:

  1. Run conflation matching.
  2. Toggling the import data layer visibility on/off to check for unmatched stops near to existing nodes (because the data is on different layers it’s easy to spot any nearby bus stop nodes coloured grey).
  3. Zooming to those stops and manually checking if there should be a match. Most can be determined by aerial imagery or a duplicate name because if all other stops nearby are matched, then the remaining stop must match. For further comparison looking at the naptan:Indicator tag helps since it tells you where the stop is relative to a nearby street/feature.
  4. Re-position either the new data or old data to the correct position so that they’re near together and will be caught by conflation matching.
  5. Re-run conflation matching and then go through the actual conflation process.

Another issue you may have missed in the image above are the two stops furthest west. If you’re paying close attention, you’ll notice that the matched stops are on opposite sides of the road. This is something else to look out for when scanning the stops. It isn’t always an issue because sometimes the NaPTAN data is just off position to the wrong side of the road. However, in this case they were matched incorrectly which was determined by checking the naptan:Indicator field (again the fix is to move the data closer to the stop it should match before re-running conflation matching).

Looking Forward

I’d hope that the completion of this import inspires other areas of the UK to make use of NaPTAN data too. It’s by no means perfect, but once the data is there it becomes much easier to see the imperfections and fix them compared to creating the data from scratch (and even with imperfections it’s still considerably accurate). I would suggest that being local to the area is very beneficial to this import process since it makes conflation easier when you’re familiar with where stops are already.

The NaPTAN dataset should only become more accurate in time and so I would like to extend my script to facilitate the possibility for re-imports by matching the naptan:AtcoCode of existing nodes and preemptively conflating the data. Additionally, I’d be curious to investigate importing other types of data from the NaPTAN database.

Leave a comment or give me a message if you have any questions or thoughts!

Location: City Centre, Aberdeen City, Scotland, United Kingdom