Audio Mapping first steps

Posted by mvexel on 21 October 2015 in English.

I have long wanted to try and do some audio mapping. Especially since I moved to the US and started spending more time in a car.

When you are driving, there is not a lot of ways you can record what you see in a way that makes it easy to map later. One way is to use Mapillary, but the sheer amount of information can be overwhelming. A picture every 5 seconds means 180 images to go through on a short, 15 minute drive. It also means handling over 300 MB of image data. And that’s only for a 15 minute drive.

So audio mapping. I have this tiny recorder that weighs almost nothing, has built in space for almost 70 hours (!) of recording and runs weeks on a set of AAA batteries:


So I thought I’d use that. Reading about audio mapping I quickly learned that there are a few strategies. One is to record one long audio track. This way you don’t have to push buttons a lot while driving which is nice. Another is to make lots of tiny recordings, one for everything you notice and want to map later.

The second method seemed more appealing to me even if it means pushing the record button a lot while driving. I was envisaging a strategy and result result similar to photo mapping. You record a GPS track as usual, and record audio clips when needed. Once home you would load both into JOSM. You would then see a string of markers along your GPS track. Each marker represents an audio clip. Play - map - play - map!

This is kind of how it turned out, but it turned out I needed to do some thinking and programming.

Hurdles to jump

First, my recorder can only write MP3 files. JOSM will only load WAV files. So I needed a way to easily batch convert the files while preserving the file time stamps.

Second, you have the same offset issues as with photo mapping. Because you use different devices to record the audio and the GPS, there will invariably be some difference between the internal clocks of the two. You need to resolve this before loading GPS and audio clips into JOSM, or the time shift will result in misplaced audio markers. For photo mapping, there is a built in function to adjust this offset in JOSM. For audio clips, there is no such thing.

So I decided to write a small tool that does both these things. In the best tradition of cryptically named OSM tools I named it You can learn more about it at the Github repository. In brief, running this tool on a directory of MP3 files will give you a directory with corresponding WAV files with proper (optionally offset) timestamps.


OSMTracker can also record short audio clips at the press of a button! I used it a lot when I had an Android phone.

The fun part!

With the WAV files handy, all you need to do is load your GPX file into JOSM, and add the audio clips. Curiously this works a little different than with images. The GPX file you just load through the File > Open menu. But then you need to right-click the GPS layer to get access to the ‘Import Audio…’ function:


You can then select your WAV files, which should show up as markers neatly along your GPS trace:



You can then click on the individual markers and enter the awkward world of listening to your own voice pointing out addresses, businesses, street names and other map-worthy observations.

Once you get in the groove, you can use the audio menu and the shortcut keys F4-F9 to quickly navigate through your clips.


Great, that wasn’t too difficult. Let me know how you audio-map, or if this quick walk-through makes you want to try it too!

Location: East Liberty Park, Salt Lake City, Salt Lake County, Utah, 84105, United States


Comment from ViriatoLusitano on 21 October 2015 at 02:47

Thanks! This will make noting all the road surfaces and speed limits ever more easy!

Comment from SimonPoole on 21 October 2015 at 07:02

Hmm, audio mapping was my preferred method of gathering data more than 5 years back when I started with OSM. It is bearable for larger features (name of the road you are currently on) and quite enviroments (non-moving or enclosed) but doesn’t really work due to:

  • positional errors (I stll now and then run in to stuff that I ampped back then which is “one-off”)
  • spelling (anything with a name is dicey)

Comment from Richard on 21 October 2015 at 07:47

Good stuff.

I’m convinced that in this age of Siri, Cortana etc. we should be able to do automatic voice recognition for mapping. I’ve experimented a little with an iOS app and Simon added some interesting-looking (sounding?) functionality to Vespucci.

Comment from SK53 on 21 October 2015 at 11:10

I still use audio mapping but rarely try & get it to match my GPS track: partially because I have an older digital dictaphone with only around 1-2 hours storage, and secondly because I found the voice-activated feature more practical, which effectively dropped the timestamps.

Transcription of files and/or direct editing OSM whilst listening to the audio is still cumbersome.

As for automated recognition, I feel that there is great scope for this is in validating existing OSM data so as to effectively re-survey places. It is quite difficult to spot changes in densely mapped places, and some kind of map=>speech in conjunction with simple recognition a very simple set of speech responses might help.

Comment from mvexel on 21 October 2015 at 13:41

Simon – the positional accuracy of the marker is tricky, I quickly found that out also. My first tests that I describe I actually did on foot, but the positional uncertainty would increase with speed. So my thinking is that in a driving context, you would need to focus on ‘big’ features such as lane configuration and turn restrictions at intersections. That is what I hope to test soon.

Richard – agreed, transcribing voice notes is very 2006. Thanks for the Vespucci reminder. I had not used it for a while because I don’t have an Android phone, but it looks like it has come a long way.

SK53, and Richard – Validating with simple (yes/no?) responses to questions is something I see for a ‘Mobile Maproulette’. I would love it if someone built that :)

Log in to leave a comment