calculate ways instead of drawing or: a look at "osm-makeroads"

Posted by malenki on 30 October 2012 in English (English)

(Hint: first read, then install. Maybe you won't need this tool at all)

Since my last holidays again resulted in tons of gpx-logs, I once more looked for a way to reduce the time I spend tracing them. Last year I tried Average tracks but wasn't too satisfied although I can't recall now exact reasons.

Nevermind, this year I remembered having read on the german weekly OSM notices about osm-makeroads.

The readme seems helpful at the first sight but (at least for me) there were some traps. Since R complained about some missing stuff, I had to install additionally libproj-dev and gfortran. The installation of tools at debian is the following: apt-get install libproj-dev gfortran r-cran-maptools gpsbabel libgdal-dev

The command of the readme to install the R-libraries did not work on my system. I figured out I could successfully install them as user by opening the R shell (by typing "R" in a terminal emulator) and installing the libraries one by one with these commands:

install.packages("princurve",dependencies = TRUE)
install.packages("rgdal",dependencies = TRUE)
install.packages("geosphere",dependencies = TRUE)

Now I could run the script. Only - how...?

First, I saved the files in a folder where I store binaries and scripts, let it be /bin here.
A bit googling and guessing gave me the line with which the script starts:
R -e "source('/bin/process.R')"
Now it complaints that it cannot find makeroads.R, so I substituted the path to the location of makeroads.R
source('makeroads.R') with
in process.R

Now I could finally run the program. Yeah! But I was a little surprised to find out it calculates only ways it can fetch for a defined bounding box from OSM. (Yes, you may call me an unalert manual reader.) Since the calculated result looked good I decided to go on since I invested quite some time already.

In process.R there is also defined the bounding box for which the script fetches the gpx data. One region with untraced logs was near Logatec. To get the four coordinates I took the changeset url, but the export-tab is just as useful. Be aware that in the file the order of the coordinates is not the OSM default left-top-right-bottom.

osm-makeroads needs quite some time to calculate average ways. Multithreading is not possible. For the above mentioned bounding box the calculation took about 18(!) hours with a AMD Phenom II X2 550 Processor - and then there was a power outage. sigh
I will not give it a second try.

Conclusion: osm-makeroads is for me not really helpful.

(edit) As I wrote this entry the bounding box contained 2,9 MB of gpx data with 22342 trackpoints. Here the file with that data.

Comment from Chris Lawrence on 31 October 2012 at 06:26

Hello! Glad to see someone using osm-makeroads. I have been working on a few speed improvements the last couple of days, but there are still issues to be worked on of course - I definitely want to add some multicore support, so I've been refactoring the code to make that possible.

Even so I'm a bit surprised it took 18 hours on a Phenom II X2; I have a Athlon II X4 630 (which should be similar performance-wise). I'll download the GPX data you used and see what the bottleneck is; it may be memory usage rather than CPU speed (I haven't paid much attention to memory usage yet).

There is a poorly documented method to bring in GPX files directly (avoiding the need to download it from the API), if you use getOSMtracksFiles() instead of getOSMtracks(). You can also rearrange the bbox setup lines to the order you're comfortable with (and getOSMtracks doesn't mind if you accidentally swap left/right and top/bottom).

I'll comment on the fit quality issues on your Oct 31 blog.

Comment from Chris Lawrence on 31 October 2012 at 06:42

Sorry, misunderstood - thought the Oct 31 post was using osm-makeroads. I'll definitely investigate what's going on in the GPX data you provided.

Comment from malenki on 31 October 2012 at 07:27

Hi Chris,
thanks for having at look here.

I hope you realise the 18 hours were limited by a power blackout and not a successful finish of calculation. ;) I don't think there was an issue with memory usage. The box on which I ran the script has 8GB and there was always at least 1GB free.
The bbox setup lines I already rearranged but didn't mention it. I think people fiddling with scripts will realise that possibility on their own. On the other hand its no big issue.

To calculate tracks from the API has its advantages. When there are already some logs I'd prefer to include all of them instead of the only two or four I could create on my own.
Though a trap can be that the bbox can have a different size than the permalink to the map shows when using the export tab of Maybe the script went a bit down to the southeast and choked on the data of the town of Logatec. I'll give you the coordinates I used as soon as I am back on the box where I tested the script.

For the post of yesterday: maybe I shouldn't use the nearly same topic.

Comment from Chris Lawrence on 31 October 2012 at 19:24

I've been working on the code a bit using the data from the link above. I've fixed a few performance issues and a few real bugs so far... but it's still slow. I really need to do some optimizations in the path search code as well. I've never run it on so much data before (typically, areas substantially smaller than what the OSM editing API will allow, with fewer source traces), so this is a bit of a learning experience! This feedback has really helped improve the code and is starting to push me to improve its documentation.

The good news is that once I have the bugs worked out and some more optimizations done I think parallelizing the remaining computation-intensive code (mainly calls to dist2Line, which involve a lot of floating point math) will be fairly easy.

Comment from malenki on 31 October 2012 at 21:49

Happy I could help. :) (And glad you optimize the code.)

3 MB gpx data data is not much data for me. After last years bike trip I had more then 20 MB; after a trip by car finished some days ago about 54 MB.

Though the most parts are already from mapped roads there are some kilometres to map left. In 2011 I just converted the log of the better device, converted, cleaned and uploaded it when there were some dozen km of new highway...

When I would have a go on such a part with makeroads I'd have to do the calculations with a lot of tiny bboxes since one single big bbox could cause to much "by-catch" of gpx-data which increase the duration of calculation uselessly.

Login to leave a comment