Testing planet import

Posted by mmd on 10 October 2022 in English.

For the first time in many years, I decided to repeat a series of planet imports using the Overpass API. As I didn’t want to spend too much money on hardware, I rented a commodity Intel machine for about 40€/month, with 2 Data center SSDs and 64GB ECC RAM.

Much to my surprise, the import on release took about 33.5 hours, which is at least 10 hours longer than I expected. I was able to improve runtime a bit by tweaking a few configuration settings, such as enabling lz4 compression everywhere and increasing a chunk size parameter, but still ended up with 26.5 hours.

I continued testing with my own experimental Overpass fork, that includes support for PBF, multithreading, and many other changes under the hood. Initial measurements looked quite promising with 10.5 hours total runtime. After some further analysis and improving some data structures, the import took 7 hours and 23 minutes. Peak memory consumption was still quite ok at 22G. I tried different settings to achieve lower memory consumption, at the cost of longer processing time (e.g. 8 hours and 13G peak memory).

Depending on compression settings, the final planet database was in a range of 230-265GB.

Detailed results are available on this wiki page:

That’s all for today.


Comment from pnorman on 12 October 2022 at 08:11

Does overpass not read the history PBFs for historical data?

Comment from mmd on 12 October 2022 at 19:12

Historical object versions is a challenging topic. I reran some tests, starting with a 2012 planet, and subsequently applying daily diffs in PBF format. I ended up processing years 2012-2017 at about 600x speed (1 day = 600 OSM days), with a package size of 3 days. Using the official release is much slower and works with XML files only.

Using a full history planet instead doesn’t work. The importer wasn’t designed with this use case in mind and uses way too much memory.

Comment from PierZen on 14 October 2022 at 18:24

Hi mmd, for those that are not familiar with these programming tools, could you describe what type of import is done exactly with Planet-OSM - moving into Overpass db or anything else ?

Comment from mmd on 15 October 2022 at 14:03

The goal of the import is to set up a new Overpass db using a recent OSM planet file. As usual, the import process writes all nodes, ways and relations to disk, and includes object metadata (user, timestamp, object version number, …). Once the import has finished, you can use the db to run some queries.

In case you don’t want to go through this process, there’s also some clone database available for download (see docs for details).

Login to leave a comment