OpenStreetMap logo OpenStreetMap

pyosmium 4.0.0 released

Posted by lonvia on 22 September 2024 in English. Last updated on 7 October 2024.

I’d like to announce the release of the new major version 4 of pyosmium.

pyosmium was originally created as a thin Python wrapper around the osmium library, a fast and flexible C++ library for reading, writing and processing OSM data. With the new version 4, pyosmium adds a convenience layer on top which gives the library a more pythonic feel and speeds up processing considerably.

The most important new features are:

  • Iterative processing. OSM files can now be iterated through with an simple for loop. (Processing a file using handler functions is still possible.)
  • Filter functions allow to quickly skip over uninteresting parts of the file, making it now possible to use pyosmium scripts with large OSM files without pre-filtering.
  • Writers with automatic reference completion. One of the challenges of writing OSM files is that for every way and relation written, you usually also need to write the nodes and members. pyosmium now implements writers which help with that.

Here is an example how to quickly find the most frequent tags used together with amenity=school using the new iterative syntax:

import osmium
from collections import Counter

tag_counter = Counter()
total = 0

for o in osmium.FileProcessor('planet.osm.pbf')\
               .with_filter(osmium.filter.TagFilter(('amenity', 'school'))):
    tag_counter.update([tag.k for tag in o.tags if tag.k != 'amenity'])
    total += 1

for tag, cnt in tag_counter.most_common(10):
    print(f"{cnt:6d} ({cnt*100/total:5.2f}%) {tag}")

Running this on a full OSM planet file takes less than 5 minutes on a 12-core machine with 128GB RAM.

Or do you want to create an thematic extract of schools:

import osmium

with osmium.BackReferenceWriter('schools.pbf', 'planet.osm.pbf',
                                overwrite=True) as writer:
    for o in osmium.FileProcessor('planet.osm.pbf')\
                   .with_filter(osmium.filter.TagFilter(('amenity', 'school'))):
        writer.add(o)

This is done in about 13 minutes. For comparison, osmium-tool’s tags-filter needs about 10 minutes for the same task on the same machine.

There are many more smaller improvements and additions. For a complete list of changes, have a look at the release notes. The improved documentation now comes with a cookbook section with documented examples to get you started.

Discussion

Comment from Mateusz Konieczny on 7 October 2024 at 17:11

https://docs.osmcode.org/pyosmium/v4.0.0/ and https://docs.osmcode.org/pyosmium/v4.0.0/cookbooks/ linked in this post are 404ing

iterative syntax looks nice! But handler functions and preprocessing are still needed if you are interested in location of objects, right?

Comment from Mateusz Konieczny on 7 October 2024 at 17:12

user manual link in https://github.com/osmcode/pyosmium/releases/tag/v4.0.0 is also refusing to work

Comment from Mateusz Konieczny on 7 October 2024 at 17:13

Maybe it is related to 4.0.0 being not listed and 4.0.1 being listed at https://docs.osmcode.org/pyosmium/ ?

Comment from lonvia on 7 October 2024 at 19:09

Thanks for the hint. Fixed the links.

But handler functions and preprocessing are still needed if you are interested in location of objects, right?

No. Have a look at https://docs.osmcode.org/pyosmium/latest/user_manual/03-Working-with-Geometries/ to learn how to create geometries when working with the FileProcessor.

Log in to leave a comment