OpenStreetMap logo OpenStreetMap

Are you familiar with the OpenStreetMap Statistics created by Piebro? It’s a handy tool that analyzes OSM changeset files and creates graphs from various perspectives.

user_statistics_japan

The project’s README also indicates that analysis can be performed for individual countries or regions. I analysed the Japanese region in my local environment, so I’ll summarize the procedure.

region_japan

My Environment

  • MacOS
  • Python 3.9

Data Preparation

Clone Piebro's git repository to your local machine:
$ git clone https://github.com/piebro/openstreetmap-statistics.git
$ cd openstreetmap-statistics

Install osmium-tool using Homebrew:
$ brew install osmium-tool

Prepare the virtual environment:
$ python3 -m venv .venv

Install necessary dependencies:
$ pip3 install -r requirement.txt

Obtaining the OSM Changeset File

Piebro’s README uses torrent. Using torrent should result in lower network load, so it’s preferable if possible.

$ rm $(ls *.osm.bz2)
$ wget -N https://planet.openstreetmap.org/planet/changesets-latest.osm.bz2

Obtain the Poly File for the Target Region

Download the poly file for the target region from the Geofabrik site.

$ wget -N https://download.geofabrik.de/asia/japan.poly

Extracting the Changeset File

Use a poly file to extract the target region from the changeset-latest file. Please use the following script for extraction (written by Claude):

https://gist.github.com/nyampire/57360f29be2bff4d74d94e3c5bfa3237

Install shapely 1.8.0.
The latest version 2.0 and later seems incompatible with dependent software.
$ pip3 install shapely==1.8.0
$ wget -N https://gist.githubusercontent.com/nyampire/57360f29be2bff4d74d94e3c5bfa3237/raw/1fa9ded4126b3282af273469225407847cb38ee7/parse-latest.py
$ python3 parse-latest.py [poly.filename] [input.filename] [output.filename]
$ (sample) python3 parse-latest.py japan.poly changesets-latest.osm.bz2 japan-chansegets.osm.bz2

As a result of the script processing, in this example, japan-changesets.osm.bz2 will be output.

Generating Files for the Notebook

The subsequent processing is similar to Piebro’s procedure. I omitted the pv command as it’s only used to monitor progress.

$ rm -r -d temp
$ osmium cat --output-format opl output.osm.bz2 | python3 src/changeset_to_parquet.py temp
$ python3 src/parquet_to_json_stats.py temp

Updating the Notebook

Execute the following in the terminal. You can make it into a shell script and kick it, but it will also work if you paste it directly into the shell.

for notebook in $(find src/questions -name calculations.ipynb); do
    jupyter nbconvert --to notebook --execute "$notebook" --output calculations.ipynb
done

Starting the Local Server

Start a local http server to check. You need to start an http server; otherwise, the graphs won’t display due to CORS issues.

python -m http.server 1010

Access to “http://localhost:1010” to show graphs.

Instead of starting an http server, you should also be able to push to your personal git repository and check from gh-pages.

However, with the steps so far, the remote is set to Piebro’s repository. Please modify it to upload to your own repository.

Email icon Bluesky Icon Facebook Icon LinkedIn Icon Mastodon Icon Telegram Icon X Icon

Discussion

Comment from Koreller on 31 July 2024 at 08:03

Very cool to see all thoses stats, it helps to see which editor is popular which is creating more objects. I really appreciated to see corporate hashtags stats with heat map.

The same kind of stats (or more!) by country would be fantastic ;)

Thank you !

Log in to leave a comment