b-jazz's Diary

Recent diary entries

Analysis of Bounding Box Sizes Over the Last Eight Years

Posted by b-jazz on 5 August 2020 in English. Last updated on 16 August 2020.

Impetus

I read recently in one of the weekly OSM newsletters of a discussion thread on the OSM-talk mailing list about limiting or adding a warning to editors to let users know if they are editing something that will result in an unusually large bounding box for the changeset. As someone that has made the mistake of accidentally editing nodes in entirely different parts of the country and being horrified that I created a massive bounding box, I was curious as to how often this happens and what a typical bounding box size would be for your average mapper.

Gathering the Data

I set about gathering the data on changesets and bounding boxes and picked the current month (July at the time) to look at. I found that there was a minutely “feed” of changeset data that also included the computed bounding box in the replication/changesets directory on planet.openstreetmap.org site (and luckily mirrored to a single place in the U.S. that I could use). After my internet connection started glowing red after a day of transferring just a week’s worth of July data, I figured that was probably enough to get something useful up. I wrote a few lines of Python to uncompress the by-minute files, convert them into SQL statements, and start loading them into a Postgres/PostGIS database. (With a non-trivial detour to learn just enough on how to actually work with polygons and WKT and how to calculate the area using the right spatial reference system.)

First Look

The first graph I generated was a simple bar chart for the first week in July. I posted the following chart on the OSM US Slack server in a new channel that I created called #data-is-beautiful (after a popular subreddit on the “front page of the internet” website known as reddit.com)

… See full entry