bdon's Diary

Recent diary entries

OpenStreetMap Isn't Unicode

Posted by bdon on 26 October 2021 in English.

When working with OSM it’s generally fair to assume that textual data, like tag values, are encoded in UTF-8. Without this assumption, multilingual mapmaking would be almost impossible - custom fonts or browser settings would need to be specified for every language when displaying geocoding results, routing directions or map labels.

As part of the newly resurrected Engineering Working Group, I’m investigating ways to improve OSM’s software ecosystem. One of the top tasks for the EWG is localization, and standardized text encoding is a prerequisite for this, but OSM does not enforce any particular encoding as policy.

Where is the non-Unicode data?

The most obvious instance of non-Unicode in OSM is the Zawgyi encoding for Burmese text. For background on Zawgyi, see this post on the civil war between fonts in Myanmar.

The default Mapnik-based rendering on OpenStreetMap.org, openstreetmap-carto, uses Unicode fonts. Zawgyi-encoded tags appear obviously garbled on the map, with the combining mark ◌ visible:

… See full entry

Location: 觀音山, Zhongzheng District, Gongguan, Taipei, 10091, Taiwan

Vector Map Bundles

Posted by bdon on 28 December 2020 in English. Last updated on 29 December 2020.

Vector Map Bundles are the follow up to Minutely Extracts which I posted about previously.

map bundle create

You can download map bundles for up to 500,000 nodes at protomaps.com/bundles.

Self-Contained, Batteries Included

Map Bundles are cartographic basemap tile pyramids built from a minutely-updated snapshot of OpenStreetMap. That means within sixty seconds of uploading your changeset, you can download a new Map Bundle and have a self-hostable or offline-friendly zoomable map with those changes.

A Map Bundle includes example HTML and JavaScript code to serve and render the vector tiles in the browser - just run python3 demo.py after you’ve unzipped the ZIP archive. You can even interact with individual elements on the map to view their tags or OSM identifiers.

… See full entry

Location: Liming Village, Zhongzheng District, Taipei, Taiwan

Minutely Extracts

Posted by bdon on 11 February 2020 in English.

Minutely Extracts

Minutely Extracts (protomaps.com/extracts) is a new service for on-demand OpenStreetMap extracts in .osm.pbf format.

Features

Data is updated once per minute from the main OpenStreetMap replication diffs. Your changes on OSM.org can be consumed almost immediately, making editing more rewarding.
Select a bounding box or draw a polygonal region up to 100 million nodes.
Small areas can be extracted in seconds.

Details

I am running Minutely Extracts as a public service to help improve the mapping experience.
Code is available at github.com/protomaps/OSMExpress : this is a command line utility and serialization format specific to planet-scale, spatially indexed OSM data.
Extracts do not include metadata such as version numbers, Changeset IDs, timestamps or usernames.
Extracts are reference-complete for ways and multipolygon relations.

Future work

… See full entry

Location: Fort Greene, Brooklyn, Kings County, New York, United States

Deep Dive: natural=coastline

Posted by bdon on 28 December 2019 in English. Last updated on 29 December 2019.

I have made several edits around the world related to the OSM coastline. My goal is to enable small-scale derivation of land and ocean polygons without resorting to global preprocessed continent geometries assembled from programs like OSMCoastline.

As a primer, the coastline should be mapped as ways with natural=coastline, with land on the left side and ocean on the right side. This is specified on the OSM wiki: https://wiki.openstreetmap.org/wiki/Tag:Tag:natural=coastline. “Land” in this instance is defined as the non-ocean parts of the world, not solid ground in general; for example, the Great Lakes are represented not by coastlines but as water body features inside “land”.

There are a few implications to this design:

The ocean should be one polygon in the OGC Simple Features sense: it has one outer ring in the clockwise (CW) direction, and a counter-clockwise (CCW) inner ring for every continent and island. The Caspian Sea is the one exception to this single polygon as stated on the Wiki.

The complement of the global ocean polygon is thousands of land polygons representing continents and islands. Each polygon has a single CCW outer ring and zero inner rings. Again, the one exception is that the Eurasian continent polygon has a CW inner ring representing the Caspian Sea.

In theory, this specification should be enough to infer land and ocean polygons from orientation even within a small geographic extract. I discovered and corrected several dozen places where a violation of this specification arises. The image below is an example of data that needs to be corrected:

… See full entry

Location: Clinton Hill, Brooklyn, Kings County, New York, 11205, United States

Broken multipolygons in California

Posted by bdon on 10 July 2014 in English.

I’ve created a Gist of invalid multipolygon relations in California (having Closed ways that do not form a complete ring)

https://gist.github.com/bdon/2ab08080b9e975a82052