OpenStreetMap

spwoodcock's Diary

Recent diary entries

ODK Entities for OpenStreetMap

Posted by spwoodcock on 4 April 2024 in English. Last updated on 13 April 2024.

ODK

For those that don’t know, ODK is an incredible suite of tools for field data collection.

The Field Mapping Tasking Manager(FMTM) leverages two of their tools to coordinate field tagging of map data:

  • ODK Central as the centralised server to store survey data.
  • ODK Collect as the mobile app for survey-based data collection on mobile phones (working very nicely in offline contexts too).

Entities

  • ODK Entities were introduced to Central in September 2023, in order to more easily track the same feature over time.
  • As a result, we have a nice way to store a feature, with geometry and properties, in ODK Central.
    • This could quite easily map to the OSM ID, feature geometry, and feature tags.
  • The geometry can then be selected in ODK Collect survey questions.

select-from-map-polygon

How To Use Entities

Create an Entity List

Within ODK Central, a collection of Entities is called an Entity List (or dataset via the API).

Currently the only way to define an Entity List is to submit a XLSForm with an ‘entities’ tab defining the entity_id field:

entities-tab

And a ‘survey’ tab defining additional properties/field for the Entities:

entities-survey-tab

Example generic Entity form

  • This will change in the future, with the possibility to create an Entity List / dataset via the Central API.

Populate Entities

Once the Entity List exists, you must populate the data:

Option 1: via the API. Manually uploading the Entity details.

Option 2: via CSV attachment. Entities can be uploaded as a CSV attachment to a form (in theory, although I have yet to make this work!).

Option 3: created automatically on submission by specifying a ‘geopoint’ or ‘geoshape’ field in your form (and collecting the location in the field).

Changes to FMTM

Until now, FMTM was frankly abusing ODK Central and had a structure like this:

FMTM Project 🔗 ODK Project

FMTM Task Area 🔗 ODK XForm (the survey)

FMTM Geometry 🔗 ODK XForm Attachments

This resulted in potentially hundreds of ODK XForms per project, which is not an ideal usage.

With Entities, the new mapping for FMTM is:

FMTM Project 🔗 ODK Project, ODK Entity List (Dataset), ODK XForm

FMTM Task Area 🔗 Group of ODK Entities (via task_id property)

FMTM Geometry 🔗 ODK Entity

This is a much more logical 1:1 mapping of the lowest level unit we are interested in: OSM features.

A World of Possibilities

  • Now Entities exist, it is possible to update Entity fields, as they are mapped in the field.
  • For example, after tags have been added to an OSM feature in FMTM and mapping is complete, a ‘STATUS’ field could be updated for the Entity as ‘complete’. This would inform other users of ODK Collect that this building has already been mapped.
  • Even more intuitively, the feature could be ‘soft deleted’ after mapping, so it so the geometry disappears entirely from another users phone. Or the Entity label could be updated with a big ✅.
  • There is a whole section on updating Entities from forms in the ODK documentation.

So Where Does OpenStreetMap Come In?

  • The goal of FMTM is to add useful field-verified metadata (tags) to features on a map.
  • For example, we have the outline of a school, but wouldn’t it be great to know: how many stories it is, what the walls and roof are made from, how many people are typically inside the building during the day?
  • FMTM mostly uses OSM data for import and updating these tags (it is also possible to use custom data that does not exist in OSM (yet!).
  • ODK Entities are essentially a representation of these OpenStreetMap features, that can be updated from detailed ODK surveys done in the field.
Location: Kalgodin, Ouagadougou, Kadiogo, Centre, Burkina Faso

Revolutionizing Field Mapping (with FMTM): Part 3

Posted by spwoodcock on 5 March 2024 in English. Last updated on 6 March 2024.

See pt1 of this series here.

See pt2 of this series here.

A Long Overdue Release

The goal with the Field Mapping Tasking Manager (FMTM) was to adopt an agile development style, making a new release around once per month.

3 months have passed since the last blog post and there was no FMTM release in between!

What gives??

Well, the team has been working extremely hard simplifying usage of FMTM and making it much more usable.

It’s been hard to find a good point to stabilise a release due to so many great and rapidly developed updates!

From this release onwards we plan to follow through with a new version increment every month. Look out for version 2024.4.0 next month.

Public Beta Now Live

  • The main goal we have been working towards is releasing a public beta this month.
  • The public beta for FMTM is now live on https://fmtm.hotosm.org
  • The idea to to have the public test out it’s functionalities & definitely break a thing or two!
  • With your valuable feedback we want to make FMTM the best it can possibly be 🙏

splash

See full details of the release on Github

The main contributors to thank for this release are: @varun2948, @nrjadkry, @Sujanadh, @NSUWAL, @Prajwalism, @manjitapandey and a entirely new contributor @cordovez.

Using The Beta

Feel free to browse around and report any issues via Github or the HOTOSM Slack.

To create a new project, be sure to select the ‘FMTM Public Beta’ organisation while doing so, with the ‘Use Default ODK Credentials’ box checked.

default-org

This will grant you permission normally reserved for organisation admins.

FMTM In Brief

For the uninitiated, I thought I would add a quick aside for how FMTM works:

  • Project is created in an area with three things:
    • Data extract: the features you want to map, say building polygons.
    • ODK XLSForm: the survey for mappers on the ground to fill out for each feature.
    • Task areas divided by feature count and linear features (e.g. rivers, roads).
  • Users assign a task area for themselves, and generate a QR code that is opened in ODK Collect.
  • User navigates to the feature and fills out the XLSForm survey, then submits.
  • The submissions are collected by ODK Central, which feeds the data back into FMTM, for cleaning, conflation with existing data, and pushing back to OSM.

As you can see FMTM is built on top of OpenDataKit actually collect the survey data about map features.

Conflation will be integrated in a future release through HOT’s conflator module by @robsavoye.

Release Highlights

There have been many, many, changes in FMTM since the last release, a lot of them being refinements to the backend too.

Some of the less visible changes: fixing iOS usage, streamlining processes to reduce resource consumption, reducing code complexity for more sustainable development, implementing best practice security for login and session management, fully encrypted ODK credentials, using modern geospatial formats such as flatgeobuf for more efficient data usage and many more.

Full Integration of HOT’s raw-data-api

We have @kshitijrajsharma to thank mostly for this.

raw-data-api allows for near real-time data extraction from OSM, using an innovative data storage structure for very efficient usage.

When you click on ‘Generate Data Extract’ during project creation, this is the tool that works underneath.

raw-data-api

Streamlined Project Creation

Project creation has been significantly streamlined, especially for users that may not know what an ODK XForm is.

project-creation

User Roles

User roles have been implemented, but for the purpose of the beta are mostly disabled to allow full access to test.

Creation of organisations can be requested through a simple web form.

  • Organisation admin: can create projects.
  • Project admin: can modify projects and grant users permissions.
  • Field admin: has special rights to add mappers to a project and unblock tasks.
  • Validator: can validate and submit the final data to OSM.
  • Mapper: the default role for most.

In public project, everyone has the role of mapper.

In private projects this must be explicitly granted.

Project UI

The project details page has a much improved user interface from before, with a listing of task activities and links to various information.

project-details

When the user is ready, they can click ‘Start Mapping’ to lock the task area and scan the QR code in ODK Collect:

qr-code

Submissions UI

A new shiny UI is being developed by the team at NAXA to view the final data submissions.

submissions-ui-1

submissions-ui-2

Project Editing

It is now possible to edit and delete existing projects.

project-edit

Monrovia Test

One of the first projects to use the FMTM Beta was the Liberia Slum Mapping Pilot Project.

Based in Monrovia, HOT and Slum Dwellers International (SDI), plus local partners FOLUPS and YMCA, undertook a mapping campaign for temporary urban settlements, that was very valuable in providing feedback into FMTM use cases.

Two Crucial Insights

  • Not all field mappers are necessarily OSM’ers, so we need to adapt.
  • Many projects in developing countries will not have the features mapped in OSM already, so we first need to get imagery (drone ideally), and integrate HOT’s Tasking Manager to map the features prior to on-the-ground tagging.

Note this is a perfect use case for AI-assisted mapping

Drone Imagery Coverage

monrovia

Mapping Area

morovia-task-area

Generating Polygons From Drone Imagery

building-polygons

Get Involved

Contributing to FMTM with code, documentation, or ideas would be very welcome!

Software developers, technical documentation writers, software testers, or anyone with a general interest - feel free to get in contact.

Even just adding a star on Github helps to show your support 🙌

Future Posts

This was an update on some of the latest features of FMTM and future plans.

There is much that I could not cover, so please check Github for the latest updates!

I plan to make future posts about developments, technical deep dives, and usage of FMTM’s features.

See you next time.

Location: Patte d'Oie, Ouagadougou, Kadiogo, Centre, Burkina Faso

Leveraging PostGIS to Write Your FlatGeobuf Files

Posted by spwoodcock on 7 December 2023 in English. Last updated on 21 February 2024.

To GDAL or not to GDAL

GDAL is an incredible geospatial library and underpins so much of what we do, including our databases (PostGIS).

However, sometimes it might be a bit heavyweight for what we are trying to achieve.

Installing it as a base system dependency inevitably installs everything - there are no options.

image

Install size is especially important when building container images, that we want to be as small as possible for distribution.

GDAL in PostGIS

PostGIS uses GDAL for most of it’s geospatial processing, including reading and writing various geospatial file formats.

FMTM is starting to use FlatGeobuf format for various purposes (OSM data extracts, storing submissions).

It also uses a PostGIS database as part of the software stack.

So today I thought: why not just use the geospatial processing built into PostGIS for reading and writing flatgeobuf data?

The solution was surprisingly painless!

Database Access

First we need a way to access the database.

FMTM is using FastAPI and SQLAlchemy, so ideally we want to pass through and reuse the database session created when an endpoint is accessed.

To make this standalone, I also added functionality to create a database engine from scratch.

image

The nitty-gritty SQL

image

The function requires a FeatureCollection geojson.

Now I’m sure this is a much more efficient way to write this by nesting SQL SELECTs, but I was too lazy to debug and I find this approach quite readable, albeit slightly less efficient.

Using the code

An example of using in FastAPI:

image

Limitations

There is one glaringly obvious limitation of this approach: if reading the FlatGeobuf is implemented in the same way then we lose the benefit of it’s ‘cloud native’ encoding.

Reading requires downloading the entire file, passing to PostGIS, and returning a GeoJSON.

However, that was not the intended purpose of this workaround.

FlatGeobuf is primarily a format meant for browser consumption. With excellent support via the npm package.

So while the backend API can write data to FlatGeobuf without requiring dependencies, the frontend can then read the data if it’s hosted somewhere online (i.e. an S3 bucket).

Code

Apologies for the code screenshots: OSM Diaries does not support code syntax highlighting, nor spaces in code blocks.

Database code

from sqlalchemy.engine import create_engine
from sqlalchemy.orm import DeclarativeBase, Session

def get_engine(db: Union[str, Session]):
	"""Get engine from existing Session, or connection string.
	If `db` is a connection string, a new engine is generated.
	"""
	if isinstance(db, Session):
		return db.get_bind()
	elif isinstance(db, str):
		return create_engine(db)
	else:
		msg = "The `db` variable is not a valid string or Session"
		log.error(msg)
		raise ValueError(msg)

SQL code

from geojson import FeatureCollection
from sqlalchemy.orm import Session

def geojson_to_flatgeobuf(db: Session, geojson: FeatureCollection):
	"""From a given FeatureCollection, return a memory flatgeobuf obj."""
	sql = f"""
		DROP TABLE IF EXISTS public.temp_features CASCADE;
		CREATE TABLE IF NOT EXISTS public.temp_features(
			id serial PRIMARY KEY,
			geom geometry
		);
		WITH data AS (SELECT '{geojson}'::json AS fc)
		INSERT INTO public.temp_features (geom)
		SELECT
			ST_AsText(ST_GeomFromGeoJSON(feat->>'geometry')) AS geom
		FROM (
			SELECT json_array_elements(fc->'features') AS feat
			FROM data
		) AS f;
		WITH thegeom AS
		(SELECT * FROM public.temp_features)
		SELECT ST_AsFlatGeobuf(thegeom.*)
		FROM thegeom;
	"""
	# Run the SQL
	result = db.execute(text(sql))
	# Get a memoryview object, then extract to Bytes
	flatgeobuf = result.fetchone()[0].tobytes()
	# Cleanup table
	db.execute(text("DROP TABLE IF EXISTS public.temp_features CASCADE;"))
	return flatgeobuf

Usage code:

from sqlalchemy.engine import create_engine
from sqlalchemy.orm import DeclarativeBase, Session

def get_engine(db: Union[str, Session]):
	"""Get engine from existing Session, or connection string.
	If `db` is a connection string, a new engine is generated.
	"""
	if isinstance(db, Session):
		return db.get_bind()
	elif isinstance(db, str):
		return create_engine(db)
	else:
		msg = "The `db` variable is not a valid string or Session"
		log.error(msg)
		raise ValueError(msg)
Location: Kampung Padang, Kampung Bharu, Kuala Lumpur, 50400, Malaysia

Revolutionizing Field Mapping (with FMTM): Part 2

Posted by spwoodcock on 29 November 2023 in English. Last updated on 6 December 2023.

See pt1 of this series here.

First Release

We have been running various versions of FMTM for some time now, but released our first official version not long ago: v0.1.0.

FirstRelease

See the excellent release notes written by Susmina Manandhar, product manager for FMTM, for further details of features, bug fixes, and improvements.

Just to reiterate what a wonderful team of devs we have working on the project: @varun2948, @nrjadkry, @Sujanadh, @NSUWAL.

Special thanks for @JoltCode, an volunteer, who helped to modernise our frontend build tools. And to @robsavoye for his work on osm-fieldwork and osm-rawdata that underpin FMTM, plus guidance from the start.

Try it out!

I have been working to ease the installation process for organisations that need to run FMTM.

You can run a version of FMTM yourself with two commands!

curl -L https://get.fmtm.dev -o fmtm.sh

bash fmtm.sh

You will be prompted with a command line interface

Although HOT will endeavour to run a ‘global’ instance of FMTM, in a similar vein to Tasking Manager currently, my main goal is that FMTM is so simple to deploy, that a workflow such as this can be envisioned:

  1. Requirement for field mapping arises.
  2. Organisation spins up an instance of FMTM.
  3. Project manager creates a project, divides tasks.
  4. Team goes out to map the features.
  5. Data is validated, conflated with existing OSM data, and exported / uploaded.
  6. FMTM instance is shut down.

Technical aside / mad idea section:

To run FMTM, currently all you need is a server and a domain name.

In the future, this could be streamlined even further with a technology such as Headscale, creating a virtual private mesh network for mappers to connect to.

HOT could potentially run the Headscale control plane, so all project managers would need to do is run FMTM on their laptop/pc, configure the VPN for their users, then connect phones to the VPN (using an easy mobile app) for mapping!

Mobile First

We realised during various test with end users, that the UI was far from ideal to operate via mobile phone.

Significant updates have been made to the UI design, adopting a mobile-first approach to ease field usage for mappers on the ground.

MobileView

We now also support map geolocation and orientation.

The project creation workflow has also undergone significant upgrades.

ProjectCreation

MBTile Output

Often mapping must be undertaken in areas with limited connectivity.

To still use a basemap to locate yourself, the MBTile format is useful. MBTiles can be generated directly in FMTM for download and use within ODK Collect.

MBTiles

Note: we are working with the developers of ODK Collect to make the process of loading MBTiles into the app a more seamless process.

Improved Deployment Process

As mentioned, HOT plans to maintain a series of FMTM servers for usage by the public.

In our newly improved deployment process, this is as follows:

The purpose behind having a staging server open to the public is to gather feedback and run public tests of new features.

Those wishing to use FMTM for a critical project should use the production server.

However, if you wish to have the latest features on FMTM as quickly as possible (the development pace is quite fast), but are willing to encounter the occasional bug, then feel free to use the staging server (and please provide feedback!).

The staging version should be updated every two weeks.

The stable version should be released every month.

Get Involved

Contributing to FMTM with code, documentation, or ideas would be very welcome!

Software developers, technical documentation writers, software testers, or anyone with a general interest - feel free to get in contact.

Future Posts

This was an update on some of the latest features of FMTM and future plans.

There is much that I could not cover, so please check Github for the latest updates!

I plan to make future posts about developments, technical deep dives, and usage of FMTM’s features.

See you next time.

Location: Prima Tanjung Business Centre, Tanjong Tokong, George Town, North-East, Penang, Malaysia

Vector Tile File Formats

Posted by spwoodcock on 18 September 2023 in English.

Storing map tiles in a single file is a common way to load basemaps on a map client.

There are a few formats available to do this, with different use cases.

Offline

mbtiles

  • A format innovated by Mapbox, but is a fully open spec.
  • Essentially an SQLite database linking to embedded tiled images.
  • The client interfaces with the database and loads each tile as required by the basemap.

OSMAnd SQLite

  • Based on BigPlanet SQLite format.
  • Basically the same as mbtiles, but a slightly different database schema.

A small aside.

Sometimes it’s necessary to generate both mbtiles and OsmAnd format to view in different software, which is a pain.

There is an open issue in OsmAnd to support mbtiles format, but it’s not a priority for now.

Knowing that they are very similar file formats, I considered the possibility of accessing one SQLite database via another ‘wrapper’ SQLite database in a custom view. This view would map tables and fields from one database schema to the other, eliminating the need to store both tilesets for the same data.

Assuming you have an MBTiles table with the following schema: CREATE TABLE mbtiles_table ( zoom_level INTEGER, tile_column INTEGER, tile_row INTEGER, tile_data BLOB );

And you want to create a view for an OsmAnd SQLite table with a schema like: CREATE TABLE osmand_table ( _id INTEGER PRIMARY KEY AUTOINCREMENT, x INTEGER, y INTEGER, z INTEGER, tile_data BLOB );

You can create a view to convert between them like this: CREATE VIEW osmand_mbtiles_view AS SELECT NULL AS _id, -- Use NULL for auto-increment _id tile_column AS x, tile_row AS y, zoom_level AS z, tile_data FROM mbtiles_table;

I couldn’t get this to work when testing, however (it may warrant further investigation).

If you find a solution, please do let me know!

Online

PMTiles

  • A neat new format specifically aimed at cloud-optimising vector tile access (accessing an mbtile file over the web is very inefficient, full details).
  • Easily handles both large planet-scale datasets with millions of tiles and small-scale datasets. As a single file it is perfect for S3 object storage.
  • Uses HTTP RANGE requests to only download the tiles specified in a BBOX (not the entire file).
  • Compression, tile deduplication (no need to repeat that blue ocean tile…), an optimised internal structure to minimise size and number of requests when panning or zooming, and minimal overhead when requesting tiles (tiny initial request).
  • For public deployments it is recommended to run behind a CDN to both cache tile requests, and act as a proxy to a private S3 bucket (anonymous direct file download from S3 may incur large costs).

So What Should I Choose?

If the layer (likely a basemap) needs to work offline, then:

  • SQLite if the tool/app supports it, e.g. OSMAnd.
  • mbtiles for tools that require it, such as ODKCollect.

If working online, then PMTiles may be best:

  • A replacement for XYZ basemap tile servers (great for reducing load on the OpenStreetMap servers 🙏).
  • Creating custom basemaps from imagery and/or OSM exports.

Converting Between Formats

mbtiles –> OSMAnd

  • This is relatively easy due to both being SQLite files at core.
  • The excellent Python utility by @tarwirdur mbtiles2osmand does this quite efficiently.

python mbtiles2osmand.py INPUT.mbtiles OUTPUT.sqlite3

  • I also ported this to Golang mb2osm, but have some work to do on improving performance. Feel free to contribute!

The advantage of using Golang here is to produce a statically compiled binary. This means that the single file does not require any external dependencies, or interpreter to run (unlike Python), making it more portable.

mbtiles –> pmtiles

  • The best choice for this would be go-pmtiles, by the creator of PMTiles.
  • Again, a single file binary program that can convert in one command.

pmtiles convert INPUT.mbtiles OUTPUT.pmtiles

Other formats –> pmtiles

  • In cases where you have other formats to convert first, e.g. directly from a database, GeoJSON, etc, tippecanoe (>v2.17) is recommended tool.
  • The official example:

ogr2ogr -t_srs EPSG:4326 cb_2018_us_zcta510_500k.json cb_2018_us_zcta510_500k.shp

Creates a layer in the vector tiles named “zcta”

tippecanoe -zg --projection=EPSG:4326 -o cb_2018_us_zcta510_500k_nolimit.pmtiles -l zcta cb_2018_us_zcta510_500k.json

Location: Kailash Chok, Lazimpat, Kathmandu-02, Kathmandu, Kathmandu Metropolitan City, Kathmandu, Bagmati Province, 21255, Nepal

Revolutionizing Field Mapping (with FMTM): Part 1

Posted by spwoodcock on 16 September 2023 in English. Last updated on 18 September 2023.

Manual Mapping

Since the inception of the Missing Maps project in 2014, the global community has achieved remarkable progress in digitally mapping communities that were once poorly mapped or entirely unmapped.

Shout-out to the incredible volunteers that contribute to this during regular mapathons🙏

mapathon

AI-assisted Mapping

HOTOSM’s fAIr project is at the verge of making this a reality. With an emphasis on open-source (ethical, responsible) models, region-specific training data, and iterative feedback, the entire globe will be mapped at record speed.

Watch this space.

fair

The Future of Mapping

With remote digitisation covered, there’s still a missing piece in the puzzle: attaching OSM tags to these features.

For that we need field-based verification.

Incorporating tags into the mapping data is crucial for enhancing its usability. Once collected, this data can be used in a wide range of projects, ranging from census data collection to humanitarian response.

Many of these tags would be difficult to extract without in-person verification: amenity type, number of floors, building materials, etc.

tags

Cool, tell me about FMTM then

While we already have field data collection tools, the real game-changer lies in having software for coordinating field mapping efforts.

The Field Mapping Tasking Manager (FMTM) does just that.

Mapping areas can be subdivided appropriately (i.e. not crossing rivers, or busy roads, and of suitable size), then allocated to a team of mappers in the field.

fmtm

The requirements for which tags should be collected during a mapping campaign can be customised using XLSForms (more on that topic in another post).

This was recently made possible by leveraging the newly developed “select from map” functionality of OpenDataKit (ODK), where mappers can select specific buildings to collect their data for.

More Details

Attention should be drawn to the fantastic open-source tools in this ecosystem such as StreetComplete. If you are mapping solo, StreetComplete may be the best tool. The main problem solved by FMTM is coordinating collaborative mapping.

Components

  • Frontend: React and OpenLayers, as a PWA.
  • Backend: FastAPI Python server.
  • Database: all data is stored and georeferenced in PostGIS.
  • Mobile data collection: ODK Collect is used for the actual field tagging.
  • Data server: Collect connects to an instance of ODK Central to aggregate and store data before processing.

Status Update

We have an alpha version of FMTM currently running mapping campaigns in a few different countries.

The workflow needs refinement and codebase is constantly being updated.

Feel free to run it yourself and try it out!

Current features:

  • OSM login for attribution of edits.
  • Task boundary splitting using Postgis.
  • Custom XLSForm usage for the ODKCollect questionnaire.
  • User access control and organisation management.
  • MBTile basemap generation for offline mapping.
  • Extraction of generated data in OSM format.
  • Remote connection to JOSM for advanced editing.
  • Real-time monitoring of task mapping status.

Collaborators

  • The task splitting algorithm was developed from a combination of internal HOT work & a hackathon organised by our partner in Nepal: NAXA.

  • NAXA has been helping to build out large portion of the FMTM codebase.

  • Contributors from all around the world, as an open-source project.

Get Involved

Full disclaimer, I am currently employed by HOT to work on FMTM.

I made it here through my volunteer work for the project during the Turkey and Syria Earthquake’s in February this year.

Contributing to FMTM with code, documentation, or ideas would be very welcomed - we are quite a friendly bunch!

Please also consider contributing to Nafundi’s excellent ODK ecosystem, without which this work would not be possible.

Future Posts

This was a high level overview of FMTM.

I plan to make future posts about developments, technical deep dives, and usage of FMTM’s features.

See you next time.

Location: Saphan Khwai, Phaya Thai Subdistrict, Phaya Thai District, Bangkok, 10400, Thailand