OpenStreetMap logo OpenStreetMap

Revolutionizing building import in Poland with AI

Posted by NorthCrab on 24 July 2023 in English. Last updated on 15 August 2023.

🗺️🦀 Hello to the OpenStreetMap community,

I am happy to announce my latest project, osm-budynki-orto-import — a fully autonomous building import tool currently in operation in Poland. This is my next step towards making OpenStreetMap (OSM) a more dynamic and efficient platform.

Dataset preview

This tool is designed with the objective of making building import process simpler and more accurate. The system utilizes official building data in conjunction with ortophoto imagery to validate the accuracy of the data before importing it.

At the heart of this tool lies an advanced computer vision model, with a precision as high as 99.7%. This accuracy is, in my opinion, superior to the capabilities of most average mappers, providing a faster and more reliable way of mapping structures.

One unique feature of this tool is its ability to cross-reference historical OSM data to prevent re-importing of previously deleted buildings. This function ensures that once a building is removed from the OSM, it will not reappear in future automatic imports.

The primary goal of this project is to significantly cut down on monotonous and repetitive tasks, freeing up the mappers to focus on intricate and high-value mapping assignments.

In the spirit of promoting transparency and collaboration, the complete project has been open-sourced on GitHub. Along with the codebase, I have also shared a CVAT-compatible dataset with 6000 classification entries, a resource which could be highly beneficial to other developers.

As always, your feedback plays a crucial role in the ongoing development and refinement of the project. If you find value in this project and wish to support its ongoing development and maintenance, you can find my donation information at https://monicz.dev/#support-my-work.

Give it a star on GitHub ⭐️.

Discussion

Comment from iWowik on 25 July 2023 at 05:50

99.7%

Good number. But as a physicist I know that all values should have units of measurements and accuracy.

For example 99.7+-30%

And how this value is obtained?

Comment from NorthCrab on 25 July 2023 at 10:06

@iWowik I think you meant to ask for a confidence interval (based on the +-30% number you added). Percentage is unit-less and accuracy in this case is irrelevant as precision is more informative about the false positive error rate.

This number is a precision achieved on the holdout dataset: containing 600 entries: 500 buildings (True) and 100 non-buildings (False). Using https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval method, the 90% confidence interval is 99.3%-100%. But it’s been some time since I last did it so excuse me any mistakes. This is really not my field of expertise atm. I might be talking completely out of the blue here.

The holdout validation code starts from here: https://github.com/Zaczero/osm-budynki-orto-import/blob/43cce7c88c898939fafb8f24087882ee4f88c0e9/model.py#L115.

Comment from simonschaufi on 25 July 2023 at 10:32

Hi, this is really cool. Are you aware of RapidEditor? I’ve written a blog post about it and how I map buildings in New Zealand with it: https://www.openstreetmap.org/user/simonschaufi/diary/401879

Comment from NorthCrab on 25 July 2023 at 10:38

@simonschaufi Yes, I am :-) But the primary goal here is slightly different. I want to create solutions that require little to no human intervention, this way mappers can be significantly more productive, instead of doing the same repetitive tasks over and over again. RapiD still requires complete human attention to make any changes, which is a more universal solution and is very low risk.

Comment from NorthCrab on 30 July 2023 at 14:10

What do you even mean by accuracy? I primarily mean the precision metric, but I use simpler terminology better understood by everyday people. Scoring code starts here https://github.com/Zaczero/osm-budynki-orto-import/blob/e899e4c2e14bced34fde4be02c4bb9b674381b25/model.py#L143

What is your IoU threshold? None, the dataset is labeled in a way that answers the question: “is this building acceptable for an import as-is?”. The idea is to automatically import buildings, which from a human perspective, don’t require much modification.

With this approach, the model is able to import about 70% of valid buildings as-is. The remaining 30% requires some modification from the model’s perspective and are not imported automatically. For clarity, 100% would be all buildings which are visible at an orthophoto imagery.

Comment from NorthCrab on 30 July 2023 at 14:10

^ Sorry for poor formatting, something has gone wrong.

Log in to leave a comment