OpenStreetMap

qeef's Diary

Recent diary entries

The damn project helps mappers by dividing some big area into smaller squares that a human can map. This diary is about the work done in the third year since the publication.

And, to be honest, not much has been done. I had a little of time this year. Still, there are some interesting improvements.

Deployment and server

I will start with probably the most boring stuff: I worked on the documentation and tests. That is thankless work, but I believe it pays off in the longer term. In short – 35 files changed, 1675 insertions(+), 842 deletions(-) and you, as mapper, should not see the difference before/after.

In parallel, I worked on the refactoring deploy. I have moved some upkeep procedures already and I will slowly continue the work.

Notathons

The most motivating for me, I think, is a feedback with a request like “hey, we are working on this and we need that”. This time it was from guys organizing notathons. We improved the damn plugin for JOSM to download notes automatically and periodically. Also, when using the plugin, the changeset comment is (finally!) automatically set based on the area information.

Get inspired by issues of similar projects

I wrote already about damn project point of view on some issues. From that diary, I think that this issue is solved by option to Review newbie, or Map or review work of other mappers workflow.

Because there are not many issues with the damn project (I do not complain!) I sometimes look up interesting issues somewhere else. So, what is there?

First two here and here deal with locking of multiple tasks (squares in damn) for mapping or validation (review in damn). Locking of multiple squares goes against the principle of “divide and map” and therefore it is problematic, but there are valid use-cases. When you need to map multiple squares, merge them first (in mappy client). Do the same with the squares you want to validate, but it is probably better idea to set which mapper in the damn plugin for JOSM instead.

That second option is interesting. I have slightly extended the server’s API (CreateCommit it is), so when sending requests to map or review, it is possible to specify the mapper’s name. And it works for all types of requests. I mean… probably the most usable is “review recent [square] of mapper’s name”. But you can also “map nearest [square] of mapper’s name”. Just any combination of map/review recent/oldest/random/nearest can contain the mapper’s name.

Split task for validators is perfectly valid idea, so I have implemented it. The interesting part here is what it took to implement it: I had to change four lines of code in single file. Good design matters, in my opinion.

The last of the interesting issues is Revert All Tasks State by Specific Username. I will not implement such a functionality. However, I am willing to revert mapper’s work manually, when reported by at least two validators with significant contribution to the area. By reverting mapper’s work I mean adding new commits to specific area with to map (or other) type. It will be a single SQL query anyway.

Please, note that I have chosen only the issues I think are interesting and worth implementing.

Work for 2023

I will continue the work on the deploy repository; it’s current state is inconsistent with the documentation. I think I will perform another round of load testing next year, just to check the performance is still good.

I need to work on web clients. I will start with the client for (first-time) mappers, for finished areas, or with the manager. The goal for the next year is to distinguish clients by what mappers need instead of what clients can do. When clients are ready and the wording somehow fixed, there are still translations to do.

That’s it. If the damn project helps you, feel free to use it. Keep mapping!

The inspiration for this diary comes from the email sent to the HOT mailing list. I must say that I’m not involved in Missing Maps anymore, so I don’t currently use the HOT Tasking Manager (TM). Also, I’m the author of the competing project. That’s the disclaimer.

I will start with the point since when the HOT Tasking Manager became unacceptable for me. That was when mappers began to be forced to provide their email addresses. The reason was: just 4% of the mappers shared it.

The purpose of a Tasking Manager is to divide big area into smaller squares that a human can map. Then, let mappers communicate on what they are working on by changing the states of the squares. So, TM helps with a group mapping management.

However, the changes go to the OpenStreetMap. You don’t have to use Tasking Manager to update OpenStreetMap, but you (must) use OpenStreetMap when working with a Tasking Manager. OpenStreetMap itself provides communication channels for mappers, particularly changeset discussions and private messages.

Changeset discussion is used to discuss changes mappers do in the OpenStreetMap. Private messaging is used to send direct messages between mappers. In both cases a notification is sent by email.

We are almost there. So, why is the communication within the HOT Tasking Manager wrong? Because the HOT TM duplicates the communication about things it does not manage – changes to the OpenStreetMap. Because it allows group and automated messages/emails that are, by definition, depersonalized. Because it confuses beginner mappers about which communication channels are really important.

I would like to end with a proposal for the HOT Tasking Manager developers. Please, keep the functionality of the HOT Tasking Manager non-overlapping with the OpenStreetMap. Please, do leverage OpenStreetMap for the rest.

This diary post is inspired by the cleaning up after a task manager task. It shows how to do the clean-up steps for a Divide and map. Now. area.

The data quality matters. The proposal in the Johnwhelan ‘s diary is to run duplicated building script and JOSM validator when an area is finished on the whole area. The rest of his diary deals with how to get the area’s border geometry and the OpenStreetMap data into the JOSM.

Here are the steps to download the area’s geometry and the OpenStreetMap data when using the damn project:

  1. Load the area’s geometry by navigating to the area in the JOSM damn plugin, then click Get area geometry button.

    It’s also possible to navigate to the area in the mappy (web) client, right click on arbitrary square and download area geojson, and open the downloaded file in the JOSM. It’s good idea to right click on the created area.geojson layer and Convert to GPX layer to make it read-only, which is the same result as when using the JOSM damn plugin.

  2. Download the data from Overpass API, which is the second tab on Download map data … dialog. You can get the Overpass query by navigating to the area in the mappy (web) client, right click on arbitrary square and area overpass query. Then copy the query to the JOSM’s Download from Overpass API, Overpass query: field and click Download. Be sure you have enough RAM for big areas.

That’s all. It’s kind of fun I’m writing this diary just few days after my damn project developer’s “annual report”, but I didn’t make it sooner.

It’s two years since the Divide and map. Now. has been published. I would like to summarize the second year of the development.

What is it about? Divide and map. Now. – the damn project – helps mappers by dividing some big area into smaller squares that a human can map.

Why should I care? Divide and map. Now. is proven to handle a mapathon with 200 mappers. There are four clients available for mappers and multiple mapping workflows. There is deployment guide for admins. You may create new or modify existing areas, use RSS to track areas’ changes, and check abandoned or overlapping areas by Python3 scripts.

In 2021 I’ve refactored the server and load tested it. There is the API documentation that is stable for more than half a year now. The web clients were also refactored: the light is text-only web client for beginners, the panel is for advanced mappers and looks like it’s integrated into the iD editor, and the mappy web client has square-based graphical interface. The web clients include improved statistics that can show OpenStreetMap contributors that haven’t used Divide and map. Now. when mapping in the same area. The damn JOSM plugin was updated to the new API and loads temporary data stored in the server when available.

I’ve got inspired by the Tanga Building Footprints Import, mapping highways’ radars and mirrors, and documented the mapping workflows available in the different clients.

I’ve implemented Python3 client with the scripts to find abandoned or intersecting areas based on the potential HOT tasking manager improvements and looked at the competing HOT Tasking Manager and SimpleTaskManager issues from the damn project point of view.

I’ve created the finished areas read-only service and announced the policy for finished areas.

There is some work for 2022. Web clients need translation, but I want to stick to the damn project philosophy when implementing it. Also, I want to refactor the damn deploy. Of course, the refactored guide must be at least as simple as the current one.

HOT Tasking Manager (TM), SimpleTaskManager (STM), and Divide and map. Now. (damn) are tools for collaborative mapping, with different philosophies and different approaches. In this diary, I discuss some issues of the first two from the perspective of the third one.

DISCLAIMER: I’m the damn project developer.

Which issues to consider: I filter out bug reports and issues that oppose the damn project philosophy. Then, I pick up issues I think are interesting and categorize them, describe the category, and provide some comments how the particular issues are or would be solved.

About naming: TM’s and STM’s project corresponds to damn’s area. TM’s and STM’s task corresponds to damn’s square. Where TM uses validation, damn uses review.

Solved by design

There are many issues of both managers that wouldn’t be an issues for the damn project because of it’s design. Let’s quickly recap that design.

There are areas divided to squares. Any change to an area is stored as commit. So, creating new area leads to creating new squares as well as new commit with to map type for each square. During collaborative mapping, new commits of different types like locked or done are added with area and square identifiers, always updating particular square of the area. It’s not possible to delete something.

Already implemented

Some functionality requested is already implemented in the damn project, or the problem described has different solution.

  • STM: Allow custom Changeset comment – the changeset comment can be set in the manager (web) client when creating new area and updated afterwards.

  • STM: JOSM: Download data based on task geometry – the square’s GPX boundary as well as OpenStreetMap data are downloaded automatically when using JOSM damn plugin.

  • STM: Notice to be responsible while mapping a task – there is notice on each area’s page of the light (web) client just before the I will map … and I will review … links.

  • STM: Create sub-tasks to specify which part of a task is done – splitting a square creates four new squares with borders dividing the original one. New commits with to map type are added for new squares and new commit with splitted type is added for the original square.

  • STM: Chat within project – there was chat in web client, then I dropped it during the refactoring, and finally re-added it again. It’s not much, it’s mainly for demonstration, but it works.

  • STM: Feature: GeometoryCollection – (multi)polygons, linestrings, or points are allowed geometries for features in a FeatureCollection of an area being created. From each feature new square is created.

  • TM: Feature Request - Remember sort oder on tasks screen – the use-case is to help validators who validate the ‘Least recently updated’ squares. The damn project uses locking policies to solve the original issue–it’s possible to request reviewing of recent, oldest, random, nearest, or newbie square.

  • TM: Filter Tasks by Username to Include All Users - Not Just Most Recent Mappers – on the area’s statistics page, it’s possible to see all the commits, i.e. all the changes to the area and all it’s squares. It’s possible to filter all the commits of a mapper or all the commits of a square.

  • TM: Add active and stale filter options to Manage Projects page – for this purpose, the abandoned_areas.py Python script of the damn-client.py is used.

  • TM: Allow logged in users to create tasks without intervention – it’s the way to create areas according to the damn project philosophy.

  • TM: Define a setup to execute periodic functions – it’s the upkeep of the damn-deploy.

  • TM: undo undue split – it’s not possible to undo something in the damn project, but this issues has the solution. The issue, from the damn project perspective, is that a mapper splits the square, i.e. creates four new squares, a commit with to map type for each of the new squares, and new commit with splitted type for the original square. Then, splits one of the new squares again, i.e. creates four new squares, a commit with to map type for each of the new squares, and new commit with splitted type for the parent square. So the result is seven to map commits. In the mappy (web) client, it’s possible to merge squares, i.e. from the seven splitted squares create new square with the same boundary as the original square has, corresponding commit with to map type for that square, and seven commits with merged type for the seven splitted squares.

  • TM: Track Split Tasks – showing all commits or square’s commits on the statistics page of an area does the trick.

  • TM: RSS feed of projects – see rss.damn-project.org.

  • TM: Submit task and jump right next to another one – with JOSM damn plugin, there is no need for opening the web browser (except for authentication for the first time.)

  • TM: Let mappers translate project description – this is my issue, contributing an idea to the HOT Tasking Manager. Translating the description by mappers is implemented in the light (web) client since the beginning.

  • TM: Improve integration with AI-assisted mapping – in a web client, iD and RapiD can be switched. On the own instance, the default editor can be set.

Boring (client) stuff the server is ready for

The server’s API is ready for a lot of things clients don’t do, because I’m not a frontend developer, nor enjoying work on the clients. The following issues could be solved easily, because it’s possible to download all the commits of an area and all the information is within these commits.

An example what is done in the web client: area’s statistics page shows hours left when mapping or reviewing the same speed as in the last hour/3 hours/day/week/overall.

Some ideas to get inspired

As mentioned in the beginning, I filter out bug reports, issues that oppose the damn project philosophy, and even more. Therefore, the issues listed in this diary is not–and is not meant to be–complete. However, it’s nice to get inspired. There are some interesting ideas in the both managers I get inspired by:

The idea to map radars and mirrors of the Czech highways was raised in the mailing list of the Czech community. One of the comments is questioning about how to track the work done–and that’s exactly the problem to be solved by the damn project. (Note that the idea to map radars and mirrors of the Czech highways is still just an idea and there are no further steps done yet I’m aware of.)

I already wrote two Get inspired by … diaries. These were the potential HOT tasking manager improvements and the Tanga Building Footprints Import. This diary discuss the use of the damn project for tracking the work done on highways.

Create area

The damn project helps mappers by dividing some big area into smaller squares that a human can map. How this applies to highways?

It’s possible to upload two kinds of GeoJSON boundary files to the damn manager. If the FeatureCollection has member ‘name’ then divide to squares function is NOT used. Moreover, I’ve updated the damn project to understand what to do with a feature containing a LineString geometry: create a “square” with the border around the LineString. (“Square” here means a square of an area of the damn project.)

So… when you download some highway=motorways within some interesting bounding box (I mean from overpass API in JOSM,) save as GeoJSON, and add name member, you are able to upload that as GeoJSON boundary file to the damn manager.

Note that I experimented with joining ways to make larger squares of an area. The JOSM damn plugin then refused to automatically download the OSM data of a square because the square was too large, and the data had to be downloaded manually.

Mapping

The mappy client may help to visualize the area. An example is area 2352 and it looks way better with background image switched off. The JOSM damn plugin, light client, or panel using map random square/map nearest square are better options, though.

Yet one minor update to the damn project is to splitting of the squares. I added split horizontally and split vertically commit types that are currently possible in the mappy client. This is because splitting of a highway square led sometimes to tiny squares in the middle of the highway square. (Imagine highway from the left bottom to the right top of the square splitted to four sub-squares.)

Conclusion

It’s possible to track the work done for highways mapping. I know the damn clients could be (significantly) better. I still hope for someone to come and write their own client for the damn server API from scratch. That would be fantastic, at least because the development would become decentralized. Until then, the mappers have no option than suffer from my design.

And that’s about tracking the work done on highways. I wonder if logging the data and importing them with the idea of the last get inspired by … diary wouldn’t be a better option for this particular project.

As I’m not yet sure what will be included in the alpha version of the damn project, I’m getting inspired. This time by Tanga Building Footprints Import announced on the imports mailing list.

It looks like the buildings are already imported in the Tanga city, e.g. this one, but I’ve never done the import thing, so I can’t say.

In this diary, I’m going to introduce the example mapping workflow with the prepared data based on Tanga Building Footptrints Import using the damn project. Not everything is yet fully automated, so this diary is more like describing the proof of concept. However, I’m open to the needs of the future.

Create area

There is the dataset of the buildings available on the wiki page. I like that idea of huge GeoJSON file that includes the features with the corresponding geometries. As I’m not the import guy, I’m confused with _key_s in the properties (I mean these underscores of each key.) I need only building:yes tag, so I’ve renamed just those (note that notepad’s “Replace All” would do the same:)

sed -i 's/_building_/building/' buildings.geojson

I’ll create new area to work on. Therefore, I need the geometry of the whole area, where the buildings are. For that, I’ll use the damn-client.py scripts. (The shapely must be installed.)

cd damn-client.py
./convex_hull.py buildings.geojson > ch-buildings.geojson

The command took about a minute on my laptop. The ch-buildings.geojson can be used as the GeoJSON boundary file in the damn project’s manager.

Buildings of the squares

So I’ve area to work on divided to squares. Now, I need the GeoJSON files containing all the buildings of each square. I’ll use the damn-client.py scripts once again:

cd damn-client.py
./prepare_tmp.py https://server.damn-project.org 2351 buildings.geojson

Note using the original buildings.geojson file. The command took about 10 minutes on my laptop. Generated directory 2351 must be copied manually to the deployed version of the damn project. Uploading of the area’s directory is not yet automated.

Squares with no buildings

This is hacking part. I’ve the area with convex hull border that is covering all the buildings and is divided to the squares. Consequently, some squares could contain no buildings. I want to mark all these squares as done, so I need to:

  1. Get all the area’s squares.
  2. Get buildings for each square.
  3. If there is no building, lock the square.
  4. When all buldings-free squares are locked, merge them, and mark is done.

Doing this manually is possible in the mappy client, but crazy. The automation of this task can be done within the developer’s console (F12 in Firefox,) using the functions of the damn-client.js, though:

var aid = 2351;
api.get_squares(function(squares) {
    for (var i in squares) {
        api.get(
            function(r)
            {
                if (r["features"].length == 0) {
                    api.post_commit(
                        function (r2)
                        {
                            console.log(r2["sid"] + " locked");
                        },
                        aid,
                        {"type": "lock", "sid": r["sid"]},
                        function() {},
                    );
                }
            },
            function() {},
            api.ep(
                "/area/" + aid
                + "/square/" + squares[i]["sid"]
                + "/tmp"
            ),
        );
    }
}, aid, function() {});

Now, I need to wait untill all the squares with no buildings are locked. This took about 2 minutes on my laptop. Then, the last steps are: right click in the mappy client and merge, right click on the merged square and lock, and finally the last right click on the merged square and is done.

Mapping with JOSM

Everything is ready for the mapping. Please, note the limitations:

  • Do not use split and merge otherwise the prepared data are not downloaded.
  • The prepared data are only available when using JOSM damn plugin.
  • I know, the JOSM damn plugin is the weakest point of the damn project.

I would recommend the following workflow:

  • From the damn plugin dialog, select the area to map.
  • Click map button to load some square and corresponding buildings.
  • Use mapathoner plugin to Select Duplicate Buildings.
  • Use todo plugin to keep track of them.

Do not save the work to the OpenStreetMap! This diary is about possibilities, not about the real work on the Tanga Building Footprints Import!

Conclusion

That’s it. Preparing the data is not fully automated, but the tools are there. The weakest point of the mapping workflow is the JOSM damn plugin which is ugly, but working. These are reasons why I’m talking about describing the proof of concept. However, that proof of concept is ready for the broader testing.

This diary is about load testing. I find load testing useful mainly because of two reasons: 1. it shows multiuser access problems; 2. it shows how good is the implementation.

The idea is to simulate mapathon. A mapathon is an event where multiple mappers map the same (big) area. To manage the work, the area is divided into smaller squares. Then, each mapper asks the server to map or review a square, works some time on the square, and finally asks the server to mark the square as needs mapping, needs review, or is done.

I use locust.io to load test the damn server. The test file is part of the repository. There is 80 % of mappers, 20 % of reviewers, and 2 testing areas. Each mapper/reviewer works for 30 - 60 seconds, then waits for 30 - 60 seconds, and then works again. All the mappers are spawned within the first minute. The test is stopped soon after there is 0 % to map for both of the areas.

damn-project.org runs on $5/month VPS with 1 GB RAM, single 2.2 GHz vCPU, and 25 GB SSD disk.

100 mappers handled The first round of load testing gives the idea about the server after the refactoring. The average response time for 100 mappers is 270 ms. Not bad, I would say.

It’s a bit worse when testing 200 mappers, though. I’ve stopped the test after 20 minutes.

Some improvements Two years ago, I decided that there must be some better solution for the problem of dividing a big area into smaller squares. I found that there is PostGIS extension to PostgreSQL so I decided to go with it. Two year later (now,) I’ve discovered database indexes.

Also, I’ve read part of the documentation and set postgres parameters.

Finally, I switched gzip on.

200 mappers handled Pretty small changes, but I had to figure them out. I noted my journey in this thread. Anyway, with the changes applied, I load tested again. The average response time of 275 ms for 200 mappers looks good.

Writing good code is green. Don’t waste energy.

In the beginning of the year, I’ve been refactoring the Divide and map. Now.. I’ve already wrote about the clients and about the improvements. In this diary, I would like to share some details about how the server looks like now.

The damn project distributed architecture is depicted below. I’ll write about the api and db bubbles, where db is the PostgreSQL database with the PostGIS extension and the api is the python FastAPI application.

damn project architecture

Damn server API

Automatically generated documentation of the API is useful for the developers of clients. There are some not so much interesting endpoints dealing with authentication and authorization, users updates, or squares GPXs. I will elaborate on /areas, /geometries, /area/AID, /area/AID/commits, and /area/AID/squares endpoints here.

The list of areas and adding new areas is done through /areas endpoint. However, returned area is different from the area to be created. Returned area contains statistics, area to be created contains FeatureCollection with the area’s geometry and the information about how to divide the area.

The /geometries endpoint is used in python client, particularly in the script detecting intersecting areas.

Getting information about an area and updating the area information is done through /area/AID endpoint, where AID is the area identifier. Currently, four digits are used for the area identifier. Yes, the number of areas of the damn project is limited to 8 999. I’m going to increase the limit to five digits – 89 999 – when needed.

There is /area/AID/squares endpoint providing the list of squares’ geometries for the area. Nothing more.

The endpoint where the real work is being done is /area/AID/commits. The idea is to get the area’s commits once, store the commits, and use ?since= query parameter in the next query to get just new commits. (JavaScript damn API library may help here.) Getting the commits helps to decide the square’s current state (e.g. to map or done,) compute statistics, or show the history of the area’s information.

It is not possible to change the commit. It’s only possible to add new one. And creating the commit is as simple as sending {"type": "map random"} to the server. The example says to the server: “Let me map some random square”. The server replies: “201: Created. Here is the area’s ID, square’s ID, and the square’s border.” Commit message is optional. It’s currently possible to map recent, random, or nearest square, to review recent, random, nearest, or newbie square, and manually lock some square specified by the square’s ID. When unlocking the square, send the type of needs mapping, needs review, is done, or split along with the square’s ID – it’s compulsory when unlocking. However, there is one exception. When multiple squares are locked, commit message with the merge type will create the new square from all the locked squares. There is no need for square’s ID in that case. (Similar to areas, returned commit has different set of types than the commit to be created.)

Source code structure

I’ve changed the files structure, too. However, it’s probably not interesting much. conf and db are unchanged. I’ve added Pydantic models and docstrings to user file. square just gets the border from the database and returns it as GeoJSON or GPX.

api is the FastAPI file. This file is used as the MODULE_NAME of the uvicorn docker. As soon as the FastAPI loses to be the number one of the Python JSON API frameworks, just change this file.

I’ve only added Pydantic models and some docstrings to the area file. save, load, and update functions are still here.

I’ve moved all the lists of any kind to the list_of file. Database queries for the list of areas, geometries, area commits and squares, and user commits are stored here.

The last file is the new file. This file deals with the database queries for creating the commits. The original idea was to move the queries that creates something new in the database. However, when you recall the API, only new area or commit is created. And it’s just cleaner to area.save, in my opinion.

SQL queries

The database structure is not changed. Just to recall:

current_areas -- stores area information
- aid -- primary key
- tags -- the default changeset comment
- priority -- how to order the areas
- description -- JSON with {"lang code": "description", ...}
- instructions -- JSON with {"what": "link to wiki", ...}
- featurecollection -- GeoJSON
- created -- automatically for new areas, UTC now()

curret_squares -- stores area's squares
- sid -- along with aid primary key
- aid -- foreign key to current_commits
- border -- geometry

current_commits -- stores area's commits
- cid -- primary key
- sid -- along with AID foreign key to current_squares
- aid -- foreign key to current_commits
- date -- automatically for new commits, UTC now()
- author -- foreign key for users.display_name
- type -- to map, to review, done, locked, ...
- message -- author's note

I’m going to share the interesting parts (in my opinion) of the queries. The complete code of the server and the database is available, of course.

Let’s start with guessing. What does the following query do?

WITH last AS (
    SELECT DISTINCT ON (sid) cid, sid, type
    FROM current_commits
    WHERE aid=$1
    ORDER BY sid, cid DESC
),
toreview AS (
    SELECT *
    FROM last
    WHERE type='to review'
    ORDER BY cid DESC
    LIMIT 1
)
INSERT INTO current_commits (sid, aid, author, type, message)
SELECT sid, $1, $2, 'locked', 'Working on review'
FROM toreview
RETURNING sid

This is non-refactored code, in fact. It just locks the recent square for review. Locking of the random square is RANDOM() insted of cid DESC, you probably know.

To use some PostGIS, let’s lock the neares square, hey?

WITH last AS (
    SELECT DISTINCT ON (cc.sid)
        cid, cc.sid, cc.aid, type, author, border
    FROM current_commits AS cc
    INNER JOIN current_squares AS cs
    ON (cc.aid=cs.aid AND cc.sid=cs.sid)
    WHERE cc.aid=$1
    ORDER BY cc.sid, cid DESC
),
mine AS (
    SELECT border
    FROM current_commits AS cc
    INNER JOIN current_squares AS cs
    ON (cc.aid=cs.aid AND cc.sid=cs.sid)
    WHERE cc.aid=$1
    AND author=$2
    ORDER BY cid DESC
    LIMIT 1
),
toreview AS (
    SELECT sid, cid, type
    FROM last, mine
    WHERE type='to review'
    ORDER BY ST_Distance(mine.border, last.border)
    LIMIT 1
)
INSERT INTO current_commits (sid, aid, author, type, message)
SELECT sid, $1, $2, 'locked', 'Working on review'
FROM toreview
RETURNING sid

Border is stored in current_squares table so it has to be joined with the current_commits. The interesting part is:

ORDER BY ST_Distance(mine.border, last.border)

And the interesting part for the newbie’s square review request is:

WHERE type='to review'
AND (info::json->>'newbie')::timestamp > now()
ORDER BY RANDOM()

The mappers mark newbies themselves as I’ve explained in older diary. In the same diary, I’ve mentioned the merging of the squares. However, I didn’t say details. It’s not the rocket science. The main part is done by PostGIS again – the ST_Union function in this case.

...
to_be_merged AS (
    SELECT border
    FROM current_squares cs, locked lo
    WHERE cs.sid=lo.sid
    AND cs.aid=lo.aid
),
merged AS (
    SELECT ST_UNION(border) as mb
    FROM to_be_merged
)
INSERT INTO current_squares (sid, aid, border)
SELECT f.msid + row_number() OVER () as sid, f.maid as aid, mb
FROM
    (
        SELECT MAX(cs.sid) as msid, MAX(cs.aid) as maid
        FROM current_squares cs, locked lo
        WHERE cs.aid=lo.aid
    ) as f,
    merged
RETURNING sid, aid

Here, the locked is query that returns all the squares locked by the author. (Note that the locked information is stored in current_commits, but the border in current_squares. Also, some parts of the code are pointless.)

The best for the last. I’ve refactored the pgsql function for dividing up an area or square – the st_divide function. No code snippet here. Rather look at the whole source code. The function gets area geometry, number of squares in x and y axis, and returns a set of the divided area’s geometries. (The number of squares in x and y is overloaded by square’s dx and dy dimensions.) I’ve also tuned the squares a little bit, so it’s possible to divide up the area to rectangles, diamonds, flat hexagons, pointy hexagons, or the wall bricks.

Conclusion

The thing I like the most about the server from the API point of view is that the whole mapping workflow – map -> review -> done – is available through the one endpoint with the trivial messages of {"sid": 1, "type": "is done"}.

The thing I like the most about the server from the database point of view is that everything is done by the SQL queries – the python application “just” translates JSON to SQL, in fact.

I liked Pete’s diary at the first sight. However, when published, I was already decided about my next work. Last Thursday I finally found some time to look closer at this interesting topic. I spent night working on it. And I will share the results in this diary.

First, I will not write about the HOT tasking manager. Second, these are my opinions. Third, understand damn in the following text as the shortcut of Divide and map. Now.

The summary of my understanding of the original diary follows. I’ll append my thoughts, though. I made a few toots during the night, which can serve to track the time effort.

What is TM?

Questions:

  • What the TM is?
  • What mapping process is?
  • What is the origin of a project?
  • What is the ownership of a project?
  • For who is TM for?

Problems:

  • New mappers don’t understand (because they don’t have answers.)
  • New mapper edits frustrate old mappers.

Consequences:

  • Discourage new mappers.
  • Learning process interruption.

Thoughts:

The damn project has advantage here, because the name says exactly what the project does. Using areas, squares, and commits notation improves the clearness, too.

As there are multiple clients, each of them targets different group of mappers. Therefore, it’s easier to improve client for newbie mappers without limiting the experienced ones.

Each commit to the area is stored and shown when requested (statistics page,) improving the understanding of the area history.

Project creators

Original problem:

  • When TM was opened, the data quality was low due to poor project description.

Solution:

  • Onboarding process. Making sure that managers fulfill the expectations of having skills to:

    • engage local communities and contributors,
    • respond the questions and info requests,
    • ensure the documentation of the project is on the wiki (in the case of organised editing,)
    • ensure that standards are met,
    • ensure that the mappers are able to do what requested.

Problems:

  • One manager for many many projects.
  • Responsibility for the onboarding process.
  • Onboarding process strenghtening.

Thoughts:

The original problem is interesting here. With the damn philosophy and the code of conduct in mind, the solution must be different.

The damn project supports mappers by creating/maintaining/inventing tools that mappers may use for keeping the quality. In the OpenStreetMap, anyone can make a change, see it, and fix it. In the damn project, anyone can make a commit to an area, see all the commits, and make another commit to change the area.

Areas of interest for beginners

Problem:

  • Bad categorisation of the project.

Solution:

  • Onboarding process (see above) forbids categorisation of urban environment as the project for beginners.

New problem:

  • The solution works for one situation only.

Thoughts:

Allowing anyone (authenticated to OpenStreetMap) to change the area helps here. As any change to the area is commit itself, reverting the change is easy peasy.

Project documentation

Problem:

  • Onboarding process does not emphasise the importance of the project documentation or link.

Thoughts:

The same as above. The openess helps here. The support should come only after the area is created. It’s similar to reviewers helping new mappers by saying “Hello, thanks for mapping, and by the way – please, square the buildings.”

The special case of changing the area is adding the translation of the area description. Since the beginning of the damn project, this particular case of changing the area is possible directly from the web client.

Unclosed projects

Problem:

  • Inactive projects hangs around for a long time.

Consequences:

  • Incomplete data in the project’s area.
  • Issues for new projects with overlapping area.
  • Hard to find people involved in the project.

Thoughts:

I solved the technical part of this issue by the abandoned_areas.py script (see https://git.sr.ht/~qeef/damn-client.py)

The example output of the script:

$ ./abandoned_areas.py https://server.damn-project.org
Areas abandoned on https://server.damn-project.org since 2020-12-29T20:55:21.
In last 90 days:
- 7114 has only 1 commits
- 2245 has only 1 commits
- 7100 has only 2 commits
- 7102 has only 2 commits
Finished.

Anyone can change the area and see the area creator, so sending the message saying “Hi, the area is quite old, with low contribution rate, so I’m going to archive it in a week. Feel free to re-activate or send the message back if you need to keep it active.” is simple.

TM projects can overlap

Problem:

  • Not possible to check if projects’ areas overlap.

Consequence:

  • Multiple projects used by different groups of mappers mapping the same area.

Thoughts:

This is my favorite one. I was looking forward solving this issue for almost three months. Solved by the intersecting_areas.py script (see https://git.sr.ht/~qeef/damn-client.py)

The example output of the script:

$ ./intersecting_areas.py https://server.damn-project.org
testing https://server.damn-project.org:
- 2229 intersects with 2253 of https://server.damn-project.org
- 2253 intersects with 2229 of https://server.damn-project.org

(Of course, this script works for multiple instances of the damn server, too.)

Feedback

Problem:

  • Not easy, nor intuitive.

Thoughts:

There is Join discussion link at the bottom of each web client page. The link forwards to the mailing list, so feedback is as simple as sending new email.

From the statistics page, it’s possible to write an (OpenStreetMap) message to anyone involved in the area history.

Local mappers

Problem:

  • Find local contributors within project’s area and interact with them.

Thoughts:

This is the third and the last technical issue I found. I solved it in the web client’s statistics page. Along with all commits and commits authors, it’s possible to get the list of local mappers – OpenStreetMap contributors that haven’t used Divide and map. Now. when mapping in the same area. The result of the overpass query that uses the convex hull of all the squares is compared with the list of commits authors.

[out:json];
(
    node(poly:"convex hull coordinates here");
    way(poly:"convex hull coordinates here");
    relation(poly:"convex hull coordinates here");
);
out meta;

Conclusion

When reading my toots to track the time, I spent 1.5 hour reading, summarizing, and understanding the original diary. This is not fair to say, as I read the diary many times since published.

I’m ok with spending 2 hours for each of the three technical issues. I spent more than 2 hours writing this diary.

The only thing I’m sorry about is that I’m three months late.

The state of the mapper's clients for the damn-project.org

Posted by qeef on 12 March 2021 in English (English). Last updated on 15 March 2021.

I’ve been refactoring the Divide and map. Now. last two months. (No surprise, I’ve plan it.) The most beautiful work was done on the server and the database. I will cover that in more technical diary later.

In this diary, I want to summarize how the clients look like now.

Client + panel

Lightweight
client

The (lightweight) client keeps with the common mapping workflow: Map -> Review -> Done. It’s meant to load fast with the minimum data transfer. It tries to avoid rushed clicks of newbie mappers by putting the phrases of “needs more mapping”, “is ready for review” or “split the square” into the sentence. Newbie mappers may click “let reviewers know” as I’ve already wrote.

Panel

The panel is included in the (lightweight) client now. I’ll stick with that if it’s ok. I’ll re-implement it as the stand-alone client if anyone cares. The panel breaks the mapping workflow just a little bit. It allows the mapper mark the square “needs mapping”, “needs review”, “is done”, or “split” whenever the mapper has finished mapping or reviewing. The panel should just speed-up the mapping process for the experienced mappers.

EDIT: I re-added the panel for iD as standalone client anyway.

JOSM damn plugin

Damn plugin

The damn JOSM plugin tends to be what the panel is for the iD editor. I just added the possibility to set the map/review actions. (I mean–request to map/review recent, random, nearest, or newbie.)

Mappy

Mappy

The mappy client is the newest client I’ve wrote. It’s heavy on data, as all the area commits and the background image is downloaded. Let’s be honest–the biggest area (in the terms of the number of squares/commits) on the damn-project.org takes 3.79 seconds to load, downloading 1,195.14 KB. (On 2021-03-11 22:29, just Firefox’s network performance analysis.)

Ok, not all the commits. Only the last commit for each square. And not always, but just for the first time. Then, only new commits are downloaded from the server.

The mappy client lets the mapper do everything what’s possible: lock the square manually, mark the square for mapping/review/done, split the square, or merge the locked squares.

There is background OpenStreetMap image. All the squares are visible. Zoom is possible by push and move left mouse button. (Clicking the left mouse button zooms to the default.) Right click opens the context menu… Just to list the control of the mappy client.

Server API changed

One of the results of the refactoring is the change in the server API. I took this as the opportunity to fix (at least some) mess in the clients. Feel free to let me know if you miss something.

Divide and map. Now. helps mappers by dividing some big area into smaller squares that a human can map.

It’s been a year already since I announced the damn project, so it’s time to summarize the first year of development.

For mappers

I finally managed to make the manager manager-friendly. There is news for the client, too. Along with mapping random or recent squares, a mapper can choose to map the nearest square. Moreover, reviewers can review a newbie square.

I added background OpenStreetMap for the squares. When you click on a square, you may lock that square manually. When you lock multiple squares, you can merge them.

A mapper can share the link to an area or a square. A mapper can download the client and open it on the computer (saving 96KB of bandwidth next time.)

I fixed bugs of the damn plugin reported by #OSMWorldDiscord guys (Thanks again!) and added the “lock & open in JOSM” link to the client.

I wrote a new client. Wondering, how the integration without integration to iD editor looks like? Open the panel and check it out.

For communities

Divide and map. Now. has a separate repository for the deployment. You need to change seven rows in the config files (five rows with security and OpenStreetMap OAuth tokens, one row with the domain, one with the email address) and create one empty file. And you are ready to go.

I was wondering what is missing in the basic setup. I mean – the minimum you need to run is the server. You probably want a client and manager, too. That’s all. You know – the damn project helps mappers …

However, you are part of a community changing the world. The client and manager are OK, but you probably want to tell others about the community.

So in the config file of the deployment repository, you change the eighth row specifying the Docker repository with your web page. As I’m not too fond of the templates for all the running instances, there is no restriction on the Docker repository. You will probably use some static web page generator with Nginx to serve the result.

In 2020

Based on the above, I am confident to say that the Divide and map. Now. is proven working.

Plans for 2021

A few things are waiting for 2021. I started the source code migration to sr.ht, and I need to move the plugin and the server there yet. I am going to improve server API documentation as it’s necessary for 3rd party clients. And I want to refactor the client. (Again, as refactoring never ends.) I have some ideas based on the feedback – at least better control of the zoomable map and a bit of performance tuning will happen.

I’m sure there will be unplanned work, too.

Divide and map. Now. -- square locking policy

Posted by qeef on 12 December 2020 in English (English). Last updated on 11 March 2021.

I am writing (again) about the Divide and map. Now. The idea of the project is to divide up a big area to organize mapping better. In this note, I will share how I was thinking about the mapping workflow from the beginning, how I finally improved it, and how I hesitated but broke the workflow in the end. And I am sure I did right.

The mapping workflow

The workflow copies a mapathon. Many guys are mapping one area. Few of them are reviewing the work of colleagues. They are locking squares of the area to avoid rewriting the data under their hands.

Since the beginning, it’s possible to lock random or recent square. Mapping guys lock random squares. The reviewing guys lock recent ones to provide feedback to the mappers as soon as possible.

Implementation details
----------------------

Feel free to skip code notes if you are not interested in the code.

There is some background I need to share before showing the SQL query: 

- The server is RESTful.
- Changes to squares work the same as git commits.

(It means that the first commit of any square is `to map`. Then mapper locks a random square adding the `locked` commit to the database. When finished, the mapper adds the `to review` commit. Then the `locked` commit is added again by a reviewer who finally adds the `done` commit.)

The important thing is that no one can delete a commit. Just add. Also, there is no square state. Square consists only of area identifier, square identifier, and border. If you need to know the square's "state," you need to look at the square's last commit.

Finally, the query to map random square is:

    WITH last AS (
	SELECT DISTINCT ON (sid) cid, sid, aid, type, author
	FROM current_commits
	WHERE aid=$1
	ORDER BY sid, cid DESC
    ),
    tomap AS (
	SELECT *
	FROM last
	WHERE type='to map'
	ORDER BY RANDOM()
	LIMIT 1
    )
    INSERT INTO current_commits (sid, aid, author, type, message)
    SELECT sid, $1, $2, 'locked', 'I am mapping'
    FROM tomap
    RETURNING sid

Lock the nearest square

The first enhancement I wanted to implement is locking the square that is the nearest to the one I just mapped. The reason is obvious – I guess that some mappers prefer continuous work. Jumping over the map randomly when the mapper found a nice area could be annoying.

Fortunately, with the PostGIS database, it’s not a big deal.

    WITH last AS (
        SELECT DISTINCT ON (cc.sid)
            cid, cc.sid, cc.aid, type, author, border
        FROM current_commits AS cc
        INNER JOIN current_squares AS cs
        ON (cc.aid=cs.aid AND cc.sid=cs.sid)
        WHERE cc.aid=$1
        ORDER BY cc.sid, cid DESC
    ),
    mine AS (
        SELECT border
        FROM current_commits AS cc
        INNER JOIN current_squares AS cs
        ON (cc.aid=cs.aid AND cc.sid=cs.sid)
        WHERE cc.aid=$1
        AND author=$2
        ORDER BY cid DESC
        LIMIT 1
    ),
    tomap AS (
        SELECT sid, cid, type
        FROM last, mine
        WHERE type='to map'
        ORDER BY ST_Distance(mine.border, last.border)
        LIMIT 1
    )
    INSERT INTO current_commits (sid, aid, author, type, message)
    SELECT sid, $1, $2, 'locked', 'I am mapping'
    FROM tomap
    RETURNING sid

Newbie mappers

No doubt, the option to review squares of newbie mappers is favorable.

I am for a pro-active, open approach. (Therefore, any mapper may translate the description of an area.) Therefore, I waited a long time before implementing the option to review a newbie’s square. I can’t say I like the idea the newbie is someone with a small number of squares mapped. Neither I think that advanced mapper marked 43 986 squares to review. (In recent times, this could be discrimination.) And I wasn’t able to come up with something better.

The question is how to find a newbie mapper when the statistics are forbidden/unusable? The answer is pro-active approach. In the last version of the client, I implemented the I am new -> Make myself newbie for next two weeks option. The locking policy for the newbie mapper’s square is then (again) only one SQL query.

    WITH last AS (
        SELECT DISTINCT ON (sid) cid, sid, type, author
        FROM current_commits
        WHERE aid=$1
        ORDER BY sid, cid DESC
    ),
    toreview AS (
        SELECT sid
        FROM last
        INNER JOIN users
        ON (last.author=users.display_name)
        WHERE type='to review'
        AND (info::json->>'newbie')::timestamp > now()
        ORDER BY RANDOM()
        LIMIT 1
    )
    INSERT INTO current_commits (sid, aid, author, type, message)
    SELECT sid, $1, $2, 'locked', 'Working on review'
    FROM toreview
    RETURNING sid

Breaking the workflow

Good enough for now. The workflow works when everything is all right. I mean – mappers map and send squares for review. Reviewers send them back if more mapping is needed. And mark the squares done otherwise.

But problems happen. And the workflow doesn’t consider making already done square ready for mapping again. Just in case you make a mistake.

The solution is the manual locking of a square. I am wondering why I thought it would introduce troubles. Of course, you can lock multiple squares, but why be afraid of it?

Moreover, when you lock multiple squares, you can merge them. (It means you add done commits for all your locked squares and create a new square to map from the borders of the done ones.) It’s not a hypothetical use-case. It’s implemented. Check the client.

EDIT: After the refactoring, the manual square locking is possible in the mappy client.

Conclusion

I wanted to implement more than random and recent square locking policy since the beginning. The nearest one was clear. And I finally found the approach I like for the newbie’s locking policy.

I acknowledge the importance of manual square locking. However, I don’t think it should be the preferred workflow. Manual locking introduces more control over the square average mapper needs. That’s why I hesitated, I think. Curiosity note: Implementation of merge locked squares functionality took me about 10 hours since I decided to implement till the merge commit.

There are two things I have in mind yet before the end of the year. First and most important, the damn JOSM plugin is reported broken. I need to fix it. The second is that I would like to make creating areas in the current manager a little bit more user friendly. The JSON format is not as comfortable as the HTML form.

It started by @gustavo22soares’ question on Mastodon (not archived.) My answer was:

@gustavo22soares Hi. If the question is about #damn – there is no delete area function. (It was not needed to demonstrate the concept and there is no consensus about “what after delete?”)

Regarding the name – feel free to use tags for this purpose.

After 2 weeks, there is still no delete area function. Sorry. However, we did something after all.

The damn client may be translated (and it is translated from English to Czech and Portuguese.) Areas may be filtered by tags. There is button to switch between grid and list view. And finally, the damn client supports multiple OpenStreetMap web editors. (No, there is no support for JOSM. Use the damn plugin instead.)

The defaults can be changed easily when deploying – and it was the intention.

And the last thing: I updated the server. (Because I was not satisfied with the function returning the list of the areas.) After the update, I stressed the server a little bit. It was definitely NOT load testing, but I just could not resist.

I run load testing with locust.io on the damn server, and I think it’s usable for mapathon with 100 mappers.

The damn server is part of the Divide and map. Now. – the damn project I have published at the beginning of the year. And I am still interested in the performance. I was wondering if I can simulate a mapathon event.

Setup

I run the damn server on $5 per month VPS of do.co. The server runs Debian 10.2 x64, 1 vCPU, and 1GB / 25GB Disk. There is nothing special about the deployement I described in the last diary.

For load testing, I prepared the area. I uploaded a big area (JOSM measurement plugin says it’s 42 852 square km) with the damn manager, that split it to 654 squares.

Then, I simulated a mapathon in the locust file. The “participating” test mapper can choose from:

  1. Get the area and all its commits. (In fact, all commits wouldn’t be used in a real client twice, since since for an area is implemented.)

  2. Request a random square for mapping. If the square is available, work on it for between 1 and 5 seconds and then stop with need more mapping (30 %), split the square (10 %), or ready for review (60 %). (In the brackets, there is a probability of stop reason used.)

  3. Request a random square for review. If the square is available, work on it for between 1 and 5 seconds and then stop with need more mapping (30 %), please, review again (10 %), or done (60 %).

When (1), (2), or (3) finishes, wait for between 1 and 2 seconds and choose again. (2) and (3) are chosen five times more often than (1).

Testing 100 mappers

I tested for ten minutes. There were ten new mappers per second until a maximum of 100 mappers.

  • Median Response Time: 2900 ms
  • Average Response Time: 6052 ms
  • Min Response Time: 177 ms
  • Max Response Time: 70272 ms
  • Requests/s: 12.10

The total request count was 7 310. And there was no error!

Testing 500 mappers

I tested for ten minutes again. There were 50 new mappers per second until a maximum of 500 mappers.

  • Median Response Time: 16 000 ms
  • Average Response Time: 30 431 ms
  • Min Response Time: 175 ms
  • Max Response Time: 375 465 ms
  • Requests/s: 13.99

The total request count was 8 499. There were 2.35 failures per second.

I don’t like failures, so what were these errors? Connection refused, connection closed without response, and not found url. I think that I just overloaded the $5 per month server.

Conclusion

First, I don’t think that the average response time of 6 seconds is fantastic. However, I don’t believe that mappers at mapathon work on a square for 1 to 5 seconds, either.

The damn server currently online is not ready for big loads. It looks like one vCPU can’t handle 500 mappers. However, the damn project aims at local communities. Moreover, I have no ambitions to run a production server on my own.

During load testing, I found some database access bugs and fixed them.

The last thing – I would like to see some similar (comparable) load testing of HOT Tasking Manager. I gave up on trying. I made the Tasking Manager run but had problems with login. And I have no idea how to run queries from 100 test mappers anyway. (No, I don’t think that 1000 times GET / is comparable load testing.)

At the beginning of the year, I released Divide and map. Now. – the damn project. I present it as the proof of concept that HOT Tasking Manager can be done better.

The damn project consists of multiple repositories – server, client, manager, plugin, and deploy. Damn deploy repository contains setup used for running instance of server, client, and manager. The only changes to master branch of the damn deploy repository are secrets in .env file and email address in traefik.yml.

Why is there a separate repository for deployment? It complies with the philosophy of do one thing and do it well. The team of administrators deploying the project shouldn’t care about the development of the server or any client.

Why am I writing this? I already wrote about the improvements to the client since the damn project release. And the development continues! I like to improve things. I think that deployment is essential. Finally, there is Ray Kiddy’s comment on gitter:

Thinking more about it, I am interested in seeing something that is easy to provision and get running. So I used “flexibility” as a code word for the ability to bring up new damn instances, at will, and without much hassle.

So I was thinking about how to deploy easier.

How to deploy

I believe that if I want to describe something, the best is start from scratch. So, here we go.

Setting up virtual private server

At do.co, I created new Debian 10 droplet, just $5/mo. In the time of publication, this droplet is already down, and testing instances do not work. See https://www.damn-project.org/ for running available instances I am not willing to shut down.

Then I added test-server.damn-project.org, test-client.damn-project.org, and test-manager.damn-project.org DNS A records pointing to the IP address of the droplet. Now, I can ssh root@test-server.damn-project.org.

Prerequisites

All the deployment is in damn deploy repository. The howto is in readme file. So, I will just follow the readme.

The first command failed (git clone ...). There is no git command in my test-server and no info in the readme. Fixed!

So, install the prerequisites:

apt update && apt install -y git docker docker-compose

and clone the damn deploy repository:

git clone https://gitlab.com/damn-project/damn_deploy.git cd damn_deploy

Set up the environment

In env file (just link to .env), I am setting up:

DAMN_SERVER=test-server.damn-project.org

Then, I generate the passwords with dd if=/dev/urandom bs=8 count=8 | base64. Just take the right part of the output. 2>/dev/null for dd command may help. (You see that some knowledge of a command line is necessary. Sorry.)

POSTGRES_PASSWORD=Sjr0jqbhsjnzBEptfvvXMAfQs2mT5LFNnpOy1TSIR1xiMgb9szInRDtuBnqszzVMZXMVw5tsYmFw JWT_SECRET=kX5s62Ecn0vju0h0V7Lyb63OC2RIz/eZND0T9stpEpwM0dyFPizq3LXLjxxSXQOug8Uj/URaF5NZ SESSION_SECRET=lCuCrMSM8VHDhW3dQ9xPViu0osZXl3CJRqwv4YRJ2LaMVgRfX+05zp2t78oQrOe5L4pgTajbH68I

Time for OpenStreetMap OAuth keys. See env file how to obtain. I go to https://www.openstreetmap.org/user/qeef/oauth_clients/new page. You need to use your own OpenStreetMap username. I fill in Name (test-server-damn-project) and Main Application URL (https://test-server.damn-project.org). Only read their user preferences is necessary.

When successful, I copy Consumer Key and Consumer Secret to the env file.

OAUTH_CONSUMER_KEY=qpMXnhHl8fozTxwYBE1J9GVI8RexdaDw7ES09c0F OAUTH_CONSUMER_SECRET=KRAlgdPoF1HRWnBPZvyt5iEwrYJQlOaEo4XMZZ9P

NOTE: Do not forget that the testing instance is not running in the time of publication of this diary. Please, do not publish your passwords, secrets, and keys.

The last of env file are DNS names of clients:

DAMN_CLIENT=test-client.damn-project.org

and

DAMN_MANAGER=test-manager.damn-project.org

I can stick with the default versions of the damn server and clients, so the environment configuration is done.

Finally, I set up the right email address in traefik.yml, and create acme.json with:

touch acme.json && chmod 600 acme.json

Autostart after the server restart

Here, I just copy and paste into the server’s terminal the Autostart with systemd section from the readme. I will not duplicate it.

Upkeep

I will again copy and paste the code from the Damn upkeep section. There is only one script in upkeep, now. Every 15 minutes, it checks for squares that are locked for more than two hours and unlock them.

Test the test instance

It should be all now. I can check if the docker containers are running with docker ps.

Then, go to the test manager, authenticate to OpenStreetMap, add some areas, go to test client, authenticate to OpenStreetMap, choose some area, map some square, and so on.

I had one complication! I don’t know how, but acme.json I created was a directory! I recognized when I checked the test client page and a Security Alert showed. I just rmdir -r acme.json directory, touch acme.json && chmod 600 acme.json again, and systemctl stop damn.service && systemctl start damn.service. Wait a while for certificates and nothing more.

Conclusion

I knew what I was doing, so it’s not surprising I finished the deployment in an hour, including a phone call, readme fixes, coffee preparation, some chatting, fixing some unrelated scripts for an unrelated project, and writing this diary. (In fact, I spent additional half an hour on the diary review.)

The fun is that even I am responsible for the damn project, I don’t remember how I did many things. Therefore, I wrote this step-by-step howto in the damn deploy readme. Just to know, how to fix issues I did.

And I am not going to say that it’s awesome, because I am biased.

(Prague) online mapathon

Posted by qeef on 3 April 2020 in English (English).

So we organized an online mapathon. I want to share what we planned, how it worked, our feedback, and my thoughts about how we should continue.

When I say we I mean core team. By definition, the core team are people who go to a pub after a mapathon. Maybe, it would be better to say that: The Missing Maps is mainly about mapping. However, somebody has to take care of the mappers…. And that’s we, the core team.

In Prague, there are Missing Maps mapathons every last Tuesday of a month (we have an exception in December), and we were not willing to give up on March’s one.

What we planned

We found that we had about 30 registered participants. We do the introduction to mapping in iD editor, we teach the JOSM basics, and we explain how to validate to advanced mappers.

The plan was to have one global chat, video stream for introduction talk, video chat for mapping groups, and video stream again for final talk(s). All of the communication channels must be accessible without registration for participants. Moreover, video chat for mapping groups should be instantly creatable to maximize flexibility.

We needed leaders of mapping groups. Fortunately, there are plenty of capable mappers in the core team. The idea was to limit the capacity of a mapping group (to about 6 - 7 people) to achieve comfort communication.

We got prepared with Introduction to Tasking Manager and iD editor video (in Czech) and written down mapping process (in Czech, too). The video length is limited to 5 minutes. I think it increases the probability that the participants would watch it.

Now about technologies. We used the Freenode IRC channel as the global chat. Participants can connect easily over Kiwi IRC client, just specifying their names. (You can add server and channel after /next/ in the link.) We used Zoom for the introduction and final talks. And we used Jitsi for video chat of mapping groups.

We knew links to global chat and zoom meetings in advance, so we sent them to participants along with times of zoom meetings, link to the mapping task, link to Introduction to Tasking Manager and iD editor video, and link to written down mapping process.

We planned the mapathon a bit shorter than the regular one. Usually, our mapathon takes 3 hours. Our first online mapathon started at 6 pm, there was the introduction talk at 6.10 pm, we started the final talks at 7.40 pm, and the whole mapathon was over a bit after 8 pm.

The last thing – we used keepbot to help with auto-creating of links to video chat for mapping groups. (It turned out that it didn’t work well. I will comment on that later.)

How it worked

So users started connecting. Together, there were about 45 guys on global chat, including the core team. I was moderating IRC. With the help of the keepbot, I was regularly sending the most important links to the channel, because IRC does not persist a history, and I was always changing the topic of the channel when the time comes to introduction talk, mapping time, or the final talks. (The last change was PIVO, of course.)

At the end of the introduction talk, I explained that there would be links to mapping groups on global chat now, and I asked mappers to join one of the mapping group. Because the leaders of mapping groups registered their descriptions of the mapping group to the keepbot, I was able to send automatically generated links to video chats of all the mapping groups.

One mapping group was created ad-hoc, for mappers with no need to learn, but want to chat. Two of the mapping groups were dropped because there was no need for them. There were some problems with screen sharing on Jitsi, however. At least two mapping groups moved to Skype. As the keepbot is a pretty simple script, I wasn’t ready for such a situation. Fortunately, the communication of the mapping group leader towards mappers was brilliant, and there were no significant complications when moving to another video chat.

When the time was up, I changed the topic to final talks. These were on Zoom. The participants asked questions for speakers on the Zoom chat. There were few left on global chat, mainly the core team. I could say that we concluded the mapathon on Zoom, announcing the number of buildings mapped.

Finally, after the mapathon, we had a little bit of chat and recent feedback from the core team.

Our feedback

This section is about feedback from us, the core team, towards us, the core team. We had no plan to take input from mappers on the first online mapathon. We will see how many mappers will come to the next online mapathon if organized.

I will start with the problems. The biggest one was Too many communication channels. We had the global chat on IRC, but there was a chat on Zoom, too. (We couldn’t use Zoom for the whole mapathon, because of the time limit for non-paying accounts.) And it was confusing.

There were some problems with screen sharing on Jitsi. We tested it before the mapathon, but maybe we should spend more time on it.

We chose a lousy mapping task that was for intermediate mappers. Avoid it for your mapathon with beginners! They wouldn’t be able to map! But we managed early and chose another one.

Also, we found out that there is no need to limit the mapping group capacity so strictly. About ten members in one mapping group is still okay.

The rest are smaller problems, but also noted. There is no chat history in IRC nor Zoom. Sometimes it happened that a participant was disconnected from IRC. Zoom does not work on Firefox.

However, we can praise ourselves, too. The mapathon had flow. We solved the problems in real-time, without much hassle.

We designed the structure and kept it, more or less.

There were enough mapping group leaders. We had three mapping groups for beginners in Czech, one in English, one mapping group with an introduction to JOSM, one mapping group with an introduction to validation, and at least one group of just mappers. We were flexible in the sense of mapping groups.

We had informative communication towards the mappers. Not just on global chat, Zoom, or Jitsi video chats, but before the mapathon, too. (We sent as much information as possible in advance.)

My thoughts about the next online mapathon

I saved a few rows for my thoughts here. I tried to be objective as much as possible. However, the diary is, of course, biased.

So what are my ideas for the next online mapathon? The biggest issue is the one with the communication channels. Let’s keep the structure as pure and open as possible. Therefore, I like IRC. I wouldn’t go for Zoom again. I would try to stream the Jitsi meeting on Youtube (disable the chat), share the link on IRC, and discuss everything there.

I am going to change the keepbot, too. The mapping groups leaders should be able to change the link to their video chats as they wish. And I should extend the keepbot for Question and Answers session.

I released Divide and map. Now. – the damn project at the beginning of the year. Also, I presented the damn project at FOSDEM. I was pretty nervous, and I have to work on my presentation skills. The talk was too chaotic, I know. Anyway – now, I want to present changes in the damn client since the release.

Inspired by tasking manager stats, I implemented mapping and review rates. The damn project is ready for these upgrades. All kinds of statistics can be computed from the commits.

The mapper can map or review a random square of an area. Square to map/review selection is server-dependent. I am going to implement something like map the nearest square in the future. However, I think that the decision about a square to map should still be on the server. The user should say something like: “Hey, I am willing to contribute with mapping something like this,” and the server replies: “Good, try this one.” I don’t think that the manual square lock is a good idea.

I moved the essentials procedures to the damn client library and implemented a workaround for the geometry of two and fewer points. Just mark such square done automatically.

I speeded up the whole client! (For developers: if you want, you may test it on your own – git checkout v0.1.0, then back to git checkout master.) Also, I changed the design slightly. I am not a designer, so I am doing small changes as I feel it.

I added squares map. It indeed helps mappers to imagine the remarkable work done.

The next is the mappers’ score. There is a list of mappers and reviewers with their corresponding number of squares done. And the last upgrade – it’s finally possible to logout (for developers: delete JWT token stored as a cookie; the default cookie expiration is one year), and there is a user’s last week statistics. It would be possible to load all the user’s commits and compute overall statistics, but it takes quite long.

Maybe I will find something interesting that I would like to implement yet. But I think that the next step is to make the damn deploy a little bit easier.

Divide and map. Now.

Posted by qeef on 1 January 2020 in English (English). Last updated on 18 December 2020.

What?

There is a place to be mapped. The area is usually too large for one guy. Involving a team of mappers is, therefore, evident. And there is an excellent approach to solving big problems called divide and conquer. Therefore Divide and map. Now. – the damn project.

The goal is to split a large area to squares a human can map. Then, lock a square, work on it, and mark ready for review. Lock again, review the square, and mark as done. The project is here to avoid overwriting work between mappers.

If this sounds familiar, you probably know the HOT’s Tasking Manager. (HOT stands for the Humanitarian OpenStreetMap Team.)

Why?

There are issues with the HOT Tasking Manager. Mainly performance issues. Then, slowly forgetting about essential functionality. Unwanted updates as the necessity of email address (I really hate this!). And only one repository for the whole project.

These are technical issues, however. What about community issues? Why Google disk when there is OpenStreetMap wiki? Why Slack where you must be invited and logged to (just!) see the history? Why all these sign-up Google forms to contribute? I will try to guess an answer – because HOT is a company (even non-profit) and does not know how to be an open community.

Before you ask “If you are so smart, why you do not contribute to Tasking Manager?”, I tried 1, 2, 3, 4. However, there is no real discussion about the future of the Tasking Manager. The future is already decided; openness means that you can see the source code and fix bugs. Nothing more.

Who?

Let’s talk about the potential damn community. It starts with mappers. I will split mappers to novices, experts, reviewers (or validators), and mentors. Mappers are the most critical and the largest group, so the tool they use should be efficient, stable, and comfortable.

Novice mappers perhaps have different needs than experts or reviewers. Mentors, on the other side, probably need to be as close as possible to novices to help them. All of them use the tool differently.

The next are guys with a place to be mapped. Their goal is to point to an area, provide the necessary information, and let mappers do their job – completely different tool usage.

The last (but not least) part of the damn community are developers. They should communicate with the whole community and deliver the tool.

How?

I will try to explain how to create the tool that satisfies Who, do What, and avoids Why. The damn project tends to be proof of concept.

Novice mappers, along with mentors, probably use a web browser. Make a web application then. That could include web chat, maybe? However, remember the primary goal! Let novice mappers map and make mistakes. Let mentors reach the novices as quickly as possible and mark their mistakes efficiently.

Expert users use JOSM. JOSM plugin is the only possible solution here. Maybe other users use a different specialized tool? Make the plugin for that tool, too.

Another web application can help guys publishing places to map. But not necessarily a web app. What about some script? Maybe with some support of so popular machine-learning? Why creating areas to map couldn’t be done by computer? And then just confirmed by some (perhaps another) tool?

You need to serve all these clients with a centralized data store. That’s the backend server.

For each client, create a communication channel between users and developers, namely between novice users, mentors, and mapping web app developers. Then between JOSM users and JOSM plugin developers. Finally, between all the clients’ developers and backend server developers. Everyone involved should also be able to communicate with everyone else.

Always support new clients. One tool just can’t fit all mappers.

Damn project

The damn project is composed of multiple repositories. The central application is damn_server 5 source – backend server written in python using the async FastAPI framework.

There are damn_client (web application, 6 source) and JOSM damn_plugin 7 as main damn clients. There is damn_manager (web application, 8 source) for creating areas to map.

All the repositories are in damn-project group on Gitlab. There is damn-project community chat on Gitter. When needed, new channels on Gitter will be created.

The damn project is proof of concept. It suffers from performance issues because the damn server runs on basic DigitalOcean droplet with 1GB RAM, 1vCPU, 1TB transfer, 25GB SSD disk. There are, of course, more issues.

How we mapped -4%

Posted by qeef on 16 August 2019 in English (English).

To be fair, we were validating. We validated 5% of the tasks and invalidated 4%. So we went over 9% of the tasks. In 2 hours. With the group of 5 people. And one of us was validating for the first time.

The point is that the validation is not fixing. The validation is approving.

I would like to note our small contribution from Prague. Inspired by London Missing Maps mappers, we started small mapathons this summer. These mapathons are for advanced mappers, we use JOSM, and this time, it was a validation time. I went over multiple sources describing how to validate. I used OSM wiki as a primary source. Then LearnOSM, MissingMaps, and browsed multiple links pointed out by these sources. I attended HOT validation training webinar. I found out these pieces of information bloated, so I stripped them and picked up what I consider helpful.

There are fewer validators than mappers. If validators want to keep up with mappers, they need to validate faster than the mappers are mapping.

Our group was validating buildings with the following workflow:

  • Find out if you are going to invalidate task, usually because of not-orthogonalized buildings, missing buildings, lousy geometry, or buildings mapped with bad imagery.

  • Orthogonalize all the buildings with 4 nodes:

    • Ctrl + F and find buildings nodes:4 inview.
    • Press Q.
  • Upload changes and invalidate task with the comment why.

Of course, if there is no problem and everything is mapped as described in instructions, just browse the square and find out if everything is really ok. MarkSeen plugin can help here. If there is one building with wrong dimensions or bad rotation, fix it (Utilsplugin2 and X for resizing help here a lot). If you find out the second one, invalidate. Validating is approving, remember?

When you are tagging mapper within the message in Tasking Manager (in JOSM, you can see the list of authors with their % of contribution), do it only once. Probably, you will validate multiple squares of that mapper. He or she does not have to read 20 messages from you saying: “Please, orthogonalize buildings.”

The last thing is that this is not the validation guide. Every validator is an individual. Each square needs a different approach. Find out what works for you. But remember:

There is always more mappers than validators. Therefore, validation must be much faster than mapping.

During validation, you are not fixing. You are approving.