OpenStreetMap

qeef's Diary

Recent diary entries

The fourth year of the Divide and map. Now.

Posted by qeef on 2 January 2024 in English.

Welcome to the summary of work done in the fourth year of Divide and map. Now. – the damn project that helps mappers by dividing a big area into smaller squares that people can map together.

Four years ago, the damn project was developed to constructively criticize the HOT Tasking Manager. (HOT stands for the Humanitarian OpenStreetMap Team.) I see that the limitations of the HOT Tasking Manager persist, that it is not getting better, and that the developers of the HOT Tasking Manager can still benefit from constructive criticism.

In this diary, I will first recap the functionality and scope of the damn project. Then I will write about the work in 2023 and the work for 2024.

Functionality and scope

To understand the damn project, let us break it down into four main components, each of which has its own repository: the server, the web clients, the JOSM damn plugin, and the deployment guide. The deployment (for sysadmins) brings up the damn project. The damn server (for backend developers) contains the core service – JSON API to the PostGIS database. The JOSM damn plugin connects to the damn server so that mappers do not have to open the web browser to contribute to OpenStreetMap when using the damn project. Finally, the web clients repository (for frontend developers) contains the code base for multiple web clients.

The features available to mappers are divided into the JOSM damn plugin and the web clients, as the damn project serves multiple groups of mappers and recognizes these groups.

Novice mappers use the mapper client, which allows them to map some square. They can choose between the random, newest, oldest or nearest (to last) square of the area they are mapping. A lightweight web client for somewhat advanced beginners is the mapper’s panel – a slim web page that appears on the right side of the monitor and pops up the iD editor on the rest of the monitor. When using map-review-done square state transitions, it is not possible to review squares using mapper clients.

Reviewers, experienced mappers who care about quality, will use the JOSM damn plugin in most cases. They can map or review (random, newest, oldest and nearest) squares. There is also the option to review newbie squares, but to enable this workflow, beginners must identify themselves in the mapper client. Next, reviewers can review the work of a particular mapper by requesting a (random, newest, oldest and nearest) square of that mapper (to map or review). Also, the JOSM damn plugin can automatically download notes, which is meant for notathons.

If JOSM is not an option, there’s the panel client – a lightweight web client for experienced mappers that can do almost everything the JOSM damn plugin can do. It also allows mappers to use a wide range of editors, including mobile editors like Vespucci, for field mapping.

Finally, area managers use manager client to create and update areas. They also use mappy client to view the area on the map and prepare the area for mapping – for example, to split or merge squares. Area managers can also use the intersecting areas website and track their areas with RSS, which can be connected to the fediverse.

This is the summary of the functionality of the damn project, which enables multiple square’s state transitions and different mapping workflows, highlighting the needs of specific groups of mappers.

Divide and map. Now. is a collaborative mapping tool. Our goal is to develop the best collaborative mapping tool, but nothing more.

It is not possible to create campaigns in the damn project, as there are already established guidelines for organized editing campaigns in OpenStreetMap. We recommend following the guidelines for organized editing and using permanent links to the areas that are part of the campaign.

It is not possible to send messages to other mappers via the damn project. It is always better to use the existing infrastructure. So use direct OpenStreetMap messages or changeset discussions to notify a mapper.

It’s not possible to create groups in the damn project, but there’s the OpenStreetMap user groups wiki where we can get inspired.

Work in 2023

What has happened in 2023?

First of all, we have finished refactoring the damn deploy. The instructions are simpler and clearer. The subdomains has better names. There is a new www template. The web clients can now be rebuilt faster. Docker containers no longer need to be restarted after the web clients have been rebuilt. It is simpler to install systemd services. And it has been proven again to be performant; proven by load testing.

There are some new API endpoints of the damn server and by default the list of commits or squares returns a maximum of 1024 items. These changes were necessary to support large areas, i.e. areas with 300 000 squares.

We have unified the code base of the web clients, all web clients now use a common repository. We have also added a permanent link for areas, which can be used for example in the changeset comments. When the area is finished and made read-only, the area identifier will be used for another area in the future. Therefore, it is better to use the permanent link in cases where a permanent link is expected.

There have been minor improvements to the JOSM damn plugin. We have added an option to load multiple imageries, including WMS ones, a button to manually lock squares and we now use JOSM’s Transifex for translations.

It’s not much, but the steady improvement is obvious.

Work for 2024

And what about the work for 2024?

OAuth 1.0a will be obsolete at some point in the future. No need to rush, but it will happen. We see this as an opportunity to rethink our approach. Currently, the damn server uses OAuth 1.0a for OpenStreetMap and sends a JWT to the (damn) client upon successful authorization. We are thinking about offloading authentication to another service so that the damn server really only serves one purpose. It would also be nice to have access to the OpenStreetMap messaging API from the damn clients so that mappers can discuss using changeset discussions. We still need to think about that.

Another challenge is large areas and detailed geometries (of squares). The load testing was performed for average areas with up to 1024 rectangular squares, which is the default (maximum number and shape) of squares that the damn project uses when creating new areas. We need to prove that large areas and detailed geometries work for the damn project or limit their creation. For example, loading the intersecting areas web page now takes up to 20 seconds! That’s not good!

A lower priority is the integration of the creation of background map images into the damn_upkeep of the damn server and the never-ending work on JavaScript web clients.

And finally – the last, the best, the most important – we listen to the ideas of the mappers. The issues raised by the community are always a priority for us. However, we are not so much interested in feature requests. We are interested in the problems that mappers have when using their workflow and we do our best to solve these problems and implement the workflows.

Dive into the HOT Tasking Manager codebase

Posted by qeef on 20 December 2023 in English.

This diary is about programming, as my diaries usually are. This diary is about the code, particularly about backend codebase of the HOT Tasking Manager (TM). TL;DR of this diary is: HOT TM code is unmaintainable mess and that will not change. HOT TM developers inherited something they can do nothing about. Please, prove me wrong if you can and end up my frustration.

I will try to find out why Error when trying to split tasks happens, so deep breath and dive.

I suspect backend, so I will omit the note that it could be replicated on chrome browser only. And I will start directly at TasksActionsSplitAPI of backend/api/tasts/actions.py file, because that looks reasonable.

In the first try block I need to find out what is that SplitTaskDTO. I always forgot what “DTO” means. The import line helps and looking into backend/models/dtos/grid_dto.py says it is “DTO used to split a task”. Thanks. But the code looks like the data structure so I merely remember the purpose – just data or alike.

Only the second try left, so let’s see. The comment on the first line says it checks the project exists. I am just curious how it’s implemented, so see backend/services/project_service.py to find out exists method. Side note – I completely misunderstand classes with static methods only. Oh, exists is just Project.exists. How is that implemented? Just curious… See backend/models/postgis/project.py to see that exists is… it looks like a database query. I don’t SQLAlchemy, but note the first database request. I will number them, this one being #1.

We are back at the second try. It looks the main work is done in split_task of backend/services/grid/split_service.py, so see that (static again) method. Well, heh, not funny anymore. But I am not going to give up easily this time. So split_task now.

Get the original task by get (static) method from backend/models/postgis/task.py. In the time of writing, there is # LIKELY PROBLEM AREA comment in this method. However, it looks like simple database query again. Database query #2.

Next, we… are we really using shapely to get the geometry of the original task? It is in the database, isn’t it? So why not just request the database to give us the geometry in the format we want, like in… next line? There is #3 database query to fetch the geometry of the task? I think we already have enough original geometries of the original task, don’t we?

Next, check that the original task is big enough to be splitted. And that the task is locked. And that the user who locked the task is the user performing this request. Looks good.

We are going to _create_split_tasks. (Yes, all the methods of the SplitService are static methods.) Wait… why we need the zoom level? And, ugh, what were these x and y of the original task? I though that the task has geometry we want to split. No, we certainly create split tasks from the original task, not from its geometry. The docstring here looks outdated, at least the parameters section, but it looks it works differently for tasks that has task geometry and those that hasn’t. Or… maybe… the comment says “If the task’s geometry doesn’t correspond to an OSM tile identified by an x, y, zoom …” what task would correspond to it? Shouldn’t tasks be independent on tiles? (Do I really understand what tile is?) Let’s guess. I guess no task geometry corresponds to OSM tile, so go to _create_split_tasks_from_geometry.

Cool, it looks that _create_split_tasks_from_geometry “Splits a task into 4 smaller tasks based purely on the task’s geometry rather than an OSM tile identified by x, y, zoom”. That’s my expectation. How to do that? We have #4 to get task’s geometry as GeoJSON from the PostGIS database. But that’s not enough, we use shapely again to retrieve the task’s geometry, centroid and bounds. Good, still get that. Looks that split_geometries is a list of… yeah, it uses _as_halves and I guess that the result is four geometries representing sub-geometries of the original task’s geometry. I need a break before converting the geometries into GeoJSON features.

Good, back again. split_features is that list. For each splitted geometry we use shapely to convert it to multipolygon and then database queries #5, #6, #7 and #8 to… make it GeoJSON, I guess? But sure that _create_split_tasks_from_geometry returns the list of GeoJSON features representing splitted geometries.

Backtrack and the list of GeoJSON features representing splitted geometries is the returned by the _create_split_tasks as well.

Cool, we are back at split_task now. (It has been called from the second try block, just to remind.) We are now ready to “create new tasks from the new geojson” as the comment says. Ok. It looks that i is since now the max task’s id available in the project. I am just curious… what that get_max_task_id_for_project (static) method do? Uh, it’s #9 database query. Got it.

Now there is loop over the GeoJSON features representing splitted geometries. As the first thing we do in this loop, we… what? We check that the splitted geometry overlaps with the original geometry? Using shapely again? Really? Really.

The next block increments our i (of the max id of project’s tasks) by one and uses the incremented value to create new task from_geojson_feature. The from_geojson_feature looks innocent and I will believe that. We have create for each new task and that are database queries #10, #11, #12 and #13 – we have four splitted geometries, remember?

Oh no. Do I really need to deal with the task history now? Uh, yes, at least to check if there are some database queries. What is that task_history anyway? Oh, we copy_task_history for each splitted geometry and that copy_task_history I do not understand and I am lost and there are too many database queries. But no – no this time – don’t give up! Just pretend there is no task_history for now, and go on. I see update then, and those are database queries #14, #15, #16 and #17.

We are almost done, trying to delete the original task. If the deletion, which is #18, fails, there is some rollback and new tasks deletion and more of #…. and … raise! Generic error with no description. In the issue screenshot, it looks like there is missing information on error. I am not aware of how errors are reported, but I feel that this raise caused the error from the issue.

We are heading toward the end, doing our #18 to get project, not sure if project.tasks.count() is another database query, not sure if .filter(... and .count(... are another database queries, but save sure is #19, and we return the list of these splitted tasks as DTOs back to the second try block, returning it as the reply to the POST request for the HOT TM API. Uff.

To conclude the dive – no, for me this is unmaintainable. Just imagine two mappers splitting their task in the same time. Whoever wins max id i at #9, the other one loses. You now know where I am heading? There are #19 of the database queries in I don’t know how many files. And that’s just for split task action.

Damn load testing for the third time

Posted by qeef on 3 September 2023 in English.

The damn deploy repository of the Divide and map. Now. has been refactored. And that’s a great opportunity for another round of load testing.

This is the third round of load testing, see the first and the second one if you are interested.

The load testing is a bit different from the last time. I performed load testing of new, freshly deployed damn project instance on $6/month VPS with 1 GB RAM, single 2.5 GHz vCPU, and 25 GB SSD. (The changes from the last time are that there is no more load testing of the “production” server, the price increased by $1/month, and shared_buffers is now 256 MB instead of 409 MB.)

The preparation for load testing on the server’s side, when the damn project is deployed, is just to run

docker-compose -f damn-deploy/gen.yml run --rm prepareloadtest

to create 1000 test users and 10 (load) testing areas in the database. For each run of load testing, the database has been deleted and created again with

systemctl stop damn-http.service
reboot
docker volume rm damn-deploy_damndb-volume
systemctl start damn.target
docker-compose -f damn-deploy/gen.yml run --rm prepareloadtest

commands. For each run of load testing, log the server’s utilization with

sar -o load-test-100 -A 15 $((4 * 61)) 1>/dev/null 2>&1

Then, from that file, you can generate data series and plot the graphs with

./get-info.sh 100
gnuplot plot1.pl
gnuplot plot2.pl

where the content of the corresponding files is

get-info.sh:

#!/bin/sh
set -eu

U=$1
F=load-test-$U

sar -f $F | sed 's/  \+/ /g' | cut -d' ' -f1,3 | head -n-1 | tail -n+4 > cpu-user.$U
sar -f $F | sed 's/  \+/ /g' | cut -d' ' -f1,5 | head -n-1 | tail -n+4 > cpu-system.$U
sar -f $F -r | sed 's/  \+/ /g' | cut -d' ' -f1,5 | head -n-1 | tail -n+4 > cpu-memused.$U
sar -f $F -b | sed 's/  \+/ /g' | cut -d' ' -f1,3 | head -n-1 | tail -n+4 > io-read.$U
sar -f $F -b | sed 's/  \+/ /g' | cut -d' ' -f1,4 | head -n-1 | tail -n+4 > io-write.$U
sar -f $F -n IP | sed 's/  \+/ /g' | cut -d' ' -f1,2 | head -n-1 | tail -n+4 > received-datagrams.$U

plot1.pl:

set grid
set xdata time
set timefmt '%H:%M:%S'
set format x '%H:%M'
set yrange [0:100]

plot 'cpu-user.100' u 1:2 w l t 'CPU %user' lc 'blue', \
     'cpu-system.100' u 1:2 w l t 'CPU %system' lc 'green', \
     'cpu-memused.100' u 1:2 w l t 'MEM %used' lc 'red'
set terminal png
set output 'cpu-100.png'
replot

plot2.pl:

set grid
set xdata time
set timefmt '%H:%M:%S'
set format x '%H:%M'

plot 'io-read.100' u 1:2 w l t 'I/O reads/s', \
     'io-write.100' u 1:2 w l t 'I/O writes/s', \
     'received-datagrams.100' u 1:2 w l t 'NET received/s'
set terminal png
set output 'io-100.png'
replot

On a computer to be used for (the server) load testing, clone the damn server repository, set the JWT_SECRET in the damn_server/conf.py file to the same value as is on the (load tested) server, and create virtual environment and start load testing with locust as described in the README (but don’t forget to change the URL of the (load tested) server accordingly).

cd damn-server
python3 -m venv tve
. tve/bin/activate
pip install -r requirements.loadtest.txt
locust -f tests/mapathon.py -H https://current.DOMAIN_NAME -u 100 -r 10 -t 1h --headless --only-summary --html load-test-100.html

Load testing is performed by simulating a mapathon event for one hour. Given the number of mappers (e.g. 100), there is 64 % of newbie mappers, 16 % of advanced mappers, and 20 % of reviewers. Every mapper maps for 30 to 60 seconds and then waits for 30 to 60 seconds. Newbie mappers only map (recent, oldest, random, or nearest) squares. They may mark the square for review, or yet needs mapping, or split the square. Advanced mapper may in addition also merge the squares. Reviewers review the (recent, oldest, random, or nearest) squares and for each square decides if the square is done or needs more mapping. This logic is described in the locust file.

Well, I know, it’s probably not how the mapathoners do their mapping, but hey. It’s at least something. The following are the results.

100 mappers

  • Average response time: 75 ms
  • Average requests per second: 2.5
  • 95 percentile response time: 200 ms
  • The worst response time: 1.3 s

CPU and memory utilization I/O utilization Total requests per second Response times

200 mappers

  • Average response time: 100 ms
  • Average requests per second: 5
  • 95 percentile response time: 310 ms
  • The worst response time: 7 s

CPU and memory utilization I/O utilization Total requests per second Response times

250 mappers, first time

  • Error occurences: 1
  • Average response time: 190 ms
  • Average requests per second: 6.2
  • 95 percentile response time: 410 ms
  • The worst response time: 31 s

CPU and memory utilization I/O utilization Total requests per second Response times

250 mappers, second time

  • Error occurences: 0
  • Average response time: 100 ms
  • Average requests per second: 6.2
  • 95 percentile response time: 310 ms
  • The worst response time: 5 s

CPU and memory utilization I/O utilization Total requests per second Response times

300 mappers, first time

  • Error occurences: 16
  • Average response time: 3.5 s
  • Average requests per second: 6.8
  • 95 percentile response time: 800 ms
  • The worst response time: 323 s

CPU and memory utilization I/O utilization Total requests per second Response times

NOTE: Do not forget that shared_buffers is set to 256 MB from the total of 1 GB, so 80 % of the MEM means there is no free memory left.

300 mappers, second time

  • Error occurences: 3
  • Average response time: 190 ms
  • Average requests per second: 7.4
  • 95 percentile response time: 470 ms
  • The worst response time: 35 s

CPU and memory utilization I/O utilization Total requests per second Response times

And the conclusion? The most important is that a mapathon of 200 mappers still could have been handled.


Divide and map. Now. – the damn project – helps mappers by dividing a big area into smaller squares that people can map together.

Divide and map. Now. deploy refactored

Posted by qeef on 18 August 2023 in English.

This diary is about how the Divide and map. Now. is deployed.

Divide and map. Now. consists of multiple parts like server, JavaScript clients, JOSM plugin or web page. Each part is clearly separated and has its own repository. All are integrated within the damn-deploy, which has its own repository, too.

The dataflow of HTTP services

I am not a sysadmin. When I worked on the refactor deploy, I kept Docker and Systemd and Debian. I simplified dockerfiles, removed unnecessary ones, removed unused SQL code, added systemd units and rewrote the “How to deploy” and “How to upgrade” in the README.

I added damn-www-template with the hugo static site generator, simple blogging theme and an example content. If you care to run your own instance, you may get inspired. But don’t get limited – only a dockerfile serving your web page is expected.

I upgraded the Python from 3.7 to 3.11 by changing the version in the dockerfile. I upgraded the PostGIS from 11 to 15 also by changing the version in the dockerfile. I upgraded from Debian 10 to Debian 12 by backing up the database and copying it to a new droplet (VPS with the same parameters as the original one, 1 GB / 1 CPU and 25 GB SSD) where Divide and map. Now. was freshly deployed. It took longer to propagate DNS change than to deploy the Divide and map. Now.

Yeah. That was not a fair upgrade. However, I am not a sysadmin. The content of the database is the important thing and it’s good that backing-up and restoring works.


Divide and map. Now. – the damn project – helps mappers by dividing a big area into smaller squares that people can map together.

Missing Maps Mapathons Core Team

Posted by qeef on 18 July 2023 in English.

We have organized mapathons in Prague. We planned them, met on site, trained new mappers, mapped something and went to the pub. There we discussed and planned and exchanged our ideas. We called ourselves the core team. (I am only writing in the past tense because I am not there anymore; there is actually still a core team organizing mapathons in Prague.)

I was involved in Missing Maps CZ & SK from 2016 to 2020. I was involved in organizing mapathons. During that time I wrote and still maintain the mapathoner plugin. I like free software and I know that it is different from open source. I lobbed for openness. In 2020, I published Divide and map. Now. – the damn project and still maintain it. In 2022 I wrote simple hot intersecting areas and have not updated it since.

I am not sure if it’s time for this diary, but I want to write down my opinion about the core team and the community, because we have been talking about the community and the core team all along. And in my opinion, these two terms are often misunderstood.

DISCLAIMER: These are my own views. Please read this diary accordingly. I am in no way affiliated with OSMF, Missing Maps, HOT, MSF or the Red Cross.

Have you heard of WHAT, WHY, HOW and WHO questions? This is one of my favorite ways to discuss things.

Missing Maps’ WHAT and WHY are clear from their website. Let me put it another way: map the (vulnerable people of the) world because it’s a good thing. WHO is the community, the mappers. HOW is decided by the core team.

The community has a flat structure. The community is made up of mappers who vary in experience, commitment, time spent, frequency of mapping, and participation in mapathons. It makes no sense to build any kind of hierarchy on top of that, because neither says anything about leadership or management skills. (If you think the term “management” is too corporate, please remember that “self-management” is a well-known term that has nothing to do with corporations.) A flat structure means that everyone is equal – everyone can contribute as much as they want. The most important thing is that the community, WHO, understands and agrees on WHY and works on WHAT.

However, it is wrong to assume that there are no leaders or managers in the community. It only means that their role is different from the established roles in companies (including nonprofits): leaders inspire and managers support. Now I probably owe a couple of examples: first, encouraging a mapper who was attending their fifth (or so) mapathon to try to lead an iD training session; second, the discussion of communication channels with publicly available history for all and why that is good.

Someone has to care about the mappers and that is the core team. I like to say that the core team is the people who go to the pub after a mapathon. You know… that’s where we discuss and plan and share ideas… The core team is made up of the mappers in the community who give their time, energy and money to keep things going. It’s a thankless job. They work hard to be replaceable because that’s what the community needs. They only succeed if the community works without them; the community can not depend on one person.

What about leaders and managers, and the core team? They are orthogonal. You do not need to organize mapathons to inspire and support. (Mind you, you usually go to the pub anyway.) And you do not need to inspire and support to serve the community. (But you often do – by doing.)

The core team decides HOW WHAT is done. Sure there are guidelines on the Missing Maps website, but that’s just a checklist that is not for everyone. The core team influences tools, workflows, documentation, communication, presentations, knowledge, tricks, and the overall look and feel of mapathons.

It is possible to apply WHAT, WHY, HOW and WHO to the core team as well. So, WHAT does the core team do? It takes care of the mappers, of the community. WHY does the core team do that? Because without the core team, the community struggle on how to do their what. WHO again, is the core team? The people in the pub. And HOW does the core team do its job? Well… Core team members need to agree on HOW, both for themselves and for the community. I believe that free map of the world should be created with free software. I believe that openness is very important. I believe that “the end justifies the means” is a wrong statement.

Finally, I would like to give a little tip on how you can expand the core team: support others, tell them that they are doing important work, that they are doing it well, and that they are perfectly capable of supporting others.

Let’s recap some work done on the damn project.

Divide and map. Now. – the damn project – helps mappers by dividing a big area into smaller squares that people can map together.

As outlined in Work for 2023, I have been working on the improvements to the web clients. The consequences are better clients and easier deployment (which is not yet documented). Also, I have restructured and slightly rewrote the https://damn-project.org/ web page.

Changes to the web clients

New client for beginner mappers is out, see mapper. I was thinking of how to better describe map-review-done workflow. The original client has “Show mapping square workflow diagram” showing ASCII art square’s state flow when clicked. I had and idea to show SVG figure (generated by dot) instead of ASCII art, because it is easier to generate, maintain, and translate. SVG is text, too, so I can bare that. But wait! SVG is HTML element, isn’t it? So it’s clickable, isn’t it? So it can be done interactive, can’t be?

Mapper web client of the damn-project.org

I improved listing of areas, particularly filtering and sorting. Short help for filter is filter >minpriority <maxpriority created (since )until /sort author and tags. There are multiple ways of how areas can be sorted. The sorting can be included in “filter” input, so it is possible to use links like https://mapper.damn-project.org/#filter=(2023%20%3E1%20/id to share areas created since the beginning of the 2023 with priority two and more ordered by completion status in decrementing order.

Since the beginning, the damn project has grayscale color schema. It is me being terrible at inventing color schemas. I recognize when I do not like it, but I am not able to come up with something I like. It’s that simple. Anyway, it is possible to change the colors in settings now. If you find some nice color schema, please, let me know!

List of areas of the damn-project.org with weird color scheme

There is new Quality Assurance page. It contains panel web client, daily updated RSS feeds, intersecting areas, and finished areas.

The panel is pretty the same as the old one. I consider the biggest improvement the possibility to review a square of a particular mapper. That mapping workflow is possible in the JOSM damn plugin, too.

RSS feeds are daily updated. I have added new channels tracking the abandoned areas.

Intersecting areas web page shows which areas of the damn project intersect. It is also possible to test if the GeoJSON you want to upload as the boundary of new area boundary intersects with any area of the damn project. The intersection is based on the bounding box.

The last are finished areas accessible read-only.

Deploying web clients

Every web page of the damn project is static web page. You can download it (see the bottom of relevant page), open downloaded copy, and use it.

Of course you need Internet connection, because you communicate with the damn server, but you may save a few hundreds kilobytes next time, yay!

But! The real consequence is for administrators. Because every client is static web page, deploying new version literally equals copying one file.

Future work

Mappy is the next web client to be rewritten to the current damn (JavaScript) client codebase. I have also some feedback and ideas for the improvements.

I plan to include at least some kind of map into the mapper web client. When mappy client is ready, I will probably use some parts for that purpose.

Refactor deploy is still ongoing and slowly improving. Unfortunately, I will update the documentation as the last thing, sorry.

I had an idea. Within new codebase it is pretty simple to built fixed filter into the web page client. I mean… mapper or panel web page, but there is no sorting or filtering option and only predefined list of areas is shown. This approach can be used for mapping campaigns. The following three commands needs to be run on generated index.html file to achieve that:

sed -i 's|var li = H.loa(r, fv);|var li = H.loa(r, "(2023 >1 /id");|' index.html
sed -i 's|H.h_loa_menu(fv)|H.h_my_menu()|' index.html
sed -i '/R.bind_loa();/d' index.html

Uhm, I should probably keep the sorting option, though.

The third year of the Divide and map. Now.

Posted by qeef on 1 January 2023 in English.

The damn project helps mappers by dividing some big area into smaller squares that a human can map. This diary is about the work done in the third year since the publication.

And, to be honest, not much has been done. I had a little of time this year. Still, there are some interesting improvements.

Deployment and server

I will start with probably the most boring stuff: I worked on the documentation and tests. That is thankless work, but I believe it pays off in the longer term. In short – 35 files changed, 1675 insertions(+), 842 deletions(-) and you, as mapper, should not see the difference before/after.

In parallel, I worked on the refactoring deploy. I have moved some upkeep procedures already and I will slowly continue the work.

Notathons

The most motivating for me, I think, is a feedback with a request like “hey, we are working on this and we need that”. This time it was from guys organizing notathons. We improved the damn plugin for JOSM to download notes automatically and periodically. Also, when using the plugin, the changeset comment is (finally!) automatically set based on the area information.

Get inspired by issues of similar projects

I wrote already about damn project point of view on some issues. From that diary, I think that this issue is solved by option to Review newbie, or Map or review work of other mappers workflow.

Because there are not many issues with the damn project (I do not complain!) I sometimes look up interesting issues somewhere else. So, what is there?

First two here and here deal with locking of multiple tasks (squares in damn) for mapping or validation (review in damn). Locking of multiple squares goes against the principle of “divide and map” and therefore it is problematic, but there are valid use-cases. When you need to map multiple squares, merge them first (in mappy client). Do the same with the squares you want to validate, but it is probably better idea to set which mapper in the damn plugin for JOSM instead.

That second option is interesting. I have slightly extended the server’s API (CreateCommit it is), so when sending requests to map or review, it is possible to specify the mapper’s name. And it works for all types of requests. I mean… probably the most usable is “review recent [square] of mapper’s name”. But you can also “map nearest [square] of mapper’s name”. Just any combination of map/review recent/oldest/random/nearest can contain the mapper’s name.

Split task for validators is perfectly valid idea, so I have implemented it. The interesting part here is what it took to implement it: I had to change four lines of code in single file. Good design matters, in my opinion.

The last of the interesting issues is Revert All Tasks State by Specific Username. I will not implement such a functionality. However, I am willing to revert mapper’s work manually, when reported by at least two validators with significant contribution to the area. By reverting mapper’s work I mean adding new commits to specific area with to map (or other) type. It will be a single SQL query anyway.

Please, note that I have chosen only the issues I think are interesting and worth implementing.

Work for 2023

I will continue the work on the deploy repository; it’s current state is inconsistent with the documentation. I think I will perform another round of load testing next year, just to check the performance is still good.

I need to work on web clients. I will start with the client for (first-time) mappers, for finished areas, or with the manager. The goal for the next year is to distinguish clients by what mappers need instead of what clients can do. When clients are ready and the wording somehow fixed, there are still translations to do.

That’s it. If the damn project helps you, feel free to use it. Keep mapping!

The inspiration for this diary comes from the email sent to the HOT mailing list. I must say that I’m not involved in Missing Maps anymore, so I don’t currently use the HOT Tasking Manager (TM). Also, I’m the author of the competing project. That’s the disclaimer.

I will start with the point since when the HOT Tasking Manager became unacceptable for me. That was when mappers began to be forced to provide their email addresses. The reason was: just 4% of the mappers shared it.

The purpose of a Tasking Manager is to divide big area into smaller squares that a human can map. Then, let mappers communicate on what they are working on by changing the states of the squares. So, TM helps with a group mapping management.

However, the changes go to the OpenStreetMap. You don’t have to use Tasking Manager to update OpenStreetMap, but you (must) use OpenStreetMap when working with a Tasking Manager. OpenStreetMap itself provides communication channels for mappers, particularly changeset discussions and private messages.

Changeset discussion is used to discuss changes mappers do in the OpenStreetMap. Private messaging is used to send direct messages between mappers. In both cases a notification is sent by email.

We are almost there. So, why is the communication within the HOT Tasking Manager wrong? Because the HOT TM duplicates the communication about things it does not manage – changes to the OpenStreetMap. Because it allows group and automated messages/emails that are, by definition, depersonalized. Because it confuses beginner mappers about which communication channels are really important.

I would like to end with a proposal for the HOT Tasking Manager developers. Please, keep the functionality of the HOT Tasking Manager non-overlapping with the OpenStreetMap. Please, do leverage OpenStreetMap for the rest.

This diary post is inspired by the cleaning up after a task manager task. It shows how to do the clean-up steps for a Divide and map. Now. area.

The data quality matters. The proposal in the Johnwhelan ‘s diary is to run duplicated building script and JOSM validator when an area is finished on the whole area. The rest of his diary deals with how to get the area’s border geometry and the OpenStreetMap data into the JOSM.

Here are the steps to download the area’s geometry and the OpenStreetMap data when using the damn project:

  1. Load the area’s geometry by navigating to the area in the JOSM damn plugin, then click Get area geometry button.

    It’s also possible to navigate to the area in the mappy (web) client, right click on arbitrary square and download area geojson, and open the downloaded file in the JOSM. It’s good idea to right click on the created area.geojson layer and Convert to GPX layer to make it read-only, which is the same result as when using the JOSM damn plugin.

  2. Download the data from Overpass API, which is the second tab on Download map data … dialog. You can get the Overpass query by navigating to the area in the mappy (web) client, right click on arbitrary square and area overpass query. Then copy the query to the JOSM’s Download from Overpass API, Overpass query: field and click Download. Be sure you have enough RAM for big areas.

That’s all. It’s kind of fun I’m writing this diary just few days after my damn project developer’s “annual report”, but I didn’t make it sooner.

The second year of the Divide and map. Now.

Posted by qeef on 1 January 2022 in English.

It’s two years since the Divide and map. Now. has been published. I would like to summarize the second year of the development.

What is it about? Divide and map. Now. – the damn project – helps mappers by dividing some big area into smaller squares that a human can map.

Why should I care? Divide and map. Now. is proven to handle a mapathon with 200 mappers. There are four clients available for mappers and multiple mapping workflows. There is deployment guide for admins. You may create new or modify existing areas, use RSS to track areas’ changes, and check abandoned or overlapping areas by Python3 scripts.

In 2021 I’ve refactored the server and load tested it. There is the API documentation that is stable for more than half a year now. The web clients were also refactored: the light is text-only web client for beginners, the panel is for advanced mappers and looks like it’s integrated into the iD editor, and the mappy web client has square-based graphical interface. The web clients include improved statistics that can show OpenStreetMap contributors that haven’t used Divide and map. Now. when mapping in the same area. The damn JOSM plugin was updated to the new API and loads temporary data stored in the server when available.

I’ve got inspired by the Tanga Building Footprints Import, mapping highways’ radars and mirrors, and documented the mapping workflows available in the different clients.

I’ve implemented Python3 client with the scripts to find abandoned or intersecting areas based on the potential HOT tasking manager improvements and looked at the competing HOT Tasking Manager and SimpleTaskManager issues from the damn project point of view.

I’ve created the finished areas read-only service and announced the policy for finished areas.

There is some work for 2022. Web clients need translation, but I want to stick to the damn project philosophy when implementing it. Also, I want to refactor the damn deploy. Of course, the refactored guide must be at least as simple as the current one.

HOT Tasking Manager (TM), SimpleTaskManager (STM), and Divide and map. Now. (damn) are tools for collaborative mapping, with different philosophies and different approaches. In this diary, I discuss some issues of the first two from the perspective of the third one.

DISCLAIMER: I’m the damn project developer.

Which issues to consider: I filter out bug reports and issues that oppose the damn project philosophy. Then, I pick up issues I think are interesting and categorize them, describe the category, and provide some comments how the particular issues are or would be solved.

About naming: TM’s and STM’s project corresponds to damn’s area. TM’s and STM’s task corresponds to damn’s square. Where TM uses validation, damn uses review.

Solved by design

There are many issues of both managers that wouldn’t be an issues for the damn project because of it’s design. Let’s quickly recap that design.

There are areas divided to squares. Any change to an area is stored as commit. So, creating new area leads to creating new squares as well as new commit with to map type for each square. During collaborative mapping, new commits of different types like locked or done are added with area and square identifiers, always updating particular square of the area. It’s not possible to delete something.

Already implemented

Some functionality requested is already implemented in the damn project, or the problem described has different solution.

  • STM: Allow custom Changeset comment – the changeset comment can be set in the manager (web) client when creating new area and updated afterwards.

  • STM: JOSM: Download data based on task geometry – the square’s GPX boundary as well as OpenStreetMap data are downloaded automatically when using JOSM damn plugin.

  • STM: Notice to be responsible while mapping a task – there is notice on each area’s page of the light (web) client just before the I will map … and I will review … links.

  • STM: Create sub-tasks to specify which part of a task is done – splitting a square creates four new squares with borders dividing the original one. New commits with to map type are added for new squares and new commit with splitted type is added for the original square.

  • STM: Chat within project – there was chat in web client, then I dropped it during the refactoring, and finally re-added it again. It’s not much, it’s mainly for demonstration, but it works.

  • STM: Feature: GeometoryCollection – (multi)polygons, linestrings, or points are allowed geometries for features in a FeatureCollection of an area being created. From each feature new square is created.

  • TM: Feature Request - Remember sort oder on tasks screen – the use-case is to help validators who validate the ‘Least recently updated’ squares. The damn project uses locking policies to solve the original issue–it’s possible to request reviewing of recent, oldest, random, nearest, or newbie square.

  • TM: Filter Tasks by Username to Include All Users - Not Just Most Recent Mappers – on the area’s statistics page, it’s possible to see all the commits, i.e. all the changes to the area and all it’s squares. It’s possible to filter all the commits of a mapper or all the commits of a square.

  • TM: Add active and stale filter options to Manage Projects page – for this purpose, the abandoned_areas.py Python script of the damn-client.py is used.

  • TM: Allow logged in users to create tasks without intervention – it’s the way to create areas according to the damn project philosophy.

  • TM: Define a setup to execute periodic functions – it’s the upkeep of the damn-deploy.

  • TM: undo undue split – it’s not possible to undo something in the damn project, but this issues has the solution. The issue, from the damn project perspective, is that a mapper splits the square, i.e. creates four new squares, a commit with to map type for each of the new squares, and new commit with splitted type for the original square. Then, splits one of the new squares again, i.e. creates four new squares, a commit with to map type for each of the new squares, and new commit with splitted type for the parent square. So the result is seven to map commits. In the mappy (web) client, it’s possible to merge squares, i.e. from the seven splitted squares create new square with the same boundary as the original square has, corresponding commit with to map type for that square, and seven commits with merged type for the seven splitted squares.

  • TM: Track Split Tasks – showing all commits or square’s commits on the statistics page of an area does the trick.

  • TM: RSS feed of projects – see rss.damn-project.org.

  • TM: Submit task and jump right next to another one – with JOSM damn plugin, there is no need for opening the web browser (except for authentication for the first time.)

  • TM: Let mappers translate project description – this is my issue, contributing an idea to the HOT Tasking Manager. Translating the description by mappers is implemented in the light (web) client since the beginning.

  • TM: Improve integration with AI-assisted mapping – in a web client, iD and RapiD can be switched. On the own instance, the default editor can be set.

Boring (client) stuff the server is ready for

The server’s API is ready for a lot of things clients don’t do, because I’m not a frontend developer, nor enjoying work on the clients. The following issues could be solved easily, because it’s possible to download all the commits of an area and all the information is within these commits.

An example what is done in the web client: area’s statistics page shows hours left when mapping or reviewing the same speed as in the last hour/3 hours/day/week/overall.

Some ideas to get inspired

As mentioned in the beginning, I filter out bug reports, issues that oppose the damn project philosophy, and even more. Therefore, the issues listed in this diary is not–and is not meant to be–complete. However, it’s nice to get inspired. There are some interesting ideas in the both managers I get inspired by:

The idea to map radars and mirrors of the Czech highways was raised in the mailing list of the Czech community. One of the comments is questioning about how to track the work done–and that’s exactly the problem to be solved by the damn project. (Note that the idea to map radars and mirrors of the Czech highways is still just an idea and there are no further steps done yet I’m aware of.)

I already wrote two Get inspired by … diaries. These were the potential HOT tasking manager improvements and the Tanga Building Footprints Import. This diary discuss the use of the damn project for tracking the work done on highways.

Create area

The damn project helps mappers by dividing some big area into smaller squares that a human can map. How this applies to highways?

It’s possible to upload two kinds of GeoJSON boundary files to the damn manager. If the FeatureCollection has member ‘name’ then divide to squares function is NOT used. Moreover, I’ve updated the damn project to understand what to do with a feature containing a LineString geometry: create a “square” with the border around the LineString. (“Square” here means a square of an area of the damn project.)

So… when you download some highway=motorways within some interesting bounding box (I mean from overpass API in JOSM,) save as GeoJSON, and add name member, you are able to upload that as GeoJSON boundary file to the damn manager.

Note that I experimented with joining ways to make larger squares of an area. The JOSM damn plugin then refused to automatically download the OSM data of a square because the square was too large, and the data had to be downloaded manually.

Mapping

The mappy client may help to visualize the area. An example is area 2352 and it looks way better with background image switched off. The JOSM damn plugin, light client, or panel using map random square/map nearest square are better options, though.

Yet one minor update to the damn project is to splitting of the squares. I added split horizontally and split vertically commit types that are currently possible in the mappy client. This is because splitting of a highway square led sometimes to tiny squares in the middle of the highway square. (Imagine highway from the left bottom to the right top of the square splitted to four sub-squares.)

Conclusion

It’s possible to track the work done for highways mapping. I know the damn clients could be (significantly) better. I still hope for someone to come and write their own client for the damn server API from scratch. That would be fantastic, at least because the development would become decentralized. Until then, the mappers have no option than suffer from my design.

And that’s about tracking the work done on highways. I wonder if logging the data and importing them with the idea of the last get inspired by … diary wouldn’t be a better option for this particular project.

As I’m not yet sure what will be included in the alpha version of the damn project, I’m getting inspired. This time by Tanga Building Footprints Import announced on the imports mailing list.

It looks like the buildings are already imported in the Tanga city, e.g. this one, but I’ve never done the import thing, so I can’t say.

In this diary, I’m going to introduce the example mapping workflow with the prepared data based on Tanga Building Footptrints Import using the damn project. Not everything is yet fully automated, so this diary is more like describing the proof of concept. However, I’m open to the needs of the future.

Create area

There is the dataset of the buildings available on the wiki page. I like that idea of huge GeoJSON file that includes the features with the corresponding geometries. As I’m not the import guy, I’m confused with _key_s in the properties (I mean these underscores of each key.) I need only building:yes tag, so I’ve renamed just those (note that notepad’s “Replace All” would do the same:)

sed -i 's/_building_/building/' buildings.geojson

I’ll create new area to work on. Therefore, I need the geometry of the whole area, where the buildings are. For that, I’ll use the damn-client.py scripts. (The shapely must be installed.)

cd damn-client.py
./convex_hull.py buildings.geojson > ch-buildings.geojson

The command took about a minute on my laptop. The ch-buildings.geojson can be used as the GeoJSON boundary file in the damn project’s manager.

Buildings of the squares

So I’ve area to work on divided to squares. Now, I need the GeoJSON files containing all the buildings of each square. I’ll use the damn-client.py scripts once again:

cd damn-client.py
./prepare_tmp.py https://server.damn-project.org 2351 buildings.geojson

Note using the original buildings.geojson file. The command took about 10 minutes on my laptop. Generated directory 2351 must be copied manually to the deployed version of the damn project. Uploading of the area’s directory is not yet automated.

Squares with no buildings

This is hacking part. I’ve the area with convex hull border that is covering all the buildings and is divided to the squares. Consequently, some squares could contain no buildings. I want to mark all these squares as done, so I need to:

  1. Get all the area’s squares.
  2. Get buildings for each square.
  3. If there is no building, lock the square.
  4. When all buldings-free squares are locked, merge them, and mark is done.

Doing this manually is possible in the mappy client, but crazy. The automation of this task can be done within the developer’s console (F12 in Firefox,) using the functions of the damn-client.js, though:

var aid = 2351;
api.get_squares(function(squares) {
    for (var i in squares) {
        api.get(
            function(r)
            {
                if (r["features"].length == 0) {
                    api.post_commit(
                        function (r2)
                        {
                            console.log(r2["sid"] + " locked");
                        },
                        aid,
                        {"type": "lock", "sid": r["sid"]},
                        function() {},
                    );
                }
            },
            function() {},
            api.ep(
                "/area/" + aid
                + "/square/" + squares[i]["sid"]
                + "/tmp"
            ),
        );
    }
}, aid, function() {});

Now, I need to wait untill all the squares with no buildings are locked. This took about 2 minutes on my laptop. Then, the last steps are: right click in the mappy client and merge, right click on the merged square and lock, and finally the last right click on the merged square and is done.

Mapping with JOSM

Everything is ready for the mapping. Please, note the limitations:

  • Do not use split and merge otherwise the prepared data are not downloaded.
  • The prepared data are only available when using JOSM damn plugin.
  • I know, the JOSM damn plugin is the weakest point of the damn project.

I would recommend the following workflow:

  • From the damn plugin dialog, select the area to map.
  • Click map button to load some square and corresponding buildings.
  • Use mapathoner plugin to Select Duplicate Buildings.
  • Use todo plugin to keep track of them.

Do not save the work to the OpenStreetMap! This diary is about possibilities, not about the real work on the Tanga Building Footprints Import!

Conclusion

That’s it. Preparing the data is not fully automated, but the tools are there. The weakest point of the mapping workflow is the JOSM damn plugin which is ugly, but working. These are reasons why I’m talking about describing the proof of concept. However, that proof of concept is ready for the broader testing.

This diary is about load testing. I find load testing useful mainly because of two reasons: 1. it shows multiuser access problems; 2. it shows how good is the implementation.

The idea is to simulate mapathon. A mapathon is an event where multiple mappers map the same (big) area. To manage the work, the area is divided into smaller squares. Then, each mapper asks the server to map or review a square, works some time on the square, and finally asks the server to mark the square as needs mapping, needs review, or is done.

I use locust.io to load test the damn server. The test file is part of the repository. There is 80 % of mappers, 20 % of reviewers, and 2 testing areas. Each mapper/reviewer works for 30 - 60 seconds, then waits for 30 - 60 seconds, and then works again. All the mappers are spawned within the first minute. The test is stopped soon after there is 0 % to map for both of the areas.

damn-project.org runs on $5/month VPS with 1 GB RAM, single 2.2 GHz vCPU, and 25 GB SSD disk.

100 mappers handled The first round of load testing gives the idea about the server after the refactoring. The average response time for 100 mappers is 270 ms. Not bad, I would say.

It’s a bit worse when testing 200 mappers, though. I’ve stopped the test after 20 minutes.

Some improvements Two years ago, I decided that there must be some better solution for the problem of dividing a big area into smaller squares. I found that there is PostGIS extension to PostgreSQL so I decided to go with it. Two year later (now,) I’ve discovered database indexes.

Also, I’ve read part of the documentation and set postgres parameters.

Finally, I switched gzip on.

200 mappers handled Pretty small changes, but I had to figure them out. I noted my journey in this thread. Anyway, with the changes applied, I load tested again. The average response time of 275 ms for 200 mappers looks good.

Writing good code is green. Don’t waste energy.

In the beginning of the year, I’ve been refactoring the Divide and map. Now.. I’ve already wrote about the clients and about the improvements. In this diary, I would like to share some details about how the server looks like now.

The damn project distributed architecture is depicted below. I’ll write about the api and db bubbles, where db is the PostgreSQL database with the PostGIS extension and the api is the python FastAPI application.

damn project architecture

Damn server API

Automatically generated documentation of the API is useful for the developers of clients. There are some not so much interesting endpoints dealing with authentication and authorization, users updates, or squares GPXs. I will elaborate on /areas, /geometries, /area/AID, /area/AID/commits, and /area/AID/squares endpoints here.

The list of areas and adding new areas is done through /areas endpoint. However, returned area is different from the area to be created. Returned area contains statistics, area to be created contains FeatureCollection with the area’s geometry and the information about how to divide the area.

The /geometries endpoint is used in python client, particularly in the script detecting intersecting areas.

Getting information about an area and updating the area information is done through /area/AID endpoint, where AID is the area identifier. Currently, four digits are used for the area identifier. Yes, the number of areas of the damn project is limited to 8 999. I’m going to increase the limit to five digits – 89 999 – when needed.

There is /area/AID/squares endpoint providing the list of squares’ geometries for the area. Nothing more.

The endpoint where the real work is being done is /area/AID/commits. The idea is to get the area’s commits once, store the commits, and use ?since= query parameter in the next query to get just new commits. (JavaScript damn API library may help here.) Getting the commits helps to decide the square’s current state (e.g. to map or done,) compute statistics, or show the history of the area’s information.

It is not possible to change the commit. It’s only possible to add new one. And creating the commit is as simple as sending {"type": "map random"} to the server. The example says to the server: “Let me map some random square”. The server replies: “201: Created. Here is the area’s ID, square’s ID, and the square’s border.” Commit message is optional. It’s currently possible to map recent, random, or nearest square, to review recent, random, nearest, or newbie square, and manually lock some square specified by the square’s ID. When unlocking the square, send the type of needs mapping, needs review, is done, or split along with the square’s ID – it’s compulsory when unlocking. However, there is one exception. When multiple squares are locked, commit message with the merge type will create the new square from all the locked squares. There is no need for square’s ID in that case. (Similar to areas, returned commit has different set of types than the commit to be created.)

Source code structure

I’ve changed the files structure, too. However, it’s probably not interesting much. conf and db are unchanged. I’ve added Pydantic models and docstrings to user file. square just gets the border from the database and returns it as GeoJSON or GPX.

api is the FastAPI file. This file is used as the MODULE_NAME of the uvicorn docker. As soon as the FastAPI loses to be the number one of the Python JSON API frameworks, just change this file.

I’ve only added Pydantic models and some docstrings to the area file. save, load, and update functions are still here.

I’ve moved all the lists of any kind to the list_of file. Database queries for the list of areas, geometries, area commits and squares, and user commits are stored here.

The last file is the new file. This file deals with the database queries for creating the commits. The original idea was to move the queries that creates something new in the database. However, when you recall the API, only new area or commit is created. And it’s just cleaner to area.save, in my opinion.

SQL queries

The database structure is not changed. Just to recall:

current_areas -- stores area information
- aid -- primary key
- tags -- the default changeset comment
- priority -- how to order the areas
- description -- JSON with {"lang code": "description", ...}
- instructions -- JSON with {"what": "link to wiki", ...}
- featurecollection -- GeoJSON
- created -- automatically for new areas, UTC now()

curret_squares -- stores area's squares
- sid -- along with aid primary key
- aid -- foreign key to current_commits
- border -- geometry

current_commits -- stores area's commits
- cid -- primary key
- sid -- along with AID foreign key to current_squares
- aid -- foreign key to current_commits
- date -- automatically for new commits, UTC now()
- author -- foreign key for users.display_name
- type -- to map, to review, done, locked, ...
- message -- author's note

I’m going to share the interesting parts (in my opinion) of the queries. The complete code of the server and the database is available, of course.

Let’s start with guessing. What does the following query do?

WITH last AS (
    SELECT DISTINCT ON (sid) cid, sid, type
    FROM current_commits
    WHERE aid=$1
    ORDER BY sid, cid DESC
),
toreview AS (
    SELECT *
    FROM last
    WHERE type='to review'
    ORDER BY cid DESC
    LIMIT 1
)
INSERT INTO current_commits (sid, aid, author, type, message)
SELECT sid, $1, $2, 'locked', 'Working on review'
FROM toreview
RETURNING sid

This is non-refactored code, in fact. It just locks the recent square for review. Locking of the random square is RANDOM() insted of cid DESC, you probably know.

To use some PostGIS, let’s lock the neares square, hey?

WITH last AS (
    SELECT DISTINCT ON (cc.sid)
        cid, cc.sid, cc.aid, type, author, border
    FROM current_commits AS cc
    INNER JOIN current_squares AS cs
    ON (cc.aid=cs.aid AND cc.sid=cs.sid)
    WHERE cc.aid=$1
    ORDER BY cc.sid, cid DESC
),
mine AS (
    SELECT border
    FROM current_commits AS cc
    INNER JOIN current_squares AS cs
    ON (cc.aid=cs.aid AND cc.sid=cs.sid)
    WHERE cc.aid=$1
    AND author=$2
    ORDER BY cid DESC
    LIMIT 1
),
toreview AS (
    SELECT sid, cid, type
    FROM last, mine
    WHERE type='to review'
    ORDER BY ST_Distance(mine.border, last.border)
    LIMIT 1
)
INSERT INTO current_commits (sid, aid, author, type, message)
SELECT sid, $1, $2, 'locked', 'Working on review'
FROM toreview
RETURNING sid

Border is stored in current_squares table so it has to be joined with the current_commits. The interesting part is:

ORDER BY ST_Distance(mine.border, last.border)

And the interesting part for the newbie’s square review request is:

WHERE type='to review'
AND (info::json->>'newbie')::timestamp > now()
ORDER BY RANDOM()

The mappers mark newbies themselves as I’ve explained in older diary. In the same diary, I’ve mentioned the merging of the squares. However, I didn’t say details. It’s not the rocket science. The main part is done by PostGIS again – the ST_Union function in this case.

...
to_be_merged AS (
    SELECT border
    FROM current_squares cs, locked lo
    WHERE cs.sid=lo.sid
    AND cs.aid=lo.aid
),
merged AS (
    SELECT ST_UNION(border) as mb
    FROM to_be_merged
)
INSERT INTO current_squares (sid, aid, border)
SELECT f.msid + row_number() OVER () as sid, f.maid as aid, mb
FROM
    (
        SELECT MAX(cs.sid) as msid, MAX(cs.aid) as maid
        FROM current_squares cs, locked lo
        WHERE cs.aid=lo.aid
    ) as f,
    merged
RETURNING sid, aid

Here, the locked is query that returns all the squares locked by the author. (Note that the locked information is stored in current_commits, but the border in current_squares. Also, some parts of the code are pointless.)

The best for the last. I’ve refactored the pgsql function for dividing up an area or square – the st_divide function. No code snippet here. Rather look at the whole source code. The function gets area geometry, number of squares in x and y axis, and returns a set of the divided area’s geometries. (The number of squares in x and y is overloaded by square’s dx and dy dimensions.) I’ve also tuned the squares a little bit, so it’s possible to divide up the area to rectangles, diamonds, flat hexagons, pointy hexagons, or the wall bricks.

Conclusion

The thing I like the most about the server from the API point of view is that the whole mapping workflow – map -> review -> done – is available through the one endpoint with the trivial messages of {"sid": 1, "type": "is done"}.

The thing I like the most about the server from the database point of view is that everything is done by the SQL queries – the python application “just” translates JSON to SQL, in fact.

I liked Pete’s diary at the first sight. However, when published, I was already decided about my next work. Last Thursday I finally found some time to look closer at this interesting topic. I spent night working on it. And I will share the results in this diary.

First, I will not write about the HOT tasking manager. Second, these are my opinions. Third, understand damn in the following text as the shortcut of Divide and map. Now.

The summary of my understanding of the original diary follows. I’ll append my thoughts, though. I made a few toots during the night, which can serve to track the time effort.

What is TM?

Questions:

  • What the TM is?
  • What mapping process is?
  • What is the origin of a project?
  • What is the ownership of a project?
  • For who is TM for?

Problems:

  • New mappers don’t understand (because they don’t have answers.)
  • New mapper edits frustrate old mappers.

Consequences:

  • Discourage new mappers.
  • Learning process interruption.

Thoughts:

The damn project has advantage here, because the name says exactly what the project does. Using areas, squares, and commits notation improves the clearness, too.

As there are multiple clients, each of them targets different group of mappers. Therefore, it’s easier to improve client for newbie mappers without limiting the experienced ones.

Each commit to the area is stored and shown when requested (statistics page,) improving the understanding of the area history.

Project creators

Original problem:

  • When TM was opened, the data quality was low due to poor project description.

Solution:

  • Onboarding process. Making sure that managers fulfill the expectations of having skills to:

    • engage local communities and contributors,
    • respond the questions and info requests,
    • ensure the documentation of the project is on the wiki (in the case of organised editing,)
    • ensure that standards are met,
    • ensure that the mappers are able to do what requested.

Problems:

  • One manager for many many projects.
  • Responsibility for the onboarding process.
  • Onboarding process strenghtening.

Thoughts:

The original problem is interesting here. With the damn philosophy and the code of conduct in mind, the solution must be different.

The damn project supports mappers by creating/maintaining/inventing tools that mappers may use for keeping the quality. In the OpenStreetMap, anyone can make a change, see it, and fix it. In the damn project, anyone can make a commit to an area, see all the commits, and make another commit to change the area.

Areas of interest for beginners

Problem:

  • Bad categorisation of the project.

Solution:

  • Onboarding process (see above) forbids categorisation of urban environment as the project for beginners.

New problem:

  • The solution works for one situation only.

Thoughts:

Allowing anyone (authenticated to OpenStreetMap) to change the area helps here. As any change to the area is commit itself, reverting the change is easy peasy.

Project documentation

Problem:

  • Onboarding process does not emphasise the importance of the project documentation or link.

Thoughts:

The same as above. The openess helps here. The support should come only after the area is created. It’s similar to reviewers helping new mappers by saying “Hello, thanks for mapping, and by the way – please, square the buildings.”

The special case of changing the area is adding the translation of the area description. Since the beginning of the damn project, this particular case of changing the area is possible directly from the web client.

Unclosed projects

Problem:

  • Inactive projects hangs around for a long time.

Consequences:

  • Incomplete data in the project’s area.
  • Issues for new projects with overlapping area.
  • Hard to find people involved in the project.

Thoughts:

I solved the technical part of this issue by the abandoned_areas.py script (see https://git.sr.ht/~qeef/damn-client.py)

The example output of the script:

$ ./abandoned_areas.py https://server.damn-project.org
Areas abandoned on https://server.damn-project.org since 2020-12-29T20:55:21.
In last 90 days:
- 7114 has only 1 commits
- 2245 has only 1 commits
- 7100 has only 2 commits
- 7102 has only 2 commits
Finished.

Anyone can change the area and see the area creator, so sending the message saying “Hi, the area is quite old, with low contribution rate, so I’m going to archive it in a week. Feel free to re-activate or send the message back if you need to keep it active.” is simple.

TM projects can overlap

Problem:

  • Not possible to check if projects’ areas overlap.

Consequence:

  • Multiple projects used by different groups of mappers mapping the same area.

Thoughts:

This is my favorite one. I was looking forward solving this issue for almost three months. Solved by the intersecting_areas.py script (see https://git.sr.ht/~qeef/damn-client.py)

The example output of the script:

$ ./intersecting_areas.py https://server.damn-project.org
testing https://server.damn-project.org:
- 2229 intersects with 2253 of https://server.damn-project.org
- 2253 intersects with 2229 of https://server.damn-project.org

(Of course, this script works for multiple instances of the damn server, too.)

Feedback

Problem:

  • Not easy, nor intuitive.

Thoughts:

There is Join discussion link at the bottom of each web client page. The link forwards to the mailing list, so feedback is as simple as sending new email.

From the statistics page, it’s possible to write an (OpenStreetMap) message to anyone involved in the area history.

Local mappers

Problem:

  • Find local contributors within project’s area and interact with them.

Thoughts:

This is the third and the last technical issue I found. I solved it in the web client’s statistics page. Along with all commits and commits authors, it’s possible to get the list of local mappers – OpenStreetMap contributors that haven’t used Divide and map. Now. when mapping in the same area. The result of the overpass query that uses the convex hull of all the squares is compared with the list of commits authors.

[out:json];
(
    node(poly:"convex hull coordinates here");
    way(poly:"convex hull coordinates here");
    relation(poly:"convex hull coordinates here");
);
out meta;

Conclusion

When reading my toots to track the time, I spent 1.5 hour reading, summarizing, and understanding the original diary. This is not fair to say, as I read the diary many times since published.

I’m ok with spending 2 hours for each of the three technical issues. I spent more than 2 hours writing this diary.

The only thing I’m sorry about is that I’m three months late.

The state of the mapper's clients for the damn-project.org

Posted by qeef on 12 March 2021 in English. Last updated on 15 March 2021.

I’ve been refactoring the Divide and map. Now. last two months. (No surprise, I’ve plan it.) The most beautiful work was done on the server and the database. I will cover that in more technical diary later.

In this diary, I want to summarize how the clients look like now.

Client + panel

Lightweight
client

The (lightweight) client keeps with the common mapping workflow: Map -> Review -> Done. It’s meant to load fast with the minimum data transfer. It tries to avoid rushed clicks of newbie mappers by putting the phrases of “needs more mapping”, “is ready for review” or “split the square” into the sentence. Newbie mappers may click “let reviewers know” as I’ve already wrote.

Panel

The panel is included in the (lightweight) client now. I’ll stick with that if it’s ok. I’ll re-implement it as the stand-alone client if anyone cares. The panel breaks the mapping workflow just a little bit. It allows the mapper mark the square “needs mapping”, “needs review”, “is done”, or “split” whenever the mapper has finished mapping or reviewing. The panel should just speed-up the mapping process for the experienced mappers.

EDIT: I re-added the panel for iD as standalone client anyway.

JOSM damn plugin

Damn plugin

The damn JOSM plugin tends to be what the panel is for the iD editor. I just added the possibility to set the map/review actions. (I mean–request to map/review recent, random, nearest, or newbie.)

Mappy

Mappy

The mappy client is the newest client I’ve wrote. It’s heavy on data, as all the area commits and the background image is downloaded. Let’s be honest–the biggest area (in the terms of the number of squares/commits) on the damn-project.org takes 3.79 seconds to load, downloading 1,195.14 KB. (On 2021-03-11 22:29, just Firefox’s network performance analysis.)

Ok, not all the commits. Only the last commit for each square. And not always, but just for the first time. Then, only new commits are downloaded from the server.

The mappy client lets the mapper do everything what’s possible: lock the square manually, mark the square for mapping/review/done, split the square, or merge the locked squares.

There is background OpenStreetMap image. All the squares are visible. Zoom is possible by push and move left mouse button. (Clicking the left mouse button zooms to the default.) Right click opens the context menu… Just to list the control of the mappy client.

Server API changed

One of the results of the refactoring is the change in the server API. I took this as the opportunity to fix (at least some) mess in the clients. Feel free to let me know if you miss something.

The first year of the Divide and map. Now.

Posted by qeef on 1 January 2021 in English.

Divide and map. Now. helps mappers by dividing some big area into smaller squares that a human can map.

It’s been a year already since I announced the damn project, so it’s time to summarize the first year of development.

For mappers

I finally managed to make the manager manager-friendly. There is news for the client, too. Along with mapping random or recent squares, a mapper can choose to map the nearest square. Moreover, reviewers can review a newbie square.

I added background OpenStreetMap for the squares. When you click on a square, you may lock that square manually. When you lock multiple squares, you can merge them.

A mapper can share the link to an area or a square. A mapper can download the client and open it on the computer (saving 96KB of bandwidth next time.)

I fixed bugs of the damn plugin reported by #OSMWorldDiscord guys (Thanks again!) and added the “lock & open in JOSM” link to the client.

I wrote a new client. Wondering, how the integration without integration to iD editor looks like? Open the panel and check it out.

For communities

Divide and map. Now. has a separate repository for the deployment. You need to change seven rows in the config files (five rows with security and OpenStreetMap OAuth tokens, one row with the domain, one with the email address) and create one empty file. And you are ready to go.

I was wondering what is missing in the basic setup. I mean – the minimum you need to run is the server. You probably want a client and manager, too. That’s all. You know – the damn project helps mappers …

However, you are part of a community changing the world. The client and manager are OK, but you probably want to tell others about the community.

So in the config file of the deployment repository, you change the eighth row specifying the Docker repository with your web page. As I’m not too fond of the templates for all the running instances, there is no restriction on the Docker repository. You will probably use some static web page generator with Nginx to serve the result.

In 2020

Based on the above, I am confident to say that the Divide and map. Now. is proven working.

Plans for 2021

A few things are waiting for 2021. I started the source code migration to sr.ht, and I need to move the plugin and the server there yet. I am going to improve server API documentation as it’s necessary for 3rd party clients. And I want to refactor the client. (Again, as refactoring never ends.) I have some ideas based on the feedback – at least better control of the zoomable map and a bit of performance tuning will happen.

I’m sure there will be unplanned work, too.

Divide and map. Now. -- square locking policy

Posted by qeef on 12 December 2020 in English. Last updated on 11 March 2021.

I am writing (again) about the Divide and map. Now. The idea of the project is to divide up a big area to organize mapping better. In this note, I will share how I was thinking about the mapping workflow from the beginning, how I finally improved it, and how I hesitated but broke the workflow in the end. And I am sure I did right.

The mapping workflow

The workflow copies a mapathon. Many guys are mapping one area. Few of them are reviewing the work of colleagues. They are locking squares of the area to avoid rewriting the data under their hands.

Since the beginning, it’s possible to lock random or recent square. Mapping guys lock random squares. The reviewing guys lock recent ones to provide feedback to the mappers as soon as possible.

Implementation details
----------------------

Feel free to skip code notes if you are not interested in the code.

There is some background I need to share before showing the SQL query: 

- The server is RESTful.
- Changes to squares work the same as git commits.

(It means that the first commit of any square is `to map`. Then mapper locks a random square adding the `locked` commit to the database. When finished, the mapper adds the `to review` commit. Then the `locked` commit is added again by a reviewer who finally adds the `done` commit.)

The important thing is that no one can delete a commit. Just add. Also, there is no square state. Square consists only of area identifier, square identifier, and border. If you need to know the square's "state," you need to look at the square's last commit.

Finally, the query to map random square is:

    WITH last AS (
	SELECT DISTINCT ON (sid) cid, sid, aid, type, author
	FROM current_commits
	WHERE aid=$1
	ORDER BY sid, cid DESC
    ),
    tomap AS (
	SELECT *
	FROM last
	WHERE type='to map'
	ORDER BY RANDOM()
	LIMIT 1
    )
    INSERT INTO current_commits (sid, aid, author, type, message)
    SELECT sid, $1, $2, 'locked', 'I am mapping'
    FROM tomap
    RETURNING sid

Lock the nearest square

The first enhancement I wanted to implement is locking the square that is the nearest to the one I just mapped. The reason is obvious – I guess that some mappers prefer continuous work. Jumping over the map randomly when the mapper found a nice area could be annoying.

Fortunately, with the PostGIS database, it’s not a big deal.

    WITH last AS (
        SELECT DISTINCT ON (cc.sid)
            cid, cc.sid, cc.aid, type, author, border
        FROM current_commits AS cc
        INNER JOIN current_squares AS cs
        ON (cc.aid=cs.aid AND cc.sid=cs.sid)
        WHERE cc.aid=$1
        ORDER BY cc.sid, cid DESC
    ),
    mine AS (
        SELECT border
        FROM current_commits AS cc
        INNER JOIN current_squares AS cs
        ON (cc.aid=cs.aid AND cc.sid=cs.sid)
        WHERE cc.aid=$1
        AND author=$2
        ORDER BY cid DESC
        LIMIT 1
    ),
    tomap AS (
        SELECT sid, cid, type
        FROM last, mine
        WHERE type='to map'
        ORDER BY ST_Distance(mine.border, last.border)
        LIMIT 1
    )
    INSERT INTO current_commits (sid, aid, author, type, message)
    SELECT sid, $1, $2, 'locked', 'I am mapping'
    FROM tomap
    RETURNING sid

Newbie mappers

No doubt, the option to review squares of newbie mappers is favorable.

I am for a pro-active, open approach. (Therefore, any mapper may translate the description of an area.) Therefore, I waited a long time before implementing the option to review a newbie’s square. I can’t say I like the idea the newbie is someone with a small number of squares mapped. Neither I think that advanced mapper marked 43 986 squares to review. (In recent times, this could be discrimination.) And I wasn’t able to come up with something better.

The question is how to find a newbie mapper when the statistics are forbidden/unusable? The answer is pro-active approach. In the last version of the client, I implemented the I am new -> Make myself newbie for next two weeks option. The locking policy for the newbie mapper’s square is then (again) only one SQL query.

    WITH last AS (
        SELECT DISTINCT ON (sid) cid, sid, type, author
        FROM current_commits
        WHERE aid=$1
        ORDER BY sid, cid DESC
    ),
    toreview AS (
        SELECT sid
        FROM last
        INNER JOIN users
        ON (last.author=users.display_name)
        WHERE type='to review'
        AND (info::json->>'newbie')::timestamp > now()
        ORDER BY RANDOM()
        LIMIT 1
    )
    INSERT INTO current_commits (sid, aid, author, type, message)
    SELECT sid, $1, $2, 'locked', 'Working on review'
    FROM toreview
    RETURNING sid

Breaking the workflow

Good enough for now. The workflow works when everything is all right. I mean – mappers map and send squares for review. Reviewers send them back if more mapping is needed. And mark the squares done otherwise.

But problems happen. And the workflow doesn’t consider making already done square ready for mapping again. Just in case you make a mistake.

The solution is the manual locking of a square. I am wondering why I thought it would introduce troubles. Of course, you can lock multiple squares, but why be afraid of it?

Moreover, when you lock multiple squares, you can merge them. (It means you add done commits for all your locked squares and create a new square to map from the borders of the done ones.) It’s not a hypothetical use-case. It’s implemented. Check the client.

EDIT: After the refactoring, the manual square locking is possible in the mappy client.

Conclusion

I wanted to implement more than random and recent square locking policy since the beginning. The nearest one was clear. And I finally found the approach I like for the newbie’s locking policy.

I acknowledge the importance of manual square locking. However, I don’t think it should be the preferred workflow. Manual locking introduces more control over the square average mapper needs. That’s why I hesitated, I think. Curiosity note: Implementation of merge locked squares functionality took me about 10 hours since I decided to implement till the merge commit.

There are two things I have in mind yet before the end of the year. First and most important, the damn JOSM plugin is reported broken. I need to fix it. The second is that I would like to make creating areas in the current manager a little bit more user friendly. The JSON format is not as comfortable as the HTML form.

It started by @gustavo22soares’ question on Mastodon (not archived.) My answer was:

@gustavo22soares Hi. If the question is about #damn – there is no delete area function. (It was not needed to demonstrate the concept and there is no consensus about “what after delete?”)

Regarding the name – feel free to use tags for this purpose.

After 2 weeks, there is still no delete area function. Sorry. However, we did something after all.

The damn client may be translated (and it is translated from English to Czech and Portuguese.) Areas may be filtered by tags. There is button to switch between grid and list view. And finally, the damn client supports multiple OpenStreetMap web editors. (No, there is no support for JOSM. Use the damn plugin instead.)

The defaults can be changed easily when deploying – and it was the intention.

And the last thing: I updated the server. (Because I was not satisfied with the function returning the list of the areas.) After the update, I stressed the server a little bit. It was definitely NOT load testing, but I just could not resist.