Tomas Straupis's Diary

Recent diary entries

On Edges

Posted by Tomas Straupis on 20 July 2021 in English (English).

I would like to share a beautiful (and of course useful) as always Daniel Huffmans article on edges:

Such small details make such a large impact.

When preparing geodata for smaller scales, you need generalisation algorithms. Douglas-Peucker and Visvalingam-Whyatt algorithms are usually used for lines (polygons). Unfortunately those are mathematical/geometrical algorithms which do not take into account cartographic requirements.

Wang–Müller algorithm tries to do exactly that - generalise lines (of natural objects) according to cartographic requirements, such as saving or even exaggerating characteristic features.

Original Wang–Müller article has a quite concise description of the algorithm and as there is no open source implementation of this algorithm (to my knowledge) it is impossible to use this algorithm in wider scale and there are a number of unanswered questions on how some particular aspects should work.

A student of Vilnius University, Motiejus Jakštys, has done a perfect job - he has not only implemented the main part of Wang–Müller algorithm using open technologies, but also described the algorithm in his paper, which you can find here:

Besides describing the algorithm itself (in more detail that original paper), this paper also includes recommendations on values of parameters to use on different scales, a lot of pictures as well as propositions for future work on finishing and refining the open implementation.

Wang–Müller for scale 1:50000

In coming days the work of Motiejus will be forked into Lithuanian Openmap repository where work on this algorithm will be continued and we hope to have a final implementation some time this year.

Topology rules

Posted by Tomas Straupis on 14 April 2021 in English (English).



Once in a while discussion arises in OpenStreetMap: which polygon to render above which. For example we have a large forest and then a lake inside it. What to do? Render water above the forest! Cool! But wait, we have a small island covered with trees in that lake, so rendering water on top of the forest hides that island. What to do? Render forest above the water!… Oh wait…

And you can change water and forest to a lot of others: residential, industrial, meadow, wetland, you name it.

The point is that this cannot be solved by deciding “what has to be rendered above/below what” (well not fully). But if you decide which objects may or may not intersect, which objects must always be covered by which etc. - then you have better described ontology. Then you cut holes in forest/water example above to get not only a correct rendering, but also an often forgotten - correct analysis (for example calculation of water/forest area).

We also have some very small patches of wood/grass inside residential/industrial areas, we do not want to exclude those small patches from residential areas. Not only because it could make landuse geometry too complex, but because those small patches ARE still residential areas, altho covered with something. What do we do with those? We define SEPARATE classes to be used above other classes. Say we define landcover=trees to be used above (and only above) residential, commercial or industrial polygons. The same could go for patches of grass or small basins/ponds. With water we define, that small water objects like landuse=basin can be covered with the likes of landuse=residential, but natural=water (lakes), landuse=reservoir and waterway=riverbank - cannot intersect with landuse=residential.


It is nothing new. It is a very well known thing in GIS - topology rules. There are a lot of tooplogy rule types, I will not go through all of them, but interested reader could find it in say A* documentation


When those rules are defined (even the small part of those, as it is done in Lithuania: Topology rules) it makes it very clear to everybody - mapers, creators of editors and QA tools, map makers - what goes above/below what.

As an additional bonus, it probably has a cartographic benefit, those small patches of trees/grass above other larger landuse can easily (without complex generalisation calculations) be removed on mid-small scales. Check how small patches of trees are removed on smaller scales removing clutter while larger forest patches are still visible:

With trees on larger scales: With trees

And here without trees on smaller scale: Without trees

Lithuanian address cadastre information was opened in October last year. We started importing the data to OSM straight away. Most (90-95%) of addresses could be imported automatically (taking care of existing addresses, putting address on a building when there is only one candidate, removing any excess addresses). Remaining addresses were added semi-automatically using JOSM remote control. This increased the number of addresses from 300 thousand to 1.1 million.

Here is a video of the progress: Address import progress

But we quickly found out that importing addresses is the least difficult part. We have a number of QA rules in Lithuania, including:

  • When there is an address with a street S, the way with such name must exist in vicinity
  • When there is an address with a city C, it can only be in admin boundaries of city C

Street name information is part of the dataset being opened, but it is way harder to import that data automatically as ways have to be created, combined, split, moved etc. Another thing was that OpenStreetMap data of street geometry is way more precise, so in some cases only human can correctly specify the geometry of the street. Therefore it was decided to go through ALL streets (highway=residential,unclassified,living_street) and add name or noname tag. Several mappers participated using simple report giving JOSM remote control to streets which needed review closest to their starting point. Each mapper started from their own point. This way more than 25000 ways have been reviewed and necessary data added.

Here is a video of name adding progress:

Admin boundary info was also in the opened dataset. It is also being fixed manually. Still in progress.

The British Cartographic society has published a short free book about Cartography. It gives and overview of the purpose of cartography and it’s main principles. Quick but very worth read for all mappers.

You can find it here:

Point symbols need generalisation as much as any other type of object, because in small scales they also start competing for the space on the map. Usually (in popular internet maps) the most primitive way of point generalisation is chosen: points are classified, priorities are assigned to classes and then lower priority point is simply not rendered if a point with the higher priority is already rendered in the same space. If there is more than one point with the same piority in the same spot - result is even worse - decision is random.

Such solution is of course not ideal. There are a number of ways to generalise points: displacement, aggregation, symbol combination etc. Only one option, which was chosen for Lithuanian river map, will be described in this entry.

Initial situation is that sometimes rivermap points are so close one to another, that one of them is removed or they overlap:

Overlapping symbols

In the picture above, bridge symbol is over the put_in/egress symbol. Both symbols are important - therefore we cannot remove/overlap any of them. Symbols have to be moved apart to fit both of them.

Order of symbols is also important to the reader of river map. In this particular example it is important to know, that egress point is after the bridge (going down the river). Therefore we choose an option to combine symbols into one and place them in the order of left to right as for reading - objects symbolised on the left side will be encountered before objects on the right.

We get a picture like this:

Combined symbol

More attentive reader will notice, that in this particular situation - when river flows eastwards (from right to left) - direction of reading is the opposite from the geographical position of objects. But we have to understand, that in other places river could be flowing eastwards, nothwards or diagonally - different mixes/rules of depiction would make it harder for map reader to read the map correctly. Another important point - main usage of these combined points is in online map where it would be difficult to change symbols depending on direction of the map, chosen by the map reader.

You can check point generalisation in Lithuanian river map

Videos of BalticGIT, SOTM Baltic 2020 are on-line:
SOTM Baltic was held in auditorium DELTA

Allan Mustard’s presentation „I’m Tired of Getting Lost!“ can be seen in auditorium ALFA (part 1) or you can watch the same presentation in last years NACIS2019:

Waterbody labels

Posted by Tomas Straupis on 5 October 2019 in English (English).

Labels should intuitively “connect” to their respective object on a map so that map reader could “feel” the stronger connection between the label and waterbody (lake natural=water or reservoir landuse=reservoir). When calculating places for labels in multi-scale (or vario-scale) map, we can distinguish three types of label placement:

  1. Multiple large labels - this is used in large scales where waterbody is so large that a very large label could fit. Usually at that point waterbody is so large that only part of it will be visible to map reader. This means multiple labels could(should?) be placed, trying to calculate the average size of a map view - we would like one and only one label to be visible at one point and as label positions have to be calculated beforehand it is impossible to make precise calculations as maps would be viewed on different devices with different size of map canvas. (There is also an interesting question if such labels should be horizontal or not, if not - how should they be positioned?)
  2. Curved labels - middle scale labels where label can be curved according to the geometry of a waterbody.
  3. Simple labels - small scale labels, where it is no longer possible to have curved labels (as they no longer fit into waterbody) and only simple straight line labels are possible.

Note that type of label depends on a particular scale+waterbody. That is on one scale you could have some waterbodies with large labels, some with curved and some with simple labels. As can be seen here:

Waterbody labels Topographic map of Lithuania

Here you can see a large curved label for lake Želva, smaller curved labels for lakes Gilužis and Trinktinis, and a simple label for lake Lenktinis.

The most interesting are curved labels. While calculating approximate medial axis could look like a good way to get a curve to draw a label on, it has one major disadvantage - you cannot get information on how large your text could be (and what letter spacing you could use). One might think that adding buffer to the line would indicate the possible size of a type, you will still have problems with irregular shape waterbodies, where label should be placed on a side where waterbody is “large enough” for a label (think of a waterbody with a shape of a prolonged triangle).

OpenTopoMap has used a very interesting algorithm to divide a waterbody to squares and then get a curve for a label. You can read about it on Github. This solution can be extended by iterating through different sizes of squares: starting with larger ones - trying to fit large type and then decreasing the size of a square thus trying to fit smaller type. The result of such calculation is seen in the picture above.

Such solution fits most of waterbodies. Even more interesting solutions are required for very irregular shaped waterbodies, say lakes with a shape of U, E etc. Such waterbodies should probable require more than one label placed even in one view as connection between different parts could not be obvious. Note that there is no single accepted cartographic convention if one object should have one label, or it could have more that one label.

Second step in building generalisation is building typification. We take centroid of a building which is deemed too small for a scale, try drawing a minimal acceptable rectangle (oriented to the nearest road) and check if there would be enough free space between it and other already accepted buildings. If so - typified building is accepted, otherwise it is thrown out. Here is an example of buildings simplified and typified up to 20m: Generalised buildings Here original buildings are gray hashed, yellow ones - simplified buildings which are “large enough” for a scale, purple ones - typified buildings (note that some buildings are gone). This way we can avoid drawing random noise/snow in small scale maps: Building noise And convert them to something which communicates information that there are small buildings: Typified buildings As always, you can look at how this works in topographic map of Lithuania. Typification starts at zoom level 15 (5m), then 10m at zoom 14, 20m at 13 and finally 40m at zoom 12.

If buildings are to be placed on a smaller scale maps, they must be prepared: simplified, then typified and finally aggregated/amalgamated.

Building simplification is not the same as line/polygon simplification (done with DP or VW algorithms). When simplifying building, you want characteristic details to remain: for example most buildings have square shapes, that must remain in simplified version.

Example of building simplification:

Building simplification

Here dashed polygon - original building, yellow one - simplified to specified amount.

As you can see square angles have been preserved as well as larger details while smaller details have been removed.

The amount of simplification depends on a resolution of the screen/printer (if a size of a pixel is 10 meters there is no point of trying to depict details smaller than 10 meters) as well as legibility requirements - when too much details is displayed, map reader cannot clearly read the map. If unsimplified buildings are placed on a small scale printed map, they could be visible as some kind of sand or other pattern, not as buildings.

Such building simplification helps not only with the very important criteria in cartography - legibility, but also with some technical details - building polygons have less vertexes, therefore they are rendered faster. While legacy technology of using raster tiles does not suffer too much because of more complex geometries, it is very important with currently used vector tiles - buildings take up much less (sometimes up to 50%) space so are faster to transfer and use less CPU (and battery on mobile) to render them.

You can check the difference in building shape in a live OpenStreetMap topo map (simplification kicks in on zoom less than 14)

When two ways (two railways, or two parts of a motorroad) are close together they start overlapping at some point when going to smaller scale, like railway in this picture: Railway mess

Cartographic fix for this is to make one way instead of two (or more). One of the simplest ways to do that is to create a buffer and then calculate medial axis (standard functionality of PostGIS): Way generalisation (here blue lines - initial ways, black one - generalised)

This way the map is more legible, and for vector tiles - it is smaller - therefore faster to download and gives less load on cpu on client side: Fixed railway Check the change live in the topographic map 13+ - non generalised ways, 12.99- - generalised ways.

NOTE: The same generalisation operation has to be done with roads.

River barrier

Posted by Tomas Straupis on 8 July 2019 in English (English).

Direction of a way representing waterway=dam or waterway=weir is important, because you want to know which side is lower, and which one is higher - it usually the one on the side of landuse=reservoir, but sometimes reservoirs could go in sequences, sometimes with very small separating parts:

Map with a dam