OpenStreetMap

Calculating label points with PostGIS

Posted by pnorman on 3 November 2020 in English.

A common task with OpenStreetMap data in PostGIS is to convert polygons to points to place labels. For simple polygons, the centroid can be used, but some shapes like C-shaped polygons, the centroid can lie outside the polygon, so ST_PointOnSurface is used. This function guarantees the point returned is within the polygon.

The only issue with ST_PointOnSurface is that it throws an exception on some invalid geometries. This isn’t a problem with a database created by a recent version of osm2pgsql which only creates valid geometries, but for older versions or other data loaders it’s unacceptable. This has lead people to writing wrapper functions that check the validity or catch the exceptions, but I’ve seen no benchmarking of the various options.

To benchmark the options, I loaded the planet data from 2020-10-12 and looked at named water polygons - those that matched ("natural" = 'water' OR waterway = 'riverbank') AND name IS NOT NULL. To make the system better reflect a tile server under load, I set max_parallel_workers_per_gather to 0 and jit to off. I then ran the query EXPLAIN ANALYZE SELECT function(way) FROM planet_osm_polygon WHERE ("natural" = 'water' OR waterway = 'riverbank') AND name IS NOT NULL;.

I tested with ST_Centroid, ST_PointOnSurface, ToPoint from postgis-vt-util, a function that checked validity before calling ST_PointOnSurface, a function that caught the exception from invalid geometries, and a function that used ST_Centroid for polygons with 4 corners and ST_PointOnSurface otherwise. The definitions are at the end of this post.

Function Time
ST_Centroid 277s
ST_PointOnSurface 408s
ToPoint 575s
point1 568s
point2 409s
point3 409s

Parallelism

I set max_parallel_workers_per_gather, but my test server has a lot of CPU cores. If I increased this value I was easily able to saturate my SSDs, and all queries took the same time. Still, even if you’re IO limited it’s a good idea to minimize CPU.

Conclusions

If you have a database with potentially invalid polygons, you should use a wrapper function that catches the exception rather than checks validity first. Although ST_Centroid is faster than ST_PointOnSurface, it’s not worth trying to use it in simple cases.

Function definitions

CREATE OR REPLACE FUNCTION public.topoint(g geometry)
RETURNS geometry
LANGUAGE plpgsql
IMMUTABLE PARALLEL SAFE
AS $function$
begin
    g := ST_MakeValid(g);
    if GeometryType(g) = 'POINT' then
        return g;
    elsif ST_IsEmpty(g) then
        -- This should not be necessary with Geos >= 3.3.7, but we're getting
        -- mystery MultiPoint objects from ST_MakeValid (or somewhere) when
        -- empty objects are input.
        return null;
    elsif (GeometryType(g) = 'POLYGON' OR GeometryType(g) = 'MULTIPOLYGON') and ST_NPoints(g) <= 5 then
        -- For simple polygons the centroid is good enough for label placement
        return ST_Centroid(g);
    else
        return ST_PointOnSurface(g);
    end if;
end;
$function$


CREATE OR REPLACE FUNCTION public.point1(g geometry)
RETURNS geometry
LANGUAGE sql
IMMUTABLE PARALLEL SAFE
AS $function$
SELECT CASE WHEN ST_IsValid(g) THEN ST_PointOnSurface(g) END;
$function$


CREATE OR REPLACE FUNCTION public.point2(g geometry)
RETURNS geometry
LANGUAGE plpgsql
IMMUTABLE PARALLEL SAFE
AS $function$
BEGIN
RETURN ST_PointOnSurface(g);
EXCEPTION WHEN OTHERS THEN
RETURN NULL;
END
$function$


CREATE OR REPLACE FUNCTION public.point3(g geometry)
RETURNS geometry
LANGUAGE plpgsql
IMMUTABLE PARALLEL SAFE
AS $function$
BEGIN
RETURN CASE WHEN ST_NPoints(g) <= 5 THEN ST_Centroid(g) ELSE ST_PointOnSurface(g) END;
EXCEPTION WHEN OTHERS THEN
RETURN NULL;
END
$function$

Discussion

Comment from pnorman on 3 November 2020 at 17:17

I did a quick check on the short-circuiting with buildings which are more likely to be low node-count polygons. point2 took 715s and point3 took 565s. This comes at the cost of point3 potentially returning points outside the polygon for very weirdly shaped polygons with four corners, e.g. those in a v shape.

Log in to leave a comment