Overpass: note to self
Posted by smootheFiets on 5 January 2020 in English. Last updated on 30 March 2020.I finally figured out how to filter for thingies within, say, the province of Groningen, and to load everything into JOSM:
(this example extracts everything with operator starting in Biblionet; turns out that’s the public libraries operated by Biblionet Groningen)
[out:xml][timeout:90];
//{{geocodeArea:Netherlands}}->.searchArea; // search within entity called Groningen (city)
area[admin_level=4]["name"="Groningen"][boundary=administrative]->.searchArea; // specify: within province (admin_level=4)
(
node[operator ~ '^Biblionet'](area.searchArea);
);
out meta;
Unclassified and residential highways within a place, say Ezinge, plus recursions up and down (partly from this link ):
(One lesson I learned the hard way: it’s dangerous to merge and/or split ways in that type of sparse editing. Do make sure to “download parent ways / relations” in JOSM, and still be careful! It’s very easy to break relations such as bus lines / routes)
[out:xml][timeout:90];
{{geocodeArea:Ezinge}}->.searchArea;
(
way[highway~"unclassified||residential"](area.searchArea);
>;
);
(._;rel(bn););
(._;rel(bw););
(._;rel(br););
out meta;
Equivalently, but specifying that Ezinge is a ‘woonplaats’ rather than any random entity:
[out:xml][timeout:90];
area[admin_level=10]["name"="Ezinge"][boundary=administrative]->.searchArea;
(
way[highway~"unclassified||residential"](area.searchArea);
>;
);
(._;rel(bn););
(._;rel(bw););
(._;rel(br););
out meta;
Useful (for me) admin levels within (ref):
- NL:
- 3: country (European part)
- 4: provincie
- 8: gemeente
- 10: woonplaats
- DE:
- 2: country
- 4: Bundesland
- 5: Regierungsbezirk
- 6: Kreis / Landkreis / kreisfreie Stadt
- 7: Verbandsgemeinde / Samtgemeinde o.ä.
- 8: Gemeinde
- 9: Bezirk / Gemeindeteil mit Selbstverwaltung
- 10: … ohne Selbstverwaltung
- FR:
- 3: France métropolitaine
- 4: regions (7 new ones)
- 6: départements
- 8: communes
And a query for sparse downloads of things I’m normally interested in while surveying (won’t download “parent relations” of ways, so need to download those manually before splitting/merging ways):
[out:xml][timeout:25][bbox:{{bbox}}];
(
node[highway];
way[highway];
node[amenity=bench];
node[amenity=bicycle_parking];
way[amenity=bicycle_parking);
node[amenity=waste_basket];
node[amenity=post_box];
node[leisure=picnic_table];
node[tourism=picnic_site];
node[amenity=recycling];
);
// recurse down
( ._; >; )
out meta;
Human-readable output (not for import into JOSM)
Here’s a bit of Overpass goodness I learned while preparing changeset osm.org/changeset/79665597:
[out:csv(::id,"duration", "interval","name")];
area[admin_level=6]["name"="Pyrénées-Atlantiques"][boundary=administrative]->.searchArea;
(
//relation[duration](area.searchArea);
relation[interval](area.searchArea);
);
out;
Finds all relations with tag ‘interval’ in département Pyrénées-Atlantiques (64); turns out, that’s for bus lines.
The interesting part, though, is the first line: instead of outputting everything and the kitchen sink as XML (for direct import into JOSM), I only output the specified information: relation ID (so I can load the relation into JOSM if needed), and the specified tags duration, interval, and name.
Another use case for this: usage statistics using regex. I found houses with addr:street tag starting in lower-case (“de Kalverweide”), inconsistent with the street name tagged as “De Kalverweide” (upper case). Lower-case street names seemed obviously wrong to me, so I optimistically changed the addresses in changeset 80142765 and alerted the ‘kadaster’ office. They replied quickly (!) and said I was wrong. The lower-case letter was right, and it’ll stay. That surprised me more than a little. Overpass to the rescue!
[out:csv (::id, "addr:street") ][timeout:125];
{{geocodeArea:Nederland}}->.searchArea;
(
node["addr:street"~"^de .+"](area.searchArea);
//node["addr:street"~"^De .+"](area.searchArea);
);
out;
The first query finds all nodes (typically: house addresses) with addr:street starting in “de “ in “Nederland”, the second query finds the upper-case variant. ‘^’ means: start of string, ‘.+’ means: one or more occurrence of “any letter” (so hypothetical streetnames ‘de ‘ of ‘De ‘ are not matched).
There are > 250,000 addresses starting in upper case De, much more than the ~35,000 addresses starting in lower case. The ratio is similar for ‘het ‘ vs. ‘Het ‘. Lower-case street names are exotic in this country but not unheard of. I still find them bizarre, but hey, OSM is about facts, not about my feelings about them. I’ve reverted my first changeset (80199834).
(Note: if I remember correctly, Overpass output can be restricted to just the number of occurrences, not the occurrences themselves. For large numbers of occurrences, that’s the way to go, otherwise output becomes unwieldy. I don’t recall how this works. Also, I prefer seeing the street names matched, so I can visually check that the regex does what it should be doing.)
Discussion
Comment from pangoSE on 6 January 2020 at 16:44
Nice to see others use overpass. I warmly recommend you to enable expert mode and download directly in JOSM using the wizard. Your searches can easily be typed in using the wizard: 1) operator~”^Biblionet” in Groningen (no results, not sure why) 2) highway~”unclassified|residential” in Ezinge (results in ways around Ezinge)
I use Overpass Turbo a lot, to download only what I’m interested in. This makes it possible to download a larger area than if I were to choose normal download.
Comment from smootheFiets on 7 January 2020 at 10:53
Thanks for your feedback! I didn’t expect to have an audience here, would’ve chosen my words more carefully if I had ;)
But since you’re asking: yes, I did try the wizard within JOSM. Good stuff. That’s how I learned the basics of Overpass coding. Your use case #1 got me started on the more advanced stuff: Groningen, unfortunately, is an ambiguous term, both a city and the province surrounding it. The city, apparently, takes precedence, which is why that search comes up empty; I need the province instead. That’s why I started messing with admin_levels.
While we’re talking, may I ask a related question? If I put a rough bounding box around my area of interest, does that make it easier or harder for the database to handle my request? Or does it not make a difference?
Say, I want to search for features within the province of Groningen. If I restrict the search for admin boundary with appropriate level and name to a bounding box, does that increase or decrease the workload on the server? It doesn’t matter much to me personally, just trying to be a good citizen / trying not to waste CPU time.
Comment from pangoSE on 8 January 2020 at 09:03
I suppose it depends on which area is larger, the bounding box or the area of the relation. I’m not familiar with h the internals of th Hhh e server, but it is open source so you could measure yourself if you wanted 😃