Well over a year ago I extracted all the amenity=pub
objects for Great Britain. Nearly 860 keys are used across all the elements. I’ve spent some time delving into these keys, trying to classify them, and hopefully learn a bit about two things: the kinds of information people want to know about pubs; and why synonyms exist for certain keys and tags. I’ve been prompted by SomeoneElse’s list of building tags.
A pub which I recently edited on OSM adding real_fire=yes
.
Pubs were among the earliest things to be mapped in detail in the UK. They are often an important part of community fabric, especially in rural areas. This was most apparent to me at Easter years ago in the small Welsh village where my sister had just moved. We attended the church service on Easter Sunday. The church was busy with several farmers in work clothes, having come straight from lambing. Once the service ended, it was as if the entire church had just decamped to the pub. In a typical Welsh way, the postmaster came up to my sister and told her “there’s a lady here who knows your aunt”. Thus my sister was introduced to the village community.
Pubs are good places to stop during hikes or cycle tours, and thus always received attention from mappers, Many also want to visit those pubs which serve good, real, ale and cider. This means they have an unusually rich range of tags.
I quickly ran through the keys and classified them as follows:
- Basic (18): amenity, name in all its varieties (but excluding language forms, see below), toilets etc.
- Enhanced (198): grouped into subcategories to do with alcoholic drinks, facilities for children, ambience, pub games, entertainment (music, satellite TV), wifi availability, facilities in the pub (seating, beer garden, floor covering - for walkers), LGBGT related, sanitation, opening hours, details of drinks available, other facilities (separate restaurant, overnight accommodation, but also see below) …. The CAMRA WhatPub website sets expectations as to the basic kind of information might anticipate. Even keys with only a few values in this category are obviously legitimate (e.g., pub games (11, none more than 4 occurrences):
pinball
,darts
,skittle_alley
,pool_tables
,bar_billiards
,pool
,billiards;snooker
,billiards
,table_football
,gambling
,games_room
BUT NOTdominoes
, backgammon …). Keys in this group are nearly all of some value even when used sparsely. SomeoneElse uses several of these keys to display information on his Outdoor map style, see some of his diary entries for more. - Address (38): tags related to the location address. A surprisingly large number of
addr:village
etc. - Bilingual (10) & Multilingual (16): keys storing bilingual information (Welsh, English, Scots Gaelic). Mainly names and address keys. Multilingual tags relate to languages which it is unlikely are actually used or displayed on the pub, typically transliterations for Chinese, Korean, Japanese, Russian, Farsi etc.
- Accessibility (15) : wheelchair keys, details of the entrance, availability of lift, presence of steps etc.
- Food (47) : many pubs are “food-led” and offer a good range of meal options. There is an extensive set of keys capturing this range. Of particular interest were keys for different meal times using the
opening_hours
syntax (7 keys). - Accommodation (9): for adding information where the pub has rooms for overnight stays. Such pubs are likely to have at most 6 rooms and food will be served in the pub. Anything larger is likely to be mapped as a hotel, with the pub part mapped separately.
- Other features (10): sometimes pubs contain or are used for other facilities: use as a polling station, presence of a post office, defibrillator or cashpoint. The leisure key is sometimes used for facilities which are usually tagged in a different way.
- Contact (41) : various tags for contacting the pub (email, phone etc). There are two distinct sets of keys for contacts, and I’m not entirely clear why the second was introduced (IIRC there is a Community thread about the topic). At least one value,
contact:whatpub
, looks silly. WhatPub is a Camra website (it also contains copyright data, and is unfortunately not compatible with OSM), so it’s just a url, not a way to contact a pub. - Payment (35): a total of 35 different keys
- Covid19 (8): a small number of keys postfixed with “:covid19”
- Quirky (8): a mix of slightly silly tags, and some serious and unusual ones, mainly for distinctive features of the pub. One of these is mine. In 2014 Nick Whitlegg and I stopped in a pub in the village of L (Sussex) located on the Greensand Ridge. The son of the publicans had done a Geology degree, and a couple of local Geology maps were provided for customers to browse.
- Building (38): keys related to the building and building parts.
- Heritage (20): pubs are often, along, with churches, among the oldest buildings in British towns and villages. It is unsurprising that many have protected heritage status. Additionally, CAMRA awards heritage status for pub interiors which reflect various eras of pub development.
- Lifecycle (71): pubs change name, occupy buildings built for other purposes (churches, banks, shops). Some of these older characteristics are well-remembered in local communities, and often further afield: the Target pub closed in 1986 (AFAIK), but the associated roundabout is still called the The Target Roundabout. There is little consistency with these
was:amenity
,old_amenity
,disused:amenity
, etc. I suspect the main usefulness is to provide information to other mappers and not data for consumption by applications. In this case a variety of keys probably reflects locally developed conventions. There are nuances a disused pub may be on the verge of reopening or have been demolished entirely, but the keys in use don’t seem to reflect this. - Unclassified (47): keys that weren’t straightforward to classify and mainly have low usage. The values of each key need to be investigated to assign to a category. For instance male & female (each used once) probably refer to toilet provision. Others such as 1happycow:id` I have no idea about, although presume it relates to a pub chain of some kind.
- Linked Data (17): most importantly wikidata and wikipedia tags, but also references to other external identifiers, such as UPRN, Food Hygiene ID, VAT number. Of these food hygiene (
fhrs:id
) is the most popular with 40% of pubs with the tag. In the first instance these can be useful for mappers, but usually the aim, not always realised, is allow linked queries to obtain richer information. The FHRS Id allows use of external open data to detect change, as well as the ability to enrich OSM data. - Information for mappers (43): comments, notes, links to photos and so on. Keys containing “note” dominate. Some of them do provide additional textual information which could be of use for users, and could possibly be better in a description:* key.
- Data Quality (6): mainly
not:addr:postcode
where the wrong postcode was stated in some external source (often Food Hygiene data, but sometimes the pub website), but there a few isolated examples ofnot:other_key
. Thenot:*
notation was introduced when OSGB open data was first made available and we discovered quite a number of errors in street names (not:name
),, but it has proved useful to avoid copying incorrect data of other types from external open data. - Meta (93, 69 related to source): keys related to the OSM data itself, mainly source:* tags on elements themselves, but also last update/check type keys.
- Errors (68): about 7.5% of keys could clearly be assigned to an error category; keys related to bus stops, probably originating through accidental merging of two elements (~20), additional unnecessary tags related to the Food Hygiene ID, uncorrected tags from Gregrs’s FHRS matching software, clear incorrect synonyms. Depending on your viewpoint many other tags might be considered errors, but as I’m fairly relaxed about synonymous tags, and believe the road to tag harmonisation starts with understanding their purposes, I want to avoid being overly judgemental
Back to the original question “Do we actually need 860 keys to learn about pubs?”
The simple answer is no! It is evident that in some of these groups that there is substantial synonymy (payment, contact, lifecycle) which just makes data consumers life harder, although this is most evident where the keys are probably not consumed by general purpose apps (lifecycle, informational & meta).
Only 110 keys are used more than 100 times each (130 more than 50 times, 238 more than 10 times). Only 23 of the keys in the enhanced category feature in these groups, mainly those related to beer. Many more of these could be added more systematically (e.g., a lot more pubs will have dartboards).
Exploring how these keys are used suggests that perhaps a quarter (200 plus) of the total keys are of regular usefulness in mapping pubs. This is still a surprisingly large total, but perhaps not surprising because what this does demonstrate is that the OSM tagging scheme is infinitely extendable & people use it when they see even the smallest value, perhaps only for themselves.
One of the most important things in allowing free-format tagging is that we can look at the low occurrence keys and identify unsuspected use cases. Over zealous tidying and normalising of keys (and tag values) may degrade valuable aspect of OSM (in itself, also an unidentified use case). This is just one more reason why my usual preference is to normalise OSM data prior to consumption rather than directly in the main OSM database.
And to answer the original question: No, we don’t need 800-odd keys, but there’s a good case for over 200 (and that’s just looking in Great Britain).
Discussion
Comment from chris_debian on 31 December 2024 at 00:08
A really good article/ write up of an unknown (to me) issue. It sounds like this could be a really good basis for rationalising the tags. Even if we could get this down to 200 tags, that would potentially clear up a lot of mostly well intended, but inconsistent tagging.
As I read the article, and saw reference to ‘Satellite TV’, it made me think that we’re all familiar with that term, but I wonder in the UK, how useful that would be, with one of the main providers (Sky) now moving to the ‘content over broadband’ model, and literally dumping the satellite dishes.
Thanks for taking a huge amount of time to clearly articulate this tagging situation.
Chris (chris_debian)
Comment from Firefishy on 31 December 2024 at 01:57
Happy cow is a vegan online business directory / review site. Often establishments display they are registered, similar to CAMRA membership.
Comment from Awoobis on 2 January 2025 at 22:55
Great writeup!
Comment from TheSwavu on 4 January 2025 at 23:58
OK, we have a winner in Berlin (where else could it have been?). Mostly due the hot mess that is the
payment:*
family.Apparently this pub is
payment:cash = yes
,payment:coins = yes
andpayment:notes= no
. Shouldn’t that bepayment:cash = it depends
?Comment from Jez Nicholson on 6 January 2025 at 09:38
Brighton pubs commonly subcontract the food to a separate franchise within the premises. These now also provide Deliveroo, etc. takeaways making them even more-so a separate entity.