Comparing natural=heath with an ecological habitat classification for Wales

Posted by SK53 on 26 October 2021 in English (English). Last updated on 7 November 2021.

A few weeks ago TrekClimbing asked on talk-gb about tagging various types of vegetation. He has documented some of the more problematic ones on the wiki.

The usage of natural=heath in the UK has not been particularly consistent, and his examples confirmed this. The situation is not helped because large swathes of natural=heath were added by a single mapper. Although doubts were expressed at the time, no-one was confident enough to say that such mapping was wrong.

Classic dry heath at South Stack, Anglesey Colourful late summer heath at South Stack

I have been aware that in upland areas of Great Britain natural=heath includes things I would describe differently. Notably these include : upland acid grassland used for rough grazing (mainly of sheep) and blanket bog.

It occurred to me that by comparing OSM data with another source would quantify this impression. Fortunately, a suitable source exists for Wales: Phase 1 habitat data.

Comparison of OSM & Phase 1 heath polygons


Phase 1 is a base level habitat survey for the United Kingdom introduced in the 1970s by the JNCC. It is at a finer scale (minimum feature size) than Corine and largely meant for ground surveying rather than remote classification. The main habitat classes and most of the detailed classes are understandable for laypeople. It also has an open access manua, which is quite readable. There are even convenient fold-out sheets of typical plant species for Phase 1 survey of Heaths.

I maintain, in a somewhat desultory manner, a wiki page showing correspondences between OSM tags and a subset of Phase 1 codes. Phase 1 codes can be tagged directly using the plant_community key.

Wales was unusual in that the official nature conservation body at the time CCW now NRW) decided to survey the whole country. The Welsh data has been available since 2005, more recently under an Open Government licence. A number of counties in England also carried out area-wide surveys, but these are not public.

View from the summit of Moel Eilio, showing how grasslands are a feature of the mountains of North Wales Upland grasslands seen from Moel Eilio.


I extracted all areas tagged with landuse or natural for Wales transforming them to the British National Grid. Although most natural tags represent habitat classes, a few (peninsula, mountain_range) represent geographical concepts . These latter were excluded from further analysis.

The basic steps were to find all OSM polygons which intersected with the Phase1 heath categories ( prefixed with “D”), and the converse, all Phase1 polygons intersected with natural=heath. OSM tags were grouped into broader categories (woodland for natural=wood and landuse=forest; grassland for natural=grassland, natural=grass, landuse=meadow etc). The results are shown below:

OSM tags covered by Phase 1 Heath ("D.*") polygons

Fortunately pretty much all Phase1 heath polygons correspond with natural=heath (or natural=fell, which may by now be a synonym), where any type of habitat or landuse has been mapped on OSM. As expected, the reciprocal relationship covers a range.of habitat classes. The most significant, accounting for 90% of the area were:

  • Grassland (Phase1 “B” classes)
  • Heath (Phase1 “D” classes)
  • Bracken (Phase1 “C.1.1” in the tall herb “C” group)
  • Wetlands (Phase1 “E” classes, most being E.1.6 and E.1.7 : blanket bog.

OSM heath compared with Phase 1 codes OSM heath compared with Phase 1 codes

The last step was to take the most common Phase 1 classes (“B”,”C”,”D” and “E”) and find all OSM polygons overlapping these. I’ve chosen to represent these with an alluvial flow diagram (apologies for it’s somewhat rough and ready appearance: I haven’t mastered the available R packages (alluvial & ggalluvial) to allow finer control of elements).

Flows between detailed Phase 1 codes, OSM tags and Phase 1 generic classes Flows between detailed Phase 1 codes, OSM tags and Phase 1 generic classes. The original diagram is here, and an alternative (ggalluvial) version here.

A direct comparison with these Phase 1 codes shows a close correspondence of natural=heath and the relevant Phase 1 polygons.

natural=heath compared with selected Phase 1 codes


This analysis did not perform any extensive processing on either dataset. The following issues might affect the results:

  • Invalid polygons. The Phase1 data has a number of invalid polygons. I merely used ST_MakeValid, and excluded any not do repaired. Phase1 data was compiled some time ago and habitsts have changed (most obviously with forestry plantations)
  • Overlapping OSM polygons. Despite the desire of some for each area of land to have a single landuse or natural polygon, there will always be overlaps (as natural=peninsula shows). For upland heathland typical overlaps are water bodies, and conifer plantations. These will have resulted in some overcounting although I believe the effect is very minor.
  • OSM tagging. Some polygons have both natural & landuse tags (e.g., natural=heath, landuse=military). A few of these are inconsistent (natural=wood, landuse=grass). I arbitrarily selected one of these values. A few typos were corrected but a polygon tagged natural=health had been “corrected” so many times I could not reliably restore the original mapper’s intent (it is on a remediated colliery spoil heap adjacent to a park: similar places exist in Notts and are part of Country Parks).
  • Mosaic Phase 1 polygons. These were not included in the analysis.

Findings & Conclusions

The main finding is that various Phase1 habitat codes have a high correspondence with natural=heath. This is not true for some other codes, such as “B.4” grassland. This is a positive result, in that natural=heath usage seems to represent a coherent range of, predominantly upland habitats. I would expect similar patterns hold on much of Europe because the Corine code mapped to natural=heath includes Moorland.

I know this does not really answer the original queries, but by looking how a tag is actually used on OSM I think it is possible to identify some ways to resolve some of the issues.

Recommendations & Next Steps

I think I can formulate a simple set of recommendations & follow-up research:

  • Use a heath subtag. Given that natural=heath appears to represent a coherent group of habitats in Wales, and elsewhere in the British Isles, the obvious way to map the clearly distinct habitats represented by the tag would be through use of a heath subtag, for instance heath=blanket_bog or heath=bracken. There is already some usage on OSM, although dominated by the undocumented ‘heath=alvar’ (all on Gotland).

  • Check lowland heaths. Wales is lacking much in the way of lowland heaths which are common in Southern England, and have seem to closely match expectations. So any implications for such heathland need to be considered.

  • Check natural=heath elsewhere in Europe. Much natural=heath will be dominated by Corine imports. Again I have the impression that this results in heath having a broader meaning on OSM than its classic ecological interpretation.

  • Compare Corine data for Wales. A corollary of the above is to make the same comparison for Wales but using Corine data.

  • Map missing areas of heath in Wales identified from Phase 1.

  • Document the sub-types of heath mentioned.

This latter item is what I plan to do next.

Location: Betws Garmon, Gwynedd, Wales, United Kingdom

Comment from imagico on 5 November 2021 at 23:21

Very interesting read.

I am not too sure about the idea of supporting the extension of natural=heath beyond woody vegetation (i.e. using it for blanket bogs and other habitats with predominantly non-woody plants). I know this is pretty widespread practice - also outside the UK (see for example here) - but it kind of reduces the natural=heath alone to a rather insignificant meaning.

This is of course partly because the alternative in many cases (like for bracken) would be using natural=grassland + grassland=* - which in case of non-grass vegetation does not feel right to many mappers either.

So ultimately your approach might be the right one - just pick a reasonably intuitive secondary tag for the primary tag that happens to be most commonly used for the feature in question anyway (for whatever reason that might be) and establish that as the tagging to use. But in that case it would be important to have secondary tags for the full range of uses of the primary tag, in this case heath=* values for the common cases of heath in the strict sense, i.e. dwarf scrub habitats. Unfortunately neither for natural=heath nor natural=grassland secondary tags are well established so far.

Comment from SK53 on 7 November 2021 at 09:57

@imagico: thanks as ever for the comments.

I’m broadly in agreement, but am trying to find a pragmatic way out of a tagging impasse. One update is that blanket bog tagged as heath will be at variance with any Corine imported data, all the main blanket bog areas, such as the Migneint have a wetland code in Corine (more on this later).

Comment from imagico on 7 November 2021 at 13:12

Tagging consistency of wetland=bog is also a serious issue by the way - it is widely used as a generic peat producing wetland tag rather than specifically for low nutrient acidic rain fed wetlands.

It seems to me for example what is imported in Norway as wetland=bog (which comes from a generic ‘mire’ classification in the source data) includes both fens and blanket bogs.

My guess is also that blanket bogs are likely to be frequently tagged natural=grassland because grass is often a significant (and visually dominating) component of the vegetation. And this practically can even be a more meaningful characterization in OSM because many of these will be perfectly walkable (as you’s expect from natural=grassland) while natural=wetland + wetland=bog often is not.

Comment from TrekClimbing on 7 November 2021 at 20:37

Wow! Thank you for taking the time to think all of this through and conduct such a detailed analysis. Am I reading your diagram correctly that almost everything tagged as wetland or marsh in OSM is considered improved grassland (B.4) in the Phase 1 data? I’m trying to figure out if that can be squared? I can see how B.5 marshy grassland would be tagged this way… I suppose you can get muddy parts of fields that are also grassy?

Login to leave a comment