OpenStreetMap logo OpenStreetMap

alan_gr's Diary

Recent diary entries

In Part 3 I talked about changes in POI data in a small area over a period of two months. Most of those changes were due to mapping activity, rather than anything that happened in the real world over that period. But I’d like to use a similar approach to better understand how the real-world POIs are evolving.

As I mentioned in Part 3, until recently a significant proportion of POIs in this area had never been mapped in OSM. That means there is no point in the past at which OSM data is anywhere close to a complete set of POIs for this area. But we might be able to learn something from the POIs that were mapped some time ago.

I chose to look back 5 years, to August 2019. I guessed that was roughly the point at which the data was most accurate. Most POIs were added in 2017-2018, so while lots of POIs were missing, those existing in 2019 were probably still fairly accurate. This is how those specific POIs evolved over 5 years, ignoring anything newly created over that period:

  POI count as %
no major change 135 57%
removed 42 18%
changed POI type 38 16%
changed name 22 9%
total 237 100%

Plotting this on a map doesn’t reveal anything especially interesting, although zooming in does show a curious island of stability along the north side of Calle Ferrándiz. Somehow that row of shops has been almost immune to the changes that affected the rest of the area: map of POIs with different colours indicating how they changed over the period

The blue dot in the middle of the street is rather poignant: it is the only public telephone in the 2019 data, not surprisingly nowhere to be found in 2024.

At a more granular level, there was a general decline in the number of clothes shops. Of 15 in the 2019 data, only 5 survived until 2024. At the other extreme are healthcare facilities including pharmacies and dentists: all 15 mapped in 2019 still exist, and 13 of them still have the same name. Otherwise, most types of POI changed at broadly similar rates.

So, 57% of POIs mapped in 2019 were still operating as the same type of POI with the same name five year later. That suggests that 10%-11% of POIs experienced a significant change (closure, change of use, rebranding) each year. In fact the rate may well be higher, as this doesn’t account for locations that changed more than once over the 5 years. I know from memory that these exist (some locations just seem to be “cursed”), but I don’t have any data to quantify this effect.

Of course all this ignores the many POIs that were not mapped until recently. In Part 3 I mentioned that these unmapped POIs were disproportionately located on quiet side streets. I have no idea if these would have increased or reduced the overall rate of change if they had been mapped from the start.

Setting aside all the limitations of the data, this does seems to indicate a fairly manageable rate of change. Now that the data is reasonably complete, and given the lack of space for new construction in this area, I expect relatively few completely new POIs in future. I think I should be able to keep up with an annual rate of change of 10% to 15% - say 40 to 60 data updates each year.

And if I manage to do that, this kind of analysis will become more interesting in future years, as it will reflect real changes rather than irregular bursts of mapping activity.

Location: Cristo de la Epidemia, Centro, Málaga, Málaga-Costa del Sol, Malaga, Andalusia, Spain

As I mentioned in Parts 1 and 2, over the last couple of months I set out to systematically update Points of Interest in four adjoining barrios in the city of Málaga.

A few businesses really did open, close, or change hands over that short period. But the vast majority of the changes in the data reflect the OSM data “catching up” with reality: adding points that had never been mapped in OSM, or updating POIs that had changed in some way since they were last touched by an OSM mapper.

Adding leisure POIs

Closely examining changes over a period is a good way of revealing faulty assumptions. I noticed that three shop locations had disappeared from my data, even though I was sure there were still businesses at those addresses. They are all gyms now, and gyms are tagged as “leisure” - a tag I had completely ignored. I’ve now added leisure POIs. As with “amenity” tags, I excluded some high-volume tags such as “garden” and “swimming pool”. I now have 392 POIs in my current dataset, not 376 as I mentioned in previous diary entries.

Changes over the period

  POI count count distinct feature tags
mid Aug 2024 267 96
+ newly created 141 34
- removed (16) (9)
late Oct 2024 392 121
% change +47% +26%

I mentioned in a previous post that I thought that POIs were reasonably well mapped in this neighbourhood. The numbers suggest that about two thirds of POIs were mapped two months ago. That seems respectable, but not great. There has not been any major expansion of retail space in this area recently. Some of the new additions are locations that were vacant or derelict when the area was first mapped in detail, but the majority have existed for a long time and were simply never mapped in OSM.

Most of the removed POIs are now mapped as with a “disused:” life cycle tag (and thus excluded from my data). These are premises that could still be used for retail, but are currently vacant. In a few cases the original POI has been deleted completely, where the building has been demolished or the retail space has been absorbed by a neighbouring shop.

Many of the POIs added and removed had unique tags, so the number of tags changes roughly in line with the number of POIs. But it increases at a slower rate, as we would expect - the more POIs we map, the more likely a new POI will be of a type already in the data.

Evolution of the POIs in the original dataset

Ignoring the newly added points, this is what happened the 267 POIs already mapped in mid August:

  number of points as %
no major change 209 78%
removed 16 6%
changed POI type 28 10%
changed name 14 5%
total 267 100%

The changes of POI type are mainly genuine changes of use, often to something very different: a greengrocer to a coworking centre, a clothes shop to a beauty salon for pets, an electronics shop to a laundry. The name changes usually reflect a change in ownership or a a significant rebranding. Only a small share of both types of change is due to improved tagging of a POI that did not change in reality.

Almost 80% of mapped POIs were already up to date when I started checking (ignoring minor changes such as contact details or spelling errors). Again, this seems fairly respectable in isolation. But combined with the POIs that were missing completely, only 53% of POIs were mapped correctly. That’s a lot less than I expected when I started all this.

Changes by POI group

Were some types of POIs better mapped than others?

feature group count new count old change % change
amenity - food+drinks 56 49 7 +14%
amenity - general 48 33 15 +45%
craft - general 14 3 11 +367%
healthcare - general 26 17 9 +53%
leisure - general 16 13 3 +23%
office - general 42 17 25 +147%
shop - clothes 15 17 -2 -12%
shop - food 52 45 7 +16%
shop - for the body 51 24 27 +113%
shop - general 61 42 19 +45%
tourism - general 11 7 4 +57%
total 392 267 125 +47%

Most eateries (restaurants, cafes, bars) were already on the map, as well as most food shops (supermarkets, greengrocers, bakeries and so on).

Of the main groups, “for the body” stands out as undermapped. I have already talked about the high proportion of shops in the area that are hairdressers or beauty salons. The majority had not been mapped previously: the number increased from 15 to 38. In fact “shop=hairdresser” is now the most common single tag, well ahead of “amenity=restaurant” which was the leader until two months ago.

It’s also clear that offices and crafts were undermapped. The increase in these categories was spread across a wide range of POI types. I did notice an increase in business related to the sale and management of property (including vacation rentals), but that only accounts for at most 10 extra POIs. I could have mapped even more offices, but as I mentioned in a previous post, I generally didn’t add businesses identified only by a plaque in a doorway.

So, it seems that over the years, local mappers (including me) have tended to focus more on some categories of POI than others. That might be due to the nature of the POIs themselves: offices are often not very interesting, hairdressers tend to open and close rather frequently, mappers tend to add cafes where they eat themselves, and so on. But it might also have something to do with…

Location, location, location

map of POIs with different colours indicating how they changed over the period

Looking at the map, it’s clear that a lot of the new POIs are in the northwest corner of the area of interest. Apparently neither I nor other mappers ever paid enough attention to this group of streets. Until I started my systematic survey, I had thought of these streets as almost entirely residential. In fact they contain quite a few small retail businesses. To some extent the same applies to other side streets. It probably doesn’t help that I do most of my surveying at night, when many of these shops are completely shuttered and attract very little attention. Shops on the main streets tend to be a little more obvious even when they are closed.

This pattern inevitably influences the results I described above. It seems quite possible that hairdressers, crafts, and offices were undermapped not so much because of anything intrinsic to these businesses, but because they are more likely to be located away from the busiest streets.

In my next diary entry, I’ll look at how the POIs in this area evolved over a much longer period.

Location: Cristo de la Epidemia, Centro, Málaga, Málaga-Costa del Sol, Malaga, Andalusia, Spain

Mapping Local POIs Part 2: All the Brands, not All the Places

Posted by alan_gr on 28 October 2024 in English. Last updated on 29 October 2024.

Brands

As part of my systematic update of local POIs (see Part 1), I added the “brand” and “brand:wikidata” tags wherever I could identify them. Before I started I thought I might find quite a few POIs that I had not previously recognised as brands.

Spoiler: I was wrong, yet again. It’s good to have your preconceptions challenged by actual data. I guess.

It turned out that the reason I hadn’t heard of most of the brands displayed on POIs in this area, and the reason not many POIs had the brand tag before I started mapping, was that … they aren’t brands. It’s possible there are a few shops belonging to brands so obscure or localised that I couldn’t find any reference to them online, but I don’t think I can have missed many.

Out of 376 POIs, only 36 - less than 10 percent - are now tagged as brands.

map of POIs with multiple colours distinguishing different types of POI

Of the 36, about a dozen are banks and insurers, and the remainder are mainly shops. Quite a few of the shop brands are relatively local, with branches only in the city or province of Málaga. Perhaps that’s one reason that nine brands them have no wikidata entry. Of all the food and drink amenities, only one cafe is branded. And no, it’s not Starbucks, it’s Tejeringo’s - another local brand.

I find this interesting for reasons that go well beyond OpenStreetMap. I have often read that European cities are becoming increasingly homegeneous, that international brands have taken over, and that a shopping street in one city is hard to tell apart from a similar street in a different city. Yet the mix of shops in these neighbourhoods isn’t repeated in other parts of Europe - it isn’t even repeated in other cities in Spain. Sure, this isn’t the most fashionable or most touristy part of Málaga. But it’s not the most out-of-the-way either: you can easily walk from the Cathedral to the southern end of this area in 10 minutes.

What does all this mean for mapping in OSM? It certainly helped motivate me to map POIs in this area, and I hope it will continue to inspire me to keep the data up to date. Making a small fruit shop or dressmaker visible on the map feels more worthwhile, and certainly more interesting, than confirming that a large Spanish shopping mall has a branch of Zara. (It might be quicker to add “zara=no” to the rare shopping centre that doesn’t).

But that positive view implies its own negative. What of the many similar areas that don’t have local mappers to carry out a painstaking survey, one shop at a time? If 80-90 percent of POIs are not part of a brand, it reduces the chances of getting data in a uniform format that might speed up the mapping process.

All the Places

As I was carrying out my local mapping campaign, I read quite a lot of discussion on OpenStreeMap forums about the possibility of using data from All The Places (ATP) to help map POIs in OSM. ATP is a project to gather data from business websites, typically those operated by large brands, and make it available in a consistent format.

Could ATP have helped to speed up my survey? Not in its current form. It only contained three valid POIs in this area (three branches of DIA supermarket).

Could ATP help if more brands were added? The potential for this particular area seems limited. Adding a “spider” to scrape business information is not a trivial task, and would only be worthwhile for brands with a substantial number of branches. Of the POIs I surveyed, the major Spanish banks and insurers seem the most likely candidates. As I mentioned, most of the others are relatively localised brands, so the effort of developing a spider would probably be out of proportion to any mapping advantage.

All in all, I can’t see ATP ever covering more than 5 percent of POIs in this area. Of course it could be very different in a shopping mall full of international and national brands. But for this kind of neighbourhood of small local businesses, it looks like ATP will only ever be of marginal help.

In my next entry I’ll look at the process of getting from 258 to 376 POIs over two months.

Location: Cristo de la Epidemia, Centro, Málaga, Málaga-Costa del Sol, Malaga, Andalusia, Spain

Mapping Local POIs Part 1: How many hairdressers does one barrio need?

Posted by alan_gr on 28 October 2024 in English. Last updated on 29 October 2024.

Recently I have been trying to systematically improve OSM data about shops, businesses, and other Points of Interest in a small area of my city. Like many OSM contributors before me, I have been torn between the desire to make my local map as complete as possible, and the realisation of just how much time is needed to do that. My observations in the following series of diary posts may not be very original, but perhaps they’ll help me to clarify my thoughts on the subject.

Background

Málaga is divided into about 200 barrios or neighborhoods, all with well-defined boundaries in OSM. I chose to focus on an area of four adjoining barrios slightly to the northeast of the historic centre. The main axis of this area, formed by Calle Victoria and Calle Cristo de la Epidemia, is about 1km long (the green dashes in the map below). These main thoroughfares, and many of the smaller streets, are lined by apartment blocks with commercial premises on the ground floor. The hillier areas to the east are more purely residential. Most shops are quite small, many are sole traders, and there are no shopping malls.

map of the area discussed in the diary entry

To be clear, I am not claiming this area is representative of anything but itself. Even in the same city, there are neighbourhoods with denser concentrations of POIs due to shopping malls, and areas that are almost entirely residential with very few POIs.

Various OSM contributors have added POI data here over the years, with a particularly big effort in 2017. I have updated individual POIs from time to time when I noticed changes, but I was aware that I hadn’t been very diligent about this, so there was probably quite a lot of outdated data. Still, I reckoned POIs in the area were reasonably well mapped, and that it wouldn’t take too long to add some missing shops and update others.

Spoiler: I was wrong.

Complete data … almost

Over the period of the survey (about 2 months), the number of POIs increased from 258 to 376.

I will talk about the reasons behind that change in a future diary entry. But first, I want to look at the data as it is now. Often discussions of OSM POI data have to start from the assumption that the data is incomplete. Looking at “my” area just after a survey gives us a chance to consider an “almost complete” OSM dataset.

Why only “almost”? There are a few reasons: * The definition of a POI is inevitably arbitrary. My definition is essentially “everthing tagged in OSM as shop, amenity, office, healthcare, craft, and tourism, EXCEPT things I decided to exclude”. Those exceptions are mainly high-volume tags such as individual parking spaces, waste/recycling facilities, and street furniture such as benches. * Some offices are identifiable only by a small plaque in a doorway, and I didn’t go out of my way to find all these (although I did try to confirm those already mapped). * Local business owners thoughtlessly went on doing things like retiring and selling their businesses while I was in the middle of my surveying effort, so I was never quite sure of being up to date. * I probably just got some things plain wrong.

Current POIs: Some numbers

POIs by type

The 376 POIs (according to my definition above) fall into these broad groups:

feature_group frequency frequency_pc
amenity - food+drinks 56 14.9
amenity - general 48 12.8
craft - general 14 3.7
healthcare - general 26 6.9
office - general 42 11.2
shop - clothes 15 4.0
shop - food 52 13.8
shop - for the body 51 13.6
shop - general 61 16.2
tourism - general 11 2.9

I have defined a few large groupings following Vespucci/JOSM presets, with everything else falling into “general”. Both “amenity - general” and “shop - general” cover a very diverse range of other POIs.

A plot of these points shows that POI types are fairly well mixed throughout the area. The mainly residential northeast corner is an exception - the POIs here are mainly tourist accommodation. map of POIs with multiple colours distinguishing different types of POI

Tag distribution

The 376 POIs are mapped by 116 distinct feature tags (amenity=bank, shop=supermarket and so on). The most common single tag is “hairdresser” with 29 uses, some way ahead of “cafe”, “restaurant”, and “fast_food”. At the other end of the scale, 52 tags are used only once. Only 21 of the 116 tags are used more than 4 times.

Within the “office” grouping, “estate_agent” and “property_management” accounted for 17 POIs between them - possibly a reflection of the growing importance of vacation rentals.

Main streets v side streets

One thing that surprised me was the number of premises on side streets. I previously had the impression that most POIs were along the main axis I mentioned above, plus Calle Ferrándiz which branches off to the east.

Spoiler: I was wrong again.

In fact 55 percent of POIs were on other streets.

Even extending the definition of “main street” a bit further (roughly to anything classed as tertiary or above in OSM), a healthy 45 percent of POIs were on side streets.

How many hairdressers?

While surveying the zone I gradually started to become haunted by a feeling that I was spending most of my time mapping hairdressers, barbers, and beauty salons. (Many such places label themselves “Peluquería - Estética”, so any distinction between hairdressers and beauty salons is rather arbitrary). It felt quite appropriate that I ended up mapping the Association of Ladies’ Hairdressers (as an association, not a hairdresser).

It turns out that 38 POIs, about 10% of the total, are hair/beauty shops. Comparing with shops rather than all POIs, that’s more than 21% of shops - compared to 10% in OSM worldwide. At face value that supports my impression that this particular area is strangely oversupplied with hairdressers.

But there is another possibility: maybe it’s quite normal for one in five shops to be hair and beauty salons, and it’s the 10% global OSM figure that is misleading. Perhaps these businesses are typically small and run by sole traders, and get less attention from mappers than larger shops or brands. That was certainly the case in my area until my recent systematic mapping.

In my next diary entry I’ll look at brands … and their absence.

Location: Cristo de la Epidemia, Centro, Málaga, Málaga-Costa del Sol, Malaga, Andalusia, Spain