OpenStreetMap

OpenStreetMap Service Availability (2023-12-20 - 2024-01-20)

Posted by NorthCrab on 25 January 2024 in English. Last updated on 27 January 2024.

I have started an independent collection of OSM SLA statistics. Approximately once a month, I will publish my results with the aim of enhancing transparency regarding the reliability of OSM services. I use uptime-kuma to run monitoring. I also verify connectivity with non-OSM services (to prevent false positives). The current configuration includes checking the availability of openstreetmap-website and openstreetmap-cgimap (API). Tile layer availability is not currently included in the checks. The health-check resolution is set to 30 seconds, and the checks are executed from a single server in the Hetzner datacenter in Germany. For the endpoint to be marked unavailable, two consecutive checks must fail. This should be well-representative of an average user experience.

Summary

Total API downtime: 10 minutes and 37 seconds

API 31D SLA: 99.976%

Total website downtime: 30 minutes and 6 seconds

Website 31D SLA: 99.932%

Note that some functionalities of the website require API to also be available.

Details

2024-01-02 11:30:00 - 2024-01-02 11:34:32

  • Total downtime: 4 minutes 32 seconds
  • 🌐 Website unavailable

2024-01-09 12:51:39 - 2024-01-09 12:53:10

  • Total downtime: 1 minute 31 seconds
  • 🌐 Website unavailable

2024-01-09 12:56:58 - 2024-01-09 12:59:28

  • Total downtime: 2 minutes 30 seconds
  • 🌐 Website unavailable

2024-01-09 13:07:18 - 2024-01-09 13:07:48

  • Total downtime: 30 seconds
  • 🌐 Website unavailable

2024-01-09 13:09:57 - 2024-01-09 13:10:27

  • Total downtime: 30 seconds
  • 🌐 Website unavailable

2024-01-09 13:16:36 - 2024-01-09 13:19:21

  • Total downtime: 2 minutes 45 seconds
  • 🌐 Website unavailable

2024-01-14 16:10:19 - 2024-01-14 16:11:04

  • Total downtime: 45 seconds
  • 🌐 Website unavailable

2024-01-14 16:15:55 - 2024-01-14 16:16:25

  • Total downtime: 30 seconds
  • 🌐 Website unavailable

2024-01-14 16:17:58 - 2024-01-14 16:18:29

  • Total downtime: 31 seconds
  • 🌐 Website unavailable

2024-01-14 16:20:02 - 2024-01-14 16:21:02

  • Total downtime: 1 minute
  • 🌐 Website unavailable

2024-01-14 16:25:02 - 2024-01-14 16:25:32

  • Total downtime: 30 seconds
  • 🌐 Website unavailable

2024-01-14 16:31:45 - 2024-01-14 16:32:30

  • Total downtime: 45 seconds
  • 🌐 Website unavailable

2024-01-14 16:44:43 - 2024-01-14 16:45:13

  • Total downtime: 30 seconds
  • 🌐 Website unavailable

2024-01-14 16:49:22 - 2024-01-14 16:49:52

  • Total downtime: 30 seconds
  • 🌐 Website unavailable

2024-01-14 16:50:54 - 2024-01-14 16:51:24

  • Total downtime: 30 seconds
  • 🌐 Website unavailable

2024-01-14 16:54:14 - 2024-01-14 16:54:59

  • Total downtime: 45 seconds
  • 🌐 Website unavailable

2024-01-14 17:00:18 - 2024-01-14 17:01:18

  • Total downtime: 1 minute
  • 🌐 Website unavailable

2024-01-14 17:11:08 - 2024-01-14 17:12:09

  • Total downtime: 1 minute 1 second
  • 🌐 Website unavailable

2024-01-14 17:14:44 - 2024-01-14 17:15:29

  • Total downtime: 45 seconds
  • 🌐 Website unavailable

2024-01-14 17:19:07 - 2024-01-14 17:19:37

  • Total downtime: 30 seconds
  • 🌐 Website unavailable

2024-01-14 17:21:42 - 2024-01-14 17:22:12

  • Total downtime: 30 seconds
  • 🌐 Website unavailable

2024-01-14 17:41:20 - 2024-01-14 17:43:35

  • Total downtime: 2 minutes 15 seconds
  • 🌐 Website unavailable

2024-01-14 18:33:53 - 2024-01-14 18:34:38

  • Total downtime: 45 seconds
  • 🌐 Website unavailable

2024-01-14 18:38:46 - 2024-01-14 18:39:16

  • Total downtime: 30 seconds
  • 🌐 Website unavailable

2024-01-14 18:48:46 - 2024-01-14 18:49:17

  • Total downtime: 31 seconds
  • 🌐 Website unavailable

2024-01-14 18:52:57 - 2024-01-14 18:53:57

  • Total downtime: 1 minute
  • 🌐 Website unavailable

2024-01-14 20:32:08 - 2024-01-14 20:33:23

  • Total downtime: 1 minute 15 seconds
  • 🌐 Website unavailable

2024-01-14 20:35:29 - 2024-01-14 20:36:29

  • Total downtime: 1 minute
  • 🌐 Website unavailable

2024-01-14 20:55:32 - 2024-01-14 20:56:02

  • Total downtime: 30 seconds
  • 🌐 Website unavailable

2024-01-15 02:21:49 - 2024-01-15 02:23:19

  • Total downtime: 1 minute 30 seconds
  • 🚩 API unavailable

2024-01-15 02:25:49 - 2024-01-15 02:27:24

  • Total downtime: 1 minute 35 seconds
  • 🚩 API unavailable

2024-01-15 04:50:55 - 2024-01-15 04:57:42

  • Total downtime: 6 minutes 47 seconds
  • 🚩 API unavailable

2024-01-19 22:53:45 - 2024-01-19 22:54:30

  • Total downtime: 45 seconds
  • 🚩 API unavailable

Discussion

Comment from Andy Allan on 26 January 2024 at 15:10

the checks are executed from a single server in the Hetzner datacenter in Germany

Then you are equally likely to be measuring the network availability of Hetzner.

Comment from NorthCrab on 26 January 2024 at 15:20

@Andy Allan, Hey! I already thought of that, and there are additional connectivity checks to my other server in Poland. I exclude any downtime that is also present on that server. :-)

I also verify connectivity with non-OSM services (to prevent false positives)

By the way, do you by chance know anything about the official uptime OSM configuration? I have noticed it’s more optimistic, which can indicate a higher timeout limit (assuming 60 seconds?). I would love to see if it’s possible to reduce the timeout on official checks, as they seem to be not fully indicative of the average user experience (applications won’t usually wait for a response for a minute).

Comment from NorthCrab on 26 January 2024 at 15:22

I also forgot to mention:

For the endpoint to be marked unavailable, two consecutive checks must fail.

So single connection drops are unlikely to be registered.

Comment from Andy Allan on 26 January 2024 at 16:02

there are additional connectivity checks to my other server in Poland. I exclude any downtime that is also present on that server. :-)

Great!

By the way, do you by chance know anything about the official uptime OSM configuration?

No, sorry I don’t. I’m only involved in the software development, not in the production operations.

Comment from NorthCrab on 26 January 2024 at 16:08

Oh okay! Thank you anyway :-) And thank you for just being nice :-)

Comment from mmd on 26 January 2024 at 22:07

Yearly database re-indexing was running on the weekend of 01-14, with periods of fairly high load on the database server: https://prometheus.openstreetmap.org/d/Ea3IUVtMz/host-overview?orgId=1&var-instance=snap-01&from=1705130975266&to=1705276768520

This might have impacted some queries to take longer than usual, or even time out.

By the way, the CGImap link points to an outdated mirror. It should be https://github.com/zerebubuth/openstreetmap-cgimap instead.

Comment from NorthCrab on 27 January 2024 at 06:59

Fixed the link, thanks!

Log in to leave a comment