OpenStreetMap

Firefishy's Diary Comments

Diary Comments added by Firefishy

Post When Comment
OpenStreetMap NextGen Development Diary #1

A note for readers: This is diary entry is about a private project by NorthCrab, it is not endorsed by the OpenStreetMap Foundation or the OpenStreetMap Operations team. The details here are only by NorthCrab. The details are not necessarily agreed by the groups involved with the running of OpenStreetMap.org.

Help purchase 1:50k Topographic series of Swaziland?

I am waiting for final confirmation from the private archive before I sent up a funding website.

Once the British Library is up and running again I will renew my pass and look at their archive of the Directorate of Overseas Surveys (DOS) map sheets of Africa.

-

You are not currently banned. The temporary block placed on your account ended 5 days ago.

šŸŒ‚ The Past, The Present, The Future

AWS dates back to at least 2022, whereas the free AWS credits began just ~6 months ago.

Our AWS usage dates back to late 2017, the amount of data we store has grown over time. We received the AWS credits in August 2022. The credits expire end of August 2023. We have already applied for the next yearā€™s worth of credits. If we donā€™t receive them we have budgeted a contingency, but will likely immediately terminate the US render server running on AWS. We have extra physical hardware being delivered to OSUOSL to increase the US render server capacity.

šŸŒ‚ The Past, The Present, The Future

It also depends on the storage region

The wal bucket is in eu-west-2 due to latency (accessed primarily from Dublin & Amsterdam). cloudtrail bucket is in eu-west-2 (unsure why), the replicated backup buckets for rails user assets (gpx-trace, gpx-images, avatars) are in eu-north-1 (greenest and cheapest in EU). All other buckets are in eu-west-1.

šŸŒ‚ The Past, The Present, The Future

However, itā€™d be worthwhile to have a comprehensive comparison between various cloud services, especially considering other potential costs.

Do you want to produce it? I can give you all the raw export data. It would need to calculate for risks and the effort required to switch solutions.

For example, expenses like ā€œopenstreetmap-wal - 28.7 TB - $400/monthā€ still seem quite substantial.

Snippet change from our private terraform AWS repo (summary: halved the days stored and move to cheaper storage tier sooner). It should reduce the billing to less than $200/month

PS: Weā€™ll likely move to terraform -> opentf once it is better established.

šŸŒ‚ The Past, The Present, The Future

On why we use AWS: As mentioned previously we use Rails ActiveStorage for the osm.org website. AWS S3 has the best compatibility with Active Storage. We started storing AWS S3 for avatar images back in July 2019. Prior to that we used NFS which was tied to a single export host (single point of failure) and not reliable running across data centres (IO + net latency issues causing timeouts). I built & tested self-hosted Ceph storage clusters but designing & running a multi-site replicated cluster for OSM would be an extreme burden for the Ops team, we have a lot more to get on with than worry just about storage. Much of the discussion is here. There is likely also discussion in the Ops meeting minutes where we thrashed out options a few years back.

Backblaze B2 only started offering an S3 compatible object storage API in May 2020. Hetzner is expected to offer S3 compatible object storage API in 2024. I see someone has written a dedicated Backblaze B2 gem for Active Storage support, but it doesnā€™t look particularly well supported.

Professionally I have 10 years of real experience building and consulting on creating secure AWS, Azure & GCP solutions. Iā€™ve also built and consulted large on-premise hosting solutions for US/UK financial and security services where using cloud hosting was not permitted. At my previous employer I was AWS certified. I understand the real costs involved in building different solutions.

On backupsā€¦

As of today (excluding logs) we have 230.8 TB backed up. 200.8 TB is AWS S3 is in Glacier Deep Archive storage tier ($0.00099 per GB) 27.4 TB is AWS S3 is in Standard-Infrequent Access storage tier ($0.0125 per GB)

Total AWS cost: $554/month ($0/month after free credits weā€™re using)*

By comparison Backblaze B2 costs $0.005 per GB.

Total Backblaze cost: $1168/month.

Yes, we could likely save money by deduplicating and moving from our TGZ backups to borgbackup / borgmatic. I had an informal discussion about this a few months back during our Ops fortnightly call. Iā€™ve said Iā€™d come up with a proper proposal, but it is extremely low priority and needs to be proven on a small scale before we can start trust it. Multiple TGZ files is simple and reliable.

In summary: AWS is cheap, reliable, we have previous experience with it and it is well matched to our limited needs allowing us to focus our time/effort on the many other parts of OSM infrastructure which require attention. The ops team run things we can realistically support. Our usage of cloud services is limited, pragmatic and we are not tied/locked-in to any cloud provider.

Why didnā€™t we publish all the decision detail before? We did or at least tried to, it maybe isnā€™t ideally collated / minuted, wanna help? Why did I discuss AWS sponsorship on Slack? Because that is where the OSM US community live and the sponsorship is primarily used for the render server for USA. Why didnā€™t we publish all the metrics / cost breakdowns / other details for AWS usage? Because nobody prior to you ever asked about it, but we have now added backlog tickets to automate publishing some of it. We are an open project we do not intentionally hide anything.

This has been extremely draining for me. I am human.

I am on irc oftc network in #osmf-operations or https://en.osm.town/@osm_tech or https://twitter.com/osm_tech if you have any specific follow up questions.

šŸŒ‚ The Past, The Present, The Future

@NorthCrab You and I are both Linux admins, we both map for fun, we both are in agreement that wasting money on cloud services, especially when there is physical hardware to hand, would be a dishonourable way to spend donated project money.

The OSM ops team is extremely small. We donā€™t have the resources (human capital, time, electrical power or even large enterprise disks) to run a large on-premise ceph / gluster / NFS / whatever cluster to store the shared data (across DCs) and backup data we store. I would not feel comfortable locating important backups in the same racks that we are backing up. Building a cluster that would survive our requirement of surviving a data centre outage would be difficult. In the past online and offline we have spent a long time discussing options for example: https://github.com/openstreetmap/operations/issues/169

Our use of AWS (and S3) is limited and is the pragmatic choice. The small amount of money we spend on AWS is small and I believe justified. Our AWS costs currently being 100% covered by free credits helps with the value proposition and allows us the ops team to focus on running the many other aspects of the OSM infrastructure require our attention.

It would be great if we could meet up and find common ground. A few times a year I visit family in Lithuania, I see you map in Poland, maybe we could meet up with drinks in Warszawa? Or video call or whatever.

Hardweg 17

Wow, very nice work!

By the way, this whole account is a joke but...

It isnā€™t just a G with colour, it is a logo trademarked by Google. Please change it.

Thanks

Iā€™ll biteā€¦ What happened that you wasted your time? OSM can be great fun, gets me out doors mapping and discovering my neighbourhood.

Community.osm.org - how's it going?

I keep a close eye on the translation feature of community.osm.org. It is an important feature for our community and am happy to adjust any rate limits if they are causing any problems. I would prefer if the Tips remained where they are.

Community.osm.org - how's it going?

The 4 second page load time is a difficult one. I cannot replicate it. Google report Good ā€œCore web vitalsā€ for 99%+ of the site for both Mobile and Desktop. There might be some edge case, but I am not seeing it.

Community.osm.org - how's it going?

I completed big tidy of some of the old forum imported categories today. Tagged the topics as relevant and then merged them into existing categories eg: Category: General Tag: Garmin.

Peering into Yesteryear

I quite like the https://every-door.app/ for updating POIs. You can also ā€œgreen tickā€ places to verify that they still exist.

I hope the app becomes more popular with the community.

By the way, this whole account is a joke but...

Also change your image. You are using a copyrighted/trademarked logo without permission.

Every Door is a game changer

ā¤ļø 100% agree. It is such an awesome app for adding detail and missing features/POI.

There are many ā€œproā€ features like being able to switch a disused shop to another type by clicking on the title bar when the feature is openā€¦. Or the ā€œhiddenā€ in plain sight button for adding social links to a feature.

How to use Every Door

Every Door v2.0 isnā€™t yet available in the Google Play Store. Is Google worried about this awesome OpenStreetMap mapping app?!? ;-)

сŠøстŠµŠ¼Š½Ń‹Šµ трŠµŠ±Š¾Š²Š°Š½Šøя Šŗ сŠµŃ€Š²ŠµŃ€Ńƒ

It depends on what feature you want to run? Nominatim? Tile rendering? API? Something else? They all use a different setup.

What the robots.txt file does

Thank you for the write-up.

The /diary disallow is a recent temporary measure to mitigate against some of the spam weā€™ve recently had and will be removed in a few days.