OpenStreetMap

🌂 The Past, The Present, The Future

Posted by NorthCrab on 15 August 2023 in English. Last updated on 17 August 2023.

Over the past few days, my perspective on OpenStreetMap (OSM) has undergone a seismic shift. For the longest time, I held OSM in the highest regard, viewing it as a beacon of transparency and people-centricity amidst a sea of profit-driven tech conglomerates.

Chapter 1: The Banner of Surprises

My view started to waver a couple of days ago when a new donation request banner popped up on the main OSM page. Though its sheer size and prominence were unsettling, the real issue lay in the near-invisible close button. When I voiced concerns on its ready-for-production state, opposition awaited me on the other end. Was this a shift in OSM’s priorities?

Chapter 2: The Amazon Alliance

Another revelation added fuel to the fire. Scrutinizing OSM’s 2022 and 2023 budgets, I stumbled upon a yearly expenditure of €24,000 directed to Amazon for S3 storage. This alliance with the tech giant seemed out of character for OSM. With an impressive arsenal of OSM-owned servers, the choice to rely on Amazon’s rented storage raised questions about OSM’s commitment to independence.

Chapter 3: Speaking My Mind

Emotions ran high as I took to the forums to articulate my frustrations. Admittedly, my initial approach was less than tactful, potentially causing offense. I since rephrased my statements, but my core concerns remained:

  1. OSM’s puzzling choice to rely on Amazon’s AWS.
  2. The opaque nature of information regarding S3 use within OSM.
  3. The seemingly wasteful storage of files on the expensive S3.

These revelations painted a rather concerning picture of OSM’s current trajectory.

I intentionally choose not to delve deeply into the now-publicized details concerning S3 dependence here. For a comprehensive understanding, I’d recommend referring to the original thread. Rehashing all the specifics here would be redundant.

Chapter 4: Lost in Dialogue

As conversations unfolded, a pattern began to emerge. Some comments seemed baseless, and when challenged, many simply turned a blind eye [1] [2] [3] [4]. This behavior wasn’t exclusive to the Amazon issue. It seemed to be a growing trend [1], even among those holding significant roles within OSM. A constructive dialogue relies on evidence-based arguments, and I found this lack of engagement disheartening.

Chapter 5: The Silenced Voices

My frustrations were further amplified when certain posts were taken out of context, resulting in the closure of the entire discussion thread. While I acknowledged my initial lack of professionalism, I took steps to rectify it. The decision to close the thread felt like a suppression of open dialogue, an ethos I believed OSM staunchly upheld.

Chapter 6: Reflection and Realization

These series of events forced me to confront a bitter truth: OSM’s priorities seem to have shifted. While my concerns about transparency were met with indifference, minor transgressions in tone drew significant attention. My principles are unwavering: I can only support an OSM that champions transparency, values its community, and remains open to feedback.

Endnote: Parting Ways, but Not in Spirit

My journey with the global OSM, in its current state, must sadly come to a halt. Yet, this doesn’t signify the end of my commitment to the open-source community. I will continue to support local OSM projects, like the OpenAEDMap, and will ensure that my AI-related projects [1] [2] remain accessible to all. While my vision of revolutionizing OSM mapping might be on hold, my passion for open source and my dream of a transparent, user-centric digital world remain undeterred.

Sunset

Photo by Alvesgaspar. See terms.

Discussion

Comment from Awoobis on 16 August 2023 at 13:24

The king is dead, long live the king.

Comment from SimonPoole on 16 August 2023 at 14:27

I can only comment on 2) and following.

Nobody had or has any problem with your question with respect to the usage of AWS.

However you can’t reasonably expect everybody to drop what they are doing to get you an immediate detailed answer, for, a in the larger scheme of things, small line item in the budget. Nor should you expect to find every detail documented in advance.

It was clearly pointed out to you that the person responsible on the technical side of things was on vacation and wouldn’t be available before today, but you didn’t pause your barrage of requests even just for two days.

Instead of blaming everybody else for your totally over the top reaction to the push back you deservedly received, maybe you should calm down and reflect a bit.

Comment from spatialia on 16 August 2023 at 18:18

I encourage anyone who reads this post to go read the original thread as this post repeatedly misrepresents numerous facts: https://community.openstreetmap.org/t/why-does-osmf-budget-25-000-on-amazon/102475

A few items: * Nobody was silenced. The thread was temporarily locked for everyone to cool things down. It helped. You’re free to go post further questions there now. * As I said in the thread - the transparency you want can be achieved without accusing people of making bad decisions. Everything here could have been easily achieved by asking questions and listening to the answers. You haven’t appeared to do either one and continue to blame others in this post without truly accounting for yourself. Regardless, you received the information you were looking for - something you don’t mention here. Your request was not met with indifference, but instead was met with a full accounting of every bit of AWS usage by OSM. You weren’t entitled to this, but were provided with it because the people involved clearly agreed with you that transparency is good. Everyone seems to agree on that despite what you assert here. Instead of responding to receiving the information you asked for in the thread you started, you’re posting here stating that people didn’t care about your concerns. I’m genuinely confused by this. * I’m also confused as to why you continue to assert that OSM spends 24k Euros on AWS when that has been repeatedly shown not to be the case. This is false information that you keep repeating. Graham explicitly stated numerous times that those costs are donated to OSM by Amazon and are not recoverable as cash. No money is being spent by OSM on these services. You received the transparency you’ve been asking for, but continue to ignore that the the information has been provided to you. Why?

I agree with Simon. It sounds like taking a break to reflect on your own actions would be a good thing.

Comment from NorthCrab on 16 August 2023 at 20:58

I appreciate both Simon and spatialia taking the time to respond and share their thoughts.

1) I’m well aware of the entirety of the conversations that took place, which is why I would urge anyone commenting here to carefully reread the diary, as it already addresses many of your concerns.

2) While I acknowledge that the thread was locked to ‘cool things down’, it’s worth noting that this occurred at a time when activity had already subsided. It seemed a tad selective in allowing only one side to continue sharing their thoughts, which felt inconsistent and censorship-like.

3) Conversations become truly productive when both sides provide justified reasoning for their positions. I’ve been earnestly looking for such constructive exchanges.

Comment from Minh Nguyen on 16 August 2023 at 21:58

To your first point above: the close button on the banner was not about you. A number of us experienced the bug, yours truly took the time to calmly report the bug, you had some suggestions for fixing it, and it got fixed a different way. I’m sorry it didn’t get fixed in quite the way you suggested. Personally, I was pleasantly surprised at the turnaround time, and I don’t see any motive behind the bug that can be tied to the incident about AWS credits.

To your other points: I’m just a simpleton to whom clouds are welcome relief from the incessant sun in this part of the world. Simpletons like me don’t know what to do with all this melodrama.

Comment from Fizzie41 on 16 August 2023 at 22:18

I’m sorry to see you leave, NC, hope you’ll be back one day, & would like to thank you for your many contributions to OSM.

I also hope that your reverter https://revert.monicz.dev/ will stay operational?

Comment from RobJN on 16 August 2023 at 22:19

I can shed some light on Item 1. The code for the banner was simply re-used with a different image of the same dimensions as previous promotional campaigns. Those previous campaigns have typically been to raise awareness of the community run State of the Map conferences. I suspect when it was first done care was taken so that the close icon (cross in the top right corner) was more visible.

So in summary there was no bad intent to make it hard to find the close icon, just a volunteer working quickly to get something on line (quickly because it is their personal time they are volunteering) and they didn’t realise this. The alternative is to pay a professional services company to handle these things but even they make mistakes all the time!

Comment from NorthCrab on 16 August 2023 at 22:24

@Fizzie41, I am not leaving OSM. I will continue to assist in my local OSMP community and maintain all projects, as they remain valuable to many people. My decision is to discontinue involvement in global OSM issues, as I intend to shift my focus towards other matters.

Comment from NorthCrab on 16 August 2023 at 22:28

@RobJN, and that’s a sign of OSM changing. I have always assumed that OSM does not want to release half-baked features that harm the user experience. The alternative you mentioned is not the only approach. As you can see now, the issue has been resolved; it only required a little more time to be refined.

Comment from MxxCon on 17 August 2023 at 05:14

There are multiple reason why AWS is being used at all. In addition to the sponsorships that covers significant portion of the cost, overall services that are offered by AWS are the most complete and extensive of any cloud service providers. So while for now OSM might be using only a couple of AWS services, being already in this ecosystem makes it easier to integrate with other services if there’s ever a need for that.

Yes, it’s possible not to use cloud providers at all and to run your own hardware. But that hardware and hosting is not free either and requires significant upfront spending in addition to the ongoing costs of hosting and maintenance.

Yet hardware hosting removes the flexibility that a cloud service provider offers by using and paying only for the resources actually used, dynamically scaling with the demand.

With fixed hardware hosting you’d need to spend upfront to accommodate for the maximum possible spikes in demand you might encounter while not using all that extra hardware.

Ultimately you should look at it this way: What is this organization’s core competency? Is it wrangling server hardware or providing a mapping service? If it’s providing a mapping service then it’s a sensible decision to outsource hardware management to somebody who makes that THEIR core competency, ie AWS.

Comment from Firefishy on 17 August 2023 at 05:55

@NorthCrab You and I are both Linux admins, we both map for fun, we both are in agreement that wasting money on cloud services, especially when there is physical hardware to hand, would be a dishonourable way to spend donated project money.

The OSM ops team is extremely small. We don’t have the resources (human capital, time, electrical power or even large enterprise disks) to run a large on-premise ceph / gluster / NFS / whatever cluster to store the shared data (across DCs) and backup data we store. I would not feel comfortable locating important backups in the same racks that we are backing up. Building a cluster that would survive our requirement of surviving a data centre outage would be difficult. In the past online and offline we have spent a long time discussing options for example: https://github.com/openstreetmap/operations/issues/169

Our use of AWS (and S3) is limited and is the pragmatic choice. The small amount of money we spend on AWS is small and I believe justified. Our AWS costs currently being 100% covered by free credits helps with the value proposition and allows us the ops team to focus on running the many other aspects of the OSM infrastructure require our attention.

It would be great if we could meet up and find common ground. A few times a year I visit family in Lithuania, I see you map in Poland, maybe we could meet up with drinks in Warszawa? Or video call or whatever.

Comment from NorthCrab on 17 August 2023 at 07:02

@Firefishy

First and foremost, thank you for your thoughtful response and willingness to open up a dialogue about OSMF’s infrastructure choices. I genuinely appreciate it.

I want to clarify something upfront: When I first brought up this concern, there was no public knowledge about OSMF’s free AWS credits. The information available at that time suggested that we had not only budgeted for, but were also spending, a significant amount on S3. Now, while I wholly understand OSMF’s monetary constraints, and I respect the pragmatic choices that are often required in such situations, there are core principles I stand by that deeply inform my perspective.

One of my core beliefs is the importance of creating and utilizing software that respects its users. I have a profound aversion to the practices of many big-tech companies. In my eyes, they commodify human beings and often operate in ways I find ethically objectionable. OSM has always been a sanctuary for me in this regard – an embodiment of what the tech world can be when driven by pure intent and community spirit. To discover the organization’s ties to Amazon deeply affected my perception of OSM’s independence.

I’m not suggesting that OSM should immediately invest in its own hardware, though it would be ideal. My primary concern revolves around our reliance on Amazon. I believe OSM should be cautious of anchoring itself too deeply to a company whose ethos might not resonate with ours. While AWS, with its current credits, might seem cost-effective now, there are numerous other solutions available that may better embody OSM’s guiding principles. -Did you hear about the free TVs? Don’t dance with the devil.

Regarding backup deduplication: While I acknowledge the complexities of setting it up properly, I don’t see it as prohibitively difficult. When we’re discussing potential savings of tens of thousands of Euros – especially considering how limited some OSM groups’ budgets are – it seems well worth the effort. Plus, having a deduplicated backup would significantly ease any future transitions, potentially even to self-owned hardware, given the reduced storage demands.

Your dedication and professionalism are truly commendable. I would be grateful for an opportunity to converse further on this topic. Whether it’s a call or another method, I believe dialogue can pave the way to mutual understanding. After all, our shared objective is the betterment of OSM.

-K

Comment from Koreller on 17 August 2023 at 08:19

As an OSM contributor, I have my doubts about your sincerity,

“Admittedly, my initial approach was less than tactful, potentially causing offence.” → “A constructive dialogue relies on evidence-based arguments”

“OSM’s priorities seem to have changed” → They haven’t changed

26,000 euros for the technical infrastructure of a project like OSM on a global scale for AWS, that seems okay to me.

We’re concerned about our independence and the people who are spending money on AWS in the first place.

On the question of transparency, I think we can do better, in fact I think it’s more a question of communication.

Comment from o_andras on 17 August 2023 at 11:12

First of all, I’m sorry to hear a contributor as active as you has decided to leave. It is always a shame and a big loss. OSM has come this far because we’re many. There is a saying in Portuguese: “grão a grão enche a galinha o papo” (lit. “grain by grain fills the chicken its belly”).

As others have also already said, the AWS spenditure picture thread is quite different from what’s painted in this diary. I haven’t followed up on the other points because my time is limited and I’d rather spend it on getting shit done.

there are core principles I stand by that deeply inform my perspective.

Core principles may be informed by facts, observations, etc. But core principles cannot inform perspectives, only influence.

One of my core beliefs is the importance of creating and utilizing software that respects its users. I have a profound aversion to the practices of many big-tech companies. In my eyes, they commodify human beings and often operate in ways I find ethically objectionable.

I share with you this principle/belief. I thank you for striving and putting in effort for a better world in this front, such as by contributing to OSM and FLOSS, and I hope you continue to do so for many years to come!

However, at the heart of an open project (such as OSM), we must put emphasis on us, the humans, and our interactions. If not for codified norms (CoC, ethical, law, cultural/social norms, …), at least for the prosperity of the said open projects. Because humans are not a commodity, as you so said, and without the human contributors the said open projects would not exist. So try to give your fellow humans reasons to continue in the project. If, no matter why, you can’t give reasons to continue, try at least not to give them reasons to leave.

Unfortunately, from several of your posts I saw more reasons not to engage with you than reasons to engage with you, and I can only imagine others felt the same. I just hope none of the core pillars of this project (such as @Firefishy of the OWG) have felt it in them reason to leave.

Comment from pnorman on 17 August 2023 at 21:25

@RobJN, [banners that interfere with the close button are] a sign of OSM changing.

No, releasing a banner which conflicts with the close button is a sign of OSM staying the same. It’s happened with SOTM banners in the past, and probably will in the future.

Comment from MxxCon on 17 August 2023 at 23:23

I want to clarify something upfront: When I first brought up this concern, there was no public knowledge about OSMF’s free AWS credits. The information available at that time suggested that we had not only budgeted for, but were also spending, a significant amount on S3.

That is incorrect. You are conflating YOU not being aware of it with “no public knowledge”. That is absolutely not the case. Migration to AWS and Amazon’s sponsorship was public knowledge and discussed in github issues, slack, discord and possibly elsewhere. I was aware when it was happening and participated in some of the conversations about it. I’m just a regular user, not associated with OSMF or any other group. 🤷‍♂️ Just because YOU were not aware of it happening doesn’t mean it was done in secret.

In my eyes, they commodify human beings and often operate in ways I find ethically objectionable.

Yet Amazon spent and still spends probably millions of dollars on contributing to OSM. https://wiki.openstreetmap.org/wiki/Organised_Editing/Activities/Amazon All these people don’t work for free. Even if it’s minimum-wage, that’s still A LOT of people hired by Amazon to add, edit and enrich data in OSM. Yes, Amazon themselves don’t do it out of charitable attitude. They profit from it too. But that’s the nature of OSM, its opensource status and its license.

-Did you hear about the free TVs? Don’t dance with the devil.

I don’t see how that’s relevant to anything that’s going on here. If you are implying that AWS somehow harvests personal data of users of their customers, you are mistaken and should read AWS’ ToS, AUP and Privacy Policy.

On the other hand you use GitHub which is owned by Microsoft. Are you fine with all the ethics of that company?

Regarding backup deduplication: …………..

There are a thousand different things that could be improved in OSM’s infrastructure. But there are only so many hours in the day and only so many people working on it. They are working on what they consider to be of the highest priority and having the most impact. Changing backup structure is probably not as critical as other things.

tens of thousands of Euros

You are still sticking to those numbers even though many times people showed that it’s incorrect premise.

Comment from NorthCrab on 18 August 2023 at 00:47

@MxxCon

Thank you for taking the time to share your perspective with me.

Amazon’s sponsorship

Can you point me to where the details about the ongoing Amazon’s sponsorship were made public? I based my findings on the OWG official website. Most people would consider that a reliable primary source. Saying it was discussed “possibly elsewhere” seems a bit vague. All the information I gathered suggests that the sponsorship is still in the planning stages.

Amazon’s contributions

I acknowledge and respect Amazon’s efforts to support the OSM. Their commitment, in terms of resources, to this open-source project is evident. My intent wasn’t to overlook their contributions, but rather, to express my concerns about OSM’s philosophical direction and values.

GitHub & Microsoft

I’m aware of GitHub’s ties to Microsoft and the ethical considerations that surround this. However, as it stands today, there’s no alternative platform with such a vibrant and active community as GitHub. This is a sentiment that can be mirrored with platforms like YouTube & Google. While I use these services, I consciously limit my exposure by adopting alternative frontends, and ensuring that any information I share is intended for public viewing. It’s a balancing act of leveraging the benefits of mainstream platforms while upholding my privacy values. On that note, I’d like to highlight that all my private projects reside on the Gitea platform.

Spending vs Budgeting

At the time of my initial comments, I was working off the information available to me. As discussions evolved, new details came to light. It’s essential to remember the context of initial inquiries and the state of knowledge at that moment. I have since acknowledged the newly released information, and any assumption to the contrary would be inaccurate.

Comment from MxxCon on 18 August 2023 at 02:51

Can you point me to where the details about the ongoing Amazon’s sponsorship were made public? I based my findings on the OWG official website. Most people would consider that a reliable primary source. Saying it was discussed “possibly elsewhere” seems a bit vague. All the information I gathered suggests that the sponsorship is still in the planning stages.

at least https://osmus.slack.com/archives/C029HV951/p1668206351595599, https://github.com/openstreetmap/operations/issues/682 and other issues linked from there.

My intent wasn’t to overlook their contributions, but rather, to express my concerns about OSM’s philosophical direction and values.

So far I haven’t seen anything about OSM operations that would affect OSM’s “philosophical direction” as the consequences of accepting that hosting sponsorship. It wasn’t done with conditions of allowing Amazon Logistics editors to submit trash data into OSM or somehow to promote Amazon’s services.

Ok, so if OSM would’ve refused AWS’s offer they’d have to spend a significant amount of money to get a new hosting solution, and it would actually cost Amazon less money! So that decision would help Amazon and hurt OSM…But you sure showed them with your values…🤷‍♂️

Comment from NorthCrab on 18 August 2023 at 03:07

@MxxCon

This will likely be my final comment on this matter directed to you. Firstly, the Slack link you’ve shared isn’t openly accessible without an account; it can’t be classified as a public resource. As for the GitHub issue, it doesn’t reference any ongoing AWS sponsorship. It seems there’s a disconnect in our conversation: my concerns center on OSM’s reliance on Amazon at all, not on refusing any sponsorship (or any other deals) from AWS.

Comment from MxxCon on 18 August 2023 at 03:22

The very first line of #682 is a link to issue #637 where the 7th comment shows the offer of sponsorship and following replies. Then in #682 the 3rd bullet point is

✔ Get credits from AWS

What do you think that issue talks about if OSM didn’t accept that offer? 🤷‍♂️

Firstly, the Slack link you’ve shared isn’t openly accessible without an account; it can’t be classified as a public resource.

Anybody is free to create an account there. There’s no requirement that OSM must make broadcast announcements in your local free newspaper for you to see about it. OSM is a huge project and it’s unrealistic to think that you can be aware of everything that’s going on in all the corners of it.

Comment from NorthCrab on 18 August 2023 at 03:33

I appreciate the references you’ve provided. Focusing on issue #682, which you’ve specifically mentioned: “Getting credits from AWS” doesn’t explicitly confirm a free sponsorship from AWS. The information I’ve gathered originally, already indicates financial investment of OSMF into Amazon, so the point about “getting credits” brings no new information.

It’s important to differentiate between accessibility and public resource. Yes, anyone can create an account on Slack, but that doesn’t classify it as an open public resource. Just as signing up for a newsletter doesn’t make the contents of that newsletter universally accessible.

Transparency in a project like OSM is of most importance. I’ve voiced concerns about the project’s direction and transparency, not to attack OSM but to emphasize its importance.

Lastly, I’d appreciate it if you could ensure your arguments are well-considered before presenting them. Continually addressing ambiguities is becoming quite time-consuming for me.

Comment from Friendly_Ghost on 18 August 2023 at 17:51

@NorthCrab

I’ve voiced concerns about the project’s direction and transparency, not to attack OSM but to emphasize its importance.

This left a bad taste in the mouths of several people because it sounded more like an accusation than as a concern. A bit more nuance here could have prevented some misunderstandings.

Lastly, I’d appreciate it if you could ensure your arguments are well-considered before presenting them. Continually addressing ambiguities is becoming quite time-consuming for me.

This goes for the people who are replying to your arguments as well. Consider for example Firefishy who spent his holiday time writing a detailed budget report on the forum, only to hear you complaining about “a lack of grounded argumentation” later on without even a “thank you.” None of us wish to have arguments in this way.

Comment from o_andras on 18 August 2023 at 18:10

@MxxCon

Anybody is free to create an account there. There’s no requirement that OSM must make broadcast announcements in your local free newspaper for you to see about it.

Being free is different from being open. And in case you’re not aware, take a look at https://wiki.osmfoundation.org/wiki/Commitment_to_Open_Communication_Channels

Comment from MxxCon on 18 August 2023 at 18:40

Being free is different from being open. And in case you’re not aware, take a look at https://wiki.osmfoundation.org/wiki/Commitment_to_Open_Communication_Channels

Then petition osmf to ban all slack, discord and telegram channels🤷🙄 The topic was posted on GitHub so the issue is moot regardless. If anybody had concerns about those issues they should’ve raised them at that time.

Comment from o_andras on 18 August 2023 at 19:07

@MxxCon I think you’re missing a couple of points.

The (possible) problem is not, using non-open communication channels. It is, not having the relevant information (whatever it might be, e.g. financials in this whole drama) in open communication channels.

If said information is available on GitHub (as you said), which is open for viewing even without an account, then I’d say that’s fine (unfortunate, IMO, a better option would be e.g. the Wiki, but fine nonetheless!).

If said information is only available in a Slack/Discord/whatever group (which I believe is not open for viewing without an account), then that is a problem due to the “Commitment to Open Communication Channels”


BTW note that the “Commitment to Open Communication Channels” applies only to OSMF communication channels, but not to local communities that manage/administer their own communication means (not sure about “local chapters” though?).

Comment from apm-wa on 18 August 2023 at 19:53

Regarding your statement, “These series of events forced me to confront a bitter truth: OSM’s priorities seem to have shifted,” I suggest that you consult two sets of documents that articulate OSM’s priorities. They are relevant to your discussion in that they partially explain why reliance on cloud services was adopted by the OWG, and the degree to which this has been approved by the vast majority of the OSM community.

First, see the results of the 2021 OSM community survey, posted online at https://wiki.osmfoundation.org/wiki/2021_Survey_Results. An easily digested summary is in this PDF file: https://wiki.osmfoundation.org/w/images/2/27/2021_survey_slides.pdf but you are of course welcome to download and delve into the anonymized data as well.

Two highlights from the survey are relevant to your discussion of priorities:

“Just over 81 percent of respondents approved or strongly approved of the Board’s decision to begin raising funds via large donations. 4.3% disapproved or strongly disapproved. The raw mean differed from the weighted mean by 7/100ths of a point.”

Respondents were asked to vote on priorities (methodology is explained in the PDF). Stability of core infrastructure won handily, with 11,249 points. When we broke down responses by various categories, it remained by far the lead concern of the community: “Shifting to the community sentiment questions, the first asked for a sense of what priorities the Board should set for 2021. Stability of the core infrastructure was a clear winner across the three demographics we have checked so far, which are OSMF members, respondents with more than 15 years in the project, and mappers. No other issue comes close.”

The second item I recommend you read is the preamble to the Strategic Plan Outline published in 2021, found online at https://wiki.osmfoundation.org/wiki/Strategic_Plan_Outline#Preamble. In particular, note these statements contained in the preamble:

“The project and its community seek not growth per se, but rather data quality, consisting of accurate and ever broader, deeper, and more detailed geographic coverage of its database, meanwhile ensuring that this database will remain free of charge and free to use, to allow anyone, anywhere, to create a ‘map of the world that anyone can use’. As a result of this philosophy, however, growth has found OpenStreetMap, and demand for its data now increases by no less than 20 to 30 percent year on year. This growth is straining the project’s volunteer workforce, its hardware and software platform, and it threatens the long-term viability of the project. Unlike most private companies, which seek growth and develop strategic plans to achieve it, OpenStreetMap is in the position of needing a strategy for coping with a growth rate it did not and does not intentionally encourage…”

“A limit on growth of expenditures on administration of the project is also core to OSM’s philosophy: we often hear the refrain that OSM should not become another opaque and inaccessible bureaucracy-heavy NGO, with large paid staff. Adherence to this core philosophy will ensure that OSM remains a free project, independent from influential donors. By empowering volunteers rather than staff, it will also remain a vibrant project that attracts enthusiastic contributors, because it will remain fun as well as useful – and we volunteer contributors, at the end of the day, are why OSM is today the success story that it is.”

Comment from NorthCrab on 18 August 2023 at 23:31

@Friendly_Ghost

Firefishy … only to hear you complaining about “a lack of grounded argumentation” later on without even a “thank you”

Sorry, but what? This sounds like a false accusation, so please provide some evidence that I did indeed point out to Firefishy the lack of grounded argumentation in his detailed budget report. I believe that my only response was that I will not engage in further discussion for reasons mentioned.

Comment from NorthCrab on 18 August 2023 at 23:51

@apm-wa

Thank you for the comprehensive overview and directing me to those resources. It’s clear from the 2021 OSM community survey and the Strategic Plan Outline that there was a strong consensus within the community to prioritize the stability of core infrastructure. This sheds more light on the overall situation, and I appreciate the effort to provide this context.

The strategic decision to prioritize infrastructure stability by opting for cloud services is understood and commendable, especially considering the rapid growth OSM has experienced. I can see that OSM’s choices were driven by its commitment to the project’s core values and its community.

However, one aspect still nudges at me: the choice of Amazon as the cloud service provider. With the wide selection of cloud service providers available, each with its own pricing models and philosophical underpinnings, why was Amazon — a corporation known for its controversial business practices — the chosen one? There are other providers like Backblaze and Hetzner, which, in my research, offer competitive, if not more affordable pricing, and do not have a reputation of commodifying its users to the extent Amazon does.

While the community’s overarching goals are clear, I think it’s crucial for the OSMF to consider not just the technical and financial aspects but also the ethical dimensions when partnering with third-party entities. It’s noteworthy to mention that Amazon’s reputation, especially concerning privacy and human rights, has reached an almost “meme” status in certain circles. Aligning with such an entity could potentially raise eyebrows, and it’s essential to be conscious of the broader implications of such associations.

Comment from Friendly_Ghost on 19 August 2023 at 00:14

@NorthCrab https://community.openstreetmap.org/t/why-does-osmf-budget-25-000-on-amazon/102475/101

OSM moderation has taken many of my texts entirely out of context, seemingly to cast me in a negative light. I’m not willing to continue the discussion in such an environment, where there’s an overemphasis on subjective views and a lack of grounded argumentation. Moreover, …

You did indeed say you were unwilling to continue further discussion (yet here you are but that is beside my point), and in this comment, your first comment after Firefishy’s comment, you did mention a lack of grounded argumentation as one of the reasons. I expected that Firefishy’s comment had given you plenty of grounded argumentation, so your reaction here surprised me a bit.

Comment from NorthCrab on 19 August 2023 at 00:22

@Friendly_Ghost

I see your viewpoint; allow me to clarify. Here’s a full sentence quote:

I’m not willing to continue the discussion in such an environment, where there’s an overemphasis on subjective views and a lack of grounded argumentation.

This sentence highlights that numerous individuals (not exclusively Firefishy) fail to present well-founded arguments alongside their comments. Continuing a conversation within such an environment proves challenging for me. I am open to engaging with individuals who offer substantiated facts rather than solely relying on their personal beliefs.

And about thanking to Firefishy, I would appreciate the opportunity to do so personally. However, I am still awaiting a response from him.

Comment from Friendly_Ghost on 19 August 2023 at 00:51

Thank you for clarifying :)

As a side-note: the OSM community is full of pedantic people with strong opinions. In situations like these it helps to keep a calm mind and wait things out. Replying while emotions are still running high is a recipe for disaster.

Comment from apm-wa on 19 August 2023 at 02:40

@NorthCrab,

You wrote,

However, one aspect still nudges at me: the choice of Amazon as the cloud service provider. With the wide selection of cloud service providers available, each with its own pricing models and philosophical underpinnings, why was Amazon — a corporation known for its controversial business practices — the chosen one? There are other providers like Backblaze and Hetzner, which, in my research, offer competitive, if not more affordable pricing, and do not have a reputation of commodifying its users to the extent Amazon does.

I believe others have answered that already in the forum, but will reiterate here. Amazon offered it for free. That’s a hard price point to beat. User Iandees explained it thus:

AWS has the concept of “Credits”, which is a dollar value balance that they can apply to your AWS account through a coupon code. It has no cash value (you can’t take a $25,000 AWS coupon to the bank and get it as $25,000 USD). When you claim an AWS credit code, any cost that your AWS account incurs is deducted from that credit balance and you don’t have to pay. It’s like a gift card that you can only spend at AWS. These credits expire one year after they are issued.

Comment from NorthCrab on 19 August 2023 at 02:56

@apm-wa

I understand how such an offering would be hard to refuse given the cost-saving implications for the OSM project. However, my concern is rooted in the timeline of events. From the data I gathered, OSM’s reliance on AWS dates back to at least 2022, whereas the free AWS credits began just ~6 months ago.

Furthermore, I’d like to express concerns about placing substantial reliance on a corporation like Amazon. Even if services are currently free, Amazon has a track record of making sudden and significant changes to its policies. To my knowledge, there’s no assurance that Amazon’s sponsorship will be perpetual, and transitioning away later could come with considerable costs and complexities. It’s free until the day it isn’t.

Comment from MxxCon on 19 August 2023 at 03:38

I am open to engaging with individuals who offer substantiated facts rather than solely relying on their personal beliefs.

But this whole topic started because of your personal beliefs that Amazon is evil and you are pushing that belief on OSM by making multiple people justify their decisions.

Considering overwhelming majority of people countered your stance, wouldn’t it sensible that perhaps your point of view of this situation is incorrect or at least incomplete, and in that case shouldn’t the onus be on you to research everything rather than make people produce multi-paragraph reports purely on your whim?

Comment from spatialia on 19 August 2023 at 04:03

To my knowledge, there’s no assurance that Amazon’s sponsorship will be perpetual, and transitioning away later could come with considerable costs and complexities. It’s free until the day it isn’t.

Fair enough, but Grant’s post covered this, though without pointing it out directly. Most of the data is in “deep archive” - in AWS Glacier. That’s much cheaper than Backblaze B2. I say this as someone that very much prefers B2 to AWS, but for backups that are infrequently or unlikely to be retrieved, AWS Glacier is the cost to beat. Even without parsing the words in his post, we can see this - 112 TB for $120/month for that first item is significantly cheaper than the equivalent in B2 ($560/month). So, even if the sponsorship goes away, we don’t need to transition out because OSM is still getting the best cost for storage.

That doesn’t overrule your ethical concerns, but on every level I can see, they’re currently making the smart choice for OSM’s financial resources.

Comment from NorthCrab on 19 August 2023 at 04:07

@MxxCon

I’m sorry to say, but the idea of Amazon being evil is not a personal belief; it’s a fact. Please conduct your own research, as my discussing it now is not feasible and would demand significant effort from my end. All I ask is for individuals making key commentary to provide justifications for their statements.

Here are some videos you may want to consider watching:

Comment from NorthCrab on 19 August 2023 at 04:21

@spatialia

That’s indeed a good observation. However, I did my due diligence by checking the S3 Glacier pricing at https://aws.amazon.com/s3/glacier/pricing/, and based on my findings, the cheapest rate I identified was approximately $3.5 per TB. This calculation doesn’t align with the numbers mentioned, so additional insights that could clarify this discrepancy, would be welcome.

Comment from spatialia on 19 August 2023 at 04:26

Two things that might impact that: 1. He mentioned at least some (much?) of the data is in deep archive, so that pricing is cheaper than the standard AWS glacier pricing. 2. It also depends on the storage region. US-East (Ohio) is a relatively cheap region that gets to $137/month for 112 TB: https://calculator.aws/#/estimate?id=dff1371f1b7fcf2fd0052b866692ac5edf241fb2 - whereas in northern california, that same storage wouldbe $260/month. I’d assume they’re storing in a very cheap region.

Comment from MxxCon on 19 August 2023 at 04:33

112TB is 112000 GB 112000GB * $0.001 per GB = $112 Include the cost of API calls, data transfer and other small things and you get to the stated $120

Comment from Firefishy on 19 August 2023 at 04:35

On why we use AWS: As mentioned previously we use Rails ActiveStorage for the osm.org website. AWS S3 has the best compatibility with Active Storage. We started storing AWS S3 for avatar images back in July 2019. Prior to that we used NFS which was tied to a single export host (single point of failure) and not reliable running across data centres (IO + net latency issues causing timeouts). I built & tested self-hosted Ceph storage clusters but designing & running a multi-site replicated cluster for OSM would be an extreme burden for the Ops team, we have a lot more to get on with than worry just about storage. Much of the discussion is here. There is likely also discussion in the Ops meeting minutes where we thrashed out options a few years back.

Backblaze B2 only started offering an S3 compatible object storage API in May 2020. Hetzner is expected to offer S3 compatible object storage API in 2024. I see someone has written a dedicated Backblaze B2 gem for Active Storage support, but it doesn’t look particularly well supported.

Professionally I have 10 years of real experience building and consulting on creating secure AWS, Azure & GCP solutions. I’ve also built and consulted large on-premise hosting solutions for US/UK financial and security services where using cloud hosting was not permitted. At my previous employer I was AWS certified. I understand the real costs involved in building different solutions.

On backups…

As of today (excluding logs) we have 230.8 TB backed up. 200.8 TB is AWS S3 is in Glacier Deep Archive storage tier ($0.00099 per GB) 27.4 TB is AWS S3 is in Standard-Infrequent Access storage tier ($0.0125 per GB)

Total AWS cost: $554/month ($0/month after free credits we’re using)*

By comparison Backblaze B2 costs $0.005 per GB.

Total Backblaze cost: $1168/month.

Yes, we could likely save money by deduplicating and moving from our TGZ backups to borgbackup / borgmatic. I had an informal discussion about this a few months back during our Ops fortnightly call. I’ve said I’d come up with a proper proposal, but it is extremely low priority and needs to be proven on a small scale before we can start trust it. Multiple TGZ files is simple and reliable.

In summary: AWS is cheap, reliable, we have previous experience with it and it is well matched to our limited needs allowing us to focus our time/effort on the many other parts of OSM infrastructure which require attention. The ops team run things we can realistically support. Our usage of cloud services is limited, pragmatic and we are not tied/locked-in to any cloud provider.

Why didn’t we publish all the decision detail before? We did or at least tried to, it maybe isn’t ideally collated / minuted, wanna help? Why did I discuss AWS sponsorship on Slack? Because that is where the OSM US community live and the sponsorship is primarily used for the render server for USA. Why didn’t we publish all the metrics / cost breakdowns / other details for AWS usage? Because nobody prior to you ever asked about it, but we have now added backlog tickets to automate publishing some of it. We are an open project we do not intentionally hide anything.

This has been extremely draining for me. I am human.

I am on irc oftc network in #osmf-operations or https://en.osm.town/@osm_tech or https://twitter.com/osm_tech if you have any specific follow up questions.

Comment from NorthCrab on 19 August 2023 at 04:35

@spatialia

I appreciate the clarification, and you’re absolutely right. Upon revisiting, I realized I confused the standard S3 Glacier rates with the S3 Deep Archive costs. The distinction, when properly examined, does align with the numbers mentioned for deep archive storage, which is roughly $1 per TB.

However, it’d be worthwhile to have a comprehensive comparison between various cloud services, especially considering other potential costs. For example, expenses like “openstreetmap-wal - 28.7 TB - $400/month” still seem quite substantial.

Thanks for pointing it out and helping me see the bigger picture.

Comment from Firefishy on 19 August 2023 at 04:49

However, it’d be worthwhile to have a comprehensive comparison between various cloud services, especially considering other potential costs.

Do you want to produce it? I can give you all the raw export data. It would need to calculate for risks and the effort required to switch solutions.

For example, expenses like “openstreetmap-wal - 28.7 TB - $400/month” still seem quite substantial.

Snippet change from our private terraform AWS repo (summary: halved the days stored and move to cheaper storage tier sooner). It should reduce the billing to less than $200/month

PS: We’ll likely move to terraform -> opentf once it is better established.

Comment from Firefishy on 19 August 2023 at 05:00

It also depends on the storage region

The wal bucket is in eu-west-2 due to latency (accessed primarily from Dublin & Amsterdam). cloudtrail bucket is in eu-west-2 (unsure why), the replicated backup buckets for rails user assets (gpx-trace, gpx-images, avatars) are in eu-north-1 (greenest and cheapest in EU). All other buckets are in eu-west-1.

Comment from Firefishy on 19 August 2023 at 05:15

AWS dates back to at least 2022, whereas the free AWS credits began just ~6 months ago.

Our AWS usage dates back to late 2017, the amount of data we store has grown over time. We received the AWS credits in August 2022. The credits expire end of August 2023. We have already applied for the next year’s worth of credits. If we don’t receive them we have budgeted a contingency, but will likely immediately terminate the US render server running on AWS. We have extra physical hardware being delivered to OSUOSL to increase the US render server capacity.

Comment from NorthCrab on 19 August 2023 at 06:50

@Firefishy

First of all, thank you for your valuable feedback. That’s quite a lot of information to analyze. Let me address it and provide you with my general thoughts about the current situation.

Transparency

First of all, it’s truly awesome to see OWG taking steps towards making information transparent and publicly available. It’s important that we don’t forget the core values of OSM.

Cloud comparison

I am willing to create the comparison, with one caveat. The data that I prepare would require supervision prior to its final publication. Given that this task isn’t something I do regularly, there’s a significant chance of errors. However, I think it would be more advantageous to first address the topics I’ll be presenting shortly before diving into the comparison. In my view, preparing the comparison is of lower priority, especially considering the current free availability of AWS for us.

Now that we’ve addressed that matter, I can shift my attention to expressing my thoughts and making specific recommendations.

1. Backup deduplication

I want to strongly emphasize the importance of prioritizing data duplication. The sooner we implement this, the more streamlined our storage maintenance will be, facilitating smoother transfers—whether to other vendors or onto self-owned hardware. This proactive approach will undoubtedly enhance the overall efficiency and management of our data storage system.

2. openstreetmap-storage-backups - 112.4 TB - $120/month

Backups including some historical. Backups are not de-duped by design (heavy admin / risk burden). Some opportunity to manually cleanup, but very low priority. No automatic cleanup.

I cannot provide specific commentary on this just yet. It would be helpful to have a more detailed breakdown first. For instance, what types of backups are being discussed, and what are their sizes? Perhaps you could direct me to a resource that would allow me to deduce this information on my own.

3. openstreetmap-planet - 71.1 TB - $100/month

Historical and current copies of published planet files. Deep-Archive, for future restore to AWS hosted planet service with full back catalog. No automatic cleanups.

I have analyzed https://planet.openstreetmap.org/planet/, but I couldn’t comprehend the need for 70TB. The most recent full-history dump weighs 200GB. Let’s assume we’ll store 5 of those, which would sum up to around 1TB. Historical dumps can be downloaded via torrent as they are currently, or reconstituted from the latest versions when unavailable. If we incorporate non-full-history dumps into the calculation, we can conservatively estimate the total storage requirement to be around 2TB.

4. openstreetmap-tile-aggregated-logs - 32.1 TB - $125/month

Archival of processed tile CDN usage logs. Historical reference for Ops to work out tends and usage patterns. More data here than provided by public logs: Index of /tile_logs 4 @pnorman can clarify.

Shouldn’t we employ a log sampling method? If the core idea is to analyze usage patterns and detect abuse, even sampling as much as 1% would significantly reduce our storage requirements.

5. openstreetmap-wal - 28.7 TB - $400/month

Live streaming “Write Ahead Log” copies of the OpenStreetMap core Postgres database. The WAL files are used for syncing follower instances of the core Postgres database server. Vital asset to our data recovery plans. Can be used for recovery between full weekly database backup or corruption. For clarity this database is private and not published via planet data (eg: messages, users etc). Automatic cleanup after 1 year.

If we are already conducting weekly database backups, retaining WAL files for a duration of 1 year seems wasteful in my opinion. Automatic cleanup should take place within 1 month at most, preferably within 2 weeks. Allocating 29TB of storage solely for WAL files appears wasteful to me.

6. openstreetmap-imagery-backups - 18.2 TB - $35/month

Backups of imagery provided to OpenStreetMap. Deep archival. Primarily backups of imagery hosted on kessie 3. No automatic cleanups.

I believe deduplication would be a perfect match in this scenario. Once imagery is added, it rarely changes, so any incremental backups would be of negligible size. Employing complete backups every time is not a suitable approach for addressing this kind of problem.

7. openstreetmap-fastly-logs - 5.3 TB - $125/month

Inbound fastly CDN logs for processing. Key to us finding and managing abuse, source for publish tile log analysis: Index of /tile_logs 4 Automatic Cleanup after 31 days.

The same argument as for 4 applies here.

8. openstreetmap-gps-traces - 2.8 TB - $80 to $225/month

The GPS traces that are uploaded to OpenStreetMap.org, the storage backend for website: Public GPS Traces OpenStreetMap 3 Formerly provided by NFS service, moved to S3 to simply admin burden and to seamlessly work across our hosting data centres. No automatic cleanup, but opportunity to improve costs with S3 “tier” lifecycle rules.

I can observe that there are approximately 10,000,000 traces uploaded at the moment, averaging 0.3 MB per trace. When I download a trace from the website, I notice that it is uncompressed, which is consistent with the 0.3 MB estimate. Given that traces are essentially text files, applying basic compression can reduce their size by a factor of 20.

9. openstreetmap-fastly-processed-logs - 1.9 TB - $50/month

Archival of processed tile CDN view logs. Historical reference for Ops to work out tends and usage patterns. More data than provided by public logs: Index of /tile_logs 4 @pnorman can clarify.

The same argument as for 4 applies here

10. openstreetmap-user-avatars - 113.1 GB - $5/month

The user “avatar” images as uploaded by users. No automatic cleanup, but opportunity to improve costs with S3 “tier” lifecycle rules. Formerly provided by NFS service, moved to S3 to simply admin burden and to seamlessly work across our hosting data centres.

I don’t have any suggestions; everything appears to be fine.

11. openstreetmap-aws-cloudtrail - 76.0 GB - $2/month

Storage backend for AWS Cloudtrail API access logging service. Security monitoring. No automatic cleanup.

I don’t have any suggestions; everything appears to be fine.

12. openstreetmap-gps-images - 62.7 GB - $10/month

The processed display images used by OpenStreetMap.org on Public GPS Traces OpenStreetMap 4 Formerly provided by NFS service, moved to S3 to simply admin burden and to seamlessly work across our hosting data centres.

This also appears to be acceptable, but the cost seems somewhat excessive. I’m not sure about the reason for this.

13. openstreetmap-backups - 21.1 GB $0.03/month

Historical database backups from OSM in first few years. No automatic cleanup.

This is not worth discussing. I can only suggest deleting this data completely, as it appears to be entirely redundant and only complicates maintenance.

Summary

Taking into account the suggestions outlined in points 3, 4, 5, 7, 8, and 9, it becomes apparent that these proposed enhancements are relatively straightforward to integrate and would significantly enhance the efficiency of OSM operations. While I wholeheartedly endorse the concept of data deduplication, I will exclude it from my summary calculation for the sake of simplicity.

Based on my calculations, the current monthly expenditure of $100 + $125 + $400 + $125 + $225 + $50 equals $1025. However, by implementing the suggested changes, the projected monthly cost would reduce substantially to $3 + $1.25 + $16 + $1.25 + $10 + $0.5, amounting to $32. This would signify an impressive ~30x reduction in costs. I assume a linear scaling of costs here, which is not entirely accurate, but it’s the best I have been able to come up with given the limited information.

I have observed an emerging trend in published information where, rather than optimizing things, we are increasingly opting to pour more and more money into them (theoretically; it’s free now but I do not see it as an excuse). While I comprehend that volunteer work is often limited, I must highlight that the suggestions I have outlined are neither difficult nor time-consuming to implement.

I trust that you will view this feedback as constructive. Kindly take a moment to reflect upon it, and respond at your convenience.

-K

Comment from CactiStaccingCrane on 19 August 2023 at 18:06

Long live the king

Log in to leave a comment