Cloud Computing Management Mistakes that Cost Millions
Tech Evangelist
Now more than ever, cloud costs have become top priority for businesses across the globe. Unfortunately for DevOps and cloud engineers, this reprioritization comes with greater scrutiny on cloud bills and increased accountability for expenses incurred in the cloud.
It’s not easy for them to balance their core responsibilities and the many adjustments and predictions needed to keep cloud costs low. Furthermore, there are many mistakes that are easy to make which ramp up prices unnecessarily.
To make the lives of both business leaders and engineers easier, we’ve put together a list of the top cloud cost culprits that cumulatively can cost companies millions and tips for overcoming them.
Mistake #1: Using a Costly Region
Not all regions are created equal. In fact, there can be up to a 70% price difference between various regions on AWS. For example, Frankfurt is 15% more expensive than Ireland. This is because AWS works at an economy of scale, meaning that the larger the region (and therefore, the more customers that use it), the less expensive it is. So if it’s relevant for your customers and your particular use case, consider using a larger region to take advantage of these savings.
Mistake #2: Not Considering the Price of Data Transfers Between Regions
It’s estimated that Apple spent a whopping $50 million on transfer fees in 2017, while many other large corporations spent close to $10 million. As high as these numbers are, nowadays, it’s safe to assume that number has gotten even higher.
So when assessing which regions you’re using, it’s important to consider the transfer fees between regions in addition to the overall cost of the region. Choosing regions that are located far from one another could mean outrageous transfer fees, whereas some regions located in close proximity to one another, like Virginia and North Ohio, have discounted transfer fees between them.
Mistake #3: Keeping Wasted, Unused Resources
It’s estimated that cloud waste cost companies $147 billion in 2022. Despite the expense, companies are finding it extremely difficult to get waste under control. In fact, 19% of FinOps Foundation respondents name eliminating cloud waste as one of their top FinOps challenges.
There are many different types of cloud waste, each of which present their own challenges. Here are some of the most common:
Idle Elastic Load Balancers (ELB)
According to Zesty’s research on utilized resources, on average organizations throw away nearly $11,000 annually based on 180 ELBs lying around unused in their environment.
If the sum of load balancer requests over the past week is less than 100, it’s safe to assume that your ELB is no longer needed. One way to get rid of wasted ELBs is through AWS’s Idle Load Balancers check which provides a report of unused ELBs. Once these idle ELBs are identified, simply refer to these instructions to delete them. Alternatively, you can also consolidate the load balancers to maximize their efficiency.
Another option to rid yourself from unused ELBs is to identify them using Zesty’s Resource Cleaner and remove them manually on AWS Console.
Wasted Elastic IPs (EIP)
Zesty’s report also shows that the average company keeps about 440 unused EIPs which wastes$1,500 per year . While these costs may not seem like a lot, when added up with other wasteful practices, it becomes significant.
Having some form of automation in place to remove unused EIPs is a low hanging fruit that can enable businesses to put a little extra cash toward something more valuable. Zesty’s Resource Cleaner can be used to identify wasted EIPs so you can easily delete them from your environment.
Zesty’s Resource cleaner can easily identify unused resources. You can also remove them manually using AWS Console.
Mistake #4: Overprovisioning Cloud Resources
As opposed to wasted or unused resources which lay completely idle in cloud environments, overprovisioned resources are the result of organizations allocating more instances, storage, or capacity than necessary to run their workloads.
These overprovisioned resources are often used as a buffer, but unfortunately, they can also ramp up costs dramatically. Here are some of the most common overprovisioned cloud resources and the costs they incur.
Overprovisioned Elastic Block Storage (EBS)
Zesty’s study found the average user spends around 2-5x extra on EBS disk space, which accumulates to $500,000 wasted per year.
Much of this can be attributed to overprovisioning which is the inevitable result of engineers trying to ensure performance for their applications. If too few MB or GB are provisioned, the application can fail. So they compensate by provisioning enough resources to cover peaks in demand or unexpected fluctuations.
To solve this problem, Zesty Disk can ensure stability and performance is optimized while saving up to 70% on block storage by automatically adding filesystem storage when demand rises and removing them when demand drops off. This ensures that your application is always running optimally and cost-efficiently.
Overprovisioned EC2 Instances
According to Datadog research, 45% of containers use less than 30% requested memory and 49% use less than 30% of requested CPU. This example of containers is one of many situations in which you may find your teams provisioning significantly more nodes and pods than necessary, wasting tons of budget as a result.
To combat this issue, it is crucial to right size instances, ensuring that your teams have not allocated more CPU or RAM than required. Cost Explorer will provide recommendations for downsizing and Operations Conductor can be used to right size instances or change instance types.
Additionally, autoscaling groups have the capability to reduce the size of your EC2 fleet as per the demand. You can also use AWS Compute Optimizer to get rightsizing recommendations for instances, even without 30 consecutive days of utilization data.
Mistake #5: Using Lift and Shift for Cloud Migration
IDC estimates that “lift and shift” cloud migrations cause organizations to keep many of their old configurations and setups, but at a 30% higher cost. Instead, they suggest companies use an application-based approach for cloud migration which evaluates whether specific apps should be phased out, kept as is, replatformed, refactored, or built from scratch. This way, you’re setting up your cloud to be financially efficient from the get-go rather than allowing it to become an afterthought in the future.
Mistake #6: Not Implementing FinOps from the Beginning
According to McKinsey, having a proper FinOps strategy in place from day one, can create a 15-25% economic benefit for cloud users over time
A FinOps strategy enables cloud users to better plan, create KPIs, and identify the right stakeholders to ensure the cloud is being approached in a manner that is financially efficient for specific business needs.
While this data may make implementing FinOps seem like a no-brainer, some FinOps practices are easier for companies to adopt than others. For example, according to FinOps Foundation research, managing cost anomalies is currently in the walk and run stages for 67% of organizations while more than 65% of organizations are at the same stages for cloud governance (the practice of allocating responsibility for cloud costs). On the other hand, some areas which are still at the crawl stage include resource utilization and efficiency at almost 68%, managing discount commitments at 72% and automation and workload management over 77%.
Mistake #7: Tracking Costs Incorrectly
McKinsey reports that businesses that focus on CapEx (capital expenditures) rather than OpEx (operational expenditures) generally face a 20% discrepancy between their forecasted cloud spend and actual cloud expenses. Therefore, companies that want to more accurately forecast and track cloud costs should focus on expenses such as SaaS, IaaS, PaaS, and DaaS to keep their budgets in-check.
McKinsey further explains that forecasting based on historical data alone is another big mistake as it provides an incomplete picture of potential cloud costs. Instead, companies should link forecasting to specific business goals (such as compute cost per customer) and establish unit economics for their major applications. In order to properly implement this strategy, businesses will need to create a shift in mindset toward a consumption model to help engineering teams understand the business implications of their cloud spend and how the costs they incur in the cloud impact unit economics.
Mistake #8: Failing to Monitor Kubernetes Costs
According to the CNCF report, 57% of organizations are spending between 50K to $1 million a month running their containers on Kubernetes, with 10% spending over $1 million a month. Over a year’s time this is $12 million spent on K8 workloads. In addition, 35% of organizations report that their costs have increased by 20% over the past year.
Having no insights into what’s driving up costs of Kubernetes is one of the culprits that can be attributed to rising Kubernetes costs. Currently, almost 70% of organizations rely on estimates or do not monitor Kubernetes costs at all, so teams cannot get visibility into specific actions or reasons that expenses are rising.
Monitoring tools such as Datadog, ELK, or others can help businesses get visibility into these costs so they can make better financial decisions regarding Kubernetes spend.
Mistake #9: Not Using Cloud Discount Offerings
While 66% of advanced FinOps practitioners are able to successfully leverage discount programs, those with less mature FinOps practices still struggle to take advantage of discount offerings without sacrificing flexibility.
AWS in particular offers a few different discount options. AWS’s Spot Instances offer a 90% discount, Reserved Instances (RIs) can save you up to 75%, and Savings Plans (SPs) can offer up to a 72% discount.
Yet, according to the FinOps Foundation Report, between 0-10% of commitments (Reserved Instances and Savings Plans) go unused.
Why?
All of these discounts come with their drawbacks. Spot Instances should only be used for test and development workloads as AWS can reclaim them with a two minute warning while RIs and SPs require you to commit to specific instance types 1-3 years in advance. Because of this upfront commitment, engineers are often reluctant to put a significant portion of their instances on RIs and SPs due to unpredictable workload requirements.
Automation tools that allocate Savings Plans, RIs, or CUDs can help achieve greater savings. However, according to the FinOps Foundation report 83% of respondents are yet to implement such a solution, partially because many of these solutions are relatively new on the market.
Final Thoughts
In today’s economy, it is critical to reduce costs wherever possible–and cloud costs can be an easy win if you know where to look. By being strategic in combating these cloud cost culprits and designing your cloud for long-term financial success, you could be on your way to driving profitability for years to come.
Doing so certainly requires a cultural shift as well as efforts from both business and R&D sides of the organization, but it is a solution that is well worth the investment.
Make combatting these cloud management mistakes easier than ever by automating savings with Zesty. Talk to one of our cloud experts to learn more.
Related Articles
-
A Holiday Tail: The DevOps Engineer Who Saved Christmas
December 19, 2024 -
How to cut water usage in cloud data centers
November 28, 2024 -
Zesty introduces automated Kubernetes optimization platform
November 17, 2024 -
How the Multi-Dimensional Automation approach revolutionizes K8s optimization
November 14, 2024 -
The never-ending compromise of Kubernetes optimization
November 14, 2024