AWS Auto Scaling: A game-changer for EC2 management

Image for AWS Auto Scaling and tools for the Featured Image

Introduction

You know the annoyed  feeling you get when faced with the “Insufficient Instance Capacity” error when dealing with AWS-hosted applications?

 

Other times, you see a large portion of the bill coming from the EC2 fleet you assigned to the application, despite the fact that many of these instances sit idle and unused.

 

Both cases leave you frustrated because the application wasn’t designed well-enough to enable  the EC2 instances to  cope with the increased load or be used efficiently.

 

For businesses running mission-critical software, a capacity related error is just not an option. It costs money, time, and if you are the person responsible, it may even cost you your job!

 

AWS Auto Scaling is Amazon’s answer to this problem, making it much easier for the infrastructure to scale so it can match the fluctuating demands of your application over time.

 

In this article, we introduce you to AWS Auto Scaling, the benefits it can bring, and the pitfalls you need to avoid when implementing it.

What is AWS Auto Scaling?

AWS Auto Scaling is a service that monitors the load on your EC2 instances to ensure that at any time, there’s an appropriate number of nodes available to match the performance demand of the application running on those nodes. It will automatically launch new  instances when the load increases, and terminate existing instances when the load decreases and they’re no longer needed.

 

To get started with AWS Auto Scaling, there are three things you need to configure: 

 

  • Auto Scaling Groups
  • Launch Templates
  • Auto Scaling Strategies

 

Auto Scaling groups

An Auto Scaling group is a logical grouping of EC2 instances that can be managed together as one logical unit. Each group can have a minimum, desired, and maximum number of EC2 instances as shown in the diagram below.

AWS Scaling groups

AWS Auto Scaling Group
AWS Auto Scaling Group

Launch templates

This is the configuration the EC2 instances will have when deployed. This will include configuration items such as:

 

  • The Amazon Machine Image (AMI) 
  • Instance type
  • Key pair
  • Network settings
  • Security groups
  • Storage
  • Resource tags
  • Network interfaces
  • User data

 

The images below shows these configurations from the AWS console:

AWS Auto Scaling Launch Template Name, Version Description, and Template Tags options
AWS Auto Scaling Launch Template Name, Version Description, and Template Tags options
AWS Auto Scaling Launch Template Configurations
AWS Auto Scaling Launch Template Configurations
AWS Auto Scaling Launch Template Configuration
AWS Auto Scaling Launch Template Configuration

Auto Scaling strategies

 

This is the area where you should focus most of your attention when setting up AWS Auto Scaling. Having the most appropriate scaling option for your application is critical to achieving any of the benefits.

 

AWS offers several ways to scale your EC2 instances:

 

Maintain current instance levels 

 

In this case, AWS Auto Scaling will monitor your environment and if it finds an unhealthy instance, it will terminate that instance and launch a new one from the template.

 

Scale Based on a Schedule

 

Scheduled scaling will increase or decrease the number of instances automatically on specific dates or times. This is useful for applications that have a predictable usage pattern. For example, applications that are used heavily Monday to Friday, but not on weekends. 

 

Scale according to demand

 

This is one of the more advanced Auto Scaling options. It involves defining certain metrics to monitor, such as CPU utilization or Disk Reads and Writes, and specifying their threshold values. AWS Auto Scaling will automatically launch or terminate instances based on the threshold you have defined. For this option to work best, you should have a good understanding of how your application runs and how it responds when it reaches the defined thresholds.

 

Predictive scaling

 

This is another advanced Auto Scaling option which uses machine learning algorithms on data obtained from actual EC2 usage and AWS’ own observations to predict the expected traffic. This is most appropriate for applications that experience cyclic traffic patterns. For example, if you have a high traffic event that occurs at specific times of the year every year, AWS Auto Scaling can scale your environment in advance.

Auto Scaling Strategies
Auto Scaling Strategies

The Benefits of AWS Auto Scaling

In the old days, coping with an increased demand for your application would require purchasing new hardware, racking it in the datacenter, adding it to the network, installing OS and application run times, and configuring it. Cloud computing services make this a much easier process, and AWS Auto Scaling takes it to a whole new level.

 

Better fault tolerance

 

AWS Auto Scaling continually monitors the metrics of your EC2 instances to ensure they are operating within your predefined thresholds. This means if any instance is underperforming or unavailable, Auto Scaling will automatically detect it and replace it. If configured correctly, this should be a seamless process your end users will barely notice.

 

Better availability

 

AWS Auto Scaling can help ensure your application has enough infrastructure capacity at all times to meet the traffic demand. For example, if your website sees a sudden burst of traffic and you have configured a maximum of 2 additional EC2 instances, AWS Auto Scaling will automatically deploy the additional nodes, helping you to avoid any downtime.

 

Cost reduction

 

EC2 Instances are charged based on how long they are used. By implementing AWS Auto Scaling, you can reduce your monthly bill as it will automatically remove extra instances from an auto scaling group when they are no longer required. 

Pitfalls to avoid

It’s rare that any technology will work perfectly out of the box, and AWS Auto Scaling is no different. Here are a few common pitfalls to avoid.

 

Monitoring interval too infrequently 

 

By default, EC2 tracks metrics every five minutes. Leaving this at the default value can often trigger an unnecessary auto scaling operation. For example, your application may spike to 90% CPU for thirty seconds, then return to normal. If your AWS Auto Scaling settings don’t update for another five minutes, it may unnecessarily spin up a new instance.

 

To avoid this, we recommend you enable detailed monitoring which tracks metrics every minute. However, it will also mean extra charges.

 

Overlooking time required to spin up new instance

 

For most AWS Auto Scaling groups, it takes around five minutes to spin up a new instance. If your application experiences a traffic spike for less than five minutes, it may not be a good idea to scale down too much when demand is reduced. 

 

For example, if your website sees spikes in traffic for two minutes and requires an extra instance to address that, it’s more logical to operate with the extra instance running than trying to spin up a new node at every spike as there won’t be enough time to deploy it. 

 

In a perfect world, an increase in demand is slow and can be predicted. But sometimes big jumps happen over a few seconds and AWS Auto Scaling might not always keep up.

 

Complex base images

 

Another pitfall to avoid is overcomplicating the launch of the new instance with complex configurations.

 

Configuration management tools such as Puppet or Ansible are often used to automatically configure applications in an instance once it comes up. These changes are separate to the configurations backed into the AMI, or the user data script that runs when the instance starts. The advantage is that it allows consistent configuration across all instances of your application. However, applying such config changes can take time (take the example of updating an Nginx config file). In an Auto Scaling event, such lengthy load time may  not be ideal. The more complex the configuration changes are, the longer it will take to load, and the more likely that the extra load on the EC2 fleet will go unaddressed. 

 

Avoid this by keeping the configuration as simple as possible, and include as much as you can in the AMI.

Final words

AWS Auto Scaling not only provides the resources required to your EC2 fleet when they are most needed, but is also free to use (although the extra EC2 nodes will incur cost). Being able to launch new instances on higher demand (and removing them when not needed) gives you the cloud scale elasticity required by  today’s distributed applications.

 

However, remember that AWS Auto Scaling isn’t a “one size fits all” solution and requires an element of trial and error to reap the benefits.

 

Monitoring your environment, creating baselines, and testing new Auto Scaling configurations will help you achieve better fault tolerance and better availability.

 

There’s no doubt that Auto Scaling has made it easier to scale on EC2. But it is by no means the only way to optimize your EC2 management. 

 

Want to take your EC2 scaling to the next level? Contact one of our cloud optimization experts to learn how.

Don’t be a stranger!

Follow us on LinkedIn