5 Worst Practices to Avoid with EBS Storage

By Alexey Baikov
CTO and Co-founder

1. Over-provisioning volumes

Not surprisingly, the most common EBS bad practice is over-provisioning.

Over-provisioning happens when large-capacity, highly performant EBS volumes are attached to EC2 instances but contain little to no data, or work well above their required performance. Typically over-provisioning, can be avoided easily, but happens when volumes are created without any clear idea or planning about capacity and performance requirements.

Another example of over-provisioning is when high-performance EBS volumes are attached to low network bandwidth EC2 instances. EBS volumes are accessed via a network, and the performance of an EBS volume is highly dependent on the network bandwidth and throughput of the EC2 instance it’s attached to. Attaching a highly performant EBS volume to a low network throughput EC2 instance will result in low volume utilization. Also, using high-performance EBS volumes with non-EBS-optimized EC2 nodes will result in bigger latency.

Overprovisioning results in unnecessary cost, and can be diagnosed by checking EBS volumes metrics. Refer to our EBS master playbook for more information.

2. Keeping unused volumes and snapshots

This is another bad practice where EBS volumes are provisioned for possible future use or at some point detached from existing EC2 instances, but aren’t used. These unattached volumes sit idle in the customer’s account, costing money.

You can check the Zesty console or AWS Cost Explorer to find unattached EBS volumes, and safely remove them if they are not required.

3. Using a single volume to store everything

This bad practice stems from how easy it is to spin up EC2 instances, attach EBS volumes, and not give enough thought to the workload itself.

In this model, the operating system, application, data, logs, and swap space all share the same volume. It’s often argued that using a single disk for everything was an issue in the old days when hard drives were directly attached to physical servers. Today, indeed a cloud-hosted volume’s storage space maps to multiple, geographically separated machines. However, this is more to do with capacity than performance.

As far as the operating system is concerned, it still sees the single volume as a storage area with a finite limit. So when that volume runs out of space, the applications in the machine are still affected.

Let’s consider a single-volume EC2 instance running a critical database. This machine will have an operating system cache, log files, and temporary files – all generated in the same volume where the data files are stored. If the log files are all blown up in size (perhaps due to an application bug sending huge trace messages to the files), the storage may run out far quicker than anticipated. This condition would affect the database’s availability.

It’s therefore important to map out an EC2 instance’s volume structure during the solution architecture phase. The only time using a single volume or the root volume to store everything makes sense is when the machine is hosting a small or less critical application. It also makes it easy to snapshot the volume.

4. Using a suboptimal file system

A file system dictates how data in a storage volume is stored, accessed, searched, written, or tracked. There are different file systems for different operating systems like Linux, UNIX, or Windows.

One file system often performs better than another for the same workload type. Often an EC2 instance is running multiple types of workloads, each accessing a different volume for its data. It then becomes a question of using different file systems for different EBS volumes attached to the EC2 instance. However, more often than not, the same file system is used in all the volumes – resulting in suboptimal performance.

Once again, choosing the right file system for different workloads will be a part of solution architecture. Our EBS Master Playbook has a great overview of some commonly used file systems.

5. Creating non-synchronized snapshots and lazily loading restored volumes

Snapshots are used to backup EBS volumes. Typically, the snapshot process is automated: a scheduled task runs a backup program or script against every EBS volume attached to every EC2 instance. Once completed successfully, the job usually sends a message to the operations team. It also sends a warning if it can’t snapshot one or more volumes. However, backups should be carefully designed to ensure data can be recovered in case of a failure or accidental data loss.

Many snapshot processes need precise synchronization. For example, let’s consider an application that runs on two EC2 servers. One EC2 machine hosts the app’s database files, the other one hosts its binaries, logs, and configuration files. Throughout the day, both configuration files and the database are updated, and both need to be in sync.

By default, each EBS volume will have a backup done in isolation; there’s no guarantee that isolated EBS volume snapshots will start or finish at the same time, even if they are attached to the same EC2.

Now imagine this. The automated snapshot process backs up the database volume at the start of its job, and after a few hours, snapshots the configuration files’ volume again. Between the two snapshots, there’s a gap of a few hours – making the volume backups out-of-sync. Once restored, these volumes may not work in sync impacting the application’s ability to come online.

That’s why if your application has more than one volume that needs to be recovered in case of a failure it needs to be in-sync. To help you achieve this make sure to use crash-consistent snapshots across multiple volumes.

It’s also important to notice that by default, after creating an EBS volume from a snapshot, the data isn’t readily available inside the volume. The data is lazily loaded only when it’s accessed.

For certain workloads like databases, this can mean the restored volume performs slower than desired for a long time until all the data has been accessed at least once. Fast Snapshot Restore addresses this issue. Although it comes with extra costs, it should be considered for latency-sensitive workloads.

Bonus: Not encrypting volumes

Okay, we thought we would talk about five EBS worst practices, but here’s another one – just as a bonus. Disk encryption ensures that data stored in a volume is encrypted with a symmetric key. In case the AWS account is hacked, bad actors can make your EBS snapshots public, copy the snapshot to their account, and then change the access permission of the snapshot back to private. They can then attach the copied snapshot to their own EC2 instance and access the data.

Scary right? Not so much if you encrypt your volume. Without the key used to encrypt the EBS volume, the snapshot can’t be restored.

Many organizations encrypt their critical data volumes to be compliant with regulations. But those that choose not to, are at risk of a data breach. Ideally, data should be encrypted both at rest and in transit. Encrypting EBS volumes with AWS KMS keys is a simple process as this AWS documentation shows.

EBS is a useful, flexible, and easy-to-use managed service from AWS. Hopefully, this article has given you some ideas about the pitfalls you can avoid when using EBS volumes and the means to address them. Designing and using EBS volumes appropriately can help minimize associated costs, as well as improve data security, data recovery, and performance.

If you’re looking to improve application stability, reduce costs of your EBS or ease the provisioning of volumes, learn more about Zesty’s solution for managing EBS storage.

5 Worst Practices to Avoid with EBS Storage

1. Over-provisioning volumes

2. Keeping unused volumes and snapshots

3. Using a single volume to store everything

4. Using a suboptimal file system

5. Creating non-synchronized snapshots and lazily loading restored volumes

Bonus: Not encrypting volumes

Related Articles

Tags

Keep your cloud up to the pace of change

Products

Solutions

Company

Resources

More

Proud to be