How Zesty Disk optimizes storage efficiency for EMR workloads
Customer
Brief
Company provides a threat detection and response solution for hybrid cloud environments. As a data-driven enterprise, their Next-Gen SIEM and XDR are powered by advanced machine learning algorithms, which process massive amounts of data in real-time. Their solution leverages the output of behavioral analytics technology to reduce event noise, prioritize high-fidelity alerts, and enables fast and precise responses to insider and cyber threats.
The company’s large R&D team is based in the USA and India. Their workloads are built on thousands of clusters, that provide a scalable, flexible, and cloud native architecture which makes the solution easy to deploy. EMR is used to run Hbase (Hadoop Database) and Spark to analyze and process the vast amounts of data that is collected by their processing engine for security event analysis.
Key
Challenges
Very low disk utilization levels which were leaving the company paying for hundreds of TBs of provisioned capacity that most of the time wasn’t used.
Key
Results
The company reduced their monthly amount of provisioned data by 1.26PB to just 240TB. Translating into savings of $100,000 every month!
The Challenge:
The high volume of data collected for analysis required the DevOps team to frequently allocate 4 TB of disk storage per EC2 instance. However, once all the available capacity was consumed by the ingested data produced by their analytics engine, the company was paying for block storage capacity that wasn’t being used.
In total they had 4,500 filesystems on EMR which equated to approximately 1.5PB of provisioned storage. On average the disk utilization level was 14%, leaving them paying for hundreds of terabytes of provisioned capacity that most of the time wasn’t used.
Zesty’s Solution:
Zesty Disk was initially deployed on the company’s EMR workloads to get a seamless service of storage persistence even when there is high data usage. A custom Amazon Machine Image (AMI) was developed that uses the Zesty Disk filesystem (Amazon Linux 2 with Zesty Disk). This is a virtual disk for the storage filesystem which consists of several small standard EBS volumes.
The fragmenting into small storage volumes is what enables the elasticity for the volume to grow as data is ingested and to shrink as data is removed. This effectively ensures a higher utilization rate of the disks attached dramatically increasing the usage ratio.
The Zesty Disk optimized EBS volumes still use all the AWS native tools, and procedures, ensuring that their SLAs remain unchanged. The company remained the owner of their data and the only ones that access to it.
Leveraging Zesty Disk was very easy with their short-lived EMR applications. It simply involved spinning up a new instance with a Zesty Disk embedded filesystem.
The onboarding process also served to optimize the right storage disk for their workload’s needs. By default, GP2 disks are packaged into EMR images, but they’re often not the ideal disks for the heavy data processing that EMR is being used to perform. Users are left paying for a more expensive drive that is not adequately meeting their needs. Zesty Disk automatically launches EMR volumes as a GP3-type volume, this provides more IOPS and comes at a lower cost.
The Result:
Since deploying Zesty Disk the company has reduced their monthly amount of provisioned data by 1.26PB to just 240TB! This translates to a savings of $100,000 every month!
Zesty Disk offers a unique solution to maximize the value derived from EBS. The solution enables organizations to improve cost efficiency and achieve more with their existing storage resources.
With the evident value, the company is running all of its EMR clusters with EBS volumes fully managed by Zesty Disk. They are pleased with the seamless service availability even in the case of large data ingestion peaks. Operationally, Zesty Disk also avoids the hassle of reallocating storage across instances and eliminates on-call developer tasks related to maintaining EBS. Overall, Zesty Disk provides a powerful tool for organizations to enhance their storage infrastructure, ensure cost-effectiveness, operational efficiency, and increase the value of their EBS investments.
Customer
Brief
Company provides a threat detection and response solution for hybrid cloud environments. As a data-driven enterprise, their Next-Gen SIEM and XDR are powered by advanced machine learning algorithms, which process massive amounts of data in real-time. Their solution leverages the output of behavioral analytics technology to reduce event noise, prioritize high-fidelity alerts, and enables fast and precise responses to insider and cyber threats.
The company’s large R&D team is based in the USA and India. Their workloads are built on thousands of clusters, that provide a scalable, flexible, and cloud native architecture which makes the solution easy to deploy. EMR is used to run Hbase (Hadoop Database) and Spark to analyze and process the vast amounts of data that is collected by their processing engine for security event analysis.
Key
Challenges
Very low disk utilization levels which were leaving the company paying for hundreds of TBs of provisioned capacity that most of the time wasn’t used.
Key
Results
The company reduced their monthly amount of provisioned data by 1.26PB to just 240TB. Translating into savings of $100,000 every month!
The high volume of data collected for analysis required the DevOps team to frequently allocate 4 TB of disk storage per EC2 instance. However, once all the available capacity was consumed by the ingested data produced by their analytics engine, the company was paying for block storage capacity that wasn’t being used.
In total they had 4,500 filesystems on EMR which equated to approximately 1.5PB of provisioned storage. On average the disk utilization level was 14%, leaving them paying for hundreds of terabytes of provisioned capacity that most of the time wasn’t used.
Zesty Disk was initially deployed on the company’s EMR workloads to get a seamless service of storage persistence even when there is high data usage. A custom Amazon Machine Image (AMI) was developed that uses the Zesty Disk filesystem (Amazon Linux 2 with Zesty Disk). This is a virtual disk for the storage filesystem which consists of several small standard EBS volumes.
The fragmenting into small storage volumes is what enables the elasticity for the volume to grow as data is ingested and to shrink as data is removed. This effectively ensures a higher utilization rate of the disks attached dramatically increasing the usage ratio.
The Zesty Disk optimized EBS volumes still use all the AWS native tools, and procedures, ensuring that their SLAs remain unchanged. The company remained the owner of their data and the only ones that access to it.
Leveraging Zesty Disk was very easy with their short-lived EMR applications. It simply involved spinning up a new instance with a Zesty Disk embedded filesystem.
The onboarding process also served to optimize the right storage disk for their workload’s needs. By default, GP2 disks are packaged into EMR images, but they’re often not the ideal disks for the heavy data processing that EMR is being used to perform. Users are left paying for a more expensive drive that is not adequately meeting their needs. Zesty Disk automatically launches EMR volumes as a GP3-type volume, this provides more IOPS and comes at a lower cost.
Since deploying Zesty Disk the company has reduced their monthly amount of provisioned data by 1.26PB to just 240TB! This translates to a savings of $100,000 every month!
Zesty Disk offers a unique solution to maximize the value derived from EBS. The solution enables organizations to improve cost efficiency and achieve more with their existing storage resources.
With the evident value, the company is running all of its EMR clusters with EBS volumes fully managed by Zesty Disk. They are pleased with the seamless service availability even in the case of large data ingestion peaks. Operationally, Zesty Disk also avoids the hassle of reallocating storage across instances and eliminates on-call developer tasks related to maintaining EBS. Overall, Zesty Disk provides a powerful tool for organizations to enhance their storage infrastructure, ensure cost-effectiveness, operational efficiency, and increase the value of their EBS investments.
“Zesty Disk are more than a software, they’re a key part of our team.”
Sanjeev Kishore Yarnapati
Principal Cloud Architect
“We chose Zesty because of the significant savings that we could achieve with absolutely minimal effort involved.”
David Ting
Senior VP of Engineering
“Zesty has allowed us to reduce our overhead. We now have a zero point deviation from cost KPIs”
Manoj Srikantaiah
Lead DevOps Engineer