Wynk slashes compute costs for big data by 60% with Amazon EMR and spot instances

Wynk was born on the Amazon Web Services (AWS) Cloud. As the company rapidly grew, however, so did its subscriber base. On a single day, the Wynk Music app generates about 4 TB of data, which includes insights on how long a user has tuned in and what genre of music they listen to most.
Wynk Slashes Compute Costs for Big Data by 60% with Amazon EMR and Spot Instances

Becoming the Destination of Choice for Entertainment in India

The music market in India ranks as the 15th largest globally and could enter the top 10 as soon as 2022. Due to fast-rising internet penetration and the proliferation of streaming services, many view music as a sunrise sector for the Indian economy.

Wynk Limited launched in late 2015 as an innovation unit of Bharti Airtel Limited, one of India’s leading telecommunications companies. Wynk Music belongs to Airtel Digital Limited, along with Xstream Digital, a video streaming service. With 72 million monthly active users and more than 14 million songs in its content catalogue, it’s the number-one music app on the App Store and Google Play Store in India, generating over 100 million downloads to date. Wynk’s mission is to become the preferred destination for entertainment by delivering customized, seamless online experiences.

Amazon EMR is developer-friendly and flexible, which makes it easier to run big data and analytics applications.”

Ridhima Kapoor
Head of Data Platform, Wynk Music

Data at the Core of Operations

Wynk was born on the Amazon Web Services (AWS) Cloud. As the company rapidly grew, however, so did its subscriber base. On a single day, the Wynk Music app generates about 4 TB of data, which includes insights on how long a user has tuned in and what genre of music they listen to most. A few years after launching, Wynk began investing in its employees and technology to transform into a data-driven operation and offer more personalized features to users.

The data team started down the analytics road by building a data lake. For unstructured data, Wynk used Apache Hadoop with Ansible software to perform distributed data processing. However, the company’s data infrastructure quickly became overly complex and compute costs soared. Wynk consulted with the AWS team to find a more manageable solution that could scale in a cost-controlled manner and switched to Amazon EMR as its big data platform. “Amazon EMR is developer-friendly and flexible, which makes it easier to run big data and analytics applications,” says Ridhima Kapoor, head of Data Platform at Wynk Music. The Wynk data team also implemented Amazon Redshift as a data warehouse and Amazon Simple Storage Service (Amazon S3) as a data lake.

As part of AWS Enterprise Support, the AWS Solutions Architect and Technical Account Manager dedicated to Wynk organized multiple training sessions. This empowered Wynk’s data team to experiment with and implement an efficient architecture design, as well as fail fast. “AWS has been with us from the start in identifying potential solutions to fit our business use cases,” says Hitesh Bhatia, head of DevOps at Wynk Music. The company has had many enablement sessions to help its data team adopt and fine-tune infrastructure components, including optimizing costs on the AWS Cloud.

Refining a “Spot Strategy” to Fit Business Use Cases

To mitigate the rising cost of data processing, Wynk uses Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances to run Amazon EMR clusters. Hitesh estimates the company is now saving 60 percent on compute costs by moving to Spot Instances from On-Demand Instances for its big data framework. Wynk provisions capacity across diversified Amazon EC2 Spot Instance pools with a capacity-optimized allocation strategy and leverages Amazon EMR managed scaling to handle spikes and dips in traffic. This configuration ensures website and app stability during peak hours, such as weekday evenings. It also leverages Spot Instance Advisor to determine which instances have the least chance of interruption across various Spot Instance pools.

Wynk’s DevOps team uses customized anomaly detection dashboards to view and analyze daily spend. The management team utilizes similar dashboards to monitor overall infrastructure costs and has monthly reviews with the AWS account team. By using AWS Cost Explorer and Amazon QuickSight business intelligence service, Wynk can visualize expenses and understand the reasons for variations in spending. The company’s management receives notifications for spending anomalies and can act fast to curb costs if needed.

Reducing Time-to-Market from 6 Days to 0.5 Day

Thanks to increased automation and managed services on AWS, Wynk has reduced its time-to-market for new features to half a day, down from six days. “The advantage of out-of-the-box services such as Amazon EMR is that we no longer have to worry about deployment, and we can just concentrate on rolling out features. As such, we’ve been able to standardize a lot of code in our data framework using templates and AWS Cloud–native tools,” Ridhima says.

Hitesh also emphasizes the benefits of integration among AWS services. “We have access to many building blocks on the AWS Cloud that we can use in multiple combinations. Just like playing Legos, we can put one block on another to customize our data framework, and in turn customize the Wynk experience,” he says.