Warner Music Migrates Its Cloud Foundry Deployment from OpenStack to AWS

One of the Cloud Foundry pioneers, WMG found troubling inconsistency in its hybrid cloud environment. So, it hit reset and migrated the entire OpenStack-based environment to all AWS, all the time.

WMG’s Adam Chesterton and Altoros’s Renat Khasanshyn discussed the move during a session at the Cloud Foundry Summit 2016 in Santa Clara, CA.

 

Issues faced with a hybrid cloud

Adam Chesterton

Adam Chesterton, WMG

Warner Music Group (WMG) started using Cloud Foundry in 2011, and they’re “still one of the largest organizations using the community version for production workloads,” according to Adam Chesterton. Apps were deployed over BOSH onto OpenStack, with WMG beginning its journey with multiple cloud providers, some public and some private.

“Then something happened,” Adam said, in getting workloads to the OpenStack environment. “We managed to crack the way to rapidly build, test, and deploy code into Cloud Foundry, but we started to experience some issues.”

Renat outlined several of the issues that arose:

Renat Khasanshyn

Renat Khasanshyn, Altoros

  • Environments were not as reliable as hoped.
  • Inconsistency of the OpenStack API into BOSH. “We would request 10 virtual machines, for example, but sometimes get two or none at all.”
  • Random, inconsistent timeouts were occurring. HTTP calls being made and acknowledged, but nothing returned.
  • Couldn’t get access to hardware logs (the managed OpenStack provider could not provide them), which lead to prolonged troubleshooting and discovery. So, it was very unclear where the problems lay, and the team needed full end-to-end visibility.
  • Renat also noted that Cloud Foundry allows users to set up identical environments at the application level. “But the environments were not the same,” he said. “As we moved code from development to production, we found that we weren’t testing the same thing.”

“The key problem that we had was that we could not get consistent results.”
—Renat Khasanshyn, CEO, Altoros

“So, we needed to go one way or another,” Adam added. “We did not have a compelling reason to stick with our model.” In contrast, referring to recent improvements in the AWS offering, Andrew noted that security awareness was growing, pricing models improved since WMG had implemented OpenStack, and AWS reliability also got better.

WMG public cloud growth

 

Hit the reset button

The team started with a comparison of public vs. private cloud options. The business issue for WMG became, “do you become over-cautious about deliverables, which goes against what Cloud Foundry can promise? Or do you take a risk (with the potential to create) a sticky situation with your business holders?”

Adam and the team decided to take the risk, confident that the Cloud Foundry platform could deliver for them on AWS. They took several next steps to make the migration work:

  • Developed a single, consistent strategy by working with both internal and AWS architects.
  • Focused on automation. “We built out a foundation, then used BOSH on top of it to deploy the apps and to do so across all of our environments.”
  • Got recommendations from security partners. “We found there were many options in this space. This was a really big factor.”
  • Migrated from an EC2 dev environment, built everything from scratch.
  • Developed upgrade path for eventual migration to Cloud Foundry Diego.
  • Building resiliency for a multi-zone model.

By then committing fully to AWS and being judicious in its use of reserved and spot instances to control costs, WMG was soon able to keep its promises to its line-of-business users and strengthen its SLAs, Adam said. “We did a pause and a reset. It was becoming hard to predict the unknown, and the hybrid environment was impacting us setting our deadlines for business.”

WMG cloud comparison

With the migration, “being able to promise lines of business when they can get their apps into the environment, and what kind of service levels they can expect, is a giant advantage when operating with multiple consumers,” Renat said. “As a result, the price we were willing to pay to get that expectation of availability was much higher than we were originally willing to pay.”

In the end, “We accomplished what we wanted to accomplish,” he said. “We removed random unknowns, so we can be more accurate in setting business expectations.”

 

Lessons learned

Neverthelss, the team found out that not all of the assumptions appeared to be true during the migration. For instance, “there are actual physical capacity limits on compute resources that you can use without having to talk to people. With a private cloud, the elasticity you have paid for is guaranteed. However, time to provision and terms for new capacity can be significantly longer than in a public cloud.”

“Elasticity is not guaranteed in a public cloud.” —Renat Khasanshyn, CEO, Altoros

For their Cloud Foundry Summit session, Adam and Renat have put together a data sheet covering the major issues that arose on the way to using AWS public cloud on Cloud Foundry.

The sheet provides an overview of 5 successful processes (or patterns) and their sibling 5 anti-patterns. In addition, it contains a Top 10 tips list that are related to detailed problems and solutions the team faced and solved.

Visit this page to get the document.

 

Related reading

 

Want details? Watch the video!

Table of contents
  1. Inconsistency issues faced with OpenStack and BOSH (2:54)
  2. OpenStack problems causing errors and timeouts (5:53)
  3. Rethinking the hybrid cloud strategy (7:45)
  4. WMG sees improved security and reliability with public cloud (9:19)
  5. Public cloud vs. private cloud comparison (10:34)
  6. Rebuilding from the ground up using public cloud (13:16)
  7. Lessons learned in the migration (15:30)
  8. Cutting costs with public cloud (17:45)
  9. Elasticity is not guaranteed in a public cloud (20:23)
  10. Questions and answers (24:41)

 

Related slides


About the speakers

Adam Chesterton, WMG bio
Adam Chesterton manages and leads the Application Development, Infrastructure, and Quality Assurance teams within WMG Technology, including both onshore and offshore engineering resources. He is responsible for development, support, infrastructure, and testing of multiple internal business apps: artist analytics, tour scheduling, mobile, licensing, etc., as well as 3rd-party services (such as YouTube API integration). The entire infrastructure and services are built to fully utilize Cloud Foundry as a PaaS, allowing rapid development and deployment using containers and ansible playbooks to maintain consistency across both private and public cloud offerings.

 

Renat Khasanshyn CEO Altoros bio
Renat Khasanshyn is the founder and CEO of Altoros, and Venture Partner at Runa Capital. He helps define Altoros’s strategic vision, and its role in Cloud Foundry PaaS ecosystem. In the past, he has been selected as finalist for the Emerging Executive of the Year award by the Massachusetts Technology Leadership Council and once won an IBM Business Mashup Challenge. Prior to founding Altoros, Renat was VP of Engineering for Tampa-based insurance company PriMed.

This post is written by Roger Strukhoff and Alex Khizhniak.