Jellyfish Pictures needed to enable secure remote access to immense computing power to render visual effects and animation sequences. The studio gained burst rendering on up to 90,000 processor cores in the cloud with Microsoft Azure Virtual Machine Scale Sets, HBv2 virtual machines, and Avere vFXT for Azure. It takes 80 percent off those rendering costs with Azure Spot Virtual Machines and uses an Azure ExpressRoute connection to minimize latency while more securely managing storage in one place, without replication.
“We used Azure Virtual Machine Scale Sets to replicate a single image across 500 nodes of Azure HBv2-series VMs spread over Ireland, the Netherlands, and the United States. There’s no way we would have delivered our latest project without that broad footprint.”
Jeremy Smith, Chief Technology Officer, Jellyfish Pictures
Animation and visual effects (VFX) studio Jellyfish Pictures has been recognized with an Emmy, a British Academy Film Award, and two Visual Effects Society awards. Well respected in the industry, the studio contributed VFX to three recent Star Wars films and HBO’s Watchmen series. Along with creating original children’s programming, it has also worked on projects like DreamWorks Animation’s How to Train Your Dragon: Homecoming.
Transient workforce, unpredictable IT needs, and inflexible deadlines
This type of work, especially VFX, is primarily project-based, with artists moving around among studios. Though the trend is changing in 2021, traditionally, most creatives worked on-premises. Others worked remotely using VPN connections and often had issues with connection speeds, depending on what broadband service they used and where they lived. At times, artists would send themselves asset files by email if they needed to work at home, leaving those files vulnerable to malware and other cyberthreats. Tracking and collating these massive, individually stored files could become a bottleneck in project schedules.
“When a studio scales up to work on large feature films, it will invest in hardware, software, and recruitment of hundreds of artists. Often when the project is complete, and the studio scales back down, it is still committed to expensive and specialized hardware. Studios can either strive to get maximum utilization out of that hardware investment or opt for the flexible solutions offered by cloud-based solutions,” says Matthew Bristowe, Managing Director at Jellyfish Pictures. “Studios are all competing for the same small pool of top artists, but if they’re restricted by fixed, on-premises capacity, they can be forced to turn down big projects because they’re not able to scale their infrastructure swiftly and efficiently.”
The need for vast amounts of compute resources
Investing in on-premises IT infrastructure has left many studios constrained to using resources in their physical locations and restricted to hiring talent who live nearby, or who can afford to relocate temporarily. Studios need large amounts of compute power for tasks like rendering, when artists are creating animation and visual effects. “Artists typically start by creating wireframes, where they craft shape, texture, motion, lighting, and so on,” says Jon Weisner, Azure High Performance Compute Lead (Americas) at Microsoft. “The next step, rendering data for 30 frames or more per second of a motion picture, requires a lot of compute and takes time. Compute requirements grow even further as we see the sophistication of images increase and we push content into the realm of 8K. Production demands aren’t consistent, though, and that calls for a flexible architecture that allows studios to operate on-premises until it’s fully utilized, then burst into cloud to get the additional compute they need to deliver on time.“
Rendering job times can last two to eight hours (or even days) when a studio uses an on-premises render farm with fixed capacity. Artists often sat around waiting for rendering to complete before they could see the finished product and make adjustments. To reduce the number of artists waiting for jobs to return, studios need to add more compute on demand and allow for more of the job queue to be processed more quickly.
The need to safeguard this painstakingly created intellectual property is also a concern. To be eligible for certain projects, Jellyfish must be accredited by organizations like the Trusted Partner Network (TPN), which often follows Motion Picture Association directives. Once accredited, a studio can be audited at any time.
The road to virtualization
In a pioneering move for the industry, Jellyfish decided that virtualization was the answer. In 2015, it began working with Microsoft, Teradici, and AMD to implement capabilities like batch rendering, which led to the company opening its first virtual studio in 2016 and going completely virtual in December 2019.
In its on-premises datacenter, Jellyfish uses only EPYC processor cores. So it was an easy decision to choose the same processors in the cloud, where it relies mainly on Microsoft Azure HBv2-series virtual machines (VMs) with 120 AMD EPYC™ 7742 processor cores. “When we were looking at our cloud consumption, we found that AMD EPYC were the best-performing and most cost-effective VM SKUs for our rendering workloads,” says Jeremy Smith, Chief Technology Officer at Jellyfish Pictures.
Jellyfish built large-scale services using Azure Virtual Machine Scale Sets to create and manage a group of heterogeneous load-balanced VMs. With Azure services, it runs many renders in parallel on a wide range of hardware, including GPUs. When the studio’s creatives exceed the capacity of the on-premises datacenter, the company handles burst rendering with Virtual Machine Scale Sets to increase or decrease the number of VMs automatically in response to demand or based on its self-defined schedule.
To deliver virtualized workstations to creatives, the studio adopted Teradici Cloud Access Software (CAS). It also employs pixitmedia’s pixstor software-defined scalable infrastructure for on-premises storage, and Avere vFXT for Azure acts as an extension of its on-premises render farm, storing content during the render process to reduce network costs. By using an Azure ExpressRoute connection, the studio minimizes latency while more securely managing storage in one place, without replication. The studio’s solution meets TPN and Content Delivery and Security Association requirements, providing creatives with more secure access to the files and rendering performance they need.
A 70 percent boost over on-premises capacity, freedom to work from anywhere
With Azure compute services, Jellyfish runs its production services in the cloud or on-premises in whatever proportion it needs, shifting as needed. “With Azure service flexibility,” says Smith, “we can put the needs of the production first, hiring the best creatives for each task no matter where they live. And when the project is over, we shut it off. We aren’t stuck with on-premises hardware investments that consume air conditioning, square footage, and so on, being underutilized or waiting idle for the next production.”
During the COVID-19 lockdown in 2020 and 2021, Jellyfish hired around 200 people from more than 20 different countries. Smith says, “We used Azure Virtual Machine Scale Sets to replicate a single image across 500 nodes of Azure HBv2-series VMs spread over Ireland, the Netherlands, and the United States. There’s no way we would have delivered our latest project without that broad footprint. All our creatives around the world experienced the same low latency and high performance.”
No matter where creatives are based, they can sign on using CAS and get access to all of the infrastructure that they would have if they were at the studio. Bristowe says, “With burst rendering in Azure, we can expand our business according to the best talent and economics.”
Jellyfish can offer these options thanks to its new scaling capacity. The studio’s on-premises render farm has about 22,000 processor cores. In January and February 2021, while working on an animated film project for a major studio, Jellyfish used over 55 million core hours on Azure HBv2-series VMs. Smith says, ”Our Azure Virtual Machine Scale Sets ran 60 to 70 thousand processor cores, sustained, for two months, and peak demand hit 90,000 total cores. With burst rendering in Azure, we gained the equivalent of an additional 70 percent capacity of our own datacenter. This enabled us to scale up massively at a moment’s notice without any long-term CAPEX commitments.”
Improved efficiency and agility
The studio makes sure to get the most from its cloud spend. ”Previously, we couldn’t work with certain sequences on-premises due to memory limitations,” says Smith. “Now, with almost half a terabyte of RAM on each Azure HBv2-series VM, using AMD processor cores, we can load massive assets into the cloud.”
Jellyfish achieves cost efficiency with Azure Spot Virtual Machines. Says Smith, “We use Azure Spot Virtual Machines to take 80 percent off our costs almost instantly. It’s an option for buying unused compute capacity at a discount.”
Bristowe says the solution is reassuring for clients. ”A lot of money goes into these projects, and clients want to feel certain that we can deliver. Our technical abilities, like scaling, help us tremendously,” he says. “If the main studio needs a reshoot for non-VFX-related reasons, we’re still agile enough to pivot and meet the deadlines—by essentially pressing a button instead of purchasing new hardware.” By adding infrastructure as code with Terraform, the studio automates platform provisioning. “Azure infrastructure as code basically shows us how to rapidly deploy Azure services in the cloud,” says Smith. “It dramatically speeds and simplifies deployments. After you figure out your basic settings, you can customize them easily.”
A view into the future
The studio is moving toward more cloud-based workflows. “Azure is really backing efforts in the industry by organizations like MovieLabs that are working to establish software-defined, cloud-based workflows,” says Smith.
Cloud-based rendering is just the beginning. Smith explains, ”After Avere vFXT for Azure supports Microsoft Azure Blob Storage, we’ll be able to mine our data with machine learning to generate additional value from it, maybe through reporting.” Jellyfish also plans to start archiving data to the cloud, then anonymizing and monetizing it.
Whatever the future brings, the studio will be ready. “Microsoft Azure offers more than just VMs and processor cores. Services like Virtual Machine Scale Sets and Avere vFXT for Azure work very well with our on-premises datacenter and hybrid cloud approach. Azure plugs seamlessly into the overall ecosystem of products in the media space,” says Smith. “Being able to take more projects, and not being tied to a geographical region, gives us a tremendous advantage. With Azure services, we can hire globally, let creatives work remotely, and scale our compute resources whenever we need.”
Find out more about Jellyfish Pictures on Twitter, LinkedIn, and YouTube.
“With burst rendering in Azure, we can expand our business according to the best talent and economics.”
Matthew Bristowe, Managing Director, Jellyfish Pictures
Follow Microsoft