When you think of Microsoft and gaming, you probably think first of Xbox. And usually when we’re talking about the Xbox business, our customer is the gamer.
But at Microsoft, there’s another customer that’s just as important to us – the game creator. Why? Because a strong and growing creator ecosystem is vital for our success.
By being focused on developer tools, partnerships and infrastructure, we are ensuring Xbox’s ability to bring the very best games to our players.
Our partners are using Azure to help them realise their dreams, and one such partner is Hello Games. At Develop:Brighton I spoke with Iain Brown, server and multiplayer lead for Hello Games, about how they are using Azure tools and services to run the incredibly popular No Man’s Sky.
Harvey: You are instrumental as part of the team that created No Man’s Sky, so please introduce yourself and tell everyone what you do at Hello Games.
Iain: I’ve been making games for about 25 years now. For the last 15 years I’ve worked on online-specific things – both multiplayer and online services – and at Hello Games I’m mostly in charge of our servers, our interactions with those and the multiplayer back-end for No Man’s Sky.
H: For those who don’t know, let’s start with what No Man’s Sky is. How would you describe the core features of the game?
I: No Man’s Sky is a space exploration and survival game set in a procedurally-generated universe. There are around 18 quintillion different planets in the game that people can go explore, and while obviously no one can ever see all the content that’s out there, every planet is unique with individual “things” defined on each planet.
It’s those “things” on the planets that I’m involved in distributing to different people. The idea is that as you’re traversing the universe, even if you don’t find other people to play with, you may find the footsteps of the people that have travelled there before you. Maybe you stumble across a planet someone has visited before, where they’ve named the planet after their father or something like that. We want to share that between our different players.
H: Tell us a little bit more about how you use Azure to implement some of the cloud features inside the game.
I: We’re a very small team, and I’m generally the only person that works on the servers, so we didn’t want to be dealing with any kind of Ops for the game – we want our servers to manage our servers for us. So we’ve gone for a PaaS offering, which is Azure App Service. We don’t have to worry about the VMs, we don’t have to worry about the OS, we don’t have to worry about the frameworks; We just have to write the C# code, upload it to the cloud and it’ll scale it out for us. That was our emphasis – we just want to write the code and not have to worry about the Ops side of things.
We use a lot of Azure features, so we’re quite heavily embedded in the Azure ecosystem. For instance, we use Azure Key Vault as a place to store all of our secrets, certificates and things like that, so we don’t have to let people see things that they don’t need to see. Anything that needs to be protected lives in Azure Key Vault, and only the servers can see them.
We also use Azure Event Hubs because we wanted to make sure that if the database itself went down, our players didn’t lose anything. We don’t upload our players’ discoveries straight into a database – we upload them to Event Hubs, and we pull them out when the database is in a ready state.
This allows us to do maintenance of the database but it also splits up the load a little bit, so that you don’t have to instantaneously deal with the submissions as they come in.
H: You’re obviously dealing with tons of data as well, so I assume you have data solutions for that?
I: Azure Cosmos DB is what we use for the actual database. It’s a NoSQL database that is automatically managed and can scale – we just say we want to run a number of requests on it, and Azure manages that and puts the data wherever it thinks is best.
We have about 3 terabytes of data in there now. It’s all in JSON documents, all text, so it’s a lot of data, but Cosmos DB deals with it fine and we can run queries across all of that data without any issues.
H: No Man’s Sky is over five years old now, which means you’ve been working with Azure for that period of time. I’m interested to know how has your usage of Azure evolved over the years?
I: The game has grown a lot in the five years since we’ve launched. At the beginning basically everything in our servers were discoveries; the planets, the animals, the plants, the rocks. We’ve added quite a few more features that now go into the database, such as bases people can build and share, and settlements which we added fairly recently. We’ve also added missions to the game that people can work on.
So the game has grown, and the Cosmos DB database has grown along with it – we just spec it up when it’s needed. We’ve also migrated from different hardware, and since we’ve launched we’ve had two iterations of new hardware within Azure. Each one has become cheaper, and we’ve just redeployed our servers onto these new versions where it just works.
Also we launched on .NET Core 1; we’re now on .NET 5 and we’ve also moved to Linux. That was seamless – we redeployed to a Linux app service instead of a Windows one while using the same code base.
H: Could you tell us a bit about how you use Slack as well?
I: We hook up various things to Slack, though they’re internal rather than external-facing things. For example, I use Octopus Deploy for all of my servers. Whenever I deploy a server, it sends out a message to an Azure Function, which is a serverless piece of code. It’s a very simple web hook that writes to our slack to say, hey, your server has started deploying, and whether it has passed or failed. It takes very few lines of code to write a Slack notification and host it within an Azure function.
H: Going back, you famously had so many players at launch – an incredible success. That must have presented some technical challenges, so I’m interested to know how Azure helped you deal with that level of initial player interest that you had.
I: Before we launched, we’d done a lot of load testing and thought we were gonna be fine. We underestimated the player numbers – a huge number during the launch period – but also the nature of our load testing was such that while we were load testing both the CPU and the RAM, we weren’t load testing the number of connections being made to each machine. Unfortunately, we ran out of those.
What we had to do was quickly jump onto the Azure portal and scale up all our servers. So we went from less powerful machines to much more powerful ones that could support many more connections. We also broadened out and went across many more instances of the servers.
All of this was easily done in the portal, and we just had to click a few buttons to get it all up and running. I didn’t have to requisition a new hub or anything like that, it was just a case of going into the portal and changing some numbers until it worked.
The other thing that happened at launch was that Cosmos DB used to have a dev offering, which was unpartitioned and cheap, as well as a partitioned offering that was a lot more expensive. You’d use the partitioned one for your production database, while keeping your dev on the cheaper, unpartitioned offering.
For our testing of the PC servers, we’d set up a dev database as we didn’t want to pay for a production one in the months leading up to the game’s launch. Unfortunately we forgot to change it over before launch, a bit of a mistake, which meant we were writing records into a dev database that wasn’t big enough and wasn’t going to cope. We also couldn’t scale up, as at the time they were two distinct products rather than the single product it is now.
We got on the phone with the Cosmos DB team and they were great – we got in touch with the actual engineers who work on it as well as the product managers there, and they knew what to do. They wrote a script that would copy everything from the dev database into a new one they provisioned that was much bigger, while coping with all the numbers that were being thrown at it. At a certain point we were able to pull the plug on the dev database and have the new one take over, all without any server downtime.
H: One of the key objectives of any game is being able to keep your players happy and engaged. Are there examples from your work on No Man’s Sky that you can share of how the cloud helped drive player engagement?
I: We’ve recently launched a new mode within the game, which is a time-limited season mode. It’ll be out for a few weeks before leaving the game, and you’ll get unique rewards for playing it. We don’t want people hacking the game client to unlock this mode early or to see what’s coming up, so all of that data lives in the cloud. It lives in table storage within Azure, and at a given time it’ll be downloaded to the game clients as they need it.
This is all set up by the game designers. We’ve tried to remove the need for a coder to do this as much as possible, so the designer can set what time events start and finish, point at a file that describes the event, what the rewards are, that kind of thing. It’s very much about empowering our designers to do this, and to be honest we don’t have that many coders so it helps to reduce our dependence on them.
We’ve similarly run weekend missions, where we encourage players to come back on the weekend in order to receive unique rewards. They’ll play together, and they’ll do a task which has a target that everyone is collectively trying to complete. We also have a limit for how much each player can personally contribute to this goal, just so that one person can’t come in and do the entire goal by themselves.
All of this is done in Cosmos DB. We’ve got scripts that allow players to submit that they’ve completed their goal, which Cosmos DB collects. We can then run queries across this data to see how many of these things happened since the weekend mission launched. Again, this is all now driven by the designers. No coders are involved in running the weekend missions – it’s all set up so that it runs and automatically deploys itself.
H: You ended up using Azure PlayFab for matchmaking and Azure PlayFab Party for in-game communication. Why did you choose those solutions and can you explain in more detail how they were used in No Man’s Sky?
I: Cross-play was a big thing for us. We wanted to do cross-play for a while, but again we’re a very small team and there was no way we could write our own system. We just couldn’t write the servers, deploy the servers, manage the servers – it just wasn’t feasible for us.
Microsoft approached us and suggested taking a look at Azure PlayFab, and it had a lot of great features that we were keen to use. For example, all Xbox signed-in users on PlayFab are free, and that’s great as we always want to reduce our running costs. But there’s also some brilliant features within PlayFab Party. It does really good speech-to-text, text-to-speech and real-time translation of foreign languages.
In fact, when we first got PlayFab running in our studio, people would gather around my desk to see this happening. Everyone was trying to trick PlayFab, seeing if they could break it in some way, and it was pretty good at actually detecting what people were saying and translating it into various different languages. I definitely recommend that people take a look at those features.
The matchmaking side of PlayFab was very easy to integrate because it’s basically the same as the Xbox Smart Match system. If you know how to use that, it’s very quick to get up and running on the PlayFab system, which then runs cross-platform. We were essentially able to take all the rules we had for Xbox from the Xbox portal and copy them into the PlayFab portal, then the matchmaking was done. All the rules just worked.
We have two levels of relationship when you’re playing a game in No Man’s Sky. One is when you are flying around the galaxy, seeing other people, who just drop in or drop out because of how physically near to you they are.
We also wanted a more long-term relationship for players, across a single session but also across multiple hours, for people you want to spend time with. We call that your Team, and the way we do that is by running two different PlayFab Party networks completely separately. We only run voice chat within a Team, as no one wants to hear 31 random players talking at them.
H: By choosing to deploy dual networks, while ground-breaking, I imagine your networking costs started to rise. I’m curious how that went and how Microsoft was able to respond.
I: Yeah, the costs did skyrocket a little bit! The way it worked was, if you’re finding people locally, you only create a PlayFab Party network when you matchmake with people. So most of the time, if you’re flying around the galaxy by yourself, you won’t be in a local network.
To make the Teams work, we needed a permanent network for every player in the game, so that their friends could join them through the game dashboard. The network had to be ready for people to join, even though, generally, there was no traffic on that network.
Unfortunately that did cost quite a lot, but we reached out to Microsoft as they had signed off on this design before we shipped it. They were very generous and gave us quite a nice discount, and very soon after that they announced a 90% decrease for all PlayFab Party network minutes which brought the costs down to a much more reasonable number.
H: With the decision to implement cross-play you were faced with the challenge of merging data from three separate databases into one common database. How did you go about achieving that?
I: When we launched, we made the decision to silo off the different game platforms into different databases for tracking player discoveries. There are various reasons behind this that made sense at the time, such as with displaying usernames, reporting content and that kind of thing, so it was easier to silo it all off.
When we launched cross-play, it made a lot more sense to actually have everyone sharing the same names for things. We had these three separate databases, each of which were about 2TB big, and we needed to combine them into one single database.
We made the new database, and we did basically the same thing that the Cosmos DB team did for us during launch – I wrote a script that I ran in an Azure Function, and as Azure Functions are serverless, they can scale massively. We used the change feed of Cosmos DB to do this, as the change feed is basically a list of all the records in a Cosmos DB partition.
Because it’s per-partition, it can also scale massively. We have a lot of partitions in our database so we just put this code into Azure Functions and let it scale itself as much as it wanted. It went very, very wide and had lots of instances of this function running, pulling all the records out of each individual database and putting them into the new one.
There were obviously some conflicts. We had places where two or more people had named the same planet, so we had to pick one of those to be the winner. The way we resolved that was by giving the naming rights to the person who visited the planet first, and we had to check as we went whether conflicting records were older than the records being added to the new database.
It took several weeks because, as I say, there were several terabytes of data in each database, but there were no problems that we could find. We did a lot of testing on it before we launched it, and we had these Azure Functions running in the background even as players were still writing to the old databases. There was no down time for this – we just set a point in time where we were going to move to the newly merged database and make the change.
It actually went surprisingly smoothly. It was a bit scary, but we’d done enough testing and we were happy. The scariest moment came a bit later on when we decided to delete the old databases. Before that we’d had them as a backup for the new database, but we were paying for these old databases and not using them, so they had to go. That was several months after the merge happened, but it was still a scary moment in time.
H: As we wrap up, what are the main take-aways that you want people to walk away with from this conversation?
I: I think the main point to take away from this is that you don’t need to invest massively into online features in order to have nice online features. It was generally just me writing these things – there was no Ops engineer, no database analyst – just me writing in mostly C#.
It takes some research to learn about all the different things Azure has to offer, but they’re all very simple to use and you can hook them up and create nice features without needing a massive team looking at it.
Azure for Gaming
Thanks again to Iain Brown for taking the time out his busy schedule to share these great insights with us. Congratulations too on the huge success of No Man’s Sky, and I know I speak for all of us when I say we can’t wait to see what’s coming next from Hello Games.
The Azure Gaming team has a mission and that is to empower game creators to realise their dreams. One way we measure our success is in the revenue we generate for creators. We want to appeal to developers and publishers, of every size and capability, by helping you both creatively and financially.
For those that might not know, Microsoft Azure has a broad and compelling set of cloud services that empower game creators to realise their dreams. We can help you:
- Acquire, retain and monetise players.
- Build healthy, engaged and safer player communities that help you extend the life of your game for a better ROI.
- Use data and analytics to help you deeply understand your players’ behaviour and deliver content they’ll love.
- Include modern game features that gamers expect, such as notifications, leaderboards and more.
- We can provide you an enterprise grade infrastructure backbone to ensure you can reach your players wherever they are, regardless of location, whatever genre of game you are making, whichever the device or platform you publish on.
Azure is a modern agnostic development platform. When technology is at its best, it gets out of the way so you don’t have to worry about it. We want you to be free to focus on creating great games. That’s our approach.
- Join us at the London Games Festival in April
- An Introduction to Microsoft Game Stack
- Getting started with Azure PlayFab
- Celebrating three years of Azure PlayFab and the future of game development
- Learning from the Game Developers Conference, past and present
- Find more game dev resources: https://aka.ms/cloudgaming/resources