Everything about MySpace boggles the mind—from its 130 million monthly active users, to the 300,000 new users who sign up each day; and from its 8 billion friend relationships it manages, to the 34 billion e-mail messages it stores while adding 41 million more each day. The site’s 1 petabyte of data is managed by 440 Microsoft® SQL Server® instances and resides on 3PAR® Utility Storage. When MySpace needed a message queuing and delivery solution to help ensure data changes were correctly and atomically executed on all affected physical database instances, MySpace created an internal solution, called Service Dispatcher, using the Service Broker feature of SQL Server 2005. Service Broker has helped MySpace ensure data integrity across its distributed infrastructure, resulting in a better user experience. Service Broker also helps MySpace developers to roll out new services faster.
Since its launch in 2004, MySpace has become the world’s leading social portal for connecting people, content, and culture. MySpace empowers its global community to experience the Internet through a social lens by integrating personal profiles, photo sharing, professional and viral videos, blogs, mobile, instant messaging, and the world’s largest music community.
The numbers that frame MySpace’s success are astonishing. MySpace has more than 130 million monthly active users, with an additional 300,000 new users joining each day.
MySpace is such an attractive site that the company estimates that 40 percent of Internet users in the United States have a MySpace account, while in the United Kingdom it’s as common to have a MySpace account as it is to own a dog. What began in Santa Monica, California has grown to include an international network of more than 30 local community sites throughout North America, Latin America, Europe, Asia, and Australia.
||We needed to see if Service Broker could handle loads of 4,000 messages per second. Our testing found it could handle more than 18,000 messages a second.
Chief Data Architect, MySpace
From an IT perspective, MySpace has been in a continual race to keep up with the demands of constant growth, as MySpace is one of the fastest growing Web sites of all time. The company has long depended upon the Microsoft® Application Platform, including Microsoft SQL Server® 2005 database software. MySpace uses the Microsoft Application Platform to support:
- 827 billion rows of data
- 8 billion friend relationships
- 27 billion comments
- 34.2 billion e-mails total
- 41 million new e-mails added per day
- 33 million video files
- 62,000 new videos uploaded per day
MySpace Data Services decided that the best way to handle the constant growth of its relational database stores, which now total more than 1 petabyte, was to scale horizontally and divide information across multiple instances of SQL Server. Functional separation and horizontal partitioning worked well for the company, but to help ensure data integrity while keeping up with peak loads of up to 4.4 million concurrent users, the company needed a solution that supported efficient asynchronous messaging between its 440 SQL Server instances and 1000 databases.
“Growth at MySpace has been so spectacular for so long that we initially created our horizontal scaling solution without implementing an effective messaging infrastructure linking the databases,” says Christa Stelzmuller, Chief Data Architect at MySpace. “The result was a database administrator’s nightmare, and sometimes a user experience that didn’t meet our expectations because of data integrity issues that emerged as transactions spanning multiple databases were sometimes only partially completed.”
Data integrity across all of its databases is required because each home or profile page view on MySpace is created upon demand, and because one user’s page is linked to those of friends, the dynamic page creation requires pulling data together from multiple databases. Without an efficient messaging solution to ensure transactions were atomic across all relevant databases, updates and changes entered by a user from one database might not be propagated to the tables of other databases. The result could be error messages, incomplete pages, and unhappy users.
The company needed a better solution for passing messages between databases to help ensure data integrity.
The MySpace Data Services team created a solution to act as a coordination point for message delivery across its distributed deployment of databases. The MySpace solution, called Service Dispatcher, works on a broadcast model in which the Service Dispatcher ensures that a change originating in one database is delivered to the specified target “group” relevant to the transaction.
||Service Broker has enabled us to reduce data errors across our distributed databases by orders of magnitude. This is significant because data errors used to be the greatest problem our group had to deal with.
Chief Data Architect, MySpace
At the core of Service Dispatcher is the Service Broker feature of SQL Server 2005, which provides a message-based communication platform that enables independent application components to perform as a functioning whole. Service Broker includes infrastructure for asynchronous programming that can be used for distributed applications across multiple databases.
MySpace was an early adopter of SQL Server 2005, and is in the process of upgrading its database infrastructure to SQL Server 2008.
The company uses a multi-tier architecture that includes:
- Web Server Tier. Customers access MySpace using a browser. The Web Server tier uses Window Server® 2003 Internet Information Services (IIS) Web server technology. The Presentation tier is hosted as a Web farm with 3,000 Web servers and 800 cache servers, all running the Windows Server 2003 SP2 operating system.
- Application Tier. MySpace coordinates the flow of data, to its users through an application tier created internally by MySpace developers using Microsoft Visual Studio® 2005, the Microsoft .NET Framework 2, and SQL CLR, which enables developers to take advantage of the common language runtime from within SQL Server. Applications include a transaction manager, service layer, pre-populator, and cache. Applications are hosted on servers running Windows Server 2003 SP2.
- Data Tier. MySpace information—which totals more than 1 petabyte—is stored on 1,100 disks of 3PAR Utility Storage and 440 instances of SQL Server database software holding 1000 databases. The databases are being upgraded to SQL Server 2008 from SQL Server 2005 SP2. The databases are hosted on HP ProLiant DL585 computers running Windows Server 2003 Enterprise Edition SP2. Each computer has 4 dual-core AMD processors and 64 gigabytes (GB) of RAM. The data tier also includes MySpace’s internally developed Service Dispatcher application which runs on 30 computers, each running SQL Server 2005 and using the Service Broker feature. All 440 database computers run an associated Dispatcher Client application to facilitate communication with the 30 Service Dispatcher servers. The 3PAR solution, which includes 3PAR InServ® T800, S800 and S400 Storage Servers, also uses its thin snapshots to host backup copies of the SQL Server data. 3PAR is a Microsoft Gold Certified Partner.
Service Broker has enabled MySpace to perform foreign key management across its 440 database servers, activating and deactivating accounts for its millions of users, with one-touch asynchronous efficiency. MySpace also uses Service Broker administratively to distribute new stored procedures and other updates across all 440 database servers through the Service Dispatcher infrastructure.
MySpace gained the enhanced data integrity and better user experience it had sought by using the Service Broker feature of SQL Server 2005 in creating its Service Dispatcher application to support its distributed database infrastructure. The company found Service Broker to have enterprise-class performance, and its internal developers are enjoying faster development by relying on Service Broker to handle asynchronous messaging between the database instances.
Enhanced Data Integrity
MySpace has achieved its goal of enhancing data integrity by using the Service Broker feature of SQL Server to ensure database changes are correctly propagated across its distributed environment of some 440 database computers. Service Broker provides queuing and reliable messaging across multiple instances of SQL Server, which is exactly what MySpace needed for its Service Dispatcher solution.
“Service Broker has enabled us to reduce data errors across our distributed databases by orders of magnitude,” says Stelzmuller. “This is significant because data errors used to be the greatest problem our group had to deal with.”
Solving data integrity issues has been liberating for MySpace database administrators, who prior to the solution spent much of their time on data integrity issues.
“We estimate that our database administrators have been able to reduce the time spent on tracking down and resolving data errors by at least 80 percent,” says Stelzmuller. “This means we can spend a lot less time on operational issues and a lot more time creating solutions to carry us into the future.”
Better User Experience
When dealing with billions of rows of data and integrating this information into the interwoven social networks and rich feature sets that are the hallmark of MySpace, data integrity is essential to support the user experience. Prior to deploying its Service Dispatcher solution powered by SQL Server Service Broker, data errors could mean missing data and frustrating error messages for users.
||With Service Broker we have the ability to ensure the integrity of our data, which greatly enhances the user experience.
Chief Data Architect, MySpace
The problems came when a change of information on one database server wasn’t propagated to all of the other servers that had relational dependencies. A user might add information to a blog that failed to show up on their profile page. “With our old system if half of the blog data was saved to the blogs database and half the data was saved to the user database, and if only one of those saves succeeded, the next time the user went back to their page they might lose their blog entry entirely or only find half of it,” Stelzmuller says. “This caused a lot of agony for our users. They would say: ‘This is the diary of my life.’”
Similar problems could occur as MySpace Data Services battled the endless tide of spammers. When removing spammers from one database, data remnants could remain on other databases, causing users to see one big red X after another. In the case of spammers, the lost names weren’t real friends. But users with no idea of who had been deleted could get anxious about what was happening with their social network.
“With Service Broker we have the ability to ensure the integrity of our data, which greatly enhances the user experience,” Stelzmuller says.
The MySpace IT team was excited to hear about Service Broker after it was introduced as part of SQL Server 2005 because the need for an asynchronous distributed messaging solution was so essential that the team was at the time creating a solution of its own.
The initial enthusiasm about using the messaging technology already built into the database was tempered by the huge demands the team knew its infrastructure would place on any messaging solution. Before deployment, the team needed to verify that Service Broker was indeed enterprise ready. Members of the team travelled to the Microsoft SQL Server Lab in Redmond, Washington to test Service Broker performance.
“We didn’t want to start down the road of using Service Broker unless we could demonstrate that it could handle the levels of messages that we needed to support our millions of users across 440 database servers,” says Stelzmuller. “When we went to the lab we brought our own workloads to ensure the quality of the testing. We needed to see if Service Broker could handle loads of 4,000 messages per second. Our testing found it could handle more than 18,000 messages a second. We were delighted that we could build our solution using Service Broker, rather than creating a custom solution on our own.”
As soon as MySpace had validated Service Broker’s enterprise-grade performance in the lab, the company’s IT group began integrating it into the Service Dispatcher solution it was already building. Service Broker helps developers compose applications from independent, self-contained components called services. Applications that require the functionality exposed in these services use messages to interact with the services, and to exchange messages between SQL Server instances. Messaging between instances was an essential element of MySpace’s distributed database infrastructure, and significantly reduced application development time.
“The ability to use Service Broker rather than creating our own messaging solution cut our development time for Service Dispatcher by half,” says Stelzmuller. “It still took some months to create our Service Dispatcher solution, but now that we have our solution up and running we’ve slashed even more development time off by offering new services that use Service Dispatcher as a base component.”
Prior to the Service Dispatcher solution, it could take several days to build and test a new service offering. “Now we can bring up a new service based on Service Broker in about 30 minutes because our developers don’t have to map out database targets or worry about complex scaling problems like conversation management,” says Stelzmuller. “They just write to our interface.”
The combination of easy programming and the ability to ensure data integrity across the distributed solution is encouraging developers to create new features and service offerings that might not have been considered earlier. “Before Service Broker and our Service Dispatcher, there was sometimes a reticence about creating new data-intensive services,” says Stelzmuller. “Now we know we can do those things, and our developers know they have a simple, quick, and uncomplicated set of tools to work with when dealing with our distributed database infrastructure.”
In summary, MySpace found that the Service Broker feature of SQL Server 2005 provides the message queuing and asynchronous message delivery it needed to create its Service Dispatcher application to support its distributed database infrastructure of some 440 SQL Server instances holding more than 1 petabyte of information.
Microsoft Server Product Portfolio
For more information about the Microsoft server product portfolio, go to: www.microsoft.com/servers/default.mspx
Microsoft SQL Server 2005
Microsoft SQL Server 2005 is comprehensive, integrated data management and analysis software that enables organizations to reliably manage mission-critical information and confidently run today’s increasingly complex business applications. By providing high availability, security enhancements, and embedded reporting and data analysis tools, SQL Server 2005 helps companies gain greater insight from their business information and achieve faster results for a competitive advantage. And, because it’s part of the Microsoft server product portfolio, SQL Server 2005 is designed to integrate seamlessly with your other server infrastructure investments.
For more information about SQL Server 2005, go to: www.microsoft.com/sqlserver
For More Information
For more information about Microsoft products and services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada Information Centre at (877) 568-2495. Customers who are deaf or hard-of-hearing can reach Microsoft text telephone (TTY/TDD) services at (800) 892-5234 in the United States or (905) 568-9641 in Canada. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information using the World Wide Web, go to: www.microsoft.com
For more information about 3PAR products and services, call (510) 413-5999 or visit the Web site at: WWW.3PAR.COM
For more information about MySpace products and services, visit the Web site at: www.myspace.com
This case study is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.
Document published June 2009