Introducing Nick Wilson – In his own words I'm CTO at Break Media, an ad network and a holding company of a number of Web sites. Our flagship site is Break.com, but our total network comprises 50 sites.
Break Media's entire purpose is to own the male―aged 18 to 35―demographic on the web; we want them to come to Break. Principally, the success of Break is due to our ability to provide cool, entertaining videos and offer a place where people can join a community of likeminded individuals.
We get more than 18 million unique visitors a month on Break.com alone, and we stream about 10–15 million videos a day, depending on the day of the week. Hitwise.com reports that we are, by far, the biggest of the independent video sites as ranked by market share. Obviously, YouTube is at the top, along with some other popular sites. But we're the largest that's not owned by a large mdia conglomerate.
Tell us a little about yourself. How did you get into technology? I'm originally from the UK, and I got started by writing computer games. I wrote some very popular games back in the '80s. And as a note of trivia, I was one of the first people to implement the Microsoft mouse. I made the software driver for it, and I remember demoing it to Bill Gates.
I've been involved in technology for a long time, mostly on the entertainment and content sides. I spent many years in corporate IT, running IT shops and supporting production environments. When I transitioned into Web video in late 2005 and joined Break, I was one of the company's first employees.
I like to solve engineering problems, but over the years I've become increasingly interested in building teams and delivering projects rather than personally geeking out, although I still code from time to time.
You mentioned a business goal of owning the 18–35 male audience. Talk about the technology behind that. That particular demographic is very fickle. If they have to wait for anything—if the site is down or if anything takes more than a tenth of a second to happen—they get bored and go somewhere else.
The immediacy and fluidity of the site are critically important to us. When you play video, you have to deliver a rich experience with no glitches. You need a back end and a production environment that can really support that. To meet those needs, we've chosen the Windows platform and the Microsoft .NET Framework.
People have the impression that anything having to do with Web 2.0 is built on LAMP, but you built on Windows. Why? We had a few advantages when we set out to create Break because we were building it from scratch. There really weren't any video sites made from PHP that we could just get for free or even buy and modify—there still aren't.
So we looked for a number of things when choosing a foundation to build on. We looked for a framework that could support a very large content-distribution system, knowing that we would have to build a full content-management system. This is an enterprise system, even though the user sees a Web front end. We have statistics, reporting, ad serving, templating, publishing, transcoding, and a whole bunch of other stuff, but the user sees just one thing: the results.
If we had gone with a simple user-facing Web application, we probably could have gotten away with LAMP, but we needed a lot more than that, and Windows and the .NET Framework provided a huge amount of enterprise-class functionality for our developers. Microsoft provides a framework that we can use and trust in production.
From my experience, LAMP can be pretty reliable out of the box, but as soon as you start tinkering with it, which you invariably have to do, its reliability plummets very quickly.
It's interesting that you find that Microsoft and Windows provide a more complete, all-inclusive platform than Linux does. With the Windows platform, .NET Framework, IIS, and SQL Server, we have a real development and production environment that provides a huge amount of functionality out of the box, and you just don't find that with the LAMP stack.
We can put business logic in stored procedures in the database. There isn't really an open source database that supports anything like the functionality that SQL Server 2005 comes with. SQL Server has complex replication, stored procedures, log shipping, and all kinds of other advanced functionality that's all built in. A lot of our application is database driven, which has been huge for us.
Microsoft also provides C# for middle tier. That's a real programming language with a tremendous library that comes as part of the .NET Framework. Microsoft also provides a road map, so we know where they are going with AJAX and other rich Web technology. Because of that road map, we also know what is coming, what we need to build ourselves, and how to build our technology to integrate with Microsoft's new technology. You don't get that kind of clarity with the LAMP stack.
Are there specific things in Windows that provide value to you? As far as Windows Server is concerned, we built an asynchronous processing framework that handles the back end of our site. To do that, we use MSMQ and .NET Framework, both of which are built into Windows. Things like that do exist with LAMP, but they're nowhere near as well packaged and as easy to use as what comes with Windows.
We do asynchronous processing, queuing, and prioritizations services, and we depend on XML for passing messages through different layers. When you install Windows Server, and start IIS, you get a foundation that makes it productive to do this kind of development. Everything that is included on the Windows platform is built for scale, performance, and reliability.
In the LAMP world, you're constantly installing one little piece and configuring it, and then installing some other little piece and configuring that one. LAMP isn't a comprehensive integrated system. For example, if you want an XML messaging system, you have to download it, install it, configure it, experiment with it, and make sure it's compatible with your versions of PHP and Apache. On Windows, Microsoft has done all the integration work for you. Everything that comes as part of Windows Server is guaranteed to work together, so you can just focus on building your application.
Is there anything in the new Windows Server 2008 that's caught your attention? With Windows, you can have a cache that spans multiple machines. If I have a load-balanced cluster of 100 machines and a request comes in, the caching figures out which machine has the correct response cached and routes the request to that box. That's a very nice feature for scalability. The open source Squid project is supposed to do that, but it doesn't actually work for most applications.
I know that the IT group and the people who do the application configuration at Break are very excited about some new Web gardening and process controls that you get in IIS 7. They're also looking forward to being able to prioritize threads and dynamically allocate RAM and CPU, which means that we'll be able to do more with fewer boxes. I predict that when we move everything to Windows Server 2008, we're actually going to free up some hardware or have more spare capacity.
But what about the notion that with LAMP, you can see the source code, and you can get in and make changes if you need to? My expectation for a Web server is that, if it's properly engineered, there should be no reason for me to get in and muck with the source code. The server should just run my applications, deal with my load balancing, and handle caching. What the Web server is doing internally is probably very complicated, but looking at that level of the stack is very boring for me; I don't care about how those details are implemented. More importantly, I shouldn't have to care. In this day and age, if I have to get in and tinker with the internal code of a Web server, I have to wonder how complete the product is.
Keep in mind, Break has a lot of Web sites, and some of them are built on LAMP, so I have a foot in both camps. On the LAMP side, we use an open source caching proxy called Squid, and we actually have meetings about getting in there and making changes to Squid for various needs. For Break.com, which is built on Windows, the meetings are entirely functionality focused, so we don't even talk about the middle tier or what we're running it on. It never comes up on Microsoft-based Web sites. For those sites, we discuss how we want them to look and work, and how they're architected. Those meetings are much more focused on business and design issues and not on, "How can we trick the system into doing what we want?"
You have both Microsoft and LAMP, and you're convinced that the Microsoft stack is more productive and lets you focus on higher-level functionality. Here's an example: I'm often at some gathering and I'll invariably come under attack from someone who's pro-LAMP and anti-Microsoft. I tell that person about the RSS feeds on Break.com, which we produce using the RSS feed components provided with .NET Framework.
To produce an RSS feed, we only need two lines of code in the page. Each RSS feed is generated by a stored procedure in the database, and those two lines of code deal with things like cache persistence and a couple of other things. There's no logic there other than calling the stored procedure. The stored procedure is basically a SELECT statement, and the .NET Framework does all the formatting.
To produce an RSS feed with LAMP is hell. For all but the simplest feeds you have to code everything from scratch, and if you want to change it, you have to duplicate, cut, and paste.
The fact that SQL Server has real stored procedures and full T SQL capability is hugely important. A lot of people coming to SQL from LAMP haven't worked with a database with anything near the capabilities of SQL Server, and they have no idea just how good it can be.
We brought a developer to Break whose PHP and MySQL skills are very, very strong. A couple of weeks ago, he asked me if he could transition to the .NET group because he'd seen what SQL Server could do, and he really wanted to write stored procedures and gain some experience with that technology. He thought the server could do really cool things—things he couldn't do in MySQL. You'll hear people say that MySQL now has stored procedures, but they're really not defining it the same way SQL Server does. It's more like running a job on demand.
Microsoft provides a platform that lets me put most of my logic in the database. If I want to change the rules for something, such as which videos show up on a certain page, and I don't have a CMS function for that, then I just change the stored procedure and document it in change control. By doing that, I make a systemwide change in one place, very easily and in a very maintainable way.
That's the thing that I don't think a lot of people on the LAMP stack realize. You can compare individual technologies—like Windows versus Linux, Apache versus IIS, PHP versus ASP.NET, and MySQL versus SQL Server—and I think you'll find that Microsoft technologies compare very well individually. But that misses the big picture: Microsoft has created a platform that simply lets me architect and build applications in a way that I can't on LAMP. I use a different paradigm on the Microsoft stack, and it's very productive and powerful paradigm.
I sometime hear people claim that Windows "blue screens" all the time and that it's not reliable; yet, you said if your Web site isn't reliable, your audience goes elsewhere. Talk about your experience with the reliability of the platform. For us, reliability is not just important to our audience, but also to the advertisers we make promises to. We sell very high-value campaigns, and we guarantee impressions and uniques. We don't simply think, "Oh we'll upset a user." We have to think more like, "Oh, we won't deliver on a campaign and that advertiser won't advertise with us again." The core of our entire business literally depends on availability. The biggest way that the Microsoft platform does that for us is principally through the very high stability of IIS.
With IIS, you figure out how to divide the load and set up some tuning, and then it just runs and runs. It's probably been 6–9 months since we've taken a major look at how we've tuned the system, and even with our traffic growth, there's nothing to indicate that we need to go back and make any adjustments. Once you install and configure it, it works.
And the way that Microsoft rolls out patches is very valuable to us. It's just so nice that I don't have to sift through patches, wonder which ones I need to install, and guess what they're going to break. We install Microsoft's patches on our QA servers and do a one day QA regression test, and then we put the patches into production. I have to admit that I've gotten somewhat blasé about testing the Microsoft patches because they just don't break anything, and I have confidence in Microsoft's extensive patch testing.
I talk to people in the LAMP community who are quite proud that they release patches very quickly, sometimes the same day an issue is discovered. You're saying, for your business, Microsoft does a better job? In my career, I have not once had to roll back a server patch from Microsoft on a Web site. Never. Patches have always worked and I think that's pretty typical for most companies that use Windows. Years ago, Microsoft did not have a good reputation for extensively testing its server patches, but that was fixed as part of a quality initiative probably five years ago.
Since that quality initiative, I have the ultimate confidence in Microsoft's patches. We test patches on our QA servers first because I just can't conceive putting something straight into production, but there's never been an issue.
On the LAMP side, we have hacked versions of most of the products. Let's take Squid, for example, which we have to use across the board in our LAMP environment because PHP can't support the load of generating pages dynamically. There was a bug in the latest stable version of Squid, so we had to hire the person who wrote it to fix the bug and add a piece of functionality. Now, whenever there's an update for Squid, we either have to pay for that person to roll our custom functionality forward, or we have to put someone with C skills on it, and we generally have very little need for C skills.
Squid is a port-80 proxy, so it's on the outer edge of our network. When there's a patch available, we have to do a lot of analysis to make sure applying it is worth the effort and risk. Sometimes we don't need a certain patch because the problem it addresses only happens on different distribution of Linux, or otherwise doesn't affect us. This is so time consuming to stay on top of, and we just don't have these kinds of problems with Windows. On Windows, if there's a patch, you just install the patch. With Squid, we could never do that.
With LAMP, sure, you have access to the source, and that sounds like a great advantage. But if you change the source, you instantly have your own personal branch of that project. Getting your changes into the main branch is very difficult. The open source community is very reputation-based, and if you show up one day with a patch and no one knows who you are, it's unlikely they're just going to take your patch and roll it in. We were successful at getting a patch into MediaWiki, but, by and large, we're stuck maintaining a custom Squid, Apache, and thinking about a custom memcached. We're not making our own custom forks for fun, but because we have to in order to get those products to meet our needs. It's definitely not the situation I'd prefer.
Microsoft has invested in making Windows Server 2008 a good platform for IIS, My SQL and PHP. Would you ever consider running PHP workloads on Windows? It's interesting, because I know that Microsoft is working on a number of different licensing and functionality models for Windows Server, including one that's very low cost for Web farms. That initiative, plus the ability to have a stack that can run different middle tier applications, is definitely of interest. If I could just start up PHP as a service on IIS, import my code and have it work, I'd definitely be interested.
There's an impression that the cost of Microsoft software makes it prohibitive for Web 2.0 businesses, and that "free" software has an advantage. The real costs in business, especially a Web business like Break, are people, bandwidth, and power; software licenses are not a significant factor. A good DBA in L.A. makes well in excess of $100,000, so I'll happily spend $18,000 on a copy of Enterprise SQL Server so that I can fully utilize my DBA's high-value skills.
Is it easy to recruit people with Windows or LAMP skills? Both the PHP developers I recruited were brought in from other states. .NET people are readily available here in L.A. There are some really big Microsoft shops here, like MySpace, Countrywide, and Spot Runner. So many companies are deploying Microsoft .NET that I have a great talent pool to choose from.
You have to look at the skill sets of people who are using .NET compared to those of people who are using things like PHP. A lot of people who come from the PHP side have basically done PHP as an adjunct to front end HTML: they use a very small amount of PHP just to capture a form and write something to a database. This isn't really coding, but more like scripting. We need real developer skills, so we have to sift through the pool of PHP talent to find people with solid programming skills.
In contrast, if someone's written a line of code in C#, chances are he or she has done serious development. A C# developer is more likely to understand object-oriented development and have more computer science knowledge.
Is it safe to say that, given a choice, you prefer Windows to Linux as a platform? I think the reason we like Windows is that we don't really have to think about it. I lose count of the number of times we have had to restart Apache and Squid because something goes wrong. On the Windows side, things are just integrated better.
Caching is a great example; on Windows, the caching and Web server are integrated and are able to behave more intelligently. If your applications fail on LAMP with Squid, then Squid will cache the failures and error pages, because Squid and Apache don't have the same level of integration. Even if you resolve the problem and get everything working, users will still see an error because the error page is still being served from the cache. Then you have to go in, restart, and flush all of Squid, which means the cache has to be rebuilt. While the cache is rebuilding, the site is performance is very, very slow.
On the LAMP side, I need to have a systems administrator who's able to do very low-level work in the operating system and other software that we use to host sites. People with that level of expertise are very expensive.
On the Windows side, I have a great IT group, and with Windows you just don't have to tinker with things deep down in the OS. You also get all the great documentation and troubleshooting information from Microsoft. Overall, it's just much easier to work with.
I've gotten the impression that the LAMP community is superior at providing help and information. You're saying that hasn't been your experience? Oh, I can Google any error message in LAMP and I will instantly find a thousand people trying to provide help. But that information is sometimes suspect, inaccurate, or for a different version of the project. The person offering that help might be running a different Linux distro, so the steps he or she suggests don't exactly work on your distro. The file paths, command lines, or other things are often different, so you have to do a lot of translation for your specific environment.
If I run into a problem with Microsoft software, there'll usually be a knowledge base or technical article on how to resolve it. It will be authoritative advice from Microsoft, which is a lot more useful than some forum post in which someone says, "I did this and it seems to work." Microsoft provides information that tells you the right way to fix a problem.
|