When Is a Browser Not a Browser?
By Janie Chang, Writer, Microsoft Research
Once upon a time, Web sites were the online equivalent of data sheets. Now users go to the Web to run business apps, do their banking, buy products, socialize, receive a daily news fix, or play interactive games. Nor are Web pages simple HTML anymore; a page can be composed of dynamic content from third-party ad sites, newsfeeds, or messaging sites.
In addition, the software industry has been moving steadily toward a software-as-a-service paradigm. As a result, the browser has taken on the role of application platform, but the increasing value of what is available through the Internet is driving development of Web applications that push the limits of browser capability. AJAX, postMessage, and other recent innovations to the browser platform empower Web developers to build richer Web applications. But Web applications have yet to achieve the richness and robustness of desktop applications; a misbehaving site, such as an ad, can interfere with other sites being viewed by the user, and today’s Web applications have limited access to local system resources such as Webcams, speakers, and printers.
No wonder then, that Helen J. Wang, senior researcher in the Systems and Networking group at Microsoft Research Redmond, is working on ways to evolve the browser into an operating system that supports an increasingly sophisticated Web environment.
In August 2009, the Systems and Networking group will be presenting a paper on this topic during the Usenix Security Symposium, the premier academic conference on system security. The Multi-Principal OS Construction of the Gazelle Web Browser describes the design and construction of a browser that is actually a multi-principal operating system. The paper is jointly authored by Wang; intern Chris Grier of the University of Illinois at Urbana-Champlain; intern Alexander Moshchuk from the University of Washington; Samuel T. King of the University of Illinois at Urbana-Champaign; Piali Choudhury, Microsoft Research Redmond senior development lead; and Herman Venter, principal software-development engineer at Microsoft Research Redmond.
In browser parlance, a principal generally equates to a Web site. Given that there is usually just one user at a time on a PC, the sharing of resources is actually across applications from different origins; in the case of Web pages, each page could consist of content from different principals, each staking out a share of computing resources. The browser is therefore the natural choice of application platform for managing principals and resource requests.
When Coexistence Is Not a Good Thing
“Everyone accepts that applications need to run on operating systems,” Wang says. “However, this has not been the case for Web applications; they depend on browsers to render pages and handle computing resources. Yet browsers have never been constructed to be operating systems. Principals are allowed to coexist within the same process or protection domain, and resource management is largely non-existent.”
These attributes manifest themselves in lack of cross-principal protection, lack of consistent device access and control, and poor resource usage control.
A Web page might offer content such as ads or newsfeeds from other Web-site principals. Yet to the browser, all these principals coexist in the same process or protection domain. An ad containing malicious or poorly written code could hog the network connection, degrade performance, freeze the entire page, or crash the browser. In a browser operating system, a “bad” principal would not be allowed to affect other principals, the browser, or the host machine.
Currently, browsers don’t offer resource management for devices; they do not manage access to devices or provide a consistent, systematic way of allowing access or managing sharing when Web-site principals are contending for the same resource. For example, browser plug-ins handle access to devices such as Webcams, printers, or game accessories. But how the device is accessed and controlled is up to the plug-in author. Furthermore, plug-ins can interact directly with the operating system and are not constrained by the browser’s security policies. Thus it is possible for plug-ins with conflicting policies to access the same device.
In the Gazelle model, the browser-based OS, typically called the browser kernel, protects principals from one another and from the host machine by exclusively managing access to computer resources, enforcing policies, handling interprincipal communications, and providing consistent, systematic access to computing devices.
Don’t Get in My Space
One of the most interesting challenges for the Gazelle team was examining where operating-system concepts could be applied to the world of Web applications. Operating systems have advanced over the years and are well-tested; Wang says the time has come to apply decades-old operating-system experience to the browser-design space. Gazelle essentially leverages the existing mechanisms of operating systems and tailors them to the needs of Web applications.
In Gazelle’s architecture, the browser kernel is a layer that sits between the underlying operating system and the principals, exclusively responsible for managing principals and system resources. Wang refers to principals as “mutually distrusting parties,” and this is exactly how Gazelle treats these entities: as potentially dangerous to each other, the browser, and the host system. Each principal is placed in a separate protection domain realized using an OS process.
This makes for a robust browser construction, with each principal contained within its own protection domain so that misbehaving code compromises only its own protection domain, leaving other principals, the browser kernel, and the host system intact. This protection extends to plug-in content.
One of the twists with Web applications is that when a Web site embeds a cross-origin frame, object, or image, these elements of different principals share a display, creating one of the major challenges in this project. This marks a departure from the desktop world, where there is no cross-principal display sharing. The research team had to ensure that the browser kernel could recognize both display and events ownership and enforce each principal to draw in its own display space. Gazelle’s architecture cleanly separates between the act of rendering Web content and the policies of how to display the content. This cross-principal display protection is in stark contrast to commodity browsers that enable these two functions to intermingle, leading to security vulnerabilities.
With regard to preserving backward compatibility with existing Web applications, Wang comments that the architecture itself can be made backward-compatible with existing Web content. Nevertheless, it poses the interesting question of whether it’s worth sacrificing some backward compatibility to achieve more security, a question that is undoubtedly on her list for future investigation.
From Abstraction to Construction
Wang is quick to dismiss various news articles that refer to “the Gazelle browser” as if it were a product prototype. Although the idea of a browser operating system undoubtedly caught the imagination of technical press and bloggers, she stresses that Gazelle is strictly research. In fact, the Gazelle project is just another milestone, admittedly a significant one, in an ongoing effort to prove a concept Wang and several of her colleagues have been pursuing for years.
An earlier, related project called MashupOS was presented in a 2007 paper titled Protection and Communication Abstractions for Web Browsers in MashupOS, by Wang, Xiaofeng Fan, Jon Howell of Microsoft Research Redmond, and Stanford University’s Collin Jackson. This research identified the browser as a multi-principal OS platform and uncovered the inadequacies in browser programming abstractions in such a platform. The consequence was that programmers are limited by existing programming abstractions to choosing between security and functionality. The MashupOS paper then proposed additional protection and communication abstractions that a multi-principal OS-based browser ought to offer to its applications.
The next step after the MashupOS project was for researchers to take a look at the implementation of the browser itself. They needed to examine how a browser could be constructed as a multi-principal operating system and whether principals could be successfully supported with reliable protection and resource management.
“There has been a lot of activity since MashupOS,” Wang says. “Some of those concepts have influenced parts of HTML 5, and the key ideas in MashupOS and Gazelle are related. The work in MashupOS was about identifying and designing the multi-principal OS abstractions that a browser should expose to programs, while Gazelle is all about constructing the browser as a multi-principal OS: How should a browser-based OS provide protection and resource management to its applications?
Envisioning an OS for Web Apps
Gazelle represents the first time that a browser has been implemented as a multi-principal operating system. From the amount of interest and speculation the paper has generated, it appears that this is an idea whose time has come. For Wang and her colleagues, there are still many challenges in working through this complex problem, but they are convinced that their work advances the industry toward a new generation of operating system tailored for Web environments. Wang believes that if their research could lead to browsers evolving into multi-principal operating systems, then Web applications would take a giant step forward in functionality and quality.
“I would like to see Web applications achieve function and quality parity with desktop apps,” Wang says. “That’s the ultimate goal of this research.”