Microsoft TerraServer: An Imagery Scalability Story
Today, online mapping websites enable PC and smartphone users to view traditional maps, aerial or satellite imagery, or “street view” images of their neighborhood, place of work, or vacation destination practically anywhere in the world. These applications did not exist until the late 1990s. Researchers led by database legend Dr. James Gray at Microsoft Research’s Bay Area Research Center in San Francisco, California, pioneered the early work of building a massively large image database of aerial and satellite imagery that standard web browsers can access without the need for special plug-ins or other applications.
The original motivation for the project was to test the scalability of a new version of the Microsoft SQL Server relational database management system (RDBMS). Working with the SQL Server team, the research project was to build a single database instance that was big (1 terabyte [TB] or larger); public (accessible on the Internet); interesting; accessible via standard web browsers (no plug-ins required); real (had a commercial purpose); fast; and easy to use, build, and deploy. Finding an interesting, real, and large dataset that wasn’t already widely available was a challenge. In researching potential datasets, the research team met with researchers at the University of California, Santa Barbara (UCSB), who were building an online digital corpus of geospatial information. Collaborating with UCSB, the team developed the Microsoft TerraServer database and web. At the time, circa 1996, the geographic information system (GIS) community was able to store and display street and world maps through web browsers. But the industry had not been able to build and deploy high-resolution imagery such as that found today on sites like Bing Maps or Google Maps.
The website then known as Microsoft TerraServer and now known as Microsoft Research Maps initially stored 2.3 TB of U.S. Geological Survey (USGS) grayscale “digital orthophoto quadrangle” (DOQ) imagery and 1 TB of declassified Russian military satellite data that were provided by Sovinformsputnik’s U.S. partner, Aerial Images, Inc. The Microsoft TerraServer researcher’s novel approach was to take the very large images—varying from 25 MB to 200 MB each—and tile them into very small, 200 x 200 pixel JPEG compressed “tiles” that ranged from 6 KB to 36 KB, depending on the image content. The pixels were selected from the source imagery such that a “seamless mosaic” of large expanses of Earth would appear to the user as a single, large image. The tiles were formatted to HTML tables of 3 x 2, 4 x 3, or 5 x 4 (in other words, the tiles were cut into 6, 12, or 20 pieces). Clickable arrows and zoom-in/out images placed around the table of images enabled users to pan and zoom around each logical collection of seamless tiles. In the conterminous United States, there were 10 logical images enabling a user to continuously pan north to south from Albuquerque, New Mexico, to Montana, or east to west from San Francisco, California, to Reno, Nevada, without changing logical seamless images.
The Microsoft TerraServer tiling and mosaic scheme proved to be a breakthrough in the GIS industry. The Microsoft TerraServer tiling approach is used by all major high-resolution imagery sites, including Google Earth, Google Maps, MapQuest, and Yahoo Maps. Microsoft TerraServer / Microsoft Research Maps are available to end users and researchers. Today, the site exclusively stores USGS original grayscale DOQ imagery, scanned USGS topographic maps (digital raster graphic, or DRG), and color imagery of major United States cities photographed by the National GeoSpatial Agency (NGA) after 9/11. Microsoft continues to provide free and unencumbered access to this data via HTML user interface, a SOAP/XML web service, and OpenGIS Consortia compliant web map server. The site enables consumers, commercial, and academic users to have both interactive and programmatic access to a total of 4.3 TB of imagery.
TerraServer is the first mapping service on the Internet with programmatic interfaces. At its release, it was the largest data collection accessible via web services. The TerraServer project generated a significant amount of feedback to the Microsoft SQL team on how to scale databases to large datasets. SQL Server was the foundation for the Virtual Earth technology.