Enterprise Design

Published: March 31, 2005

The storage services are implemented as part of an overall storage solution architecture that is designed to solve a set of specific customer problems.

On This Page
Business NeedBusiness Need
Architecture DefinitionArchitecture Definition
Architecture DesignArchitecture Design
Logical DesignLogical Design
Architecture DependenciesArchitecture Dependencies
AvailabilityAvailability
SecuritySecurity
ScalabilityScalability
ManageabilityManageability
PerformancePerformance
SupportabilitySupportability
ConsolidationConsolidation
InteroperabilityInteroperability

Business Need

Today’s organizations, regardless of industry, size, or geographic location, share a common challenge: to increase efficiency and reduce total cost of ownership (TCO) by managing ever increasing volumes of business data.

Driving the market for storage devices are business continuity requirements such as 24x7 availability, zero data loss, and rapid recovery times. Organizations are spending more on storage, with storage capacity doubling each year. Organization spending on storage is expected to increase by 50 percent per year over the next four years while storage spending budgets are expected to increase from four percent of total computing spending to 17 percent by 2003 (source: Forrester Research). It is estimated that the amount of external storage shipped to customers will increase from a few hundred petabytes shipped in 2001 to almost 4000 petabytes shipped in 2005 (source: Gartner Group). (A petabyte is equal to 1024 terabytes.)

The rapid growth in the demand for data storage has many organizations re-evaluating the design and management of their IT infrastructures. In existing direct-attached storage (DAS) environments, analysts estimate that storage resources are underutilized by as much as 40 percent (source: Gartner Group). In addition, the cost of administering and managing data can be as much as five to seven times the cost of the storage platform on which the data is located.

Storage design goals of lower TCO, higher availability, security, superior scalability, and simplified management can be achieved by a combination of DAS and the implementation of a network model for storage. However, solution designers need to consider how the goals affect each other and carefully weigh some of the trade-offs, which include:

Availability costs more to implement while designing a scalable system.

When another component is added to increase scalability, availability of the environment usually suffers and requires purchase of additional equipment.

Increasing security can often affect performance and has a direct influence on manageability.

An evolved IT infrastructure usually includes a number of potentially discreet computer systems and facilities. Inter-site communications are often integrated, but storage tends to remain local and system-specific. The design of an enterprise storage architecture must consider numerous factors, including:

Data volume

Criticality of data availability to each location

Available budget

Enterprise network capabilities

Technical diversity of business systems

Legacy storage systems

Technical skills resources

It is unlikely that all business systems will be best served by a single storage solution. Also, it is likely that many of the factors affecting storage design will likely conflict with each other. Accordingly, an effective enterprise storage architecture design will be the result of careful analysis and thoughtful compromise.

Architecture Definition

The fundamental goal of storage architecture is to meet the business requirements of the organization. Storage architecture is fundamental to many business solutions, and it is vital that the architecture enables organizations to quickly create secure, integrated business applications within the shortest possible timeframe and with minimum disruption of the existing business requirements. In addition, storage architecture must minimize the potential for data loss; serious data loss can have a significant impact on future operations of the company.

Many organizations today protect their storage assets as a vital business asset. A number of storage technologies that can be used to help meet business needs are discussed in the next section, “Storage Technology Review.” The way these technologies are organized constitutes the foundation of enterprise storage architecture. Decisions taken at the architectural level have a far-reaching impact on the design of services and devices that are implemented as part of future business projects. For this reason, it is important that system architects understand the service and device technologies available to them as part of the storage architecture.

Storage Technology Review

Before a discussion of storage designs is begun, a brief review of the storage technologies that are commonly available is in order. The following storage technologies are referenced throughout the rest of this blueprint.

Direct-Attached Storage (DAS)

DAS is storage that is directly connected to a server by connectivity media such as fiber or copper. Some examples of DAS include the local disk drives that are often accessed through Integrated Device Electronics (IDE) or SCSI interfaces of RAID (redundant array of independent disks) controllers. The main characteristic of DAS is that it provides fast data access to the directly attached server; however, storage is accessible only from that single server.

Network-Attached Storage (NAS)

NAS is a type of storage engineered to provide a flexible and scalable solution to the file-sharing needs of an organization. A NAS device is a server that runs an operating system specifically designed for handling file services. The main characteristic of network-attached storage is that the storage is accessible directly on the local area network (LAN) through LAN protocols such as TCP/IP. The downside of accessing storage using network protocols is that the speed of data access and in turn the perceived end-user performance is dependent on the responsiveness of the network infrastructure as compared to a direct access storage device, which uses local bus speeds.

Windows Storage Server 2003

Windows Storage Server 2003 devices are dedicated high-performance file servers with built-in Microsoft Windows operating systems. Only services that are required for file serving, security, and management are installed on these appliances, which offer the availability, security, and scalability that are common features of Windows operating systems. These appliances come preconfigured with support for heterogeneous environments by providing NFS, File Transfer Protocol (FTP), AppleTalk, Hypertext Transfer Protocol (HTTP), Web Distributed Authoring and Versioning (WebDAV), and NWLink protocol support. Some of the key features of Windows Storage Server 2003 include:

Active Directory Integration: You can integrate Windows Storage Server 2003 appliances with Microsoft Active Directory directory service to take advantage of features such as encrypting file system (EFS) and Group Policy objects (GPO).

Independent Software Vendor (ISV) Utility Support: You can install several ISV utilities on Windows Storage Server 2003 appliances including backup, antivirus, and replication utilities.

Simple Management: Remote management is supported through Terminal Services sessions as well as a Web interface. Storage administrators do not have to learn a new operating system to operate the NAS appliance, because it uses Windows 2000 or Windows Server 2003.

Enhanced Snapshot Support: Windows Storage Server 2003 appliances can support up to 250 snapshots for maximum data availability.

Note: This feature is available only if the OEM has implemented it.

Distributed File System (DFS) Integration: Windows Storage Server 2003 appliances can use DFS to provide fault tolerance and load balancing for data access.

Storage Area Networks (SANs)

A storage area network (SAN) is a specialized network that provides access to high performance and highly available storage subsystems. The SAN is made up of specific devices, such as host bus adapters (HBAs) in the host servers, SAN switches that help route storage traffic (in methods that are similar to those used by LAN network switches), disk storage subsystems, and tape libraries. All these devices are interconnected by fiber or copper. The main characteristic of a SAN is that the storage subsystems are generally available to multiple hosts at the same time, which makes them scalable and flexible. The specialized nature of the SAN HBAs and switches provides a performance benefit over NAS. Although DAS data transfer rates are still faster, the performance gap between DAS and SAN technologies is consistently shrinking. The advantage of multiple servers being able to use the storage solution in a SAN generally outweighs any shortcoming in overall access speeds.

Windows Server 2003 Storage Technology Features

Windows Server 2003 introduces a number of significant new features in the area of storage technologies. Some of these features are:

Virtual Disk Service (VDS): This service enables multiple storage device solutions to interoperate in Windows. VDS exposes new application programming interfaces (APIs) to storage hardware and management programs, which allow administrators to discover and configure storage devices from different vendors using a unified interface.

Shadow Copy: This feature provides a mechanism to increase the availability of data and reduce the administrative burden of restoring files. Shadow Copy provides a point-in-time copy of single or multiple volumes that may be used to restore files, if needed.

SAN Support: SANs are significantly easier to use in Windows Server 2003, Enterprise and Datacenter editions. For example, these editions do not automatically mount visible logical unit numbers (LUNs); they provide improved Fibre Channel support, SAN HBA interoperability, and enhanced SAN-boot functionality.

For more information on the enhanced storage features provided in Windows Server 2003, refer to the Windows Server 2003 Deployment Kit.

For more information on the storage technologies reviewed here, refer to the Storage Devices Blueprint.

People

Microsoft Operations Framework (MOF) defines two roles that are applicable to storage. These roles are:

Storage Manager

Storage Administrator

Both the roles support the Operations Role Cluster, which includes skilled specialists who focus on the performance of production systems and the tasks necessary to run them on a daily basis. In WSSRA, these two storage roles have been created and assigned to the people involved in the functions of enterprise data storage solutions.

The people who perform these roles are responsible for providing all the data storage functions to the organization. In addition, these roles need to interact with a number of other roles from different teams within the organization. For example, Backup/Recovery Owners need to work closely with the people who perform storage roles to be effective.

For more detail on the functions these roles perform, refer to the “Manageability” section later in this blueprint.

Process

Once data resides on storage devices, it is vital to put in place the necessary processes to support and protect it. Because the management tools that are available to the storage teams can destroy data if used incorrectly, no storage management should be undertaken without a properly defined, communicated, and understood set of processes being in place. These processes fall into the following basic groups:

Data Protection: These processes provide routines that are used to proactively protect the data on the storage solutions.

Process Backout: These processes ensure that all storage procedures have a process to return to good data in a known state.

Data Recovery: These processes provide routines for recovering data in the event of a storage failure or data corruption.

For more details on the data protection and recovery processes, refer to the Backup and Recovery Services Blueprint.

Tools

Storage architecture can comprise a number of storage solutions from different vendors, each of which provide services to the organization and must be built, deployed, and operated. The architecture design discussed later in this blueprint presents a framework to which the storage devices should conform. Part of this framework specifies management tools that the solutions should provide.

The ongoing TCO of a storage solution can be seriously affected if the tools provided with the solution cannot be integrated into the enterprise management solution. Most solutions include a Web or Microsoft Management Console (MMC)-based tool for simple administration tasks. In enterprise environments, however, additional tools may be used. For example, additional tools may be needed to report the status or performance of the storage solutions. If an organization has invested in a particular enterprise management or storage resource management solution, it is important that the storage solution and associated tools integrate with the enterprise management solution at the required level.

Architecture Design

Storage architecture design should follow a structured approach to ensure that the correct solution is adopted by the organization. The three basic types of storage architectures that are discussed in this blueprint are:

Distributed Storage

Hybrid Storage

Centralized Storage

Each of these storage architecture types defines a storage pattern that can be used as a starting point for providing guidance on how the storage should integrate with the business needs of an organization. If business needs are not communicated when enterprise architecture issues are being considered, it is easy for new projects to focus only on their own requirements and miss the wider picture that an architecture encompasses.

For a complete understanding of the various options available, the storage architect should refer to the "Windows Server 2003 Deployment Kit,” especially the "Planning for Storage" chapter.

A structured design process for a complete enterprise storage solution consists of:

Determining the storage requirements.

Choosing the storage technologies.

Defining fault tolerance technologies.

Defining backup and recovery technologies.

The following sections guide you through the storage design process for an enterprise-class organization.

Determining Storage Requirements

It is essential to understand different storage demands such as availability, capacity, and performance when designing a storage architecture. In order to understand these demands, accurate metrics and an awareness of the criteria for collecting the metrics are essential. Estimation is an important aspect of the design process, and the quality of the metrics directly affects the ability to make meaningful estimates.

In order to determine physical metrics, you first need to know how much capacity is required by the operating system; this information is fairly simple to collect. It is more difficult to evaluate the capacity requirements for various applications. Knowing which applications are going to be used within your environment will provide some baseline information, but it is not always safe to make generic assumptions.

The following figure shows physical metrics and logical metrics, indicating the actual volume of data and the external influences that can affect the storage requirements of this data.

Figure 1. Storage Requirement Drivers

Figure 1. Storage Requirement Drivers
See full-sized image

Of course, the volume of data in different organizations can vary considerably. For example, the amount of data in messaging systems can vary because of elements like the different archiving and e-mail attachment policies that exist within the organizations as well as the more obvious e-mail volumes. Also, it is important to keep in mind that different applications have varying storage requirements. For example, Active Directory may have extensive storage needs while a DHCP server's needs may be quite limited.

Physical metrics are relatively straightforward, and there are numerous tools that can be used to interrogate servers to gather physical metrics.

Logical metrics deal with storage capacity requirements, so some overall design criteria may not apply. For example, security is not a function of capacity, although it most certainly needs to be considered when selecting storage technologies. The principal logical metrics cover availability, scalability, manageability, and performance. Each of these can directly affect the amount of storage capacity as well as throughput requirements.

For example, storage availability requirements and acceptable levels of downtime might mean that storage capacity needs to be doubled across the infrastructure. In environments where system non-availability can cost millions of dollars per day, it is important for the storage design to have the capacity to hold duplicate datasets.

Scalability of storage is also an important logical metric. Current storage requirements consist of calculated physical metrics, but future requirements are always estimates. The ability to provision enough online storage to scale requires a good understanding of where volatile data is housed and what a worst-case scenario might demand. (Keep in mind, however, that planning for an absolute worst-case scenario will be an expensive option.)

Manageability generally has a somewhat smaller impact on the total systems volume requirements; however, on systems that require a significant amount of monitoring this will not be the case.

Performance is more commonly associated with technologies than with storage per se. The number of disks, along with their configuration and interconnections can have a significant impact on performance.

Worksheets for gathering physical and logical metrics can be found in Appendixes 3.2 and 3.3 at the end of this blueprint.

Application File Access Assessment

As mentioned in the previous section, applications can have a significant impact on storage requirements. What is sometimes overlooked, however, is the effect that applications can—or should—have on the selection and configuration of platform hardware.

The ways in which applications access data files may be an important consideration. For example, records in a database might be searched sequentially or through use of an index. Performance considerations might dictate whether block or file access is preferred (although this may be a function of the protocol used for file transfer, such as HTTP or CIFS).

Meaningful information about how applications access data files helps create a successful storage architecture design. A worksheet for gathering such information is provided in Appendix 3.4.

Distinguishing Between Different Types of Data

When determining storage requirements for planning and design purposes, a distinction can be made between the following types of data:

Server operating system data: The operating system files required to start and operate the server.

User data: Data created by:

Services such as the Active Directory service.

Applications such as Microsoft SQL Server 2000.

Clients that store business information.

The reason for distinguishing between these types of data is that they may need to be handled differently.

Design Options for Operating System Data Storage

There are two basic options that apply when designing a storage solution for operating system data. These two options and their advantages and disadvantages are outlined in the following sections.

Option 1—Local Storage

The local storage option is by far the most common configuration used in organizations. The data required by the operating system is stored on a disk drive locally attached to the server.

Advantages

Advantages of local storage design include:

Simple configuration: Installation and configuration of the operating system on a local disk is a simple process that can be achieved remotely with tools such as the System Preparation tool (SysPrep) and other disk imaging solutions.

Low startup costs: Local disks are relatively cheap, and setup and configuration costs are minimal.

Disadvantages

Disadvantages of local storage design include:

High management costs: The local storage design option distributes operating system data throughout the organization. Therefore, management of local data can be complex and may require additional visits to the server to provide ongoing maintenance of the storage.

Complex high availability: The local storage design option is a highly available solution but potentially more complex, as each server will need to have a solution designed for it.

Complex backup solution: Managing the backup of operating system data that is distributed throughout the organization can pose significant challenges.

Option 2—Remote Storage

It is possible to configure servers to store all their operating system data in a remote storage location.

Advantages

Advantages of remote storage design include:

Lower management costs: The ongoing management costs of a remote storage design are lower than one based on a distributed model because the management functions are focused in a single area and do not need to be repeated for each server.

Centralizes the availability configuration: A remote storage design can achieve high availability by grouping together the operating system storage requirements for a number of servers.

Centralizes the backup solution: The remote storage design option allows a single backup operation to back up operating system data for all servers that use the remote storage solution.

Disadvantages

Disadvantages of remote storage design include:

Complex setup: Configuring a server to start from a remote storage solution can be complex.

Higher startup costs: The investment in a large central storage solution that is capable of managing operating system files can be significant. While the long-term cost benefits may be greater, the initial startup costs may make this a difficult option to select.

Possible performance issues: Moving data from a location where it is served by a fast local data bus and onto a centralized remote location can cause performance degradation to occur. However, centralized solutions are getting faster all the time; this issue should be evaluated when the design decision is being made.

"Eggs in one basket": If a centralized storage solution fails, all servers utilizing the storage will also fail. The design of the central solution should be robust enough to mitigate this risk.

Design Options for User Data Storage

The majority of storage and management overhead relates to the service, application, and client data referred to collectively as user data. Requirements for user data storage can be considerable, and their complexity should not be underestimated. The remainder of this blueprint focuses on providing a storage architecture that can accommodate requirements of user data.

The following questions are designed to help establish the business need and determine the requirements for different kinds of user data.

Availability

Which critical business applications need to be safeguarded against unplanned outages?

Is it an e-mail system, document data, or a database application that contains your organization’s information?

Scalability

What amount of storage capacity is needed in the upcoming years? Ideally, the storage solution should be able to grow not only in terms of disk capacity but should be easy for IT administrators to manage and monitor.

Budget

How much money has been allocated for storage and storage management?

Performance

What are the peak performance needs of key applications?

Data Backup

What are your current backup requirements?

Is there a nightly backup window or a service level agreement (SLA) in place?

Do you need to remove backups from the LAN to make your applications run faster?

Disaster Recovery

Do you need to clone or snapshot this data?

Server and Storage Utilization

How much available but inaccessible storage could be used by your data-intensive applications if it was shared throughout the organization?

Do you have physical environment restrictions in your data center? If so, what are they? Which older, smaller-capacity storage devices could you consolidate into fewer, larger-capacity devices for ease of management and to reduce service contracts?

Once the business needs for storage have been determined, each application’s storage requirements have to be identified. Some questions to be answered by the application service designer are:

What is the type of data this application will store on this volume?

What is the final size of data required for the application, including any anticipated data growth?

What type of input/output (I/O) performance is required for this specific volume?

Generally, it is sufficient to provide a high, medium, or low indicator. For example:

High: If the application requires a lot of I/O, such as a large database.

Medium: If the application requires less I/O, such as a user application.

Low: If the application does not require a significant level of I/O, such as stored files with a low access rate.

What type of data transfer will this application generate?

The answer to this question is usually presented as a ratio of data reads to data writes. For example, a log disk will primarily see writes (0/100) whereas a database disk will see varying amounts of reads and writes (50/50). Most shared storage devices provide a way of tuning the cache to improve the performance.

Will this application access data sequentially or randomly?

For example, a transaction log disk is accessed sequentially and usually a database disk is accessed randomly.

Does the data on the disk require high availability?

Does the data need to survive single or multiple disk failures? If possible, the application service designer should provide guidance for specific RAID disk types for optimal performance and availability.

Appendix 3.4 at the end of this blueprint is a worksheet that can be used as a job aid to gather this information.

Selecting the Storage Technologies

The appropriate technologies for storage requirements are determined by a number of factors. The following figure shows the physical and logical metrics that should be considered, as well as the results of the application assessment efforts.

The use of logical metrics in the technology selection process emphasizes technologies rather than products. However, it is important to ensure that all the technologies work together and that the operating system supports the hardware options. Although workarounds can be devised and compromises made, it is preferable to integrate complementary technologies to create an available, secure, scalable, and manageable environment.

Figure 2. Technology Selection Drivers

Figure 2. Technology Selection Drivers
See full-sized image

Availability requirements should have been assessed on a per service basis in the storage requirements gathering process. The information obtained through the process will direct you to the types of technologies that will meet your requirements. Common availability considerations include redundancy, fault tolerance, and data replication. Availability also influences the configuration of topologies, especially in complex SANs. The need for high levels of availability will mean that more complex dual-fabric core-to-edge configurations (detailed later in this blueprint) will be preferred to a more basic point-to-point implementation. Topology configurations can also have a significant impact on performance.

Designing a secure storage architecture brings both physical and software considerations into play. For example, physical security considerations affect whether servers are positioned in a centralized secure environment or DAS units deployed within departments. Large SAN environments often provide additional layers of security within the solution, and it is important to ensure that these are compatible with the infrastructure security that has been established. Also, the security aspects of the SAN fabric need assessing for compatibility and interoperability.

Storage technologies that provide enhanced scalability are sometimes lost in discussions about disk capacity. Scalability, of course, goes far beyond raw gigabytes or terabytes. True scalability implies flexibility, and that the platform has been designed to accommodate expansion or even contraction. An example of this would be the use of Virtualized Storage, where capacity is treated as a single entity and pools of disks are created and allocated to systems. Virtualized Storage allows for dynamic disk upgrades and performance tuning through the use of data leveling algorithms.

Manageability of the storage solution may need to fit within existing practices, and the need to add new management tools or protocols might conflict with existing IT policies. It should be kept in mind that the deployment of an unsupported or unsupportable solution will be catastrophic, and that manageability (like scalability) implies flexibility. Nonetheless, compatibility with support practices should be given as much consideration as compatibility between hardware and software components.

Network topology will be critical to the performance of the storage solution, but so will the placement of storage across remote sites. Use of a SAN provides the option to exclude data traffic, which will enhance performance. Generally, the modular design of larger storage options allows tuning at a component level, supplying additional spindles to applications that will benefit from them.

The ability to consolidate should also be appraised. A significant feature of successful consolidation is having the right technology and knowing how much that technology can accomplish. Worksheets for gathering the physical and logical data can be found in Appendixes 3.5 and 3.6.

The following sections discuss some of the advantages and disadvantages of different storage technologies; these sections are designed to help you decide which technology fulfills your data needs. For a detailed discussion of each of these storage technologies, see the Storage Devices Blueprint.

Storage Technology Design Option 1—DAS

As previously mentioned, DAS is storage that is directly connected to the host by connectivity media such as fiber or copper. Examples of DAS include the disk drives accessed through SCSI or RAID controllers. Large implementations commonly use RAID systems that are accessed through RAID controllers.

Advantages

The advantages of using DAS include:

Low entry cost: Because of its low entry cost, DAS is popular and easily justified in small to medium organizations, workgroups, and departments that do not want to spend extra money on shared storage, enhanced availability, or enhanced performance.

Simplicity: Data administrators are likely to be knowledgeable about most elements of DAS technology.

Initial management simpler: DAS is easier to manage when viewed within the scope of a single server.

Simpler security: Security is simpler to configure because access to the storage is only possible through a single server. Careful control of access to this server ensures that data is safeguarded from unauthorized access.

Performance: Storage performance is generally high because the server uses the total storage subsystem I/O bandwidth. On the other hand, when multiple servers are connected to an external storage subsystem (such as a SAN or NAS), total storage subsystem bandwidth is shared. This may be an important consideration for applications with high I/O requirements.

Disadvantages

The disadvantages of using DAS include:

Possibly more difficult to manage: When viewed from an enterprise perspective, managing a large number of DAS solutions may be more difficult or expensive. Typically, the servers connected to DAS storage devices are a heterogeneous mix from different vendors. This mismatching means that the servers have varied proprietary architectures, leading to connected yet isolated “data islands.” Available storage connected to a single server cannot be directly used by other servers.

Scalability: The amount of storage available is limited to the drives that can be physically attached to the server. Also, access to server storage is limited to the number of physical network connections that the server can support.

Backup performance versus complexity: Backups to DAS may either occur locally, with a library or standalone tape drive attached to each server, or by traversing a LAN. To combat the impact of backups on LAN performance, many organizations either run backups exclusively during non-business hours, or implement a separate backup LAN and install multiple network adapters on each server.

Storage Technology Design Option 2—NAS

NAS appliances are engineered to provide a flexible and scalable answer to the file-sharing needs of an organization. They incorporate components and software that enables storage and retrieval of data, and often support multiple protocols such as Server Message Block (SMB) and NFS.

Advantages

The advantages of using NAS include:

Simplicity: NAS is easy to deploy and manage. There are fewer new high-level technologies to learn, and the structure of the environment is simple. The technology presents data as file servers do, making it easy for people to understand and allowing for security and access control. Providing only file services allows service delivery to be more efficient than when delivered by servers handling multiple services. Also, it is easy to add storage for file services where the service is needed. A new NAS device can be located close to the users requiring the service and attached to the existing network.

Low cost: NAS is less expensive to implement than a SAN due to the fact that it can connect directly to an existing LAN infrastructure, from which it inherits file-level access control and security protocols. SANs require additional hardware devices and cables.

Built-in fault tolerance: Many NAS appliances offer redundant power supplies, network, and disk configurations.

Disadvantages

The disadvantages of using NAS include:

LAN-only access: Data access to NAS appliances is done over the LAN, making the LAN a potential bottleneck for data access.

Note: Implementing high-speed interfaces such as Gigabit Ethernet can help reduce or even eliminate such network bottlenecks.

Backup: NAS appliances are designed for backups to a local storage device or over a LAN (similar to DAS.) When backing up over a LAN, you need to plan backup times to avoid affecting normal business operations. Unlike DAS, however, many NAS appliances allow you to perform server-free backups without affecting the appliance’s CPU. Another alternative with NAS is to back up the appliance through a SAN.

Application compatibility: Many database applications do not support storing database or log files on NAS appliances.

For more information on using SAN and NAS devices with Exchange Server, see the article at the following URL:

support.microsoft.com/default.aspx?scid=kb;en-us;328879

For more information on support for network database files, see the article at the following URL:

support.microsoft.com/default.aspx?scid=kb;en-us;304261

Storage Technology Design Option 3—SAN

A SAN is a specialized network that provides highly available storage subsystems to multiple hosts. SAN fabrics (networks that connect hosts to storage devices) can become increasingly complex as more devices are added. Similar to LAN subnets, which are interconnected by routers and switches, separate fabrics called “SAN islands” can be interconnected by fabric switches.

Advantages

The advantages of using a SAN include:

Backup and recovery: Data can be backed up and recovered over the SAN fabric at high speeds in two ways. First, LAN-free backups minimize the impact of backup and restore processes, as data is backed up in-band over the Fibre Channel network. However, this method requires some server processing cycles. Second, server-free backups can be performed with appropriate hardware devices without affecting servers connected to the SAN. However, this approach usually requires additional steps to maintain application data integrity.

Centralized management: Storage can be pooled together and allocated to the servers that require it. Storage is no longer distributed, and excess storage is not wasted.

Scalability: SANs can be expanded with the needs of your organization. New storage resources can be added to the SAN without any impact or downtime for existing resources. Scalability is higher with the ability to add more storage to the fabric or connect more hosts to the storage subsystems. Also, storage virtualization results in the ability to increase data volumes on an as-needed basis.

Distance: Using technologies such as iSCSI, SANs can be configured to interconnect storage devices in separate physical locations, allowing you to easily move data between a production site and a disaster recovery (DR) site. For details of iSCSI, see the Storage Devices Blueprint.

Disadvantages

The disadvantages of using SANs include:

Complexity: SANs are frequently implemented to address multi-server and multi-application storage environments. Therefore the scope, planning, and required knowledge for deployment is more complex and crosses more IT organizational boundaries than other configurations. Specialized knowledge is required to design, operate, and maintain a SAN.

Expense: The costs involved in implementing a SAN range from tens to hundreds of thousands of dollars or more, depending on its size and the amount of distance data needs to traverse. A SAN is an expensive solution for the infrastructure when compared with DAS or NAS, at least in terms of its initial deployment costs. However, a SAN solution may deliver a better long-term return on capital invested in storage and servers through increased flexibility and reduction in administrative costs due to centralization.

Interoperability: Many of the first generation SAN devices and management applications provided low levels of interoperability. Typically, all components of the SAN had to be provided by a single supplier. With hardware vendors’ widespread adoption of American National Standards Institute (ANSI) Fibre Channel standards, this is becoming less of an issue.

Storage Technology Design Option 4—Combination Design

Design option 4 is a combination of different types of storage solutions. If designed correctly, such an architecture can mitigate the disadvantages of individual storage solutions by combining the strengths of different storage technologies.

Examples of such hybrid systems include newer NAS appliances that can accommodate Fibre Channel HBAs for accessing storage on a SAN. This approach provides NAS with greater storage scalability, limited only by the disk array components of the SAN. In addition, a SAN can enable clustering of appliances as a two-node failover cluster, providing true fault-tolerant access to critical data. NAS devices with Fibre Channel HBAs can also be backed up to storage devices on a SAN, providing a method to perform LAN-free NAS backups. The following figure depicts such an example:

Figure 3. NAS Using SAN for Disk Subsystem

Figure 3. NAS Using SAN for Disk Subsystem

Advantages

The advantages of using the combination design option include:

Performance: A best-of–breed high-performance storage solution can be constructed by combining different storage technologies.

Flexibility: This solution is the most flexible option because it combines the strengths of all technologies used.

Disadvantages

The disadvantages of using the combination design option include:

Expense: A multi-device solution inevitably leads to higher implementation costs. Additional high ongoing manageability costs can make this hybrid approach too expensive for all but for the most demanding storage environments.

Complexity: The combination of different devices into a single solution increases the possibility of interoperability and manageability issues.

Interoperability: As identified earlier, interoperability issues are more likely, especially when the solution comprises components from different vendors.

Defining Fault Tolerance Requirements

Most organizations today need to have mission-critical data available at any time, which requires building systems that provide fault-tolerant disk configurations for the operating system while also protecting the data. Common design options for data fault tolerance are explained over the next few pages.

Fault Tolerance Design Option 1—Data Replication

Data replication is a process by which data is copied from one server to another; examples include Windows Active Directory replication, File Replication Service (FRS), and Windows Internet Naming Service (WINS) replication.

Advantages

The advantages of using data replication are:

Full redundancy: This process provides a fully redundant copy of the data being replicated.

Simplicity: The setup process of the services that provide this functionality "out of the box" is usually straightforward.

Disadvantages

The disadvantages of using data replication are:

Data duplication: Copying of data presents an n times increase in overheads for the storage architecture solutions (where n is equivalent to the number of copies of data).

May be bandwidth intensive: In many cases, the replication takes place over the organization's network infrastructure, which may present unacceptable overhead on the network. Mitigation plans to this include replication throttling and the creation of "replication windows" during periods of low network utilization.

Fault Tolerance Design Option 2—Using RAID

Defining a RAID solution for the back end data devices is generally the preferred option for providing a fault tolerant storage solution. The reason for this is that the wide range of solutions available under the RAID banner usually means a solution can be tailored to meet particular requirements of storage availability. Please refer to the Storage Devices Blueprint for a detailed discussion of RAID and its various levels.

Defining Backup and Recovery Technologies

Service and disk redundancy provides resilience by eliminating single points of failure for services and data. It is essential, however, to make adequate backups so that data and system configurations can be restored in the event of data loss, system failure, or data corruption. Even though every conceivable precaution may be taken, planning a strategy for backup and recovery is important because it is impossible to plan for every disaster or outage that could affect a data center. The quantity of data that is stored varies across environments, but you can often expect it to grow to many terabytes. In addition, the number of users supported will increase over time. These constantly changing environmental factors require a backup solution that can change with the environment while supporting mission-critical application data and minimizing management costs.

It is important to back up critical data and enable it to be quickly restored in the event of data loss, regardless of how small or large the loss is. Crises that result in potential data loss include:

Hard disk subsystem failures

Power failures that result in corrupted data

Systems software failures

Accidental or malicious deletion or modification of data

Destructive viruses

Natural disasters such as fire, flood, or earthquake

Theft or sabotage

An organization’s ability to recover quickly from any outage or disaster, whether it is a component failure or the complete destruction of a site, directly contributes to the organization’s ability to survive the disruption. A backup and recovery solution should be prepared for many types of failures. Such a solution should be based on well-defined requirements for system availability and should take the elements of each server into account.

Assessing the Situation

For each operating system and application introduced in the environment, consider the following questions:

What are possible failure scenarios?

What is the critical data and where is it located?

How often are backups required?

When should full, incremental, and differential backups be done?

Will the backups be server-based or server-free?

Will a SAN storage device be used for magnetic backups?

Will backups be sent to magnetic, magneto-optical, or tape media?

Will the backups be performed online or offline?

Will backups be started manually or use a schedule to start automatically?

How will backup data be validated?

Will backups be stored onsite or offsite?

A good backup and recovery plan should include a disaster avoidance plan, tools that assist recovery from a disaster or an outage, and detailed procedures and standards for performing a recovery. For each subject area, the architecture should clearly define the people, processes, and technologies that are required for success.

For more information on the backup and recovery process, refer to the Backup and Recovery Services Blueprint.

Logical Design

A variety of logical designs for storage architecture can be adopted in an organization. In most cases, these designs grow organically on a project-by-project basis. Such an approach leads to serious cost issues, because the TCO of such a solution rises with each addition to the infrastructure.

A better approach is to provide an overall architecture for storage with which each project should comply. This section includes some examples of logical storage architecture designs that can be used in different projects.

Note: These designs are aimed at the server, application, and user data requirements as detailed earlier in this blueprint. The assumption is that the operating system data is provided through a simple DAS implementation.

Logical Design Option 1—Distributed Storage Model

In the distributed storage model, the architecture dictates that each server and service provides, configures, and manages its own storage requirements in the organization. The following figure depicts this solution in a typical enterprise scenario:

Figure 4. Distributed Storage Example

Figure 4. Distributed Storage Example

Advantages

The advantages of the distributed storage model include:

Flexibility: This option provides a flexible model for each application to work in and gives them complete control of the design they prefer.

Minimum impact on other service storage requirements: Separating the storage requirements minimizes the impact on other services within the organization.

Disadvantages

The disadvantages of the distributed storage model include:

Many storage decisions to make: If the project requires its own storage solution, it will need to have access to expertise in all aspects of the storage solution to make the correct decisions.

Expensive TCO: The distributed storage model becomes expensive over time because the provision of storage leads to management overhead (such as monitoring and backup and recovery options), which leads to duplication of small solutions that are expensive to run when compared with a centralized overall solution.

Logical Design Option 2—Hybrid Storage Model

In the hybrid storage model, the architecture defines a number of storage solutions that projects and services can use. The following figure depicts this solution in a typical enterprise scenario:

Figure 5. Hybrid Storage Example

Figure 5. Hybrid Storage Example

Advantages

The primary advantage of the hybrid storage model is that it provides flexibility by creating a number of storage solutions for projects and services. With this solution, project and service teams are able to decide the storage solution that best matches their requirements.

Note: In the real world, this solution allows for selection of the Distributed Storage model if the project requirements cannot be met by the enterprise-defined hybrid storage. This is an acceptable compromise because the emphasis is on the project to justify why its requirements are not met, which leads to more flexibility and also allows the silos to adjust over time to meet these new requirements as necessary.

Disadvantages

The disadvantages of using the hybrid storage model option include:

Complexity: The hybrid storage solution can be the most complex to implement because it combines all storage technologies to meet the varying requirements of the organization. The process of defining the requirements of each service within an organization can be long and complex.

Expense: The ongoing TCO of this solution is likely to be high because it contains examples of all the storage technologies from a number of different vendors.

Logical Design Option 3—Centralized Storage Model

In the centralized storage model, the architecture defines a single storage solution that is powerful enough to meet the needs of the organization's services. The following figure depicts this solution in a typical enterprise scenario:

Figure 6. Centralized Storage Example

Figure 6. Centralized Storage Example

Advantages

The advantages of using the centralized storage model option include:

Low TCO: The centralized storage model is a powerful solution that has a low TCO because storage needs of all the services can be efficiently provided and managed out of a single solution.

High security focus: The centralized approach can be configured to provide a highly secure store for the organization’s data. A centralized storage solution usually is stored in access-controlled data centers; file servers are generally more difficult to control, as they are distributed throughout the organization.

Note: It is possible to establish a central data store with bad security, so the security element is not implicit. However, the focus provided in this type of design generally promotes a secure solution.

Disadvantages

The disadvantages of using the centralized storage model option include:

Limited use in geographically distributed sites: The solution is somewhat inflexible and requires all servers’ storage to be located in a geographical location with high-speed access to the solution.

Expensive startup costs: A large-scale centralized store is expensive to establish.

Solution certification issues: Vendor certification and support issues may block the creation of a single solution.

Architecture Dependencies

Storage architecture has only one direct architecture dependency: the network architecture. This dependency is obvious if NAS is used as part of the storage architecture. This dependency also exists if DAS is used, because DAS invariably requires the network to provide backup solutions to a number of DAS elements.

The management architecture is indirectly dependent on the storage architecture. A significant element of the storage solution is its manageability throughout the organization, which means that a significant amount of dependency is placed on the management architecture to allow the storage team to effectively manage their solution.

A number of services are required to implement a complete storage architecture for an enterprise. These services are listed in the following table.

Dependency NameSpecific Requirements

Directory Service

A directory service is required to manage user authentication and authorization and manage other resource objects in the environment.

Network Devices

NAS/SAN client access occurs over the Ethernet network.

Network Services

For all hosts and storage management devices to communicate properly, TCP/IP name resolution services are required. Additional NAS/SAN management occurs usually through HTTP, TELNET, and/or Simple Network Management Protocol (SNMP) interfaces.

Server Deployment

All SAN-connected servers must be installed with correct driver/firmware versions.

Backup and Recovery Services

Backup and recovery services are required for data availability and storage management.

Infrastructure Management Services

Infrastructure management services are required to provide the resources for remote administration and support the storage systems in the enterprise environment.

Table 1. Storage Architecture Service Dependencies

Availability

Each storage architecture model is capable of providing a highly available storage solution, although the manner in which availability is provided differs. The primary thing to keep in mind is the relative importance of availability to the organization. For example, a centralized storage model is likely to be expensive to establish. Consider the case of an organization that has a requirement for a large reference library that is neither mission critical nor regularly used. Instead of placing the reference library on a large centralized SAN, the use of the hybrid storage model is better because it allows the creation of cheaper NAS or DAS solutions for certain data needs.

Security

Security is a primary concern whenever a company's data is involved. This section provides a summary of some of the security issues that may arise for each architecture. However, it is important to understand that each design can provide a secure data environment if correctly configured and managed.

In the distributed and hybrid storage models, the data to be secured is spread across many systems and locations. In the hybrid storage model, the number of locations is consolidated based on the needs of the users and applications using the data. This physical separation of data has an advantage in that it provides an automatic boundary to the security risk for the data in that location. The physical separation, however, has the disadvantages of providing an environment where it is difficult to physically secure each data store and making it easy to miss security "holes" that might put the company data at risk.

The centralized storage model provides a single storage environment that is easy to physically secure, and its focused management environment is easier to secure from a logical perspective. However, the lack of physical separation requires management to ensure that there are no security "leaks" between the various volumes in the data store.

Security Lockdowns

The processes involved in locking down the data in each storage model are basically the same. Through the use of secure policies and access control lists (ACLs), the servers, devices, and data are secured to ensure that only authorized users and applications are able to access the data. These processes should always be configured on a default "No Access" basis so that allowing access is a required administrative task.

For more information on providing security policies and processes, refer to the Security Architecture Blueprint and the Storage Devices Blueprint.

Scalability

All three storage architectures provide for scalability in some fashion. In the distributed storage model, each server can add drives to the storage enclosure (whether internal or external). Alternatively, smaller drives could be swapped for higher capacity devices. However, this is a time consuming process that can involve copying data between drives and possible system downtime.

When a SAN solution is involved, all you need to do is add a new drive to the SAN and configure it to be added to the overall storage pool. The exact process varies depending on the manufacturer's device, but all SAN solutions are designed to provide scalability of the storage pool with minimal storage system downtime.

Manageability

Manageability of technologies used in storage architecture is a major consideration when designing the solution. The centralized storage architecture model provides for the simplest management of these technologies because it is focused and highly controlled. The decentralized models of the hybrid and distributed storage architectures present some challenges in manageability, which are dealt with using the remote management capabilities available in most storage technologies.

Both the storage technology and the device manufacturer affect the manageability of an architecture. For more details on the manageability of major storage technologies, refer to the Storage Devices Blueprint. For details on the overall management architecture for the enterprise, refer to the Management Architecture Blueprint.

Role-based Administration

Enterprise operations roles include dedicated specialties such as messaging system administration, telecommunications, networking, storage, and database administration. The operations roles manage daily operations and the system administration activities that run and maintain the IT services and applications throughout an organization. In doing so, they perform the scheduled and repeatable processes such as data backup, archiving and storage, output management, system monitoring, and event log management.

The storage manager should be part of the security role cluster specifically working with the IT security manager to ensure that data remains secured in all aspects of the storage architecture. Some goals for this cluster are:

Data confidentiality: No unauthorized users should be able to access the data. No one should be able to view data if not authorized.

Data integrity: Authorized users should feel confident that the data presented to them is accurate and properly updated.

Data availability: Authorized users should be able to access the data they need to perform their job functions.

Some common tasks for the storage administrator are:

Performing backup and recovery of critical data.

Monitoring disk and other storage media for availability and capacity.

Administering RAID storage, CD-ROM towers, and optical storage jukeboxes.

Some common tasks for a storage manager include:

Maintaining end-to-end responsibility for the storage management process.

Determining strategies for backup, restore, and recovery of data.

Establishing and monitoring adequate backup, restore, and recovery procedures.

Ensuring that backup documentation exists and remains current.

Ensuring that the storage operators have the right skills and tools to perform backups, restores, and recoveries.

Ensuring accurate representation of storage resources in the CMDB.

Configuring the specific backup and recovery events to be monitored.

Configuring storage tools and utilities according to service level requirements.

Ensuring that storage resources are in good working order.

Coordinating the installation and maintenance of system hardware and software.

Providing technical support for storage management systems.

Managing client/server storage management configurations.

Executing routine tasks to ensure smooth running of storage devices and peripherals.

Executing documented administration procedures for service quality assurance (QA).

Executing appropriate storage management security procedures.

Confirming appropriate authorizations for system and data access.

Configuring storage resources for test environment.

Adjusting system storage capacities (for example, hardware, file system, and software parameters) based on service plans.

Executing change management work orders related to the storage resources.

Isolating and resolving faults associated with storage resources.

Participating in production testing of storage resources.

Installing patches related to storage management.

Participating in pilot testing of storage resources.

Ensuring sufficient storage space exists for business and utility applications.

Ensuring storage management and system administration procedures are in place prior to service activation.

Detecting storage management events and raises alerts across multiple shifts.

Ensuring that backup results match expectations.

Executing end user backup and restoration.

MOF Role ClusterRole NameRole ResponsibilitiesActive Directory PermissionsOther Permissions/ Abilities

OPERATIONS, RELEASE, INFRASTRUCTURE, SUPPORT, SECURITY

Storage Administrator

See preceding list

Local Admin on server, SAN Management Appliance/SAN and NAS devices

N/A

OPERATIONS, RELEASE, INFRASTRUCTURE, SUPPORT, SECURITY

Storage Manager

See preceding list

Local Admin on server, SAN Management Appliance/SAN and NAS devices

N/A

Table 2. Storage Role-based Administration

For further information on role-based administration, refer to the "MOF Team Model for Operations" article at:

http://www.microsoft.com/technet/solutionaccelerators/cits/mo/mof/moftml.mspx

System Administration

Administration of a centralized architecture is relatively simple because all tasks are in one place. Also, most storage devices allow remote system administration. The presentation and communication mechanisms for remote administration fall into three basic types:

MMC

Web-based

Proprietary

The MMC and proprietary solutions provide remote administration functionalities, but the Web-based tools are still improving. It is worth noting that the security design of the network architecture may make remote management impossible across certain security boundaries. For example, if a storage solution is situated on a perimeter network, the port filtering on the firewall may block the remote system administration tools.

Windows Server 2003 and Windows Storage Server 2003 devices have additional systems administration options, including Terminal Services and command line tools, to help automate common tasks.

Performance

The performance of a storage architecture is dependent on the storage technologies used within the design. For details on the performance of these technologies, please refer to the Storage Devices Implementation Guide.

Supportability

Supportability of various hardware vendor devices is usually documented by the respective hardware vendors in a compatibility and supportability matrix. Contact the appropriate hardware vendor to ensure that all planned systems are configured in a supported fashion. For details on the supportability of these devices, please refer to the Storage Devices Implementation Guide.

Consolidation

If you are planning to implement either the hybrid storage or centralized storage architectures, consolidation will form a significant part of your project. Success of a consolidation project is dependent on following a prescriptive methodology and adhering to project priorities that are determined prior to the deployment. Microsoft has published a server consolidation methodology that outlines common tasks for a consolidation project. The details of this server consolidation methodology are available at the following URL:

http://www.microsoft.com/servers/consolidation/method.asp

Assessing Current Infrastructure

You need to ask several questions when assessing an environment for consolidation. For example, where is the storage allocated in an environment? Is it being used efficiently? Often storage administrators find that a moderate percentage of storage is left unused despite being formatted and available. Traditionally, administrators buy more storage than required because RAID definition limitations create artificial size boundaries. In addition, unknown growth requirements, long lead times for product orders, and inflexible storage functionality force customers to buy storage using today’s disk currency as increments (such as 18 GB, 36 GB, and 72 GB).

Identifying Goals (What is the Outcome of Storage Consolidation?)

Some of the questions you should ask when identifying goals of the storage consolidation are:

Is storage consolidation a more efficient centralized management of resources?

Does it provide for a higher capacity usage of existing storage?

Does it reduce the administration costs, thereby reducing the TCO?

Does it meet service requirements for high performance, availability, and security?

Designing a New Environment

The final target storage architecture may be a hybrid that consists of DAS, NAS, and SAN technologies. Ensure that the design protects against any loss of access to information by implementing appropriate fault-tolerant technologies and avoiding single points of failure as much as possible while staying within the budgets allocated for consolidation.

Planning for the Migration

Assess the business impact of each of the consolidation alternatives. Identify the organizational roles and responsibilities during and after the consolidation. In addition, assess the plan, its risks, the budget, and desired results prior to implementation.

Building, Testing, and Implementing the New Pilot Environment

To help ensure the pilot is successful, provide the storage administrators and managers with adequate training and resources to support the chosen hardware.

Developing a Plan for Migrating Users and Data

Company data and intellectual property are valuable assets for an organization. Care must be taken to back up the data prior to migrating to a new storage architecture. Validate that the plan provides for the same level of access and functionality and, if not, make sure a rollback contingency option is built into the plan.

Implementing the New Production Environment

Once the new storage architecture hardware is in place, deploy the applications, utilities, and tools in the new, consolidated production environment. In addition, develop and document post-consolidation maintenance and management procedures.

Migrating Users and Data to the Consolidated Environment

Ideally, the planned migration should be tested in a pilot environment with a replica of the production data to ensure a smooth process. A migration can take place in a staged manner in which selected sections of the data are migrated to the new environment and checked before the next section is moved. Alternatively, the migration can take place as a single operation: this approach has potentially more risk, but is generally also the less disruptive approach.

Evaluating and Reviewing

Consolidation is an iterative process and hence requires re-evaluation on a regular basis. Evaluate the results of your consolidation project, including costs and maintenance procedures. In addition, optimize your environment.

Interoperability

If an organization has specific interoperability requirements, it is important that they be documented as requirements of the storage solution. For example, if the organization has a heterogeneous client base of servers or workstations, the storage technology must support the communication protocols and file systems required by those servers and workstations.

Interoperability can be a serious issue when different storage vendors provide the SAN elements. Contact the appropriate hardware vendor to ensure that all planned systems are configured in an interoperable manner. Referencing the HCL may be a good starting point to ensure that device meets the required interoperability standards.


**
**