Identity Aggregation and Synchronization

Chapter 2: Approaches to Identity Aggregation and Synchronization

Published: May 11, 2004 | Updated: June 26, 2006

A typical large organization may have dozens of data stores for identity information. Even medium and small organizations usually have several identity stores. The challenge is how to aggregate the correct data from all of the identities in an organization and then synchronize the correct data with identity stores that may have incorrect or out-of-date data.

For example, an employee’s job title and address are usually stored in more than one identity store. When an employee moves or is promoted, the same information must be updated in several different identity stores. To further complicate matters, identity stores are often managed by independent departments. Keeping track of these changes and propagating them to all identity stores within an organization is the process of identity aggregation and synchronization.

On This Page
Common Identity Data SourcesCommon Identity Data Sources
Synchronizing Identity InformationSynchronizing Identity Information

Common Identity Data Sources

There are three main types of identity data sources:

Directories

Databases

Flat files

This section also discusses a special type of database: the Human Resources (HR) department database.

Directories as Identity Data Sources

To manage data objects, organizations often use a specialized data store called a directory. A directory provides a well-defined set of object classes with associated attributes and a hierarchical view for organizing objects. A directory service exposes the operations necessary to locate and manage the content of a directory.

Typically, directories are used for:

E-mail address books or white pages that contain name and e-mail address information.

E-commerce directories that contain information about users and profiles.

Server operating system directories that contain information about users, computers, devices, and applications.

Historically, directories were custom applications that were designed to fulfill a specific role within an organization's network environment. In many cases, separate directories were implemented to contain relevant information to satisfy specific target functions.

Databases as Identity Data Sources

There are many identity stores in an organization that are not directory-based. Identity stores for individual applications are often implemented as databases for the following reasons:

Developers often have a better understanding of database technologies and interfaces compared to directory services.

Directory service administrators may not want developers altering a directory schema.

For these cases, databases can easily adapt to storing identity information, but there are several drawbacks. Databases are inherently non-hierarchical, but when storing information about people it is usually more convenient to mimic typical organizational hierarchies like companies, departments, and teams. These hierarchical structures help to easily locate objects and provide intuitive searching capabilities. In addition, databases generally do not follow a common schema that defines the data and its characteristics.

Additionally, databases do not come with a suite of security services for authentication, authorization, trusts, and security auditing — all required functionality must be programmed uniquely (and unnecessarily) for each database.

Flat Files as Identity Data Sources

Flat files (text-based files such as comma-delimited and XML files) can also serve to store identity information, especially with older applications. Flat file identity stores suffer from all of the same issues as databases, but typically provide significantly worse performance and management.

Flat files are often used for importing and exporting information between data sources and platforms if direct integration is otherwise infeasible.

Human Resources Databases as Identity Data Sources

The HR database (or equivalent) is a special case because of the functions of the HR department and their role in the management of an organization's users. The HR database is usually an authoritative source of information about the existence of user identities and many of the key attributes of a person, such as employee ID, first name, last name, home address, and so on.

The HR department is typically the first to know that an employee has been either hired or fired, thus being the authoritative indication that a user identity should be added, or removed, from the environment. The HR database also manages many user attributes, which makes it an important source of identity data that must be synchronized to other identity stores.

For security and privacy reasons it is usually difficult to have an HR database participate directly with identity integration services. However, HR departments will typically permit a reduced database view with read-only access, or a flat file containing specific fields from the HR database to be used.

Synchronizing Identity Information

Regardless of how identity data is stored, there are several common scenarios that affect the management of this information. The following table describes some of these scenarios.

Table 2.1. Identity Scenarios and Requirements

ScenarioRequirements

Implementing single sign on

Manage user name, password, and access rights information across many different platforms and applications.

Managing a global address book

Synchronize mailbox information among the e-mail directories that are used within a company.

Managing e-commerce applications

Synchronize information for suppliers and extranet users, such as digital certificates, with e-commerce directories that reside in perimeter networks.

Hiring/firing employees

Quickly propagate information about newly hired employees to all systems that require identity information, and quickly perform the same processes in reverse when employees leave.

The following sections describe a number of approaches that are commonly used to accomplish these tasks, including:

Manual administration.

Implementing automation through custom scripts.

Implementing automation through product-specific integration services.

Using a metadirectory product.

Using an identity integration product.

Manual Administration

Manual administration is the default mechanism for managing the attributes of users in identity stores. Some identity stores, such as the Microsoft® Active Directory® directory service, provide tools similar to the Microsoft Management Console (MMC) Active Directory Users and Computers snap-in. This tool provides a convenient graphical interface that is easy to use and provides quick and easy manipulation of user attributes.

Although manual tools are intuitive and easy to use for a trained IT administrator, they are cumbersome to use across multiple identity stores and often result in errors and inconsistencies.

Custom Scripts

After manual administration becomes cumbersome, the typical next step is for the IT administrator to create scripts that manage identity attributes in various stores. Through powerful scripting languages such as PERL or Visual Basic® and interfaces such as the Active Directory Scripting Interface (ADSI), it is fairly easy to create scripts that can manipulate identity data in an organization.

While easy to create and cheap to implement, most script-based identity synchronization solutions have one or more of the following issues:

Lack of centralized control. Because scripts are easy to create, they can spread quickly throughout an organization. Unfortunately, they are also often poorly maintained and many organizations have little understanding of all the scripts that are currently in use. This lack of maintenance and awareness can lead to significant problems with identity stores that can result in security issues and loss of data.

Limited error and exception handling capabilities. Incorrect error handling can cause a script to abort prematurely, which can lead to unsynchronized data between identity stores or a loss of integrity in the identity store database itself. Limited error reporting can hide the fact that problems even exist. Exception handling can have an even greater impact; a poorly-written script might delete all identity information in the target identity store. This result could range from annoying to catastrophic, but will almost certainly cause lost productivity for administrators, users, or both.

Dependence on the people who develop them. Investments in custom scripts are often made behind the scenes, and typically there are a limited number of experts for any script. When those people are away or leave the organization and a problem occurs, the organization suffers as a result.

No preview mode. A script that updates objects in one identity store with object data from another identity store can have a significant impact. A preview mode would show the result of running the script before the script is run. Unfortunately, most scripts are not this sophisticated and it often feels like you are taking a chance every time the script runs. You don't know what's going to happen — you just have to trust the script.

Undesirable security characteristics. A script with poor security characteristics may have the username and password of an administrator equivalent account hard-coded into the script. Unfortunately, this is all too common an occurrence. Another common characteristic of script-based mechanisms is that an administrator, or group of administrators, need to retain credentials that allow highly privileged access to some identity store so that the scripts they run in their user context have the ability to read and write the identity information in each store.

Limited scalability and redundancy. Most scripts do not scale well to support dozens of identity stores, do not include redundancy for hardware failures and other exceptions, and are generally unable to meet the needs of larger organizations.

Integration Services

Integration services provide another approach to automating maintenance of identity information, although they usually only integrate with a single type of identity store without the flexibility of a full identity integration product. Examples of these integration services are:

Windows Server 2003 R2. Windows Server 2003 R2 includes built-in interoperability components that help you integrate UNIX and Windows environments. This interoperability includes the Subsystem for UNIX-based Applications, directory services integration, and File and Print services. For more information about UNIX interoperability, see Windows Server 2003 R2 UNIX Interoperability Components.

Services for UNIX. Windows Services for UNIX version 3.5 provides the programs and services to support identity integration between Windows and UNIX or Linux computers. For more information about Services for UNIX, see the Windows Services for UNIX 3.5 downloads page.

Services for Netware. Microsoft Directory Synchronization Services (MSDSS), which is included with Services for NetWare 5 includes support for propagation of identity information from Active Directory to Novell eDirectory 8.7. For more information about MSDSS, see the Microsoft Windows Services for NetWare 5.03 page.

Host Integration Server (HIS). A comprehensive mainframe integration platform, HIS enables seamless access to host-based systems through a user’s Active Directory account and provides automatic authentication of an identity in both Active Directory and the host system. HIS maintains account information between Windows and the host system to enable single sign on and password management. Bidirectional password synchronization is also available for mainframe security systems (RACF, ACF/2, and Top Secret) with the addition of third-party tools. For more information, see the Host Integration Server Web page.

Active Directory Connector (ADC). ADC provides directory synchronization and import/export tools. It lets administrators replicate a hierarchy of directory objects between Microsoft Exchange servers and Active Directory. For more information, see the Exchange Server 2003 Active Directory Connector Solutions Center.

Data Transformation Services (DTS). A set of Microsoft SQL Server™ 2000 components that allows database administrators to import, export, and transform both relational and non-relational sources of data, providing a powerful toolset for transferring data between systems. DTS can be an appropriate choice for translating identity data between multiple database sources and flat files. For more information about SQL Server DTS, see the Data Extraction, Transformation and Loading Techniques page.

The Role of Metadirectories

A metadirectory is a store containing information from multiple directories. It provides a centralized view of relational data from disparate identity stores throughout the enterprise. Even though separate directories may not share information, metadirectories make this relational view of data from all directories possible.

Although metadirectory products may attempt to provide a single view of identities, they do not always aggregate and synchronize identity information with each of the connected data sources. Customers want this crucial capability to ensure the applications that use each identity store relay accurate and up-to-date information to their users.

Microsoft Metadirectory Server 2.2, the precursor to Microsoft Identity Integration Server 2003, Enterprise Edition is an example of a metadirectory product.

Identity Integration Products

An identity integration product is designed to provide all of the functionality of scripts and integration services, but also address the drawbacks listed in the previous sections. Identity integration products also provide additional functionality that may be very hard or impossible to implement with scripts.

Identity integration products typically provide the following set of features:

Aggregation and synchronization of identity information across multiple identity stores.

Password management services, including password propagation of changes and resets.

Group management of security and distribution groups, including synchronization of groups across different identity stores.

Automated provisioning and centralized management of identity information.

E-mail contact synchronization between heterogeneous systems such as Microsoft Exchange Server and Lotus Notes.

Additional vendor-specific features, such as global address list (GAL) synchronization for Microsoft Exchange 2000 Server and Exchange Server 2003 across multiple forests.

Microsoft offers two identity integration products:

Microsoft Identity Integration Server 2003, Enterprise Edition with Service Pack 1 (MIIS 2003 with SP1).

Identity Integration Feature Pack 1a for Windows Server™ Active Directory.

Both products have similar software requirements; Windows Server 2003, Enterprise Edition and Microsoft SQL Server 2000, Enterprise Edition or SQL Server 2000 Developer Edition (for testing purposes only). However, each product offers a different level of support for integration with external systems.

Note   SQL Server 2000 Developer Edition is licensed per developer and must be used for designing, developing, and testing purposes only. It should not be confused with Microsoft SQL Server Desktop Engine (MSDE). For more information see Microsoft SQL Server: How to Buy.

Microsoft Identity Integration Server 2003, Enterprise Edition with Service Pack 1

MIIS 2003 with SP1 is an enterprise identity integration product from Microsoft; it replaces the previous metadirectory product, Microsoft Metadirectory Services (MMS) 2.2. MIIS 2003 with SP1 provides all the identity integration product features listed in the previous section.

For more information about MIIS 2003 with SP1, including the MIIS 2003 Technical Reference, see the MIIS 2003 page on Microsoft.com at www.microsoft.com/miis and the Microsoft Identity Integration Server 2003 Frequently Asked Questions page.

MIIS 2003 with SP1 uses Microsoft SQL Server 2000, Enterprise Edition or Standard Edition as its identity store for the metaverse as well as for individual views of each connected directory, application, or data source. The following table defines the connected identity stores (called management agents) that are available in MIIS 2003 with SP1.

Table 2.2. MIIS 2003 with SP1 Management Agent Categories

Connected identity storeExample

Network operating systems and directory services

Microsoft Windows NT®
Active Directory (Windows 2000 Server and later)
Active Directory Application Mode
Novell eDirectory 8.6.2, 8.7, and 8.7.3
Sun ONE Directory Server 5.0,  5.1, or 5.2 (formerly iPlanet Directory Server)
IBM Directory Server 4.1, 5.1 or 5.2
Resource Access Control Facility (RACF)
X.500 Systems

E-mail systems

Microsoft Exchange 5.5
Microsoft Exchange 2000 and later (GAL synchronization)
Lotus Notes and Domino 4.6 and later

Application systems

PeopleSoft
SAP
ERP1
Telephone switches
XML- and DSML-based systems

Databases

IBM DB2 Universal Database 7 and 8.1 on Windows, 8.1 on Linux and 5.1.5 on OS/400
Microsoft SQL Server 7.0 and 2000
Oracle 8i and 9i

File-based agents (for generic connections)

DSML v2 (Directory Services Markup Language)
LDIF (LDAP Data Interchange Format)
CSV (comma-separated value) and other delimited formats
Fixed width
Attribute-value pairs

For an up-to-date list of supported systems and other enhancements in MIIS 2003 with SP1, see MIIS 2003 Product Overview.

Identity Integration Feature Pack 1a for Active Directory

The Identity Integration Feature Pack (IIFP) 1a for Windows Server Active Directory is a reduced feature set version of MIIS 2003 with SP1 with a limited number of management agents. The Identity Integration Feature Pack provides connections only to the following directories and e-mail applications:

Active Directory for Windows 2000 Server and later.

Active Directory Application Mode (ADAM).

GAL synchronization for Microsoft Exchange 2000 Server and Exchange Server 2003.

The IIFP is appropriate for environments that operate Microsoft directory products. For example, it is useful for synchronizing identity information between multiple forests and ADAM instances.

The software requirements for IIFP are similar to MIIS 2003 with SP1: Windows Server 2003, Enterprise Edition and Microsoft SQL Server 2000, Enterprise Edition, Standard Edition or Developer Edition (for testing purposes only).

Download the Identity Integration Feature Pack 1a for Windows Server Active Directory.


**
**