Windows Server 2012 R2 Inside Out: Active Directory Architecture
- By William Stanek
- Active Directory physical architecture
- Active Directory logical architecture
Active Directory is an extensible directory service that enables you to manage network resources efficiently. A directory service does this by storing detailed information about each network resource, which makes it easier to provide basic lookup and authentication. Being able to store large amounts of information is a key objective of a directory service, but the information must also be organized so that it’s easily searched and retrieved.
Active Directory provides for authenticated search and retrieval of information by dividing the physical and logical structures of the directory into separate layers. Understanding the physical structure of Active Directory is important for understanding how a directory service works. Understanding the logical structure of Active Directory is important for implementing and managing a directory service.
Active Directory physical architecture
The physical layer of Active Directory controls the following features:
- How directory information is accessed
- How directory information is stored on the hard disk of a server
Active Directory physical architecture: A top-level view
From a physical or machine perspective, Active Directory is part of the security subsystem. (See Figure 10-1.) The security subsystem runs in user mode. User-mode applications do not have direct access to the operating system or hardware. This means that requests from user-mode applications have to pass through the executive services layer and must be validated before being executed.
Figure 10-1 Top-level overview of the Active Directory architecture.
Each resource in Active Directory is represented as an object. Anyone who tries to gain access to an object must be granted permission. Lists of permissions that describe who or what can access an object are referred to as access control lists (ACLs). Each object in the directory has an associated ACL.
You can restrict permissions across a broader scope by using Group Policy. The security infrastructure of Active Directory uses policy to enforce security models on several objects that are grouped logically. You can also set up trust relationships between groups of objects to allow for an even broader scope for security controls between trusted groups of objects that need to interact. From a top-level perspective, that’s how Active Directory works, but to really understand Active Directory, you need to delve into the security subsystem.
Active Directory within the Local Security Authority
Within the security subsystem, Active Directory is a subcomponent of the Local Security Authority (LSA). As shown in Figure 10-2, the LSA consists of many components that provide the security features of Windows Server and ensure that access control and authentication function as they should. Not only does the LSA manage local security policy but it also performs the following functions:
- Generates security identifiers (SIDs)
- Provides the interactive process for logon
Figure 10-2 Windows Server security subsystem using Active Directory.
When you work through the security subsystem as it is used with Active Directory, you’ll find the three following key areas:
- NTLM (Msv1_0.dll). Used for Windows NT LAN Manager (NTLM) authentication
- Kerberos (Kerberos.dll) and Key Distribution Center (Kdcsvc.dll). Used for Kerberos V5 authentication
- SSL (Schannel.dll). Used for Secure Sockets Layer (SSL) authentication
- Authentication provider (Secur32.dll). Used to manage authentication
- NET LOGON (Netlogon.dll). Used for interactive logon through NTLM. For NTLM authentication, NET LOGON passes logon credentials to the directory service module and returns the SIDs for objects to clients making requests.
- LSA Server (Lsasrv.dll). Used to enforce security policies for Kerberos and SSL. For Kerberos and SSL authentication, LSA Server passes logon credentials to the directory service module and returns the SIDs for objects to clients making requests.
- Security Accounts Manager (Samsrv.dll). Used to enforce security policies for NTLM.
- Directory service component: Directory service (Ntdsa.dll). Used to provide directory services for Windows Server. This is the actual module that allows you to perform authenticated searches and retrieval of information.
As you can see, users are authenticated before they can work with the directory service component. Authentication is handled by passing a user’s security credentials to a domain controller. After the user is authenticated on the network, the user can work with resources and perform actions according to the permissions and rights the user has been granted in the directory. At least, this is how the Windows Server security subsystem works with Active Directory.
When you are on a network that doesn’t use Active Directory, or when you log on locally to a machine other than a domain controller, the security subsystem works as shown in Figure 10-3. Here, the directory service is not used. Instead, authentication and access control are handled through the Security Accounts Manager (SAM). Here, information about resources is stored in the SAM, which itself is stored in the registry.
Figure 10-3 Windows Server security subsystem without Active Directory.
Directory service architecture
As you’ve seen, incoming requests are passed through the security subsystem to the directory service component. The directory service component is designed to accept requests from many kinds of clients. As shown in Figure 10-4, these clients use specific protocols to interact with Active Directory.
Figure 10-4 The directory service architecture.
Protocols and client interfaces
The primary protocol for Active Directory access is Lightweight Directory Access Protocol (LDAP). LDAP is an industry standard protocol for directory access that runs over Transmission Control Protocol/Internet Protocol (TCP/IP). Active Directory supports LDAP versions 2 and 3. Clients can use LDAP to query and manage directory information—depending on the level of permissions they have been granted—by establishing a TCP connection to a domain controller. The default TCP port used by LDAP clients is 389 for standard communications and 636 for SSL.
Active Directory supports intersite and intrasite replication through the REPL interface, which uses either remote procedure calls (RPCs) or Simple Mail Transfer Protocol over Internet Protocol (SMTP over IP), depending on how replication is configured. Each domain controller is responsible for replicating changes to the directory to other domain controllers, using a multimaster approach. The multimaster approach used in Active Directory allows updates to be made to the directory by any domain controller and then replicated to other domain controllers.
For older messaging clients, Active Directory supports the Messaging Application Programming Interface (MAPI). MAPI allows messaging clients to access Active Directory (which Microsoft Exchange uses for storing information), primarily for address book lookups. Messaging clients use RPCs to establish a connection with the directory service. The RPC Endpoint Mapper uses UDP port 135 and TCP port 135. Current messaging clients use LDAP instead of RPC.
For legacy clients, Active Directory supports the SAM interface, which also uses RPCs. This allows legacy clients to access the Active Directory data store the same way they would access the SAM database. The SAM interface is also used during certain replication activities.
Directory System Agent and database layer
Clients and other servers use the LDAP, REPL, MAPI, and SAM interfaces to communicate with the directory service component (Ntdsa.dll) on a domain controller. From an abstract perspective, the directory service component consists of the following:
- Directory System Agent (DSA), which provides the interfaces through which clients and other servers connect
- Database layer, which provides an application programming interface (API) for working with the Active Directory data store
From a physical perspective, the DSA is really the directory service component and the database layer resides within it. The reason for separating the two is that the database layer performs a vital abstraction. Without this abstraction, the physical database on the disk would not be protected from the applications the DSA interacts with. Furthermore, the object-based hierarchy used by Active Directory would not be possible. Why? Because the data store is in a single data file using a flat (record-based) structure, whereas the database layer is used to represent the flat file records as objects within a hierarchy of containers. Like a folder that can contain files and other folders, a container is simply a type of object that can contain other objects and other containers.
Each object in the data store has a name relative to the container in which it’s stored. This name is aptly called the object’s relative distinguished name (RDN). An object’s full name, also referred to as an object’s distinguished name (DN), describes the series of containers, from the highest to the lowest, of which the object is a part.
To make sure every object stored in Active Directory is truly unique, each object also has a globally unique identifier (GUID), which is generated when the object is created. Unlike an object’s RDN or DN, which can be changed by renaming an object or moving it to another container, the GUID can never be changed. The DSA assigns it to an object, and it never changes.
The DSA is responsible for ensuring that the type of information associated with an object adheres to a specific set of rules. This set of rules is referred to as the schema. The schema is stored in the directory and contains the definitions of all object classes and describes their attributes. In Active Directory the schema is the set of rules that determine the kind of data that can be stored in the database, the type of information that can be associated with a particular object, the naming conventions for objects, and so on.
The DSA is also responsible for enforcing security limitations. It does this by reading the SIDs on a client’s access token and comparing them to the SIDs for an object. If a client has appropriate access permissions, it is granted access to an object. If a client doesn’t have appropriate access permissions, it’s denied access.
Finally, the DSA is used to initiate replication. Replication is the essential functionality that ensures that the information stored on domain controllers is accurate and consistent with changes that have been made. Without proper replication, the data on servers would become stale and outdated.
Extensible Storage Engine
Active Directory uses the Extensible Storage Engine (ESE) to retrieve information from, and write information to, the data store. The ESE uses indexed and sequential storage with transactional processing, as follows:
- Indexed storage. Indexing the data store allows the ESE to access data quickly without having to search the entire database. In this way, the ESE can rapidly retrieve, write, and update data.
- Sequential storage. Sequentially storing data means that the ESE writes data as a stream of bits and bytes. This allows data to be read from and written to specific locations.
- Transactional processing. Transactional processing ensures that changes to the database are applied as discrete operations that can be rolled back if necessary.
Any data that is modified in a transaction is copied to a temporary database file. This gives two views of the data that’s being changed: one view for the process changing the data and one view of the original data that’s available to other processes until the transaction is finalized. A transaction remains open as long as changes are being processed. If an error occurs during processing, the transaction can be rolled back to return the object being modified to its original state. If Active Directory finishes processing changes without errors occurring, the transaction can be committed.
As with most databases that use transactional processing, Active Directory maintains a transaction log. A record of the transaction is written first to an in-memory copy of an object, then to the transaction log, and finally to the database. The in-memory copy of an object is stored in the version store. The version store is an area of physical memory (RAM) used for processing changes. Typically, the version store is 25 percent of the physical RAM.
The transaction log serves as a record of all changes that have yet to be committed to the database file. The transaction is written first to the transaction log to ensure that even if the database shuts down immediately afterward, the change is not lost and can take effect. To ensure this, Active Directory uses a checkpoint file to track the point up to which transactions in the log file have been committed to the database file. After a transaction is committed to the database file, it can be cleared out of the transaction log.
The actual update of the database is written from the in-memory copy of the object in the version store and not from the transaction log. This reduces the number of disk I/O operations and helps ensure that updates can keep pace with changes. When many updates are made, however, the version store can reach a point at which it’s overwhelmed. This happens when the version store reaches 90 percent of its maximum size. When this happens, the ESE temporarily stops processing cleanup operations that are used to return space after an object is modified or deleted from the database.
Although in earlier releases of Windows Server index creation could affect domain controller performance, Windows Server 2012 and Windows Server 2012 R2 allow you to defer index creation to a time when it’s more convenient. By deferring index creation to a designated point in time, rather than creating indexes as needed, you can ensure that domain controllers can perform related tasks during off-peak hours, thereby reducing the impact of index creation. Any attribute that is in a deferred index state will be logged in the event log every 24 hours. Look for event IDs 2944 and 2945. When indexes are created, event ID 1137 is logged.
In large Active Directory environments, deferring index creation is useful to prevent domain controllers from becoming unavailable due to building indexes after schema updates. Before you can use deferred index creation, you must enable the feature in the forest root domain. You do this using the DSHeuristics attribute of the Directory Services object for the domain. Set the eighteenth bit of this attribute to 1. Because the tenth bit of this attribute typically also is set to 1 (if the attribute is set to a value), the attribute normally is set to the following: 000000000100000001. You can modify the DSHeuristics attribute using ADSI Edit or Ldp.exe.
ADSI Edit is a snap-in you can add to any Microsoft Management Console (MMC). Open a new MMC by entering MMC at a prompt and then use the Add/Remove Snap-in option on the File menu to add the ADSI Edit snap-in to the MMC. You can then use ADSI Edit to modify the DSHeuristics attribute by completing the following steps:
- Press and hold or right-click the root node and then select Connect To. In the Connection Settings dialog box, choose the Select A Well Known Naming Context option. On the related selection list, select Configuration (because you want to connect to the Configuration naming context for the domain) and then tap or click OK.
- In ADSI Edit, work your way down to the CN=Directory Service container by expanding the Configuration naming context, the CN=Configuration container, the CN=Services container, and the CN=Windows NT container.
- Next, press and hold or right-click CN=Directory Service and then select Properties. In the Properties dialog box, select the dsHeuristics properties and then tap or click Edit.
- In the String Attribute Editor dialog box, type the desired value, such as 000000000100000001, and then tap or click OK twice.
Ldp is a graphical utility. Open Ldp by typing ldp in the Apps Search box or at a prompt. You can then use Ldp to modify the DSHeuristics attribute by completing the following steps:
- Choose Connect on the Connection menu and then connect to a domain controller in the forest root domain. After you connect to a domain controller, choose Bind on the Connection menu to bind to the forest root domain using an account with enterprise administrator privileges.
- Next, choose Tree on the View menu to open the Tree View dialog box. In the Tree View dialog box, choose CN=Configuration container as the base distinguished name to work with.
- In the CN=Configuration container, expand the CN=Services container, expand the CN=Windows NT container, and then select the CN=Directory Service container. Next, press and hold or right-click CN=Directory Service and then select Modify.
- In the Modify dialog box, type the attribute name as dsHeuristics and the value as 000000000100000001.
- If the attribute already exists, set the Operation as Replace. Otherwise, set the Operation as Add.
- Tap or click Enter to create an LDAP transaction for this update, and then tap or click Run to apply the change.
Once the change is replicated to all domain controllers in the forest, they will defer index creation automatically. You must then trigger index creation manually by either restarting domain controllers, which rebuilds the schema cache and deferred indexes, or by triggering a schema update for the RootDSE. In ADSI Edit, you can initiate an update by connecting to the RootDSE. To do this, press and hold or right-click the root node and then select Connect To. In the Connection Settings dialog box, choose the Select A Well Known Naming Context option. On the related selection list, select RootDSE and then tap or click OK. In ADSI Edit, press and hold or right-click the RootDSE node and then select Update Schema Now.
To allow for object recovery and for the replication of object deletions, an object that is deleted from the database is logically removed rather than physically deleted. The way deletion works depends on whether Active Directory Recycle Bin is enabled or disabled.
Deletion without Recycle Bin
When Active Directory Recycle Bin is disabled, as with standard deployments prior to Windows Server 2008 R2, most of the object’s attributes are removed and the object’s Deleted attribute is set to TRUE to indicate that it has been deleted. The object is then moved to a hidden Deleted Objects container where its deletion can be replicated to other domain controllers. (See Figure 10-5.) In this state, the object is said to be tombstoned. To allow the tombstoned state to be replicated to all domain controllers, and thus removed from all copies of the database, an attribute called tombstoneLifetime is also set on the object. The tombstoneLifetime attribute specifies how long the tombstoned object should remain in the Deleted Objects container. The default lifetime is 180 days.
Figure 10-5 Active Directory object life cycle without Recycle Bin.
The ESE uses a garbage-collection process to clear out tombstoned objects after the tombstone lifetime has expired, and it performs automatic online defragmentation of the database after garbage collection. The interval at which garbage collection occurs is a factor of the value set for the garbageCollPeriod attribute and the tombstone lifetime. By default, garbage collection occurs every 12 hours. When there are more than 5,000 tombstoned objects to be garbage-collected, the ESE removes the first 5,000 tombstoned objects and then uses the CPU availability to determine if garbage collection can continue. If no other process is waiting for the CPU, garbage collection continues for up to the next 5,000 tombstoned objects whose tombstone lifetime has expired, and the CPU availability is again checked to determine if garbage collection can continue. This process continues until all the tombstoned objects whose tombstone lifetime has expired are deleted or another process needs access to the CPU.
Deletion with Recycle Bin
When Active Directory Recycle Bin is enabled as an option with Windows Server 2008 R2 and later, objects aren’t tombstoned when they are initially deleted and their attributes aren’t removed. Instead, the deletion process occurs in stages.
In the first stage of the deletion, the object is said to be logically deleted. Here, the object’s Deleted attribute is set to TRUE to indicate that it has been deleted. The object is then moved, with its attributes and name preserved, to a hidden Deleted Objects container where its deletion can be replicated to other domain controllers. (See Figure 10-6.) To allow the logically deleted state to be replicated to all domain controllers, and thus removed from all copies of the database, an attribute called ms-DeletedObjectLifetime is also set on the object. The ms-DeletedObjectLifetime attribute specifies how long the logically deleted object should remain in the Deleted Objects container. The default deleted object lifetime is 180 days.
Figure 10-6 Active Directory object life cycle with Recycle Bin.
When the deleted object lifetime expires, Active Directory removes most of the object’s attributes, changes the distinguished name so that the object name can’t be recognized, and sets the object’s tombstoneLifetime attribute. This effectively tombstones the object (and the process is the same as the legacy tombstone process).
The recycled object remains in the Deleted Objects container until the recycled object lifetime expires, and it’s said to be in the recycled state. The default tombstone lifetime is 180 days.
As with deletion without the Recycle Bin, the ESE uses a garbage-collection process to clear out tombstoned objects after the tombstone lifetime has expired. This garbage-collection process is the same as discussed previously.
Data store architecture
After you examine the operating system components that support Active Directory, the next step is to see how directory data is stored on a domain controller’s hard disks. As Figure 10-7 shows, the data store has a primary data file and several other types of related files, including working files and transaction logs.
Figure 10-7 The Active Directory data store.
These files are used as follows:
- Primary data file (Ntds.dit). Physical database file that holds the contents of the Active Directory data store
- Checkpoint file (Edb.chk). Checkpoint file that tracks the point up to which the transactions in the log file have been committed to the database file
- Temporary data (Tmp.edb). Temporary workspace for processing transactions
- Primary log file (Edb.log). Primary log file that contains a record of all changes that have yet to be committed to the database file
- Secondary log files (Edb00001.log, Edb00002.log, ...). Additional logs files that are used as needed
- Reserve log files (EdbRes00001.jrs, EdbRes00002.jrs, ...). Files that are used to reserve space for additional log files if the primary log file becomes full
The primary data file contains three indexed tables:
- Active Directory data table. The data table contains a record for each object in the data store, which can include object containers, the objects themselves, and any other type of data that is stored in Active Directory.
- Active Directory link table. The link table is used to represent linked attributes. A linked attribute is an attribute that refers to other objects in Active Directory. For example, if an object contains other objects (that is, it is a container), attribute links are used to point to the objects in the container.
- Active Directory security descriptor table. The security descriptor table contains the inherited security descriptors for each object in the data store. Windows Server uses this table so that inherited security descriptors no longer have to be duplicated on each object. Instead, inherited security descriptors are stored in this table and linked to the appropriate objects. This makes Active Directory authentication and control mechanisms very efficient.
Think of the data table as having rows and columns; the intersection of a row and a column is a field. The table’s rows correspond to individual instances of an object. The table’s columns correspond to attributes defined in the schema. The table’s fields are populated only if an attribute contains a value. Fields can be a fixed or a variable length. If you create an object and define only 10 attributes, only these 10 attributes will contain values. Although some of those values might be fixed length, others might be variable length.
Records in the data table are stored in data pages that have a fixed size of 8 kilobytes (KBs, or 8,192 bytes). Each data page has a page header, data rows, and free space that can contain row offsets. The page header uses the first 96 bytes of each page, leaving 8,096 bytes for data and row offsets.
Row offsets indicate the logical order of rows on a page, which means that offset 0 refers to the first row in the index, offset 1 refers to the second row, and so on. If a row contains long, variable-length data, the data might not be stored with the rest of the data for that row. Instead, Active Directory can store an 8-byte pointer to the actual data, which is stored in a collection of 8 KB pages that aren’t necessarily written contiguously. In this way, an object and all its attribute values can be much larger than 8 KBs.
The primary log file has a fixed size of 10 megabytes (MBs). When this log fills up, Active Directory creates additional (secondary) log files as necessary. The secondary log files are also limited to a fixed size of 10 MBs. Active Directory uses the reserve log files to reserve space on disk for log files that might need to be created. Because several reserve files are already created, this speeds up the transactional logging process when additional logs are needed.
By default, the primary data file, the working files, and the transaction logs are all stored in the same location. On a domain controller’s system volume, you’ll find these files in the %SystemRoot%\NTDS folder. Although these are the only files used for the data store, Active Directory uses other files. For example, policy files and other files, such as startup and shutdown scripts used by the DSA, are stored in the %SystemRoot%\Sysvol folder.