Beneath the hype surrounding Active Directory, is the fact that any directory service is based on database technology. Understand this concept, and youre closer to understanding Windows 2000.
What Active Directory Brings to NT
Beneath the hype surrounding Active Directory, is the fact that any directory service is based on database technology. Understand this concept, and you’re closer to understanding NT 5.0.
- By Michael Chacon
- December 01, 1998
Even the most ardent Windows NT fan will admit that the
product has always been weak in the area of directory
services. Microsoft has countered this perception by calling
its current registry-based domain administrative function
a directory service (DS). With Novell and its Novell (renamed
from NetWare) Directory Service (NDS) nipping at Microsoft,
especially lately, the DS issue has remained in the forefront.
Novell has had a directory service since NetWare 4.x
shipped in 1993; but the company has gradually squandered
its first-to-market opportunity. Novells DS has
been a useful administrative tool in terms of organizing
users, applications, and other resources. However, developers
havent taken the next step and pushed the DS down
into the network infrastructure. Thus, Novell hasnt
coaxed the developer community to rally around its new
flag.
The fact that Cisco chose to work first with Microsofts
future DS, under the Directory Enabled Networks
(DEN) initiative, rather than with NDS, illustrates Novells
problem. The companys next chance to build momentum
is with its new NetWare 5.0, which shipped this summer.
With the eventual release of Windows NT 5.0, however,
the directory services race-to-market could well be over.
Although directory services will become increasingly important,
who was first to market will be a moot point.
What is a Directory Service?
Most enterprises already have many different directories
in place, including those for network operating systems,
e-mail systems, and groupware products. Directory services
attempt to offer an organized approach to managing all
of the bits of data maintained in these existing directories.
In one whitepaper on the topic, Microsoft likens directory
services to the white pages of a phone book. You use input,
such as a persons name, to get outputthat
persons address and phone number. Likewise, DS offers
the flexibility of the yellow pages. If the user enters
general input (where are the printers?), a
browsable list of available printer resources is shown.
Whether you call it Active Directory, as Microsoft does,
or NDS, as Novell does, a directory service is a distributed
database, pure and simple. The supposedly high-level issues
surrounding directory services boil down to what and how
information is stored, how you can get to that information,
and what you can do with it. The advantages are three-fold:
1) A directory service offers a single entry point to
the network for users; no more remembering multiple passwords;
2) users will be able to find and use network resources
across the enterprise; 3) this design approach simplifies
enterprise network administration. (Obviously, these advantages
will be enjoyed in larger NT network environments; if
you maintain a small network, the single domain will continue
to be sufficient for your needs.)
Acronyms |
ACLAccess Control
List
BDCBackup Domain Controller
CNCommon Name
DCDomain Controller
DNDistinguished Name
DNSDomain Name Service
GUIDGlobal Unique Identifier
NDSNovell Directory Service
PDCPrimary Domain Controller
PVNProperty Version Number
RDNRelative Distinguished Names
RFCRequest for Comment
RPCRemote Procedure Call
SMBService Message Block (protocol)
SRV RRService Resource Records
UDVUp-to-date Vector
USNUpdate Sequence Number |
|
|
AD provides the ability to organize users, applications,
and other resources in a hierarchical manner in the same
fashion as NetWare. The hierarchy allows the administrator
to scale the logical views of the network organization
and application of permissions to enterprise-sized designs.
Its been sorely needed for some time now, and Active
Directory will remove the negative DS comparisons with
NetWare. While the high-level functionality of Active
Directory (AD) is interesting and consequential, its
important first to understand the fundamental architecture.
Youll then have a firm foundation when youre
ready to implement Active Directory with NT 5.0.
A Brief Description
AD extends the current NT 4.0 domain concept by allowing
us to join resources together into a domain tree.
For administrative purposes domains can be subdivided
into Organizational Units (OUs). You can use OUs to get
greater granularity for management of objects than domains.
When you need to build a more complex organizational structureor
a very large number of objects (beyond a million) must
be storedyou can join domains together to form a
tree. A domain tree can contain many millions of objects.
AD objects are divided into two groups: leaf objects,
and container objects. Whereas a container object can
contain other objects, a leaf object cant. Standard
Container Objects consist of Namespaces (more on this
shortly), Country, Locality, Organization, and the like.
Standard Leaf Objects consist of User, Group, Alias, etc.
AD will also offer extensibility by allowing users to
publish objects. For instance, a financial
services vendor could publish a bank object
specific to its business, thereby adding it to its AD.
Like Novells NDS, AD is based on the X.500 architecture.
If youve worked with Exchange, youll recognize
many common features since AD builds on Exchanges
directory.
A
Design Note |
Microsoft recommends as
few domains as possible in the AD and
is currently suggesting that each domain
can hold a maximum of 1,000,000 objects.
You can consider the use of OUs (Organization
Units) as a replacement of existing resource
domains. (Microsoft recommends not exceeding
10 levels, however.) Since the domain
is the physical partition of the AD database,
you could structure them either by business
function (HR, manufacturing, or accounting)
or by location. For enterprises that span
many locations, you need to think of the
WAN implications of having to replicate
changes on every object in the domain
to each DC. (Remember, a change
could be as minor as your last network
logon date and time.) Although Microsoft
doesnt recommend anything specific
here, its worth noting that Novell
tells NDS designers to structure by location
in order to reduce directory traffic on
the WAN. |
|
|
Building on Exchanges Database
Services
AD is built on ESE97, the Extensible Storage Engine database
format that Exchange 5.5 introduced. It promises three
features that are key in any database: speed, reliability,
and availability.
Microsofts internal development testing reports
promising results in the area of speed, including lookups
within seconds in a database of more than a million objects.
Furthermore, the response rate for some common searches
has been accelerated, thanks to the addition of a new,
separate index file called the Global Catalog. This catalog
is replicated to all DS servers in the enterprise. While
the Global Catalog is created automatically with Microsoft-defined
properties, it can be extended with administrator-defined
property index fields so that it can also be used to locate
custom directory objects.
The Global Catalog will prove very useful, but it can
cause problems if you choose either too many index fields
or inappropriate ones. If a query isnt satisfied
through the Global Catalog, the entire Active Directory
partition must be searched to fulfill the request, which
dampens performance. As a good example, lets take
a relatively fixed unique property like an employee number.
Such a property usually makes a useful catalog entry if
thats how users will be likely to search the directory
service.
Properties that arent fixed and unique, however,
should be left out of the Global Catalog. Although the
Global Catalog can include all objects in a forest, if
a property is searched thats not in the Global Catalog,
only the users domain tree is searched.
Regardless, be prepared to consume significant disk space
with a large DS and its associated Global Catalog.
The second important feature of ESE97reliabilityis
supplied through basic database technology. That includes
transaction logs, which Microsoft suggests you store to
separate disks from the actual database. With directory
services in place, every entry to the database will be
considered a separate checkpointed transaction. In the
event of a hardware or other type of failure, the entry
can be reproduced from the log. Placing the logs on different
disk drives helps performance and provides a backup of
the database.
The third key database feature is availability. ESE97
achieves high availability through its distributed nature,
in which replication is controlled by intervals or specific
times. With the Exchange kinship fully implemented in
NT 5.0, replication among administrative sites can be
accomplished using standard RPC or through a messaging
transport such as SMTP or X.400 pipelines. This lets administrators
create backup replication paths that can be controlled
by assigning a higher cost to the backup connection.
This cost assignment allows certain paths to kick in only
when needed.
|
Figure 1. The tree structure
in Windows NT 5.0. Each domain in a domain tree has
a copy of the directory service holding all objects
for that domain, as well as metadata about the domain
tree such as the schema, list of all domains in the
tree, and location of global catalog servers. |
The Physical Layout
Windows NT 4.0 and earlier versions use a single-master
Security Accounts Manager (SAM) database, also referred
to as a master-slave model. Regardless of what its
called, a single-master database is one where only one
copy, the master, can be written to. All updates are then
sent to the backup database copies. The backup databases
can be read to obtain information, but can be updated
only by the master database. For example, if a user in
New York changes his password, and the PDC, or the master
database, is in Los Angeles, the update will happen in
Los Angeles over the WAN connection. By contrast, the
Active Directory database in NT 5.0 uses a multiple-master
model. In that model, any copy of the Active Directory
can be updated by applications, devices, and users.
Multiple-master database functionality raises several
issues that the single-master model avoids, particularly
involving database replication. Whereas NT 4.0 followed
a primary domain controller (PDC) and backup domain controller
(BDC) master-slave database model, all servers running
Windows NT 5.0 Active Directory are simply considered
domain controllers (DC).
With AD, all DC database replication within a site is
automatic; its controlled by time intervals using
Remote Procedure Calls (RPCs). Because of this automatic
replication, youll want to make sure that your DS
partitions follow your physical network. Essentially,
sites follow IP subnetwork designs, the assumption being
that each Active Directory site, as well as a subnet,
will exist completely on a high-speed network. A site
can span multiple subnetworks as long as you ensure that
the interface connections are high-speed. However, keep
in mind that a single subnet can support only one Active
Directory site. Just as with Exchange, if you allow a
site to span a slow link, youll experience problems
with bandwidth consumption and interface congestion.
Consider the Namespace
Another aspect of Active Directory that youll need
to consider before implementation is the namespace. A
namespace is any administratively contained territory
in which a logical name can be resolved. The primary use
of a namespace is to organize the descriptions of resources
in a manner that lets users locate those resources by
their various characteristics or properties. The directory
allows users to find the location of an object without
knowing its name; if they know the name, it lets them
find unknown but useful information about a known object.
One overriding point in any directory is that the design
of the namespace ultimately determines how useful the
directory is to users as it grows. The best sorting and
search algorithms in the world wont help if you
have a poor logical directory design.
Since every object on the network must be uniquely identified,
what is the DS going to internally call each of your objects?
In AD, this is accomplished by associating a Global Unique
Identifier (GUID) with each object. This 128-bit number
is guaranteed to be unique and is never changed by the
AD, even if the objects logical name is changed.
The GUID is generated when a user or an application first
creates the Distinguished Name (DN) in the directory.
The DN is based upon the namespace of the most successful
namespace yet implemented, the Domain Name System (DNS).
For anyone who doesnt know, DNS is the namespace
that uniquely identifies all the registered networks and
their respective objects on the Internet.
Building on this already established namespace, Active
Directory uses DNS as the location service for finding
the physical location of network objects. Therefore, to
implement Active Directory you must understand and have
the proper hooks into a DNS. The Active Directory holds
all of the properties necessary to identify an object.
The DNS will then use the DN to return the IP address
of the device where the object actually resides.
The integration between Active Directory and DNS is achieved
by each Active Directory server publishing its own address
in Service Resource Records (SRV RR) in a DNS (this is
described in RFC 2052). An SRV RR is a record that maps
a service name to an address where that service can be
physically found. SRV is similar to the MX record thats
currently used to find mail servers. Although Microsoft
is distributing a new DNS with NT 5.0, most DNS servers
are on Unix platforms and will remain so in many enterprise
information systems. This means that the Windows NT administrator
will have to work with the Unix DNS administrator to make
sure that the SRV RR entries are made, are accurate, and
support RFC 2052.
More
about Database Replication |
Multiple-master database
replication also affects when to synchronize
changes, which information is most current,
and when to stop replicating data to avoid
loops. To determine what information needs
to be updated, Active Directory uses 64-bit
Update Sequence Numbers (USN). These are
created and associated with all properties.
Every time an object property is modified,
its USN is incremented and stored with
the property. In addition, each Active
Directory server maintains a table of
the latest USNs from all replication partners
within the site. This table is composed
of the highest USN for each property.
When the replication interval is reached,
each server requests only the changes
with an USN greater than whats listed
in its own table.
Occasionally, changes may be made to
two different Active Directory servers
for the same property before all changes
are replicated. This causes a replication
collision. One of the changes must be
declared more accurate and used as the
source for all of the other replication
partners. To reconcile this potential
problem, Active Directory uses a sitewide
Property Version Number (PVN) value.
A PVN is incremented when an originating
write takes place. An originating write
is one that occurs directly at a particular
Active Directory server. Thats
different from a write thats updated
during replication. When two or more
property values with the same PVN have
been changed in different locations,
the Active Directory server receiving
the change will check the timestamps
on each change and use the most recent
one for the update. This means that
you should maintain a central clock
for your network (which Windows NT provides)
and keep it accurate. Be aware that,
while timestamps certainly can be used
to make a decision, all automatic decisions
are, by nature, still arbitrary.
Another replication issue is looping.
Active Directory lets administrators
configure multiple replication paths
for redundancy purposes. To prevent
changes from endlessly updating, Active
Directory creates lists of USN pairs
on each server. These lists, called
up-to-date vectors (UDVs), maintain
the highest USN of each originating
write. Each UDV maintains a list all
of the other servers within its own
membership site. When replication occurs,
the requesting server sends its own
UDV to the sending server. The highest
USN for each originating write is used
to determine if the change still needs
to be replicated. If the USN number
is the same or higher, then no change
is needed because the requesting server
is already shown to be up to date.
Michael Chacon
|
|
|
Time for an IP Network
DNS is an IP-based service, which means you cant
use protocols such as IPX or NetBEUI with Active Directory.
If you use those protocols, youll have to continue
using the standard NetBIOS services (such as the Browser
service) for publishing, locating, and establishing sessions
within a Windows NT networknot a good idea. If you
dont have an IP network today, nows a good
time to design one in preparation for NT 5.0 and AD. Another
important point to make here is that if you use the Dynamic
Host Configuration Protocol (DHCP) for IP address allocation,
youll want to implement Dynamic DNS as a replacement
for WINS. Dynamic DNS will allow changes in IP address-to-host
mappings to be updated in the DNS. Thus, youll avoid
the need to make the changes manually. In most DNS systems,
the network resources have static IP addresses. This means
that every time a device obtains a new address, someone
must open a text editor and update the DNS records manually.
Depending on the size of your network, this manual updating
can range from a mild pain in the derrière to completely
impossible.
Most sites will use static IP addresses for network resources
only, such as servers. Client machines wont participate
in the DNS. This is another reason to ban peer-to-peer
networking from your network. If you allow users to share
resources and publish them in the AD, the DNS must have
entries for those clients; otherwise, it wont be
able to let other clients locate and establish sessions
with them.
Dynamic DNS servers will use the protocol described in
RFC 2136 to allow IP addresses to be updated on a periodic
basis in the DNS. However, this RFC has yet to be widely
implemented. If you want this functionality immediately,
youll need to use the Microsoft DNS. A more likely
scenario is a compromise in which you use Microsoft Dynamic
DNS for local network resolution and have it report up
to the static Unix-based enterprise DNS using accepted
zone transfers. This area of network administration will,
for a while anyway, be more dynamic than any new version
of DNS.
Integrating DNS and AD
To illustrate DNS and Active Directory integration, lets
look at a typical user request. The client looks for an
object, such as a file or other resource, in the AD. For
example, the user might want to find a server containing
sales reports. The request contains the address of the
nearest Active Directory server, which was gleaned from
the logon sequence. Active Directory first looks in the
Global Catalog for the property selected. If the property
wasnt indexed in the Global Catalog, Active Directory
searches through the actual directory partition to locate
the object. One of the properties of the object would
be the servers fully qualified domain name, such
as server1.sales.ny.abccomany.com. Following standard
DNS resolution (see this months Windows
Insider), the client then requests the servers
IP address. Once a session is established with the server,
higher-level network operating system protocols such as
SMBs are used to communicate with the server through a
share name (another property associated with the DS object).
As you can see, without an up-to-date DNS, users cant
connect to the objects listed in the AD.
Logical Considerations
Now that weve looked at some of the physical aspects
of the AD, lets examine some of the logical considerations.
At the logical level, as with DNS, the Active Directory
is simply another namespace. There are two main types
of information stored in a directory. One is the logical
location of a desired objectusually something such
as an application, file, or printer. The other type is
a list of attributes about the object. These attributes
provide search characteristics such as phone number, address,
eye color, or shoe size. The attributes can be used simply
to inform or to provide search criteria when looking for
general items of interest. Using attributes for searching
becomes even more important when the Active Directory
schema is extended. When you add objects, classes of objects,
and attributes for those objects, their structure will
determine how useful theyll be to the directory
users.
As in most useful directories, Active Directory objects
are structured like trees or containers. A container is
an object unto itself that exists only in the Active Directory.
While it has attributes that can be used for description
and location, it doesnt represent an actual object
on the network. Rather, its the holding place for
objects and other containers. A series of containers holding
objects connected in a hierarchy for organizational purposes
is called a tree. The branches of the tree are containers
that point to the endpoints, or leaves, which are the
objects the users need. The branch containers can also
hold objects.
Each container and object in the tree is uniquely named.
The namespace then becomes the complete path of all the
containers and objects, or branches and leaves, in the
tree. Where you place an object in the tree determines
the distinguished name that I mentioned earlier. The distinguished
name identifies the complete path down through the tree
hierarchy, such as:
/O=Internet/DC=com/DC=Ascolta/DC=Irvine/CN=Users/CN=Managers/
CN=Michael Chacon
Because the distinguished name is useful for organizing,
but not so useful for remembering, Active Directory also
borrows the idea of Relative Distinguished Names (RDN).
An RDN is an attribute of the object that can still be
uniquely identified within the tree, such as:
CN=Michael Chacon
In this example, the RDN could also be used as the logon
ID for the network. You could also add other attributes,
such as a logon alias, that are easier to remember.
As I mentioned earlier, the foundation for the namespace
used for most networks will be based upon the current
DNS namespace. This DNS relationship will determine the
shape of the Active Directory tree and the relationship
of the objects to each other. In my previous example,
the domain controller entries are the domains listed in
the DNS, while the Common Name (CN) entries are the specific
paths for the user objects in the directory.
Physical IP subnets are created to manage local ARP broadcast
traffic and otherwise control and maintain network efficiency.
The DNS servers create administrative domains and organize
the physical IP addresses into a hierarchy users can more
easily understand. Windows NT domains are created to build
a security boundary around network resources. This is
accomplished by controlling user authentication and the
assignment of permissions and privileges. Creating a contiguous
DNS namespace that dovetails with a contiguous Active
Directory namespace will take a great deal of planning
and compromise on the part of the network administrator.
Multiple domains with a contiguous namespace under Active
Directory are referred to as domain trees. For example,
contiguous means that the top or root domain could be
called company.com, while the child domains underneath
it could be called something like ca.company.com and ny.company.com.
Unlike the SAM-based NT 4.0 directory model, each domain
will have automatically generated trust relationships.
In addition, the trust relationships will be transitive
rather than point-to-point, as they were previously. This
means that, if domain A trusts domain B
and domain B trusts domain C,
then domain A and C will also
trust each other for authentication. This is accomplished
using the Kerberos authentication protocol, which is the
default in Windows NT 5.0.
You can use an Active Directory without a contiguous
namespace, but its not preferred. Usually, this
need will arise either when two companies merge operations
or simply because of poor planning. For example, a non-contiguous
namespace could be two domains called company1.com and
company2.com. When youre dealing with a non-contiguous
namespace and you want the various trees to share the
same schema, configuration, and Global Catalog, you can
create a forest. In a forest, Active Directory
cross-references the root of each tree to build the hierarchies
in the directory. The Kerberos trust relationships then
uses the directory for authentication.
Because these design choices are made during installation,
youll need to carefully pre-plan how your new Active
Directory server will relate to the rest of the tree.
During the installation process, youll be given
several choices: create a new tree in a new forest (which
youll do with the very first Active Directory machine),
create a new tree in an existing forest, create a replica
of a current domain, or create a child domain. One nice
thing about these choices is that you can uninstall Active
Directory without having to reinstall the entire NT 5.0
server. To make a member Windows NT server a DC, all you
need to do is add the Active Directory service. Inversely,
to make an Active Directory service a member server, all
you need to do is remove it from a DC. You dont
have to reinstall the operating system to make this change,
as was the case with domain controllers in Windows NT
4.0.
|
Figure 2. Trusts in Windows NT
4.0 are non-transitive; trusts go from point to point.
Trusts in NT 5.0 are transitive; if domain A
trusts domain B, which trusts domain C,
then A will also trust C. |
Changes with Groups
Another aspect of the logical planning process for Active
Directory is the concept of groups. The functionality
and terminology of groups changes dramatically from Windows
NT 4.0. This is a good thing. Global groups are still
with us, but now they can contain other global groups.
Yes, we finally have nested groups! Global groups are
still used to collect users, making it easier to drop
them into other groups anywhere else in the forest. However,
global groups can only contain users and other global
groups from the same domain that the Global Group exists
in. Domain-local groups can contain users and global groups
from any domain in the Active Directory forest. However,
they can only be applied to Access Control Lists, or ACLs,
on objects within the same domain. As with NT 4.0, these
domain-local groups will be used to tie permissions to
local objects. New is the universal group, which can contain
all other groups and users from any tree in the forest
and can be used with any ACL within the forest.
We can have Security Groups, comparable to
those in NT 4.0, or Distribution Groups, which
will also be Exchange Distribution Lists when the next
version of Exchange surfacesbut a group can only
be one or the other, not both. The distribution list functionality
is granted because you can have non-security objects in
the Active Directory that are used only for identifying
the recipients of electronic messages, even those outside
the administrative authority of the tree.
The three types of groupsglobal, domain-local,
and universalcan be combined to control access to
network resources. The basic use of global groups will
be for organizing users into administrative containers
that represent their respective domains. Universal groups
will be used to contain the global groups from the various
domains to further manage the domain hierarchy when granting
permissions. Microsoft advises adding global groups to
universal groups, then assigning permissions to domain-local
groups where the resource physically exists. This architecture
lets administrators add and remove users from each domains
global groups to control access to resources throughout
the enterprise without having to make changes in multiple
locations.
These group relationships can also minimize the global
catalog replication traffic. Since domain local groups
dont appear in the global catalog, they are, by
definition, local. Global groups appear in the catalog
but not the users, and membership of the groups arent
replicated outside of their defined domain. Universal
group members are published in the catalog, but they should
contain primarily global groups. Obviously, when planning
your directory tree, youll need to consider the
relationships among the various types of groups.
AD Leads NT 5.0
Its clear that Active Directory will help move
Windows NT 5.0 into a whole new ballgame. Despite the
continuity of the low-level architecture and the security
model from previous versions of NT, theres a wide
array of new services in NT 5.0 that will need to interoperate
with the Active Directory. The best advice I can give
is plan, learn DNS, plan some more, really learn DNS,
and then plan again. AD, like the enterprise it serves,
is almost organic, something that will always be changingand
so will the directory design over time. Ill continue
to cover NT 5.0 issues throughout the year in NT
Insider.
Make sure you keep the distinctions of your logical design
and physical design appropriately clear. Examine the physical
layout of your network in terms of switches, hubs, and
routers. Design a test installation with the current beta
versions of NT 5.0 (as they are released) that will produce
replication problems. This is a new product. There will
be problems. The key is to understand them before you
deploy instead of discovering them on your production
network.
Your goal is to roll out the real thing without uncovering
many surprises. You can, after all, reproduce a complex
network with one router. If you plan and test with the
fundamental principle in mind that a directory service
is a database, youll go a long way toward building
a solid Active Directory implementation.
Thanks to Greg Neilson for providing insights about
Active Directory to the editorial crew.