In-Depth
        
        Get Active Directory Replication Right!
        There’s a method to the madness of Active Directory replication, but many of the concepts can be tough to decipher... 
        
        
			- By Andrew Lindley
- October 01, 2002
        There’s a method to the madness of Active Directory replication, but 
        many of the concepts can be tough to decipher. The following four tales 
        demonstrate the range of problems you can encounter with this little-understood 
        aspect of AD.
      
Scene 1: Deny the Defaults
      
      
A company with a hub and spoke network topology was unable to figure 
        out why they had so many AD replication connections across their WAN links. 
        They were tapping into one of AD’s most important features—the ability 
        to use Sites to control replication and authentication traffic. Their 
        AD design called for a single domain spanning all their WAN links, so 
        they really needed to make sure replication was as tuned as possible. 
        I’d seen the same problem they were having on numerous occasions and was 
        confident of the solution.
      The first thing I told them was they hadn’t done anything wrong with 
        their configuration. Instead, they were seeing the results of a particular 
        default AD setting. In order to minimize the amount of replication latency 
        in AD, all Site Links are bridged by default. This means a domain controller 
        from any site will try to create replication connections to DCs in any 
        other site that has a DC from the same domain. In addition to that connection, 
        there will also be a replication connection between every Site in order 
        to replicate AD Schema and Configuration information.
      Figure 1 shows that there were connection objects between all of the 
        DCs in all the Sites. This is because all four DCs are from the same domain 
        and, by default, all the Site Links are bridged. Changing the default 
        settings involves the following steps:
      
        -  Open AD Sites and Services.
-  Expand InterSite-Transports.
-  Right click on IP and select Properties.
-  On the General tab uncheck the box that says “Bridge all Site Links.”
After you uncheck this box, the number of replication connections will 
        be reduced, after the Knowledge Consistency Checker (KCC) runs on every 
        DC in the topology. This happens every 15 minutes by default but can be 
        triggered manually through AD Sites and Services by highlighting the NTDS 
        Settings under each DC, clicking Action and selecting, “Check Replication 
        Topology”. Figure 2 shows how the replication connections changed after 
        removing the Bridge All Sites feature.
      
         
          |  | 
         
          | Figure 1. Before: Leaving the default Site Link 
            settings results in a plethora of connection objects and lots of replication 
            traffic. | 
      
       
      
         
          |  | 
         
          | Figure 2. After: The same domain after making 
            changes to the Site Bridging properties. | 
      
      The company’s last requirement was to reduce replication latency between 
        their two manufacturing sites. To facilitate this, we simply created a 
        site link bridge and added the two manufacturing site links to it.
      
      
Scene 2: Satellite Slowdown
        A company with several satellite WAN links was having problems 
        getting AD replication to complete successfully. The WAN link bandwidth 
        should have been enough to allow smooth replication, and they were confused 
        about why it wasn’t happening. This was an issue I’d dealt with myself 
        a few years ago, so I was familiar with the problems they were having. 
        Figure 3 shows an example of their setup.
      The first problem was that satellite links are notorious for having higher 
        amounts of latency than other connections like frame relay. AD uses Remote 
        Procedure Calls (RPC) as its default replication protocol; RPC is extremely 
        susceptible to network latency. The first thing I suggested was the possibility 
        of upgrading their WAN links from the satellite connections they were 
        using. They said no, since they were committed to making replication work 
        over their current connections.
      
         
          |  | 
         
          | Figure 3. Before: The satellite links of this 
            company’s WAN weren’t replicating properly. | 
      
       
      
         
          |  | 
         
          | Figure 4. After: The reworked network topology 
            included two new domains and addition of the SMTP protocol. | 
      
      Active Directory replication has just two available protocols: RPC and 
        Simple Mail Transport Protocol (SMTP). Since their links weren’t able 
        to support RPC replication, their only other option was to switch to SMTP 
        replication across the satellite connections.
      First, though, we had to address some major Windows 2000-related SMTP 
        replication restrictions. One is that SMTP replication is only available 
        between sites, while RPC is the only protocol that you can use within 
        a site. This makes sense, since you should have plenty of bandwidth within 
        a site for RPC replication to work without any problems.
      The most important restriction is that DCs from the same AD domain can’t 
        use SMTP replication. So if this company wanted to use SMTP replication, 
        they’d have to create a separate AD domain for every remote site that 
        had a satellite connection.
      They weren’t particularly excited about doing this, but in order to get 
        their replication working and keep their satellite WAN links, they decided 
        it would be the only solution that made sense. Global Catalog server, 
        Schema, and configuration data is available through SMTP replication, 
        so they were still able to provide a local Global Catalog server for these 
        remote sites. Figure 4 shows what the SMTP replication topology looked 
        like.
      Configuring SMTP replication was a fairly straightforward process. For 
        a step-by-step guide to setting it up, see “Additional 
        Information.”
      Scene 3: Beware Consultants Who Know Nothing
        A company had been working with another consultant on its AD design but 
        was questioning his recommendations. The consultant told them they should 
        have DNS installed on every DC in their environment because AD replication 
        wouldn’t work if you didn’t. Fortunately, I was able to help them go through 
        a redesign before they implemented a solution that would have been difficult 
        to maintain and support.
      The advice they’d received was absolutely incorrect. AD was designed 
        to use DNS to locate services running on DCs. It shouldn’t change the 
        way you’d normally configure a DNS infrastructure; rather, it should just 
        build off what’s already in place. Many companies choose to use BIND for 
        DNS, which wouldn’t be running on the DCs since BIND is typically installed 
        on either a Unix or Linux platform. Before talking too much about their 
        DNS infrastructure, we revisited their AD domain design to ensure that 
        they knew exactly what they wanted. This is always a good idea since every 
        AD domain requires a DNS domain with the same name. Figure 5 shows an 
        example of what their AD domain structure and DNS infrastructure looked 
        like after following the advice of the consultant.
      Deciding to create multiple forests is a big decision and one I never 
        take lightly. Talking with this company’s IT department convinced me that 
        they had good reason to have the division within their environment. The 
        reason for having two forests is that they had a section of the network 
        that was not as trusted as the rest, so they wanted those minimally trusted 
        domains to have limited access to the rest of the network resources. They 
        also wanted to ensure that the only DNS records accessible from the external 
        network were for resources that should be seen.
      
         
          |  | 
         
          | Figure 5. Before: This company’s proposed network 
            would have had DNS installed on every domain controller—not a good 
            idea. | 
      
       
      
         
          |  | 
         
          | Figure 6. After: The redesigned DNS structure, 
            using “Shadow zones” for the external forest. S.P. represents a Standard 
            Primary zone, S.S. a Standard Secondary zone. | 
      
      They were aware that with Win2K DNS, security can only be set on AD-integrated 
        zone files, but they were still having trouble figuring out if their proposed 
        solution would work. But since the DNS records are stored in the domain 
        partition in AD, only DCs in the same domain can have an AD-integrated 
        copy of a DNS zone file. So, for example, if a DC from the public1.net 
        domain hosts an AD-integrated copy of the public1.net DNS domain, only 
        other public1.net DCs can hold AD-integrated copies of that zone.
      Another interesting caveat is that a DNS server that’s also a DC can 
        host any DNS zone as an AD-integrated zone, including a zone that will 
        be hosting records for a separate AD domain.
      The company was also curious about what a change to their proposed DNS 
        infrastructure would do to their AD replication topology. I explained 
        that since the external network had a separate forest, there wouldn’t 
        be any AD replication between the external and internal networks. I also 
        showed them another option that would satisfy all their requirements.
      Since they were going to stick with Win2K DNS, there was really only 
        one feasible option to allow them to control what records were seen by 
        the external network: Shadow zones. When using this method, the DNS servers 
        in the external network actually have a primary copy of zone files used 
        in the internal network. The internal domain admins ensure that any records 
        for machines that should be seen by the external network are manually 
        added to the external zone file. In this situation, the number of records 
        was small, so it didn’t add much of an administrative burden. None of 
        the AD service location records was needed in the external zone files 
        because there wouldn’t be any replication between the two forests. Figure 
        6 shows the redesigned DNS infrastructure with the public1.net and public2.net 
        name servers hosting shadow copies of the internal zones. This allows 
        the internal administrators to control exactly what records they want 
        visible to the public network.
      
      
Scene 4: Hidden Costs
        A company with multiple redundant WAN links was having trouble getting 
        their replication connections to work the way they wanted. The company 
        had connections between two of their branch offices for redundancy, but 
        figured that since there wasn’t much traffic going over the link it could 
        be used to reduce replication latency. They’d changed the costs on their 
        AD site links but still weren’t getting the desired result. Their main 
        problem was a misunderstanding of how site costs work.
      Although the AD connection objects showed the connections between the 
        two branch offices, the replication traffic was still going over the two 
        T1 links. To truly see what was going on, we diagrammed their router and 
        site link costs in their environment (see Figure 7).
      
         
          |  | 
         
          | Figure 7. The excessive router cost between the 
            two branches was forcing this company's traffic through the more saturated 
            T1 links. | 
      
      Notice that the actual network routing cost between the two branch offices 
        is more than the combined cost between the branch offices and the corporate 
        hub. This is obviously because the network traffic has been designed to 
        go through the corporate site with the 256k link designed to be a backup 
        connection. The AD site costs show that the cost between the branch offices 
        is less than the combined cost between the branches and the corporate 
        office; however, the traffic was actually going through the corporate 
        office.
      I’ve always felt that the costs of AD site links were one of the most 
        difficult concepts to understand. The costs placed on Site Links affect 
        only where the connection objects will connect within the replication 
        topology. So for example, even though the Site Link cost will ensure that 
        the connection objects will be directly between the DCs in the two branch 
        office locations, the network costs force the actual traffic through the 
        routers at the corporate office. One way to get the actual traffic to 
        go directly over the 256 link between the two branch offices would be 
        to change the network routing costs so that the cost between the two branch 
        offices was less than the combined through the corporate office. This 
        wasn’t optimal in the scenario, however, because that would force all 
        traffic between the branch offices to follow that same path. The better 
        way to get just the AD replication traffic to follow that path was to 
        add routes directly to the DCs. This was done simply by using the command 
        line “route add” command on the Win2K DCs in the branch offices. Normally 
        DCs would communicate to each other through the use of their default gateways. 
        The command, “route add destination ip mask 255.255.255.255 remote office 
        router ip”, caused the DCs to communicate across the 256K connection. 
        Note: The reason that the destination IP address was used and not the 
        subnet is because we only wanted traffic between the DCs to go across 
        that connection.
      
      
         
          | 
               
                | 
                     
                      | Additional 
                        Information |   
                      | Read TechNet's "Active Directory Branch Office Planning 
                          Guide Series," to learn more about AD replication components 
                          and examples for implementing a branch office replication 
                          topology. It's available here: TechNet home | Products 
                          & Technologies | Active Directory | Windows 2000 Server 
                          | Deploy | Active Directory Branch Office Guide Series 
                          (or click 
                          here). You'll find useful information in the Windows 2000 
                          Resource Kit on AD architecture here: TechNet home | 
                          Products & Technologies | Windows 2000 Server | Resource 
                          Kits | Windows 2000 Server Distributed Systems Guide 
                          (or click 
                          here). To learn more about configuring SMTP replication, visit 
                          TechNet home | Products & Technologies | Windows 2000 
                          Server | How-To Resources | Step-by-Step Guide to Setting 
                          up ISM-SMTP Replication (or click 
                          here).  |  |    | 
      
      Replication Gratification
        I’ve faced many challenges in the last couple of years working with AD. 
        Every company I’ve worked with has had a unique environment, and I’m never 
        surprised to see something I haven’t before. I hope that these tales will 
        help you along your path to a smoothly replicating AD environment.