We’re in the middle of replacing all 550+ DC’s in our Active Directory environment with new hardware. Because some developers and applications are hardcoded to use certain DC’s and since the DC’s are also our DNS servers, we did not want their IP addresses or names to change. If we changed their IP’s, for example, we’d have to change the DNS entries on all the servers’ TCPIP NIC configurations, as well as the scopes in DHCP.
This isn’t too bad, because we worked out a step-by-step process to demote, rename, re-ip, the old systems before we tear them down completely. Then we can bring up the new DC’s with the original name and IP. It’s a lot cleaner than doing DC renames later and much less fraught with difficulties.
Except….I’ve run into fun replication issues with stubborn metadata and KCC’s.
THE SETUP
· Normally, if we make a change, convergence for our entire AD infrastructure takes about 1 hour. A recent AD health check by Microsoft confirmed this.
· We have two DC’s at every site for redundancy.
· All DC’s are GC’s except 2 per domain: the infrastructure master role holder; and a special DC at a central site we use for backups (the ntds.dit is smaller and easier to backup if it is not a GC).
· The infrastructure master is always located at a domain hub site.
· The second DC at the domain hub site is a GC, PDC-emulator role holder, and RID master role holder.
· We have a hub-and-spoke replication topology for each domain, centered around a site with excellent WAN connectivity. That hub then replicates with our site as the national hub.
· Typically, there are anywhere from 5 to 10 sites within a domain. Some have more, though none have less than 5.
· All DC’s are DNS servers, carrying their domain’s AD-integrated DNS zone as well as some other legacy zones and the standard root zone.
THE SCENARIO
I demoted and took out the old DC/GC at a domain hub site.
When I tried to promote the new hardware, I got the message: “Can’t join the domain, user already exists.” (Of course, the “user” is the computer, in this case.)
I’ve had those errors before and it is invariably one of three things:
· Debris left over in Sites & Services. If you look at the site, you might see the old DC you promoted still there as an object, but it won’t have any connector objects.
o You can just delete the DC object *IF* you expand it and there is *no* NTDS Settings and no connectors listed in Sites & Services.
o If the NTDS Settings/connectors still exist under the DC object in Sites & Services, you’ll need to perform a forcible removal via NTDSUTIL, which I’ll discuss a little later in this blog.
· The old DC may still be listed as a name server in DNS on the domain DNS zone’s Name Servers tab.
o Open DNS and select the domain’s DNS zone. Right-click on the zone and pick properties to look at the Name Server’s tab.
o If the old DC’s name is still listed as a name server, remove it.
· The old DC may have left an old computer account in Active Directory Users and Computers (ADUC) and you need to delete the old account. (That’s why we usually rename the computer after the demotion, but before we take it down hard for the last time. If you rename it, there should be no old account left in ADUC with the same name.)
WHAT I DID
But this time, I checked all the above things, and it looked clean.
So I opened the DCPROMO log, %windir%\debug\dcpromoui.log and went to the bottom. I discovered which DC it was talking to, to sponsor it’s addition into the domain. (Note: to find the sponsor, search for: Enter MyNetJoinDomain)
I checked the sponsoring DC and found that it still listed the old DC in Sites & Services *and* it preferred that old DC as it’s replication partner, even though it no longer existed. And there was nothing I could do in replmon, repadmin or Sites & Services to force the KCC to give up replication with the dead DC and establish a connection with the remaining DC in the site.
So I did a forcible removal of metadata about the old DC by using NTDSUTIL. (I’ll list that process further down) with the focus set on the stubborn sponsoring DC.
Tried to DCPROMO the new DC again—no go. Checked the log again and found it had selected a different sponsoring DC from another site. This new sponsoring DC also refused to give up its replication connector to the old, removed DC. So I had to do NTDSUTIL again to remove metadata on that system.
HOW I DID IT
Here, in a nutshell, is how to remove a demoted DC’s metadata so that the KCC will stop trying to create connectors to DC’s that no longer exist, and so that you can reuse a domain controller name (if you wish to).
Oh, you have to be at least a domain admin.
And you have to do this on a DC in your domain.
1. Open Sites & Services / Expand the target site / Expand the target DC you want to remove
2. Check for the NTDS object beneath the DC server object and connections within that
3. If the NTDS object does *not* exist, just delete the DC server object and you’re done. Skip to the end.
4. If the NTDS object exists, *continue*
5. At the command prompt, enter: ntdsutil
6. Enter: connections
7. Enter: connect to server servername
where servername is the FQDN (myserver.subdom.dom) of a DC in the domain you’re working with
8. Enter: quit
9. Enter: select operation target
10. Enter: List sites
a. Scroll through the sites to find the site containing the stubborn DC server object
b. Enter: select site sitenumber
where sitenumber is the number of the site containing the stubborn DC server object
11. Enter: list domains
a. Scroll through the domains to find the domain containing the stubborn DC server object
b. Enter: select domain domainnumber
where domainnumber is the number of the domain containing the stubborn DC server object
12. Enter: list servers in site
13. Enter: select server servernumber
Where servernumber is the stubborn DC you want to remove
14. Enter: list current selections
VERIFY that you have selected the DC you want to remove
15. Enter: Quit
16. Enter: remove selected server
17. Read the popup window and VERIFY what you are going to delete
18. Click on [Yes]
19. Enter: quit
Keep entering quit until you exit ntdsutil
20. Go back to Sites & Services
21. Make sure the stubborn DC now has no NTDS object under it.
22. If the stubborn DC is now clean, delete the stubborn DC in Sites & Services
FINAL EXPLANATION
Because I was working with a DC at the main domain hub site, and that DC was the only GC at that site, the KCC on all the other domain DC’s preferred that DC/GC. When the DC/GC was demoted, there was “nothing” at the other end of that connector. The sites preferred to keep that connector (even though I deleted connectors and forced the KCC to rerun).
Because the other DC’s in other sites were only trying to communicate with the demoted DC/GC, they did not replicate the metadata that indicated that the old DC/GC was gone. So the KCC would always regenerate connectors to the dead DC, which in turn meant DC’s other sites never got the news about the demotion.
I finally had to use the ntdsutil method on every promotion-sponsoring DC, which was basically one DC at each site in the domain (the DC elected as the replication bridgehead) before the dcpromo would agree that the DC’s name was “free” to use to join the new DC to the domain and promote it.
Whew. What a pain in the neck.
Sincerely,
Amy G. Padgett
No comments:
Post a Comment