Tag: Active Directory

Debugging An Active Directory Custom Password Filter

A few years ago, I implemented a custom password filter in Active Directory. At some point, it began accepting passwords that should be rejected. The updated code is available at https://github.com/ljr55555/OpenPasswordFilter and the following is the approach I used to isolate the cause of the failure.

 

Technique #1 — Netcap on the loopback There are utilities that allow you to capture network traffic across the loopback interface. This is helpful in isolating problems in the service binary or inter-process communication. I used RawCap because it’s free for commercial use. There are other approaches too – or consult the search engine of your choice.

The capture file can be opened in Wireshark. The communication is done in clear text (which is why I bound the service to localhost), so you’ll see the password:

And response

To ensure process integrity, the full communication is for the client to send “test\n” then “PasswordToTest\n”, after which the server sends back either true or false.

Technique #2 — Debuggers Attaching a debugger to lsass.exe is not fun. Use a remote debugger — until you tell the debugger to proceed, the OS is pretty much useless. And if the OS is waiting on you to click something running locally, you are quite out of luck. A remote debugger allows you to use a functional operating system to tell the debugger to proceed, at which time the system being debugged returns to service.

Install the SDK debugging utilities on your domain controller and another box. Which SDK debugging tool? That’s going to depend on your OS. For Windows 10 and Windows Server 2012 R2, the Windows 10 SDK (Debugging Tools For Windows 10) work. https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/debugger-download-tools or Google it.

On the domain controller, find the PID of LSASS and write it down (472 in my example). Check the IP address of the domain controller (10.104.164.110 in my example).

From the domain controller, run:

dbgsrv.exe -t tcp:port=11235,password=s0m3passw0rd

Where port=11235 can be any un-used port and password=s0m3passw0rd can be whatever string you want … you’ve just got to use the same values when you connect from the client. Hit enter and you’ve got a debugging server. It won’t look like it did anything, but you’ll see the port bound on netstat

And the binary running in taskman

From the other box, run the following command (substituting the correct server IP, port, password, and process ID):

windbg.exe -y “srv:c:\symbols_pub*http://msdl.microsoft.com/downloads/symbols” -premote tcp:server=10.104.164.110,port=11235,password=s0m3passw0rd -p 472

This attaches your WinDBG to the debugging server & includes an internet-hosted symbol path. Don’t worry when it says “Debugee not connected” at the bottom – that just means the connection has not completed. If it didn’t connect at all (firewall, bad port number, bad password), you’d get a pop-up error indicating that the initial connection failed.

Wait for it … this may take a long time to load up, during which time your DC is vegged. But eventually, you’ll be connected. Don’t try to use the DC yet – it will just seem hung, and trying to get things working just make it worse. Once the debugger is connected, send ‘g’ to the debugger to commence – and now the DC is working again.

Down at the bottom of the command window, there’s a status (0:035> below) followed by a field where you enter commands. Type the letter g in there & hit enter.

The status will then say “Debuggee is running …” and you’re server is again responsive to user requests.

When you reach a failing test, pause the debugger with a break command (Debug=>Break, or Ctrl-Break) which will veg out the DC again. You can view the call stack, memory, etc.

To search the address space for an ASCII string use:

!for_each_module s -[1]a ${@#Base} L?${@#Size}  "bobbob"

Where “bobbob” is the password I had tested.

Alternately, run the “psychodebug” build where LARGEADDRESSAWARE is set to NO and you can search just the low 2-gig memory space (32-bit process memory space):

s -a 0 L?80000000 "bobbob"

* The true/false server response is an ASCII string, not a Boolean. *

Once you have found what you are looking for, “go” the debugger (F5, Debug=>Go, or  ‘g’) to restore the server to an operational state. Break again when you want to look at something.

To disconnect, break and send “qd” to the debugger (quit and detach). If you do not detach with qd, the process being debugged terminates. Having lsass.exe terminate really freaks out the server, and it will go into an auto-recovery “I’m going to reboot in one minute” mode. It’ll come back, but detaching without terminating the process is a lot nicer.

Technique #3 – Compile a verbose version. I added a number of event log writes within the DLL (obviously, it’s not a good idea in production to log out candidate passwords in clear text!). While using the debugger will get you there eventually, half an hour worth of searching for each event (the timing is tricky so the failed event is still in memory when you break the debugger) … having each iteration write what it was doing to the event log was FAAAAAR simpler.

And since I’m running this on a dev DC where the passwords coming across are all generated from a load sim script … not exactly super-secret stuff hitting the event log.

Right now, I’ve got an incredibly verbose DLL on APP556 under d:\tempcsg\ljr\2\debugbuild\psychodebug\ … all of the commented out event log writes from https://github.com/ljr55555/OpenPasswordFilter aren’t commented out.

Stop the OpenPasswordFilter service, put the verbose DLL and executables in place, and reboot. Change some passwords, then look in the event viewer.

ERROR events are actual problems that would show up either way. INFORMATION events are extras. I haven’t bothered to learn how to properly register event sources in Windows yet 🙂 You can find the error content at the bottom of the “this isn’t registered” complaint:

You will see events for the following steps:

DLL starting CreateSocket

About to test password 123paetec123-Dictionary-1-2

Finished sendall function to test password123paetec123-Dictionary-1-2

Got t on test of paetec123-Dictionary-1-2

The final line will either say “Got t” for true or “Got f” for false.

Technique #4 – Running the code through the debugger. Whilst there’s no good way to get the “Notification Package” hook to run the DLL through the debugger, you can install Visual Studio on a dev domain controller and execute the service binary through the debugger. This allows you to set breakpoints and watch variable values as the program executes – which makes it a whole lot easier than using WinDBG to debug the production code.

Grab a copy of the source code – we’re going to be making some changes that should not be promoted to production, so I work on a temporary copy of the project and delete the copy once testing has completed.

Open the project in Visual Studio. Right-click OPFService in the “Solution Explorer” and select “Properties”

Change the build configuration to “Debug”

Un-check “Optimize code” – code optimization is good for production run, but it will wipe out variable values when you want to see them.

Set a breakpoint on execution – on the OPFDictionary.cs file, the loop checking to see if the proposed word is contained in the banned word list is a good breakpoint. The return statements are another good breakpoint as it pauses program execution right before a password test iteration has completed.

Build the solution (Build=>Build Solution). Stop the Windows OpenPasswordFilter service.

Launch the service binary through the debugger (Debug=>Start Debugging).

Because the program is being run interactively instead of through a service, you’ll get a command window that says “Press any key to stop the program”. Minimize this.

From a new command prompt, telnet to localhost on port 5995 (the telnet client is not installed by default, so you may need to use “Turn Windows features on or off” and enable the telnet client first).

Once the connection is established, use CTRL and ] to get into the telnet command prompt. Type set localecho … now you’ll be able to see what you are typing.

Hit enter again and you’ll return to the blank window that is your telnet client. Type test and hit enter. Then type a candidate password and hit enter.

Program execution will pause at the breakpoint you’ve set. Return to Visual Studio. Select Debug =>Window=>Locals to open a view of the variable values

View the locals at the breakpoint, then hit F5 if you want to continue.

If you’re set breakpoints on either of the return statements, program execution will also pause before the return … which gives you an opportunity to see which return is being used & compare the variable values again.

In this case, I submitted a password that was in the banned word list, so the program rightly evaluated line 56 to true and returns true.

LAPS For Local Computer Administrator Passwords

Overview

LAPS is Microsoft’s solution to a long-existing problem within a corporation using Windows computers: when you image computers, all of the local administrator passwords are the same. Now some organizations implemented a process to routinely change that password, but someone who is able to compromise the local administrator password on one box basically owns all of the other imaged workstations until the next password change.

Because your computer’s local administrator password is the same as everyone else’s, IT support cannot just give you a local password to access your box when it is malfunctioning. This means remote employees with incorrect system settings end up driving into an office just to allow an IT person to log into the box.

With LAPS, there is no longer one ring to rule them all – LAPS allows us to maintain unique local administrator passwords on domain member computers. A user can be provided their local administrator password without allowing access to all of the other domain-member PCs (or a compromised password one one box lets the attacker own only that box). A compromised box is still a problem, but access to other boxes within the domain would only be possible by retrieving other credentials stored on the device.

Considerations

Security: The end user is prevented from accessing the password or interacting with the process. The computer account manages the password, not the user (per section 4 LAPS_TechnicalSpecification.docx from https://www.microsoft.com/en-us/download/details.aspx?id=46899).

Within the directory, read access is insufficient (per https://blogs.msdn.microsoft.com/laps/2015/06/01/laps-and-password-storage-in-clear-text-in-ad/) to view the attribute values. In my proposed deployment, users (even those who will be retrieving the password legitimately) will use a web interface, so a single service acct will have read access to the confidential ms-Mcs-AdmPwd attribute and write access to ms-Mcs-AdminPwdExpirationTime. There are already powershell scripts published to search an improperly secured directory and dump a list of computer names & local administrator passwords. You should run Find-AdmPwdExtendedrights -identity :<OU FQDN> to determine who has the ability to read the password values to avoid this really embarrassing oversight.

Should anyone have access to read the ms-Mcs-AdmPwd value beyond the service account? If the web interface goes down for some reason, is obtaining the local administrator password sufficiently important that, for example, help desk management should be able to see the password through the MS provided client? Depends on the use cases, but I’m guessing yes (if for no other reason than the top level AD admins will have access and will probably get rung up to find the password if the site goes down).

In the AD permissions, watch who has write permission to ms-Mcs-AdminPwdExpirationTime as write access allows someone to bump out the expiry date for the local admin password. Are we paranoid enough to run a filter for expiry > GPO interval? Or does setting “not not allow password expiration time longer than required by policy” to Enabled sufficently mitigate the issue? To me, it does … but the answer really depends on how confidential the data on these computers happens to be.

With read access to ms-Mcs-AdmPwdExpirationTime, you can ascertain which computers are using LAPS to manage the local administrator password (a future value is set in the attribute) and which are not (a null or past value). Is that a significant enough security risk to worry about mitigating? An attacker may try to limit their attacks to computers that do not use LAPS to manage the local admin password. They can also ascertain how long the current password will be valid.

How do you gain access to the box if the local admin password stored in AD does not work (for whatever reason)? I don’t think you’re worse off than you would be today – someone might give you the local desktop password, someone might make you drive into the office … but bears considering if we’ve created a scenario where someone might have a bigger problem than under the current setup.

Does this interact at all with workplace join computers? My guess is no, but haven’t found anything specific about how workplace joined computers interact with corporate GPOs.

Server Side

Potential AD load – depends on expiry interval. Not huge, but non-zero.

Schema extension needs to be loaded. Remove extended rights from attribute for everyone who has it. Add computer self rights. Add control access for web service acct – some individuals too as backup in case web server is down??

Does a report on almost expired passwords and notify someone have value?

Client Side

Someone else figures this out, not my deal-e-o. Set GPO for test machines, make sure value populates, test logon to machine with password from AD. Provide mechanism to force update of local admin password on specific machine (i.e. if I ring in and get the local admin password today, it should get changed to a new password in some short delta time).

Admin Interface

Web interface, provide computer name & get password. Log who made request & what computer name. If more than X requests made per user in a (delta time), send e-mail alert to admin user just in case it is suspicious activity. If more than Y requests made per user in a (longer delta time), send e-mail alert to admin user manager.

Additionally we need a function to clear the password expiry (force the machine to set a new password) to be used after local password is given to an end user.

User Interface

Can we map user to computer name and give the user a process to recover password without calling HD? Or have the manager log in & be able to pull local administrator for their directs? Or some other way to go about actually reducing call volume.

Future Considerations

Excluding ms-Mcs-AdmPwd  from repl to RODC – really no point to it being there.

Do we get this hooked up for acquired company domains too, or do they wait until they get in the WIN domain?

Does this facilitate new machine deployment to remote users? If you get a newly imaged machine & know its name, get the local admin password, log in, VPN in … can you do a run-as to get your creds cached? Or do a change user and still have the VPN session running so you can change to a domain user account?

LAPS For Servers: Should this be done on servers too? Web site could restrict who could view desktops v/s who could view servers … but it would save time/effort when someone leaves the group/company there too. Could even have non-TSG folks who would be able to get access to specific boxes – no idea if that’s something Michael would want, but same idea as the desktop side where now I wouldn’t give someone the password ‘cause it’s the password for thousands of other computers … may be people they wouldn’t want having local admin on any WIN box they maintain … but having local admin on the four boxes that run their app … maybe that’s a bonus. If it is deployed to servers, make sure they don’t put it on DCs (unless you want to use LAPS to manage the domain administrator password … which is an interesting consideration but has so many potential problems I don’t want to think about it right now especially since you’d have to find which DC updated the password most recently).

LAPS For VDI: Should this be done on VDI workstations? Even though it’s a easier to set the password on VDI the base VDI images than each individual workstation, it’s still manual effort & provides an attack vector for all of the *other* VDI sessions. Persistent sessions are OK without any thought because functionally no different than workstations. Non-persistent with new name each time are OK too – although I suspect you end up with a BUNCH of machine objects in AD that need to be cleaned up as new machine names come online. Maybe VDI sorts this … but the LAPS ‘stuff’ is functionally no different than bringing a whole bunch of new workstations online all the time.

Non-persistent sessions with same computer name … since the password update interval probably won’t have elapsed, the in-image password will be used. Can implement an on-boot script that clears AdmPwdExpirationTime to force change. Or a script to clear value on system shutdown (but that would need to handle non-clean shutdowns). That would require some testing.

 

Testing Process

We can have a full proof of concept type test by loading schema into test active directory (verify no adverse impact is seen) and having a workstation joined to the test domain. We could provide a quick web site where you input a computer name & get back a password (basically lacking the security-related controls where # of requests generate some action). This would allow testing of the password on the local machine. Would also allow testing of force-updating the local admin password.

Once we determine that this is worth the effort, web site would need to be flushed out (DB created for audit tracking). Schema and rights would need to be set up in AD. Then it’s pretty much on the desktop / GPO side. I’d recommend setting the GPO for a small number of test workstations first … but that’s what they do for pretty much any GPO change so not exactly ground breaking.

Viewing Active Directory Object Metadata

Objects in active directory have a modification timestamp attribute, whenChanged, that reflects the time of the last change to the object. This is useful if you want to confirm a change had not been made after a specific time (e.g. the user began having problems at 2PM yesterday, but their object was last changed November of last year … an account change is not likely to be the cause).

There is additional stored metadata which provides a modification timestamp (and source domain controller for the modification event) for each individual attribute on an object. This can be a lot more useful (e.g. a user’s home directory is incorrect, but the object modification timestamp reflects the fact they changed their password yesterday). To view the metadata, use repadmin /showobjmeta DC-Hostname “objectFQDN”

I redirect the output to a file; it’s a lot easier to search a text file for the attribute name than scroll through all of the attributes in a DOS window.

repadmin /showobjmeta dc.domain.gTLD "cn=user account,ou=pathToObject,dc=domain,dc=gTLD" > myaccount.txt

57 entries.
Loc.USN Originating DSA                       Org.USN   Org.Time/Date       Ver   Attribute
======= ===============                       ========= =============       ===   =========
20822   92d3c1e5-d4ed-41c7-989f-62a1712b1084  20822     2014-06-08 22:20:57 1     cn
...
4659114 Default-First-Site-Name\DC            4659114   2016-12-29 20:56:21 10    unicodePwd
3299408 Default-First-Site-Name\DC            3299408   2016-01-16 17:03:05 13    lockoutTime
4978129 Default-First-Site-Name\DC            4978129   2017-02-18 21:50:13 90    lastLogonTimestamp
4988421 Default-First-Site-Name\DC            4988421   2017-02-22 10:31:06 54333 msDS-LastSuccessfulInteractiveLogonTime
4977488 Default-First-Site-Name\DC            4977488   2017-02-18 16:21:12 223   msDS-LastFailedInteractiveLogonTime
4977488 Default-First-Site-Name\DC            4977488   2017-02-18 16:21:12 223   msDS-FailedInteractiveLogonCount
4977489 Default-First-Site-Name\DC            4977489   2017-02-18 16:21:18 165   msDS-FailedInteractiveLogonCountAtLastSuccessfulLogon

The originating DSA may be an odd GUID value (the domain controller on which this change was initiated has since been decommissioned) or it may be an AD site and domain controller name.

The originating timestamp indicates when the attribute’s value was last changed. The version indicates the number of revisions on the attribute value – which itself can provide interesting information like the number of times an account has been locked out or the number of times a user has changed their password.

This information can be useful when an account change does correspond with a user experiencing problems. You can identify the specific attributes that were updated and research those specific values.

It’s also useful to track down who changed a specific attribute value. The combination of originating domain controller and attribute modification time can make searching for the event log record corresponding to a specific change a lot easier — you know which server to search and can filter the log down to records spanning a few seconds.

 

Custom Password Filter Update (unable to log on after changing password with custom filter in place)

I had written and tested a custom Active Directory password filter – my test included verifying the password actually worked. The automated testing was to select a UID from a pool, select a test category (good password, re-used password, password from dictionary, password that doesn’t meet character requirements, password containing surname, password containing givenName), set the password on the user id. Record the result from the password set, then attempt to use that password and record the result from the bind attempt. Each test category has an expected result, and any operation where the password set or bind didn’t match the expected results were highlighted. I also included a high precision timer to record the time to complete the password set operation (wanted to verify we weren’t adversely impacting the user experience). Published results, documented the installation and configuration of my password filter, and was done.

Until the chap who was installing it in production rang me to say he couldn’t actually log in using the password he set on the account. Which was odd – I set one and then did an LDAP bind and verified the password. But he couldn’t use the same password to log into a workstation in the test domain. Huh?? I actually knew people who wanted *some* users to be able to log in anywhere and others to be restricted to LDAP-only logons (i.e. web portal stuff) and ended up using the userWorkstations attribute to allow logon to DCs only.

We opened a case with Microsoft and it turns out that their Password Filter Programming Considerations didn’t actually mean “Erase all memory used to store passwords by calling the SecureZeroMemory function before freeing memory.” What they meant was “If you have created copies of the password anywhere within your code, make sure you erase memory used to store those copies by calling SecureZeroMemory …”

Which makes SO much more sense … as the comments in the code I used as our base says, why wouldn’t MS handle wiping the memory? Does it not get cleaned well if you don’t have a custom password filter?? Remarked out the call to SecureZeroMemory and you could use the password on NTLM authentications as well as kerberos!

// MS documentation suggests doing this. I honestly don’t know why LSA
// doesn’t just do this for you after we return. But, I’ll do what the
// docs say…
// LJR – 2016-12-15 Per MS, they actually mean to wipe any COPIES you make
// SecureZeroMemory(Password->Buffer, Password->Length);

 

I’ve updated my version of the filter and opened an issue on the source GitHub project … but if anyone else is working a custom password filter, following MS’s published programming considerations, and finds themselves unable to use the password they set … see if you are zapping your copies of the password or the PUNICODE_STRING that comes in.

Active Directory: Custom Password Filtering

At work, we’ve never used the “normal” way of changing Windows passwords. Historically, this is because computers were not members of the domain … so you couldn’t use Ctrl-Alt-Del to change your domain password. Now that computers are members of the domain, changing Active Directory passwords using an external method creates a lot of account lockouts. The Windows workstation is logged in using the old credentials, the password gets changed without it knowing (although you can use ctrl-alt-del, lock the workstation unlock with the new password and update the local workstation creds), and the workstation continues using the old credentials and locks the account.

This is incredibly disruptive to business, and quite a burden on the help desk … so we are going to hook the AD-initiated password changes and feed them into the Identity Management platform. Except … the password policies don’t match. But AD doesn’t know the policy on the other end … so the AD password gets changed and then the new password fails to be committed into the IDM system. And then the user gets locked out of something else because they keep trying to use their new password (and it isn’t like a user knows which directory is the back-end authentication source for a web app to use password n in AD and n-1 in DSEE).

long time ago, back when I knew some military IT folks who were migrating to Windows 2000 and needed to implement Rainbow series compliant passwords in AD – which was possible using a custom password filter. This meant a custom coded DLL that accepted or rejected the proposed password based on custom-coded rules. Never got into the code behind it – I just knew they would grab the DLL & how to register it on the domain controller.

This functionality was exactly what we needed — and Microsoft still has a provision to use a custom password filter. Now all we needed was, well, a custom password filter. The password rules prohibit the use of your user ID, your name, and a small set of words that are globally applied to all users. Microsoft’s passfilt.dll takes care of the first two — although with subtle differences from the IDM system’s rules. So my requirement became a custom password filter that prohibits passwords containing case insensitive substrings from a list of words.

I based my project on OpenPasswordFilter on GitHub — the source code prohibits exact string matches. Close, but not quite 🙂 I modified the program to check the proposed password for case insensitive substrings. I also changed the application binding to localhost from all IP address since there’s no need for the program to be accessed from outside the box. For troubleshooting purposes, I removed the requirement that the binary be run as a service and instead allowed it to be run from a command prompt or as a service.  I’m still adding some more robust error handling, but we’re ready to test! I’ve asked them to baseline changing passwords without the custom filter, using a custom filter that has the banned word list hard coded into the binary, and using a custom filter that sources its banned words list from a text file. Hopefully we’ll find there isn’t a significant increase in the time it takes a user to change their password.

My updated code is available at http://lisa.rushworth.us/OpenPasswordFilter-Edited.zip

 

USN Rollback

I had to recover my domain controller from the Hyper-V image backup. There’s some protection build into AD which prevents just randomly reverting a server. When you’ve got a larger domain, the built-in protection after unsupported restoration procedures serves a purpose. Pausing netlogon avoids having users log on against bad data. Disabling replication avoids propagating bad information out to the remainder of the network. The solution is simple – demote the DC, promote it again, and the DC returns to service. But when you have a single domain controller in a single domain in a single forest … well, there’s no other data around. What the recovered DC has is as good as it’s going to get (i.e. a change from 2AM is lost when I revert to my 10PM backup). And taking the entire domain down and building it overkill. You can, instead, basically tell AD to go with it. From the MS documentation:

To restore a previous version of a virtual domain controller VHD without system state data backup

  1. Using the previous VHD, start the virtual domain controller in DSRM, as described in the previous section. Do not allow the domain controller to start in normal mode. If you miss the Windows Boot Manager screen and the domain controller begins to start in normal mode, turn off the virtual machine to prevent it from completing startup. See the previous section for detailed instructions for entering DSRM.
  2. Open Registry Editor. To open Registry Editor, click Start, click Run, type regedit, and then click OK. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes. In Registry Editor, expand the following path: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Parameters. Look for a value named DSA Previous Restore Count. If the value is there, make a note of the setting. If the value is not there, the setting is equal to the default, which is zero. Do not add a value if you do not see one there.
  3. Right-click the Parameters key, click New, and then click DWORD (32-bit) Value.
  4. Type the new name Database restored from backup, and then press ENTER.
  5. Double-click the value that you just created to open the Edit DWORD (32-bit) Value dialog box, and then type 1 in the Value data box. The Database restored from backup entry option is available on domain controllers that are running Windows 2000 Server with Service Pack 4 (SP4), Windows Server 2003 with the updates that are included in article 875495 (http://go.microsoft.com/fwlink/?LinkId=137182) in the Microsoft Knowledge Base installed, and Windows Server 2008.
  6. Restart the domain controller in normal mode.
  7. When the domain controller restarts, open Event Viewer. To open Event Viewer, click Start, click Control Panel, double-click Administrative Tools, and then double-click Event Viewer.
  8. Expand Application and Services Logs, and then click the Directory Services log. Ensure that events appear in the details pane.
  9. Right-click the Directory Services log, and then click Find. In Find what, type 1109, and then click Find Next.
  10. You should see at least an Event ID 1109 entry. If you do not see this entry, proceed to the next step. Otherwise, double-click the entry, and then review the text confirming that the update was made to the InvocationID:

 

  • Active Directory has been restored from backup media, or has been configured to host an application partition. 
    The invocationID attribute for this directory server has been changed. 
    The highest update sequence number at the time the backup was created is <time>
    
    InvocationID attribute (old value):<Previous InvocationID value>
    InvocationID attribute (new value):<New InvocationID value>
    Update sequence number:<USN>
    
    The InvocationID is changed when a directory server is restored from backup media or is configured to host a writeable application directory partition.
    
  • Close Event Viewer.
  • Use Registry Editor to verify that the value in DSA Previous Restore Count is equal to the previous value plus one. If this is not the correct value and you cannot find an entry for Event ID 1109 in Event Viewer, verify that the domain controller’s service packs are current. You cannot try this procedure again on the same VHD. You can try again on a copy of the VHD or a different VHD that has not been started in normal mode by starting over at step 1.
  • Close Registry Editor.

 

After following the instructions from Microsoft, I still had a problem — my DC has replication turned off & netlogon comes up paused. In regedit, locate HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Parameters and delete the “Dsa Not Writable” key (value: dword:00000004). In a command prompt, run the following:

 

repadmin /options dchostname.example.com -DISABLE_OUTBOUND_REPL
repadmin /options dchostname.example.com -DISABLE_INBOUND_REPL

Reboot the DC. When it starts, netlogon should be running and replication.

Microsoft Directories – NT and Windows 2000/2003/2008/2012

Windows NT provided a limited repository for user id’s and passwords.  NT domain credentials had the advantage of providing single-sign-on access to other Microsoft resources such as file shares and Exchange.  Exchange itself housed a secondary directory, used for the “global address list” type details for Exchange accounts.  Address, phone number, manager, email addresses … basically anything other than the user’s ID and password were stored within the Exchange directory.  The Exchange directory then linked each account into an NT4 domain user account for logon credentials.

With Windows 2000, Microsoft integrated the two directories into Active Directory.  This allowed a more robust set of user details to be provided – and moved the LDAP compliant directory off the Exchange server onto the domain controllers.  Major changes were introduced in Active Directory – an increased maximum object count (from 40,000 to ten million in a single domain with a billions of objects in an AD forest), multi-master architecture, and attribute level replication being some of the key changes.

Data Store

Active Directory data is stored in ntds.dit.  ESE (extensible storage engine) is used to access the data within the database.  In addition to ntds.dit, there are several peripheral database files – edb.log is the current in-use transaction log file.  EDB#####.log may be present if the edb.log file has been filled.  EDB.CHK is the checkpoint file – this keeps track of which transactions have been committed to ntds.dit and a crash of the system will cause the transaction logs to be replayed from the pointer referenced in the chk file.  Res1.log and res2.log, ten meg in total, are placeholder files just in case should the server run out of disk space the files are removed to allow continued operation.

Within NTDS.DIT there are two main tables:

  • The link table – metadata for calculating linked values
  • The data table – actual domain data

There are four other tables about which no additional information will be provided here-in

  • System Table – metadata for the DSA-defined tables and indices
  • HiddenTable – DSA metadata
  • SDPropTable – Transiently stores Security Descriptor propagation, records are removed from table as propagation completes
  • MSysDefrag1 – ESE database table, not specific to AD

For linked attributes, the backlinked attribute is not modified directly but rather determined when it is queried.  As an example – Active Directory generates a reporting structure.  An object has a manager, but the “reports” listing is calculated based on object’s managers.  The linkID of a forward link attribute is always even and it’s associated backlinked attribute is always the forward linkID plus one (consequently also always odd).  A full list of forward/back link pairs can be generated by looking at the linkID values.

The data table contains three different naming contexts – the schema, the configuration, and the domain data.  These correspond to the three partitions shown in REPLMON – “cn=schema,cn=configuration,dc=windstream,dc=com”, “cn=configuration,dc=windstream,dc=com”, and “dc=windstream,dc=com”.  The term partition in Active Directory is used to indicate a naming context – in no way related to Novell’s use of the term to indicate a replication boundry.

The schema and configuration partitions are replicated to all domain controllers in a forest – since we only have one tree in the forest rendering the point moot since all the domain controllers in the domain are also all the domain controllers in the forest.  The domain partition is replicated to all domain controllers in the domain.

 

Active Directory – Schema

Microsoft’s documentation on unmodified schema classes and attributes can be found at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/adschema/adschema/active_directory_schema.asp   The modifications Exchange makes to the AD schema can be found at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wss/wss/wss_ldf_AD_Schema_intro.asp

The schema management MMC is not automatically available on a Windows machine.  To enable the snap-in, run regsvr32 c:\winnt\system32\schmmgmt.dll – then “Active Directory Schema” will be an option when adding snap-ins to MMC

Active Directory’s schema is normally in a read-only mode and no user has rights to modify the schema.  Prior to enacting a schema change, then, you must enable schema writes and add your account to the “Schema Admins” group.  To enable schema writes, right click on the “Active Directory Schema” item in the MMC and select “Operation Master”.  Then check the box next to “The Schema may be modified on this domain controller”

When creating new schema classes or attributes, ensure you use the correct OID for our organisation.  Preferably, too, create auxiliary classes and associate the aux class with a structural class.  This prevents any vendor changes to the structural class from impacting your schema attributes.

In AD, schema changes cannot be deleted (well, it can but the process is unsupported).  An attribute can be deactivated, but it remains in the schema definition.

Active Directory – Configuration

The AD Configuration partition holds, as the name implies, configuration for the domain and some services within the domain.

  • Display Specifiers: Under the DisplaySpecifiers CN you will see multiple three digit hex number combinations.  These are codes for different languages – 409 being English.  http://www.microsoft.com/globaldev/reference/win2k/setup/lcid.mspx lists the codes used within the Windows internationalisation features.  Under each regional container you will find the actual display specifier for structural schema objects.  On, for instance, the user-Display object, is defined what appears when you right-click a user object in Active Directory Users and Computers.  Another attribute defines the pages which appear when you create a user and the order in which those pages appear.  The createDialog attribute is of particular interest – we modify this to automatically create the display name as lastname, firstname MI if you manually create a user within AD.  This is done by defining the createDialog value as “%<sn>, %<givenName> %<initials>”
  • Extended Rights:  On the controlRightsAccess object, appliesTo defines structural schema objects to which the controlRightsAccess object applies.  The controlRightsAccess objects themselves have several functions.
    • When validAccesses is set to 8, this is to validate writes – or check the attribute value beyond the schema definition.  Implementation is not widespread.
    • When validAccesses is 256, then the object defines an actual extended right – something not part of the normal ACL’s.  Recieve-as and Send-As, for instance, are a special operations for Exchange which can be found in the ExtendedRights container.
    • Other validAccesses codes define ACL groups which can be assigned through the “Delegate Control” function.  and validAccesses indicates what rights the ACL group permits – 16 for read, 32 for write, and the sum of 48 for read/write access.  The membership object in ExtendedRights, with appliesTo bf967aba-0de6-11d0-a285-00aa003049e2 and validAccess of 48 means this access group allows whomever is granted it to both read and write to user objects (bf967aba-0de6-11d0-a285-00aa003049e2 is the guid of the user schema object).  On the schema object “member”, then, the rightsGUID is entered as the attributeSecurityGUID.

An example of the rights grouping is the “Personal-Information” object, rightsGUID 77B5B886-944A-11d1-AEBD-0000F80367C1.  You will find the corresponding octet string, 0x86 0xb8 0xb5 0x77 0x4a 0x94 0xd1 0x11 0xae 0xbd 0x00 0x00 0xf8 0x03 0x67 0xc1, applied to several schema attributes – telephoneNumber, facsimileTelephoneNumber, streetAddress, telexNumber, and so on:

Thus using the “Delegation Of Control” wizard, it is possible to select “Read and write Personal Information” as a permission set rather than specifying each individual attribute you want editable. Note, too, in the ACL editor the listing of “Personal Information” is retained

ForestUpdates

Under ForestUpdates you will see an “Operations” CN.  Operations holds a listing of updates made to the forest (e.g. exchange /forestprep).  This allows the system to check that the requisite forest updates are in place prior to installation without requiring the changes to be re-run.

LostandFoundConfiguration

This is basically the same thing “LostAndFound” in the domain naming context is, but within the configuration partition.  All things being equal, it should be empty.  Should an object be created within the Configuration partition at the same time it’s parent is deleted, the object is moved to “LostAndFoundConfig” for holding.

Partitions

Contains crossref objects to all partitions within the forest – again not as interesting here as it could be with just one tree and domain.

Physical Locations

This is intended for use with Directory Enabled Networking.  The DEN concept is maintained by DMTF (http://www.dmtf.org/standards/wbem/den/) and is not at present implemented at Windstream

Services

Forest-wide application settings – objects within this container correspond directly to the “Services” listed within the “Active Directory Sites And Services” snapin.  One of the services listed is Microsoft Exchange.  Should a server fail, running setup /disasterrecovery will recover most of the Exchange settings for the server from within this container.

Sites

The “Sites” of “Active Directory Sites and Services”.  IP subnets and their associated sites are defined in this container, as well as the replication partnerships between domain controllers.

WellKnown Security Principals

What I call the “virtual credentials” – system security credentials like Everyone and Self are defined herein.

Active Directory – Domain Data

Objects specific to just one domain within the forest – the obvious users, computers, printers, file shares, groups, and contacts.  Less obvious items too are stored within the domain data.  If Windows DNS zones are configured as “Active Directory Integrated”, the DNS entries will appear under “cn=MicrosoftDNS,cn=System,dc=…”.  File replication service (FRS) shares (including the domain SYSVOL), some information on Group Polices, Oracle database connections … any of the structural schema objects … can also be found within this partition.

An object named Infrastructure is in the root of the domain naming context, this object holds the NTDS settings for the domain infrastructure operations master.

Flexible Single Master Operations (FSMO) Roles

FSMO roles are assigned for functions which cannot practically be performed by any domain controller – functionality which cannot subscribe to the multi-master principal.

There are two forest-wide FSMO roles, the Domain Naming Master and the Schema Master.

  • The Schema Master is the server on which writes can be made to the schema.  All domain controllers will have a read-only copy of the schema, but only the schema master can write changes.
  • The Domain Naming Master is used when a new domain is created within a forest – it verifies the new domain has a unique name.

Three additional FSMO roles exist in each domain within a forest.  The Infrastructure Master, RID Master, and PDC Emulator.

  • The Infrastructure Master, in a multi-domain environment, handles cleanup of phantom objects created as members are added to groups via a trust.  The cleanup process is detailed by Microsoft at http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q248047    As we have a single domain, this is somewhat immaterial.  Should we begin implementing other domains, the Infrastructure Master will need to be moved to a non-global catalogue (GC) server.  The GC functionality precludes the phantom objects from being created (and hence from being purged).
  • The RID master allocates blocks of relative ID’s, RID’s, to the domain controllers within the domain to ensure unique GUID’s.  Should the RID master be offline for a short interval, new objects can still be created until the already-allocated RID block has been exhausted.
  • The PDC emulator is multi-function.  Were the domain to be in mixed-mode and therefore support NT4 BDC’s, the PDC emulator is required by the NT4 domain controllers for backwards compatibility.  Our domain is in native-mode and cannot have NT4 BDC’s.  This does not preclude NT4 member servers, just domain controllers.  The PDC Emulator server is authoritative for the user’s password.  Any failed logons are re-checked against the PDC emulator.  In the NT4 environment this was because a BDC was a read-only directory copy to which password changes could not be made.  If you changed your password and attempted to authenticate prior to the domain replicating the change completely, you could receive an invalid password error using your correct new password.  To prevent this issue, a password failure on the BDC was re-checked with the PDC before the logon attempt was failed.  This is how we can allow CSO password changes into AD without requiring the user to wait for domain synchronisation.  The DirXML AD driver is installed to the PDC emulator server to allow immediate use of the user’s new CSO password.  Group Policy Objects are created and edited on the PDC Emulator’s SYSVOL share.  The PDC emulator is also the time source for the domain.  Our PDC emulator is configured to use time.windstream.com as its time source with a time sync period of eight hours.

Normally you can move the FSMO roles between domain controllers using MMC’s.  For the three per-domain roles the change is made in Active Directory Users and Computers.  The Domain Naming Operation Master is changed from Active Directory Domains and Trusts”; the Schema master is changed within Active Directory Schema Manager.

Within Active Directory Users and Computers, right click the domain and select “Connect To Domain Controller” – select the domain controller which will receive the new role.  Then right click the domain and select “Operation Masters”.  You just click the “Change” button to move the role.

In the event of a catastrophic server failure complete with no system state backups you can forcibly transfer the FSMO roles from a non-operational source.  We have done this once in production, the ICM domain, but there is additional peripheral cleanup required to remove the failed domain controller from operation.  http://support.microsoft.com/kb/255504/ contains instructions for seizing FSMO roles.  Microsoft mostly documents the domain cleanup process at http://support.microsoft.com/kb/216498/   If you want to try it for the experience, build two servers, create a fake domain with the two of them, turn one off and seize all the roles onto the remaining machine.  This is effectively what happened in the ICM domain, they had three domain controllers but the first which held all roles was destroyed.  Be careful in production as the post-seizure cleanup is not fully documented.  DNS entries will still exist in BIND.  It is possible for your domain controller machine password to be out of sync with the domain.  I’m sure there are other situations which could arise as well which we didn’t happen across.

Domain Registration – WINS

The WINS entries of your domain should only be used by ‘legacy’ clients, NT4 workstations and servers.  If you configure your domain controllers TCP/IP properties to use your WINS servers, the registration for the domain will be created automatically.  Alternately you can create an LMHOST file for import into a foreign WINS server.  The only reason we do this is to establish a trust with an NT4 domain.  There are two records needed – and for our domain the text is included here-in.

10.33.8.25       SCARLITNT631         #DOM:ALLTEL          #PRE
10.33.8.25       "ALLTEL         \0x1b"  #PRE

If you are attempting to create the LMHOST file for an alternate domain, you can just change the values except the “ALLTEL         \0x1b” entry.  There is a quotation mark, sixteen characters only (insert holy hand grenade like joke here) followed by the \0x1b then the closing quotation mark.  If your domain name is BOB you cannot replace ALLTEL with BOB, you need to replace it with BOB and three trailing space characters.

Domain Registration – DNS

There are four “underscore zones” – new DNS zones used to store the SRV records relevant to the domain. Active Directory works fine with BIND DNS servers – you need to allow dynamic updates from the domain controller IP addresses. Since I do not allow dynamic updates on the root zone, I manually add the domain controller A records.

  • _sites.domain.tld.   Service records advertise servers providing global catalogue, Kerberos, and LDAP services within each site.  The sites are differentiated within the record name – _service._tcp.SITENAME._sites.domain.tld.    The following lines are the _sites records for the TWNUserAuth site
$ORIGIN _tcp.TWNUserAuth._sites.alltel.com.
_gc                               SRV     0 100   3268    neohtwnnt630.alltel.com.
_kerberos                         SRV     0 100   88      neohtwnnt630.alltel.com.
_ldap                             SRV     0 100   389     neohtwnnt630.alltel.com.
_gc                               SRV     0 100   3268    neohtwnnt631.alltel.com.
_kerberos                         SRV     0 100   88      neohtwnnt631.alltel.com.
_ldap                            SRV      0 100   389     neohtwnnt631.alltel.com.
  • _tcp.domain.tld. Service records advertise all domain controllers within the domain providing global catalogue, Kerberos, LDAP, and kpasswd services.  The following lines are the _tcp records for the NEOHTWNNT630 server
$ORIGIN _tcp.alltel.com.
_gc                           SRV     0 100   3268    neohtwnnt630.alltel.com.
_kerberos                     SRV     0 100   88      neohtwnnt630.alltel.com.
_kpasswd                      SRV     0 100   464     neohtwnnt630.alltel.com.
_ldap                         SRV     0 100   389     neohtwnnt630.alltel.com.
  • _udp.domain.tld. Used for UDP kerberos connections to get tickets and change passwords.  Service records in this zone advertise the UDP Kerberos and kpasswd services for the domain.  The following lines are the _udp records for the NEOHTWNNT630 server
$ORIGIN _udp.alltel.com.
_kerberos                   SRV     0 100   88       neohtwnnt630.alltel.com.
_kpasswd                    SRV     0 100   464      neohtwnnt630.alltel.com.
  • _msdcs.domain.tld.  Kerberos, ldap, and global catalogue records by site and not.  In addition each domain controller’s GUID used for replication is registered here.  Again the example provides the service records for NEOHTWNNT630
$ORIGIN _msdcs.alltel.com.
47c1965e-87e8-4445-8552-fd20892c08c2    CNAME          neohtwnnt630.alltel.com.
_ldap._tcp.e0f0a709-9edf-483b-96e6-55c0dd55c1a6.domains           SRV 0 100 389             neohtwnnt630.alltel.com.
gc                      A       10.10.90.217
$ORIGIN _tcp.dc._msdcs.alltel.com.
_kerberos                                 SRV     0 100   88        neohtwnnt630.alltel.com.
_ldap                                     SRV     0 100   389       neohtwnnt630.alltel.com.
$ORIGIN _tcp.TWNUserAuth._sites.dc._msdcs.alltel.com.
_kerberos                                 SRV     0 100   88        neohtwnnt630.alltel.com.
_ldap                                     SRV     0 100   389       neohtwnnt630.alltel.com.
$ORIGIN gc._msdcs.alltel.com.
_ldap._tcp                                SRV     0 100   3268    neohtwnnt630.alltel.com.
$ORIGIN _sites.gc._msdcs.alltel.com.
_ldap._tcp.TWNUserAuth                    SRV     0 100   3268    neohtwnnt630.alltel.com.

The PDC emulator is also advertised here

$ORIGIN _msdcs.alltel.com.
_ldap._tcp.pdc                         SRV     0 100   389      scarlitnt631.alltel.com.

 

Client Authentication

A client which has already authenticated to the domain will have a registry entry which retains the client’s site.

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Netlogon\Parameters]
"DynamicSiteName"="LITUserAuth" 

When a client attempts to authenticate to Active Directory, the service records for the kerberos service are used to determine an appropriate authentication source.  In the case of the PC above, this would be a query for _kerberos._tcp.LITUserAuth._sites.dc._msdcs.alltel.com. service records is made.  An LDAP connection is initiated over udp/389 to every domain controller returned by the DNS query.  Each connection is initiated in 1/10th intervals second.  The receiving servers compare the client’s IP address to the subnet configuration to verify the client is reaching the correct site for it’s current subnet.  The first LDAP response received is then used as the kerberosauthentication server.  If the client’s site is incorrect a referral is returned for the correct site – which then prompt the client to re-query DNS for the correct new site.