Tag: Oracle

OUD 11g Failure

Ran out of disk space on the OUD partition (don’t let people turn on debug logging!!). Easily fixed — stop service, remove big log file, start again. Unfortunately that “start again” wasn’t as easy as it could have been. Server startup fails. It says the cryptomanager failed to publish the instance-key-pair in ADS …. which, Oracle says means you get to recreate the single instance. Honestly, this isn’t a horrible process since I’ve got a backup of the directory. But I’d rather not spend the next hour messing around with the server.

The underlying problem is that the admin partition didn’t dump out to its storage file (full disk), so admin-backend.ldif is a 0-byte file. Now, had the directory been running when I noticed this … I could have dumped the cn=admin data partition to another volume and copied the file in once I cleared up disk space. But I have a file-level backup … and it was easy enough to pull the file from last night back. Voila, one server online in a couple of minutes. And a good reminder for the future that a shutdown with a full disk … no good. I probably want to find something to free up a couple meg of space prior to shutting down the server. Extra-cautious option would be the MS Exchange approach of creating a few 5-meg files to ensure there’s space that can be freed up.

 

Added 06 Feb 2020: One of my colleagues encountered this problem again today, and it turns out you can copy the admin-backend.ldif from another server too. It may not be 100% — he cannot log into the server through the admin console. But he’s got a directory being served & no users complaining about an outage. It can get sorted properly later.

DSEE 6.3 To OUD 11g Transition

There’s no direct path to replicate data from DSEE6.3 to OUD11g. Not unreasonable since DSEE is the Sun product based on the Netscape Directory Server and OUD is the Oracle product based on OpenLDAP – they weren’t exactly designed to allow easy coexistence that would permit customers to switch from one to the other. Problem is, with Oracle’s acquisition of Sun & axing the DSEE product line … customers *need* to interoperate or do a flash cut.
Since our Identity Management (IDM) platform was not able to prep development work and implement their changes along with the directory replacement, a flash cut was right out. I’ve done flash cuts before — essentially ran two completely different directories in parallel with data fed from the Identity Management platform, tested against the new directory using quick modification to the OS hosts file, then reconfiguring the virtual IP on the load balancer to direct the existing VIP to the new service hosts. Quick/easy fail-back is to set the VIP to the old config and sort out whatever is wrong on the new hosts. A lot lower risk than a traditional ‘flash cut’ approach as long as you trust the IDM system to keep data in sync. But lacking an IDM system, flash cut is typically a non-starter anyway.
There is a migration path. Oracle put some development effort into the DSEE product line prior to discontinuing it. DSEE7 was the Sun distributed “next version”. It was not widely deployed prior to the Oracle acquisition. Oracle took over DSEE7 development but called it DSEE11 (to match the OUD version numbering, I guess?). Regardless of the rational, you’ll see the “next version” DSEE product referred to as both DSEE7 and DSEE11.
There’s not a direct replication between Oracle DSEE11 and Oracle OUD11. Oracle created a “replication gateway” that handles, among other things, schema name mapping (only Netscape would use attribute names like nsAccountLockout, and that nomenclature carried through to the Sun product). Oracle did a decent job of testing DSEE11<=>OUD11 Replication Gateway interoperability. I don’t know if they just assumed DSEE6 would work because DSEE11 did or if they assumed the installation base for DSEE6 was negligible (i.e. didn’t bother to test older revisions) but we found massive bugs in the replication gateway working with DSEE6. “You cannot import the data to initialize the OUD11 directory” type of bugs which I was willing to work around by manually editing the export file, but subsequent “updates do not get from point ‘A’ to point ‘B’ bugs too. The answer from Oracle was essentially “upgrade to DSEE11” … which, if i could flash-cut upgrade DSEE6 to DSEE11 (see: IDM platform couldn’t do that), I could just cut it to OUD11 and be done. Any non-trivial change was a non-starter, but Oracle wasn’t going to dump a bunch of development time into fixing replication for a dead product to their shiny new thing.
I worked out a path that used tested and working components — DSEE6 replicated just fine with DSEE11. DSEE11 replicated just fine with the OUD11g replication gateway, and the OUD11g replication gateway replicated fine with OUD11g. Instead of introducing additional expense and time setting up dedicated replication translation servers, I installed multiple components on the new servers. There is a DSEE11 directory on one of the new OUD servers, the replication gateway on another one of the new OUD servers, and (of course) the OUD11g directory that we actually intended to run on the new servers is on those new OUD servers.
This creates additional monitoring overhead – watching replication between three different directories and ensuring all of the services are running – but allows the IDM platform to continue writing changes to the DSEE6.3 directory until they are able to develop and test changes that allow them to use OUD11g directly.