The appliance is no longer collecting any new data. We can't select the most recent hour as the Time Period to look at. Also, we now can't see all of our user sessions--we used to be able to see them all and we haven't changed anything.
We checked the log zip file and see we're getting an error pushing data to database : "LDAP API Error : Insufficient access on host...". More complete output from ecrit log file:
Jan 30 06:50:18 fxm ecdbpushd: (3) Error in module LdapPushModule : Error pushing: sit http 1 1/3900385 34 04bf588ead095853dc79b9e629b4520a76 1170140700 / 0 (caused by: LDAP API Error : Insufficient access on host=xx.xx.xx.xx port= 9011)
"module push SLAP Main. LDAP Foglight push module."
The LDAP database has gotten into a corrupt state.
A corrupt backup was restored onto that appliance.
On Fedora Core 4 sometimes the swap partition label gets corrupted when the vendor installs the software on the appliance hardware. This causes the swap partition to have a size of 0 and can cause problems with the Foglight Experience Monitor (FxM) memory monitor. This problem was being tracked as CR0208903.
NOTE: You should consider upgrading to version 5.3.x. The LDAP database corruption shouldn't happen very often in version 5.3.x and up because all of the metrics are stored in a MySQL database, not a LDAP database (though configuration settings will still be saved in an LDAP database).
WORKAROUND (for Cause 1):
From the command line Console Program, run 'Advanced Configuration' | 'Reset LDAP Referrals'. And then choose 'Verify / Fix Database'.
If the above procedure does not work it may be necessary to reset the current real-time load bucket. Current bucket means the database that FxM is trying to insert new metrics into. There are actually many databases (about 70) and FxM "rotates" from one to the next over time.
Instructions for resetting the current real-time load bucket:
Login to the the shell (6,z,shell from the console menu) and run the following commands
2). dbtool -a init -b rs -d 21
This will wipe out the data in the current real-time load bucket.
WORKAROUND (for Cause 2):
From the command line Console Program, run 'Advanced Configuration' | 'Reset LDAP Referrals'. And then choose 'Verify / Fix Database'. If that does not work, get rid of the restored (corrupted) database completely. Reset the new machine to defaults.
WORKAROUND (for Cause 3):
1. Log into the Console Setup Program and then escape back to the Linux shell. You do this (from the main menu) by entering:
6 (you do not have to press anything else such as the enter key)
z (you do not have to press anything else such as the enter key)
shell (then press the 'enter' key)
2. Go into the /etc/blkid.tab file and find the "swap" entry. There will be a device id associated with it. It will look something like /dev/sdb3. Write that device id down.
3. Go into the /etc/fstab file and find the swap entry. Replace the entire LABEL=<whatever>" (not just the value of after the equals sign) string at the beginning of the line with the device id found in step 2 in the /etc/blkid.tab file (e.g. /dev/sdb3).
4. Reboot the machine
5. run the "free" command. The swap device should now show space available.
So now that this swap file issue fixed, the FxM memory management will handle the out of memory state correctly and your database will no longer get corrupted.
STATUS (for Cause 3):
Issue fixed in version 5.2. The latest version of Foglight Experience Monitor can be downloaded at http://support.quest.com/support_download/Downloads.asp