Information System Troubleshooting Guide

Which BDII version?

This page is about BDII v5 used on EMI.

Troubleshooting Steps

Before attempting to troubleshoot problems with the information system, it is important to have a general overview of the information system, in particular working knowledge of the BDII. Information flows from the resource level BDII to the top level BDII via the site level BDII. For this reason a top down approach for troubleshooting is followed.

Identify the service where the problem is seen.

  1. Is the information in the top level BDII correct? If yes then there is usually no problem.
  2. Is the information is correct in the site level BDII but not correct in the top level BDII? If yes then the problem is probably with the top level BDII.
  3. Is the information is correct in the resource level BDII but not correct in the site level BDII? If yes then the problem is probably with the site level BDII.

Identify the component with the problem

  1. Is the information correct when the bdii-update script is executed with the same user as the BDII uses? If yes the the problem is with the BDII.
  2. The problem must be in one of the data sources, investigate each source in the ldif, provider, plugins directory to identify which has the problem.

Investigating the BDII

  1. Check the BDII log file for error messages.
  2. Change the BDII_LOG_LEVEL to DEBUG and check the log file.
  3. Check the files in the BDII_VAR_DIR directory to help locate the problem.

Common Problems

BDII fails to start

For gLite 3.2 on SL5-compatible installations this can happen due to SELinux settings. One recourse could be to switch SELinux off. The following workaround can be used:

chcon --changes --reference=/var/lib/ldap/ -R /var/bdii/
semanage port -a -t ldap_port_t -p tcp 2170

Further details here:

If the BDII fails to start, this could also be an underlying problem with the LDAP database. Try to start the slapd server with the default slapd.conf file.

/usr/sbin/slapd -f /etc/openldap/slapd.conf -d 255
If this fails, there is a problem with the LDAP installation. Note that this has been experienced when using virtual machines. To solve this problem online forums related to the LDAP and the OS distribution can be useful.

If the LDAP installation has been verified, the slapd.conf file used by the BDII should be tested.

/usr/sbin/slapd -f  /etc/bdii/bdii(-top)-slapd.conf -d 255
If this fails there is could problem with the BDII slapd.conf file.

Unable to initialize mutex Error

There were reports about the following error:
...
bdb_db_init: Initializing BDB database
bdb(o=grid): unable to initialize mutex: Function not implemented
bdb(o=grid): /opt/bdii/var/2171/__db.001: unable to initialize environment lock: Function not implemented
...
This issue may be fixed using the FAQ provided by Oracle : http://www.oracle.com/technology/products/berkeley-db/faq/db_faq.html#12

Entry's missing in the BDII

If invalid LDIF is produced, then the entry will be rejected when it is being inserted in to the LDAP database. Rejected entries will be recorded in the BDII log file when logging is set to WARNING or higher.

Default values shown instead of dynamic values

The dynamic plugin might have a problem or there is a miss-match with the dn's. Check that the dn's produced by the dynamic plug-in are the same as in the static ldif file. The dynamic plugin should be executed with the same user as the BDII uses to spot permission problems. Run the following command to spot any errors
su ldap /usr/sbin/bdii-update -c /etc/bdii/bdii.conf > /dev/null

BDII started but no response from port 2170

Run netstat -l to see if the slapds ports are running on port 2170. These are ports that the LDAP servers are listening on.

tcp        0      0 localhost.localdomain:2170  *:*
LISTEN

The BDII is overloaded with queries

Due to the critical nature of the information system with respect to the operation of the grid, the BDII should be installed as a stand-alone service to ensure that problems with other services do not affect the BDII. In no circumstances should the BDII be co-hosted with a service which has the potential to generate a high load. If there are too many queries to a BDII and the load is too high, multiple instances of the BDII can be deployed high a dns load balanced BDII service behind a "round robin" dns alias. Detailed logging for slapd is availalbe by configuring the slapd syslog.

Change the loglevel in the slapd.conf to 256

Add in /etc/syslog

 local4.*                /var/log/slapd.log
Restart the syslog syslog daemon.
service syslog restart
Restart the BDII

The log file can be parsed by this script which will generate a summary

BDB backend dies on memory allocation error

This issue has been seen on a virtual machine with limited memory.

slapd -f /opt/bdii/var/2171/bdii-slapd.conf -d 25

bdb_db_open: dbenv_open(/opt/bdii/var/2171)
bdb_db_open: dbenv_open(/opt/bdii/var/2171/infosys)
bdb(o=infosys): mmap: Cannot allocate memory
bdb(o=infosys): PANIC: Cannot allocate memory
bdb_db_open: dbenv_open failed: DB_RUNRECOVERY: Fatal error, run database recovery (-30978)
backend_startup: bi_db_open(1) failed! (-30978)
slapd shutdown: initiated
====> bdb_cache_release_all
====> bdb_cache_release_all
bdb(o=infosys): DB_ENV->lock_id_free interface requires an environment configured for the locking subsystem
slapd shutdown: freeing system resources.
bdb(o=infosys): txn_checkpoint interface requires an environment configured for the transaction subsystem
bdb_db_destroy: txn_checkpoint failed: Invalid argument (22) 
The solution is to reduce the cache memory allocation specified in /opt/bdii/etc/DB_CONFIG
set_cachesize N_GBytes N_Bytes N_segments
Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2013-05-29 - MariaALANDESPRADILLO
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback