LCG File Catalog (LFC) administrators' guide
Authors:
Jean-Philippe Baud, James Casey, Sophie Lemaitre, Caitriana Nicholson
Main Developers:
- Software: Jean-Philippe Baud, David Smith, Sophie Lemaitre
- Performance tests: Caitriana Nicholson
Abstract:
The LCG File Catalog (LFC) is a high performance catalog provided by LCG. This document describes the LFC architecture and implementation. It also explains how to install the LFC client as well as the LFC server for both Mysql and Oracle backend.
Last version released in production: 1.6.3
Support: helpdesk@ggusNOSPAMPLEASE.org (Remove the NOSPAM !)
Important
LFC version >= 1.6.3 requires a database schema change.
Please check
How to upgrade to know what to do to
upgrade to LFC version >= 1.6.0.
Note : YAIM handles all the steps needed for the upgrade/first installation.
Introduction
Description
The LCG File Catalog is provided by the CERN IT Grid Deployment (IT-GD) group. It is a high performance file catalog based on lessons learnt during the Data Challenges. It fixes the performance and scalability problems seen with the EDG catalogs. For instance, it provides:
- Cursors for large queries,
- Timeouts and retries from the client.
The LFC provides more features than the RLS:
- User exposed transaction API,
- Hierarchical namespace and namespace operations,
- Integrated GSI Authentication and Authorization,
- Access Control Lists (Unix Permissions and POSIX ACLs),
- Sessions
- Checksums
- virtual ids and VOMS support
The LFC supports Oracle and Mysql as database backends, and the integration with GFAL and lcg util has been done by the Grid Deployment group.
The LFC client and server modules can be installed either by hand, via RPMs, or using the YAIM tool.
LFC status as of August 2007:
- LFC_poster2.JPG:
LFC Deployment Plan
The LFC should progressively replace the EDG Replica Location Server (RLS).
An instance of the LFC can be used for several VOs. At CERN, an Oracle version of the LFC will be deployed for the four LHC Experiments VOs, to replace the RLS service in place.
The existing entries stored in the RLS should be migrated into the LFC. A migration script is provided to do so.
LFC Architecture
The LFC has a completely different architecture from the RLS framework. Like the EDG catalog, it contains a GUID (Globally Unique Identifier) as an identifier for a logical file, but unlike the EDG catalog it stores both logical and physical mappings for the file in the same database. This speeds up operations which span both sets of mappings. It also treats all entities as files in a UNIX-like filesystem. The API is similar to that of the UNIX filesystem API, with calls such as creat, mkdir and chown.
There is a global hierarchical namespace of Logical File Names (LFNs) which are mapped to the GUIDs. GUIDs are mapped to the physical locations of file replicas in storage (Storage File Names or SFNs).
System attributes of the files (such as creation time, last access time, file size and checksum) are stored as attributes on the LFN, but user-defined metadata is restricted to one field, as the authors believe that user metadata should be stored in a separate metadata catalog. Multiple LFNs per GUID are allowed as symbolic links to the primary LFN.
Bulk operations are supported, with transactions, and cursors for handling large query results. As there is only one catalog, transactions are possible across both LFN and SFN operations, which was impossible with the EDG RLS. In case of momentary loss of connection to the catalog, timeouts and retries are supported.
In the secure version of the LFC, authentication is by Kerberos 5 or Grid Security Infrastructure (GSI), which allows single sign-on to the catalog with users Grid certificates. The client domain name is
mapped internally to a uid/gid pair which is then used for authorization. It is also planned to integrate
VOMS (the Virtual Organisation Membership Service which was developed in EDG) as another way of authentication, mapping the
VOMS roles to multiple group IDs in the LFC. The uid/gid pairs are used for authorization by means of the file ownership information which is stored in the catalog as system metadata on the LFN. Standard UNIX permissions and POSIX-compliant Access Control Lists (ACLs) on each catalog entry are supported.
N.B. : Only the secure version of the LFC is available. No insecure version is provided.
LFC Implementation
The LFC is implemented entirely in C and is based on the CASTOR Name Server code. It runs as a multi-threaded daemon, with a relational database backend. Currently, both Oracle and Mysql are supported
as database components. The client may also be multi-threaded.
Bulk operations are implemented inside transactions. The transaction API is also exposed to the user, both to allow multiple operations inside a single transaction and to allow user-controlled as well as automatic rollback.
Clients currently exist for GFAL, POOL and lcg utils. GFAL is the Grid File Access Library, a library developed by LCG to give a uniform POSIX interface to local and mass storage on the Grid; POOL is the Pool of persistent Objects for LCG, a framework for the LHC experiments to navigate distributed data without knowing details of the storage technology; and lcg utils is the command-line interface and API for user access to LCG.
LFC CLI and API
A simple and detailed description of the LFC
Command Line Interface (CLI) and
Application Programming Interface (API) can be found here :
LFC CLI and API
LFC Client Timeouts
From version 1.5.8 onwards, timeouts and retries are implemented at the LFC client level.
3 environment variables can be used:
-
$LFC_CONNTIMEOUT
-> sets the connect timeout in seconds
-
$LFC_CONRETRY
-> sets the number of retries
-
$LFC_CONRETRYINT
-> sets the retry interval in seconds
The default is to retry for one week.
VOMS and ACLs
For details about
VOMS and ACLs in LCG Data Management, check
VOMS and ACLs in Data Management.
Secondary groups
From LFC version >= 1.6.4, secondary groups are supported.
What are secondary groups ?
With a
grid-proxy-init
, you'll belong to only one LFC virtual
gid
(the
gid
corresponding to your VO).
[me@UI ~]$ grid-proxy-init
Your identity: /C=CH/O=CERN/OU=GRID/CN=Sophie Lemaitre 5847
Enter GRID pass phrase for this identity:
Creating proxy ................................. Done
Your proxy is valid until: Fri Mar 16 22:34:47 2007
[me@UI ~]$ export LFC_HOST=mylfc.my.domain
[me@UI ~]$ lfc-ls /grid
If you check the LFC server log, you'll see that you are mapped to only one virtual
gid
(here
104
):
03/16 10:40:52 8421,0 Cns_srv_lstat: NS092 - lstat request by /C=CH/O=CERN/OU=GRID/CN=Sophie Lemaitre 5847 (101,104) from lxb2057.cern.ch
With a
voms-proxy-init
, you might belong to several LFC virtual
gids
.
Requirements
Deployment Scenarios
If your site acts as a central catalog for several VOs, you can either have :
- One LFC server, with one DB account containing the entries of all the supported VOs. You should then create one directory per VO.
- Several LFC servers, having each a DB account containing the entries for a given VO.
The choice is up to you. Both scenarios have consequences on the way database backups will be handled. In this guide, we give details on the first scenario.
What kind of machine ?
Apart from the special case of a purely test setup, we recommend to install the LFC server on hardware that provides (at least):
- 2Ghz processor with 1GB of memory (not a hard requirement)
- Dual power supply
- Mirrored system disk
Database backups
You can install the LFC server on the same machine as the database or not. It is especially recommended to have them separate if the database service is provided externally.
Also, there has to be backups in place, in order to be able to recover the LFC data in case of hardware problems. This is considered the responsibility of the local site and is outside the scope of this document.
Which ports need to be open ?
For the LFC server daemon, the
port 5010 has to be open locally at your site.
For the LFC Data Location Interface (DLI), the
port 8085 also has to be open.
For the information system the
port 2170 has to be open locally at your site. It has to be visible by the site bdii.
LFC Installation via YAIM
If you are not familiar with the YAIM tool, please refer to these pages:
Since
glite-yaim-3.0.0-22
, YAIM supports the LFC for both Mysql and Oracle backends.
The site-info.def file has to contain the following variables :
-
$MY_DOMAIN
-> your domain, ex : "cern.ch"
-
$MYSQL_PASSWORD
-> the root Mysql password
-
$LFC_HOST
-> the LFC server hostname, ex : "lfc01.$MY_DOMAIN"
-
$LFC_DB_HOST
-> the Mysql server hostname, or the Oracle database SID
-
$LFC_DB
-> the Mysql database name, ex: cns_db
or my_lfc_db
; or the Oracle database user account, ex: lcg_lfc_cms
-
$LFC_DB_PASSWORD
-> the password for the Mysql or Oracle database user
-
$VOS
-> supported VOs
-
$LFC_CENTRAL
-> list of VOs for which the LFC should be configured as a central catalogue, ex : "swetest hec"
-
$LFC_LOCAL
-> list of VOs for which the LFC should be configured as a local catalogue, ex : "atlas"
. If not defined, it is set to $VOS
minus $LFC_CENTRAL
.
Note: the only difference between a local and a central catalog, as fas as configuration is concerned, is the information published by the Information System.
Then run:
cd /opt/lcg/yaim/scripts
./install_node site-info.def glite-LFC_mysql
./configure_node site-info.def glite-LFC_mysql
or
./install_node site-info.def glite-LFC_oracle
./configure_node site-info.def glite-LFC_oracle
These commands will :
- install the whole security stack (globus RPMs, CA RPMs, pool accounts, etc.),
- install Mysql and create the LFC database,
- install, configure and start the LFC server,
- install the LFC client,
- start the LFC Data Location Interface (DLI), if the LFC is a central catalog.
For distribution reasons, the Oracle client (Oracle Instant Client for instance) is not automatically installed by YAIM. This is your responsibility to install it.
Some YAIM node types (WN, UI and RB) include the LFC-client.
LFC Installation via RPMs
The CERN Grid Deployment group provides RPMs for the LCG File Catalog.
The RPMs are available for Scientific Linux 3.
By default, the LFC RPMs are installed in the
/opt/lcg
directory.
So, INSTALL_DIR refers to
/opt/lcg
by default.
It is recommended to install the LFC client on the server machine.
The LFC RPMs for SLC3 are:
- lcg-dm-common-VERSION-1_sl3.i386.rpm
- LFC-client-VERSION-1sec_sl3.i386.rpm
- LFC-server-mysql-VERSION-1sec_sl3.i386.rpm
- LFC-server-oracle-VERSION-1sec_sl3.i386.rpm
- LFC-interfaces-VERSION-1_sl3.i386.rpm
The LFC-interfaces RPM contains:
- a Perl interface,
- a Python interface.
Security
The LFC is GSI security enabled. This implies several things.
Host certificate / key
On the LFC server, there has to be a valid host certificate and key installed in
/etc/grid-security/lfcmgr
.
Copy the host certificate and key to the right directory. Pay attention to the permissions !
$ ll /etc/grid-security/lfcmgr | grep lfc
-rw-r--r-- 1 lfcmgr lfcmgr 5423 May 30 13:58 lfccert.pem
-r-------- 1 lfcmgr lfcmgr 1675 May 30 13:58 lfckey.pem
This step is handled automatically by YAIM.
IMORTANT : the host cert and key still have to be kept in their original place !!!
$ ll /etc/grid-security/ | grep host
-rw-r--r-- 1 root root 5423 May 27 12:35 hostcert.pem
-r-------- 1 root root 1675 May 27 12:35 hostkey.pem
On the client
The user needs to have a valid proxy (and to exist in the LFC server grid-mapfile):
$ grid-proxy-init
Your identity: /C=CH/O=CERN/OU=GRID/CN=Sophie Lemaitre 2268
Enter GRID pass phrase for this identity:
Creating proxy ................................. Done
Your proxy is valid until: Fri Apr 1 22:38:07 2005
Otherwise, this error will appear :
lfc-ls /
send2nsd: NS002 - send error : No valid credential found
/: Could not establish context
Trusted Hosts
By default, only root can issue privileged operations (the same ones as on any UNIX system). For instance, the
/grid
directory creation can only be performed by root on the LFC server machine.
However, it is possible to add trusted hosts from where these operations can be issued as well (as root). But the trusted hosts should be carefully chosen, and we recommend that only service machines are added if needed.
To add node1.domain.name and node2.domain.name as trusted, create the
/etc/shift.conf
file as follow:
more /etc/shift.conf
LFC TRUST node1.domain.name node2.domain.name
Note: There has to be
only one blank or tab between each word in
/etc/shift.conf
.
Dependencies
Common Dependencies
The LFC-client, LFC-server-mysql and LFC-server-oracle RPMs all depend on lcg-dm-common.
lcg-dm-common itself depends on the usual Globus security RPMs, that might already be installed on your machine :
- gpt-VDT1.2.2rh9-1.i386.rpm
- vdt globus essentials-VDT1.2.2rh9-1.i386.rpm
- vdt globus sdk-VDT1.2.2rh9-1.i386.rpm
These RPMs can be found at:
http://grid-deployment.web.cern.ch/grid-deployment/RpmDir_i386-sl3/vdt/rh9/globus/
Install lcg-dm-common as root :
rpm -Uvh lcg-dm-common-VERSION-1.i386.rpm
CGSI_gSOAP
The LFC Mysql and Oracle servers depend on CGSI gSOAP version 2.6 >= 1.1.15-6. On the LFC server machine, install it as root:
rpm -Uvh CGSI_gSOAP_2.6-1.1.15-6.slc3.i386.rpm
The LFC RPMs depend on :
glite-security-voms-api-c-1.6.16-0
glite-security-voms-api-1.6.16-0
Mysql LFC Server
The LFC server for Mysql depends on the Mysql-client RPM version 4.0.20 or higher. Install it as root:
rpm -Uvh Mysql-client-4.0.21-0.i386.rpm
As from the gLite3.0 release onwards, the LFC RPMs are built against Mysql 4.1. But they stay compatible with Mysql 4.0.
Oracle LFC Server
The LFC server for Oracle depends on the oracle-instantclient-basic-lcg RPM.
This RPM can also be found in the Grid Deployment AFS directory.
Install it as root:
rpm -Uvh oracle-instantclient-basic-lcg-10.1.0.3-1.i386.rpm
If you install the LFC on the same machine as the Oracle database, do not install the Instant Client and install the LFC-server-oracle RPM with the --nodeps option.
LFC Client Installation
To install the LFC client, just install the RPM as root:
rpm -Uvh LFC-client-VERSION-1sec_sl3.i386.rpm
LFC Server Installation
The LFC server RPM is available for :
Install the one corresponding to your database backend as root:
rpm -Uvh LFC-server-mysql-VERSION-1sec.i386.rpm
Create the LFC User (LFCMGR)
Create the dedicated LFC user, as root:
useradd -c "LFC manager" -d /home/lfcmgr lfcmgr
LFC Database Setup
In this section, we assume that you have an Mysql or Oracle database instance running on a given host.
In this database, you need to create a dedicated account to store the LFC file and replica names.
Mysql
Create the LFC database and tables:
mysql -u root -p < INSTALL_DIR/share/LFC/create_lfc_tables_mysql.sql
Create the LFC appropriate user, called for instance
lfc
, with the correct privileges on
cns_db
:
mysql -u root -p
use mysql
GRANT ALL PRIVILEGES ON cns_db.* TO 'lfc'@localhost IDENTIFIED BY 'lfc_password' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON cns_db.* TO 'lfc'@LFC_HOST IDENTIFIED BY 'lfc_password' WITH GRANT OPTION;
Note : The database is called
cns_db
by default. It is possible to replace it with any other name. In this case,
/opt/lcg/etc/NSCONFIG
should look like :
lfc_user/lfc_password@host/database_name
Oracle
Using the scripts provided in INSTALL_DIR/share/LFC/db-deployment, create a dedicated LFC user.
These commands will create the LFC USER Oracle account, tablespaces and schema :
./create-tablespaces-lfc --name user
./create-user-lfc --name user --password XXXXX
./create-schema-lfc --name user --password XXXXX --schema create_lfc_oracle_tables.sql
Oracle Environment Variables
Depending if the LFC is running on the same machine as the Oracle database, different environment variables have to be set.
Also, if they run on the same machine, the
/etc/sysconfig/lfcdaemon
file has to be modified.
Oracle and LFC the same machine
If the Oracle database SID is different from
LFC
, or if the
$TNS_ADMIN
directory is located somewhere else, you have to modify the
/etc/sysconfig/lfcdaemon
file as follow before running
the LFC daemon:
# - Oracle Home :
#ORACLE_HOME=/usr/lib/oracle/10.1.0.3/client
# - Directory where tnsnames.ora resides :
TNS_ADMIN=/another/tns_admin/directory
# If the LFC server is installed on the same box as the Oracle instance,
# use the $ORACLE_SID variable instead of the $TWO_TASK one :
ORACLE_SID=ANOTHER_SID
Oracle and LFC on different machines
Note: This setup is not tested for every release, and is not recommended if a proper central database service exists in your institute.
If the Oracle database SID is different from
LFC
, or if the
$TNS_ADMIN
directory is located somewhere else, you have to modify the
/etc/sysconfig/lfcdaemon
file as follow before running
the LFC daemon. You should also probably modify the
$ORACLE_HOME
variable to be the proper one.
# - Oracle Home :
ORACLE_HOME=/another/oracle_home/
# - Directory where tnsnames.ora resides :
TNS_ADMIN=/another/tns_admin/directory
# - Database name :
TWO_TASK=ANOTHER_SID
Oracle minimal number of connections
The Oracle database backend should allow for enough connections. The
LFC_USER
should accept at least as many connections as there are LFC threads.
If you run the LFC with 20 threads, you should allow for at least 20 connections.
LFC Configuration File
The default configuration file is : (check the permissions !)
ls -al /opt/lcg/etc/NSCONFIG
-rw-rw---- 1 lfcmgr lfcmgr 44 May 26 12:07 /opt/lcg/etc/NSCONFIG
- For Mysql, use the database host:
echo "lfc/password@lfchost.domain.name" > /opt/lcg/etc/NSCONFIG
Note : If the LFC database is different than the one by default (
cns_db
), you should use instead :
echo "lfc/password@lfchost.domain.name/database_name" > /opt/lcg/etc/NSCONFIG
- For Oracle, use the database connect string defined in tnsnames.ora:
echo "LFC_USER/password@LFC_SID" > /opt/lcg/etc/NSCONFIG
If you want to use a different configuration file, it is possible. You then have to specify it in the
/etc/sysconfig/lfcdaemon
file:
# - LFC configuration file
NSCONFIGFILE="/var/etc/NSCONFIG"
LFC Log File
The log file by default is:
/var/log/lfc/log
.
To use a different configuration file, you have to create the directory where it resides with the right permissions :
ls -al /another/log/directory
-rw-rw---- 1 lfcmgr lfcmgr 44 May 26 12:08 /another/log/directory
You then have to specify it in the /etc/sysconfig/lfcdaemon file :
# - LFC log file :
LFCDAEMONLOGFILE="/another/log/directory"
In order to avoid the LFC log file to be full and stop the LFC server to work, the log file is automatically rotated every day. The logrotate file is
/etc/logrotate.d/lfcdaemon
.
Run the LFC Server
To start the LFC daemon:
service lfcdaemon start
Number of threads
By default, the LFC daemon is started with 20 threads. To specify more/less threads, modify
/etc/sysconfig/lfcdaemon
before running the daemon :
# - Number of LFC threads :
NB_THREADS=50
Important : increasing the number of threads doesn't mean improving the performance. Most of the LFC operations are fast and don't occupy the thread long. And on a dual CPU machine, only 2 active threads will be served at the same time. So, increasing the number of threads is useful for instance if there are usually many threads waiting for the database to respond (and only if the database is on a different machine...)
READ-ONLY LFC
By default, the LFC is read-write. But you might want to run a read-only LFC instance to lower the load on another read-write instance.
Then, modify
/etc/sysconfig/lfcdaemon
before running the LFC daemon :
# should the LFC be read-only ?
# any string but "yes" will be equivalent to "no"
#
RUN_READONLY="yes"
The write operations will not be possible :
$ lfc-mkdir /grid/dteam/hello
cannot create /grid/dteam/hello: Read-only file system
Test the LFC Server
On a machine where the client is installed, test that the server is running :
export LFC_HOST=LFC_server_full_hostname
lfc-ls /
This command shouldn’t return anything as the LCG File Catalog is empty for the moment, but it shouldn’t return any error either.
Create one Directory per VO
As root, create the
/grid
directory:
lfc-mkdir /grid
As root, create one subdirectory per VO. For instance, if your site supports the LFC as central catalog for the dteam and atlas VOs, only the subdirectories for the dteam and atlas VO should be created:
lfc-mkdir /grid/dteam
lfc-entergrpmap --group dteam
lfc-chown root:dteam /grid/dteam
lfc-chmod 775 /grid/dteam
lfc-setacl -m d:u::7,d:g::7,d:o:5 /grid/dteam
lfc-mkdir /grid/atlas
lfc-entergrpmap --group atlas
lfc-chown root:atlas /grid/atlas
lfc-chmod 775 /grid/atlas
lfc-setacl -m d:u::7,d:g::7,d:o:5 /grid/atlas
Check that the directories have been properly created :
lfc-ls -l /grid
drwxrwxr-x 5 root 102 0 Nov 23 15:55 atlas
drwxrwxr-x 5 root 101 0 Nov 23 15:56 dteam
Note: in this case,
101
and
102
are the respective LFC virtual gids of
dteam
and
atlas
.
LFC Data Location Interface (DLI)
Description
The Data Location Interface (DLI) is the common catalog interface used by the Workload Management System for match making. The DLI is a web service interface.
Note: Since LCG-2.7.0, the LFC DLI is run automatically by YAIM for
central and local LFC catalogs.
For the DLI, port 8085 should be open.
Configuration
The LFC DLI comes with the LFC server RPM.
After having installed the LFC server, create the
/etc/sysconfig/lfc-dli
file:
cp -p /etc/sysconfig/lfc-dli.templ /etc/sysconfig/lfc-dli
Edit this file, and set the
$LFC_HOST
variable:
# - LFC server host : !!!! please change !!!!
export LFC_HOST=LFC_hostname
And start the LFC DLI :
service lfc-dli start
Important
By default, YAIM will run the Data Location Interface for all LFCs (locals and centrals). The DLI gives a insecure read-only access to the LFC data. Please check with your VO whether this is acceptable.
To turn the DLI off, here is the recipe :
$ /sbin/service lfc-dli stop
$ /sbin/chkconfig lfc-dli off
And in
/etc/sysconfig/lfc-dli
, set :
RUN_DLI="no"
Virtual Ids / VOMS
The LFC supports virtual Ids and
VOMS.
For more details, refer to this page :
Virtual Ids / VOMS
LFC Admin Commands
Start/Stop the LFC
To start/stop the LFC server, on the server machine itself:
service lfcdaemon start|stop|restart
To know if the LFC server is running:
service lfcdaemon status
Start/Stop the Data Location Interface (DLI)
To start/stop the LFC DLI, on the server machine itself:
service lfc-dli start|stop
To know if the LFC DLI is running :
service lfc-dli status
Note: remember that the DLI should be run for
central LFC catalogs only, not for local sites catalogs.
Integration of the LFC
Publish the LFC in the Information System
To use the LFC in the LCG-2 environment, you have to publish it in the Information System. Depending if the LFC is the central catalog or a local catalog, the information published is different.
Here is a page to help you doing this :
How to publish the LFC/DPM in the Information System. However, we don't garanty that this page is up-to-date. In case of problems, refer to the appropriate documentation.
Note: the hostnames retrieved are the full hostnames
Central Catalog
Check the output of an
ldapsearch
for a
central catalog.
The important part is: GlueServiceType: lcg-file-catalog
Local Catalog
Check the output of an
ldapsearch
for a
local catalog support.
The important part is: GlueServiceType: lcg-local-file-catalog
Test the Integration
Test if the LFC server appears in your site
BDII, by using the lcg-infosites command on a UI:
lcg-infosites --vo dteam lfc --is BDII_hostname
lcg-infosites --vo dteam lfcLocal --is BDII_hostname
The
--is
option specifies the
BDII to query, in case it is not defined by the LCG GFAL INFOSYS environment variable. So, the previous command is equivalent to:
export LCG_GFAL_INFOSYS=BDII_hostname
lcg-infosites --vo dteam lfc
or
lcg-infosites --vo dteam lfcLocal
On a UI (where lcg util, lcg-gfal and LFC-client are installed), test that the LFC is correctly integrated.
The lcg util commands can work with either the EDG RLS or the LFC as a catalog. The
$LCG_CATALOG_TYPE
environment variable should be set accordingly (
edg
or
lfc
).
Pay attention: it is
$LCG_CATALOG_TYPE
and not
$LFC_CATALOG_TYPE
...
Use the LFC as backend:
export LCG_CATALOG_TYPE=lfc
Set the
$LFC_HOST
variable:
export LFC_HOST=LFC_HOSTNAME
It is also convenient to use the lcg-infosites command to set the LFC HOST variable :
export LFC_HOST=‘lcg-infosites --vo dteam lfc‘
And try to copy and register an existing file (for instance, /tmp/hello.txt) :
lcg-cr -v -d SE_hostname -l /grid/atlas/hello.txt --vo dteam file:/tmp/hello.txt
The SE hostname should be among the Storage Elements available for your VO :
lcg-infosites --vo dteam se --is BDII_hostname
The file should then appear in the LCG File Catalog :
lfc-ls /grid/dteam/
LFC Interfaces
The LFC-interfaces RPM contains :
- a Perl interface,
- a Python interface.
They are automatically generated using
swig
.
The corresponding man pages are :
-
man lfc_perl
-
man lfc_python
LFC Performance
A series of performance tests have been conducted on the LFC.
A detailed paper presenting these tests and comparing the results with the Globus and the EDG-RLS catalogs is available at :
"Performance Analysis of a File Catalog for the LHC Computing Grid" by Jean-Philippe Baud, James Casey, Sophie Lemaitre and Caitriana Nicholson.
See
https://edms.cern.ch/cedar/plsql/doc.info?cookie=3327541&document_id=545811&version=2
.
The main results are outlined here, for the insecure version of the catalog.
- There was no significant difference in operation times when there was a large number of entries in the catalog. It was tested with up to 40 million entries, and both mean insert time and query rate were independent of number of entries up to that point. With the EDG RLS, the time for an individual query started to increase rapidly beyond 100,000 entries, and the time for an individual insert started to increase significantly beyond 200,000 entries.
- Insert, delete and query rates increase as more client threads are added, up to about 6 threads. Beyond this, the rates are roughly constant, up to the limit of threads on the server. Insert rate is more than twice that achieved with the Globus RLS under similar conditions. Query rate is lower, but the two forms of query are not comparable: the Globus RLS query is a simple LFN-SURL lookup, whereas the LFC checks permissions and returns useful metadata. Query rate is also 50% higher than the EDG RLS.
- Tests were performed with up to 10,000 replicas of a single file, up to 10,000 symlinks in a directory and up to 10,000 files in a directory, showing time to list or read all replicas/symlinks/files to vary linearly with the total number present, as one would expect, up to the limit of server threads.
- Using transactions slows the performance by at least a factor of 2, depending on the number of operations performed per transaction. Changing into the working directory when performing a large number of operations (e.g. inserts, deletes) in the same directory leads to significant performance improvement and is strongly recommended. Without changing directory, operation time increases linearly as a function of number of subdirectories in the path.
- Tests of the operation rates were repeated with up to 10 simultaneous clients, each running with 10 threads. These showed the operation rates to be independent of the number of clients, the LFC thus being shown scalable up to 100 client threads.
Troubleshooting / FAQ
See the
Troubleshooting section.
In case you have questions or you need help, do not hesitate to contact
helpdesk@ggusNOSPAMPLEASE.org (remove the NONSPAM !)
Developers' Documentation
The Developers' documentation is under construction at :
Developers' documentation.
--
SophieLemaitre - 03 Sep 2007