FTS Adminstration Toolpack for FTS 2.0
The FTS toolpack consists of a variety of scripts and database procedures to do useful things upon the FTS - either the daemons or the backend database.
This is where quick scripts will go until we've had a chance to integrated them nicely into the release. Much of it is not very pretty and much of it will do terrible damage to your database schema if you run it wrongly (requiring a complete re-install). You may wish to ask your DBA to run some of these tools for you.
A proper release?
The toolpack is currently not in any gLite release.
It comes in the messy
glite-data-transfer-scripts
RPM.
It has been noted that the versioning of this RPM (of the DB scripts in particular) is rubbish. This will be fixed soon and stable components will be moved out of this 'special' RPM as they become available.
Current version
Until then, it will continue to be distributed is a nasty manner.
Current version is 0.2.0-1
Database administration tools
There are a few database tools packaged with the toolpack.
FTS admin package
This is a PL/SQL package that can be loaded into your database. It consists of two files:
/opt/glite/share/glite-data-transfer-scripts/plsql/fts_admin_pack.sql
/opt/glite/share/glite-data-transfer-scripts/plsql/fts_admin_pack_body.sql
Load the package into your database on the schema owner account (the account on which your ran the schema install) using the tool of your choice (see
FtsAdminToolsPackageLoading20 for how to do this with SQL/Plus).
Changing the channel source or destination
To update the channel destination (for example if the GOCDB sitename has changed or if you want to make it upper-case, as recommended), run the
fts_admin_pack.CHANGE_CHANNEL_SOURCE
or
fts_admin_pack.CHANGE_CHANNEL_DEST
methods, passing the channel name and the new source or destination.
You do not need to stop the agents to make this change.
SQL> exec fts_admin_pack.CHANGE_CHANNEL_DEST('CERN-RANDOM', 'RANDOMT1_NEW_NAME');
PL/SQL procedure successfully completed.
Changing a channel name
The channel name may be changed using the
fts_admin_pack.RENAME_CHANNEL
passing the old and new channel names.
This is not a lightweight operation with the current schema.
YOU MUST STOP ALL VO AND TRANSFER AGENTS BEFORE YOU DO THIS
If you do not stop all agents, it will lock-up your schema (in which case, buy your DBA a coffee...)
SQL> exec fts_admin_pack.RENAME_CHANNEL('CERN-RANDOM', 'CERN-NEWRANDOM');
PL/SQL procedure successfully completed.
The channel name is used as a key in the job table and updating this may take a long time (like
several hours
if you have many jobs in the table).
FTS history package
* Check carefully the history package with schema 3.1.0! *
The purpose of the history package is to move old jobs (jobs in terminal state, X days old) to a history table.
This will prevent any part of the FTS from indexing over those jobs while maintaining the data for audit requirements. This is important at high transfer rates since making the database trawl through hundreds of thousands of jobs just to find the few ones which are currently running leads to peformance problems. N.B. AFTER A JOB HAS BEEN MOVED, YOU CANNOT QUERY ITS STATUS BY ANYTHING EXCEPT DIRECT SQL COMMANDS (THE FTS WEB-SERVICE WILL NOT SEE IT).
N.B. This is (still) a work-around. A better solution is being investigated making use of Oracle partioning. The work is in progress for this - in particular, the 3.1.0 schema now has a partitioning key (timestamp) which is put onto each entry by a trigger when the associated job enters a terminal state.
Oracle block fragmentation
Fragmentation has been noted on the DB blocks when using this tool. See
FtsRelease20TableFragmentation for more details.
Loading the schema
Load the history schema into the database. The schema is essentially the same as the
t_job
,
t_file
and
t_transfer
tables. This should be loaded onto the schema owner account.
From SQL/Plus (as described in
FtsAdminToolsPackageLoading20):
SQL> @/opt/glite/share/glite-data-transfer-scripts/plsql/create_fts_history_tables.sql
Loading the package
The load the package files (as described in
FtsAdminToolsPackageLoading15):
SQL> @/opt/glite/share/glite-data-transfer-scripts/plsql/fts_history_pack.sql
SQL> @/opt/glite/share/glite-data-transfer-scripts/plsql/fts_history_pack_body.sql
Start the DBMS job
You need to have DB permissions on the FTS account to run DBMS jobs (essentially a database cron task).
It will move jobs in a terminal state that have a submit_time over 7 days old. The job runs, by default every 10 minutes, moving 100 jobs and their contents at a time.
REMEMBER - AFTER A JOB HAS BEEN MOVED, YOU CANNOT QUERY ITS STATUS BY ANYTHING EXCEPT DIRECT SQL COMMANDS (THE FTS WEB-SERVICE WILL NOT SEE IT).
SQL> exec fts_history.submit_job;
PL/SQL procedure successfully completed.
Check the job is there using your user ID:
SQL> select job, next_date, next_sec from user_jobs where user='LCG_FTS_STRESSTEST2';
JOB NEXT_DATE NEXT_SEC
---------- --------- --------
82 04-JUL-06 16:00:24
Start the job for the first time (this is often needed) and check that there is no error code reported in the logging table. Use the job id you got back from hte user_jobs table:
SQL> exec dbms_job.run(82);
PL/SQL procedure successfully completed.
SQL> select ERRCODE, JOBS, FILES from T_HISTORY_LOG;
ERRCODE JOBS FILES
---------- ---------- ----------
100 217
The numbers refer to the number of jobs and files that were moved to the history table on this job run. There will be an entry each time the job runs. Any error codes are reported here.
To stop the DBMS job, run:
SQL> exec fts_history.STOP_JOB;
PL/SQL procedure successfully completed.
SQL> select * from user_jobs where job = 82;
no rows selected
Check the status
Every time the job runs, the number of entries in
t_job
should reduce by approx. 100, provided you have that many entries that are more than 7 days old in a final job state. The number of entries in
t_job_history
should increase by the corresponding number.
The same can be said for
t_file
/
t_file_history
and
t_transfer
/
t_transfer_history
, although the number of entries moved will depend on the content and performance of the jobs.
SQL> select count(*) from t_job;
COUNT(*)
----------
12223
SQL> select count(*) from t_job_history;
COUNT(*)
----------
100
... wait a while (> 10 minutes) ...
SQL> select count(*) from t_job;
COUNT(*)
----------
12123
SQL> select count(*) from t_job_history;
COUNT(*)
----------
200