A new page https://lhcb-shifters.web.cern.ch/ has been created with additional information for shifters, please also follow instructions there. This twiki page is not being maintained anymore

Grid Shifter Guide

UpdatedProductionShifterGuide (under development Sept 2010)

Contents:

Placeholder for the tWiki version of the shifter guide.

Introduction

This document describes the frequently-used tools and procedures available to Grid Shifters when managing production activities. It is expected that the current Grid Shifters should update this document by incorporating or linking to the stable procedures available on the LHCbProduction TWiki pages [1] when appropriate.

The Grid Sites section gives some brief information about the various Grid sites and their backend storage systems. The Jobs section details the jobs types a Grid Shifter is expected to encounter and provides some debugging methods. The methods available to manage and monitor productions are described in the Productions section. The Web Production Monitor section describes the main features of the Production Monitor webpage.

A chronological guide through a production shift, from beginning to end, is presented in the shifts section. The ELOG section outlines the situations for which the submission of an ELOG is appropriate. Finally, the Procedures section details the well-established procedures for Grid Shifters.

A number of quick-reference sections are also available. DIRAC Scripts and Acronyms list the available DIRAC 3 scripts and commonly-used acronyms respectively.

Grid Sites

Jobs submitted to the Grid will be scheduled to run at one of a number of Grid sites. The exact site at which a job is executed depends on the job requirements and the current status of all relevant grid sites. Grid sites are grouped into two tiers, Tier-1 and Tier-2. CERN is an exception, because it is also responsible for processing and archiving the RAW experimental data it is also referred to as a Tier-0 site.

Tier-1 Sites

Tier-1 sites are used for Analysis, Monte Carlo production, file transfer and file storage in the LHCb Computing Model.

  • LCG.CERN.ch (acting also as a Tier 0)
  • LCG.CNAF.it
  • IN2P3.fr
  • LCG.NIKHEF.nl
  • LCG.SARA.nl
  • LCG.PIC.es
  • RAL.uk
  • LCG.GRIDKA.de

Tier-2 Sites

There are numerous Tier-2 sites with sites being added frequently. As such, it is of little worth presenting a list of all the current Tier-2 sites in this document. Tier-2 sites are used for MC production in the LHCb Computing Model.

Backend Storage Systems

Three backend storage technologies are employed at the Tier-1 sites, Castor, dCache StoRM. The Tier-1 sites which utilise each technology choice are summarised in the table below:

Backend Storage Tier-1 Site
Castor CERN, RAL
dCache IN2P3, NIKHEF, GridKa, PIC
StoRM CNAF

Jobs

The number of jobs created for a production varies depending on the exact requirements of the production. Grid Shifters are generally not required to create jobs for a production.

JobIDs

A particular job is tagged with the following information:

  • Production Identifier (ProdID), e.g. 00001234 - the 1234$^{th}$ production.
  • Job Idetifier (JobID), e.g. 9876 - the 9876th job in the DIRAC system.
  • JobName, e.g. 00001234_00000019 - the 19th job in production 00001234.

Job Status

The job status of a successful job proceeds in the following order:

  1. Received,
  2. Checking,
  3. Staging,
  4. Waiting,
  5. Matched,
  6. Running,
  7. Completed,
  8. Done.

Jobs which return no heartbeat have a status of ``Stalled'' and jobs where any workflow modules return an error status are classed as ``Failed''.

The basic flowchart describing the evolution of a job's status can be found in figure 1. Jobs are only ``Grid-active'' once they have reached the ``Matched'' phase.

Job status flowchart. Note that the ``Checking'' and ``Staging'' status are omitted.

Figure 1: Job status flowchart. Note that the ``Checking'' and ``Staging'' status are omitted.

Job Output

The standard output and standard error of a job can be accessed through the API, the CLI and the webpage via a global job ``peek''.

Job Output via the CLI

The std.out and std.err for a given job can be retrieved using the CLI command:

dirac-wms-job-get-output <JobID> | [<JobID>]
This creates a directory containing the std.out and std.err for each JobID entered. Standard tools can then be used to search the output for specific strings, e.g. ``FATAL''.

To simply view the last few lines of a job's std.out (``peek'') use:

dirac-wms-job-peek <JobID> | [<JobID>]

Job Output via the Job Monitoring Webpage

There are two methods to view the output of a job via the Job Monitoring Webpage. The first returns the last 20 lines of the std.out and the second allows the Grid Shifter to view all the output files.

Figure 2: Peek the std.out of a job via the Job Monitoring Webpage.

To ``peek'' the std.out of a job:

  1. Navigate to the Job Monitoring Webpage.
  2. Select the relevant filters from the left panel.
  3. Click on a job.
  4. Select ``StandardOutput'' (Fig. 2).

Figure 3: View all the output files of a job via the Job Monitoring Webpage.

Similarly, to view all output files for a job:

  1. Navigate to the Job Monitoring Webpage.
  2. Select the relevant filters from the left panel.
  3. Click on a job.
  4. Select ``Get Logfile'' (Fig. 3).

This method can be particularly quick if the Grid Shifter only wants to check the output of a selection of jobs.

Job Pilot Output

The output of the Job Pilot can also be retrieved via the API, the CLI or the Webpage.

Job Pilot Output via the CLI

To obtain the Job Pilot output using the CLI, use:

dirac-admin-get-pilot-output <Grid pilot reference> [<Grid pilot reference>]
This creates a directory for each JobID containing the Job Pilot output.

Job Pilot Output via the Job Monitoring Webpage

Viewing the std.out and std.err of a Job Pilot via the Job Monitoring Webpage is achieved by:

  1. Navigate to the Job Monitoring Webpage.
  2. Select the relevant filters from the left panel.
  3. Click on a job.
  4. Select ``Pilot'' then ``Get StdOut'' or ``Get StdErr'' (Fig. 4).

Figure 4: View the pilot output of a job via the Job Monitoring Webpage.

Operations on Jobs

The full list of scripts which can be used to perform operations on a job is given in DIRAC Scripts. The name of each script should be a clear indication of it's purpose. Running a script without arguments will print basic usage notes.

Productions

As a Grid Shifter you will be required to monitor the official LHCb productions. Each production is assigned a unique Production ID (ProdID). These consist of Monte Carlo (MC) generation, data stripping and CCRC productions. Production creation will generally be performed by the Production Operations Manager and is not a duty of the Grid Shifter.

The current list of all active productions can be obtained with the command:

dirac-production-list-active
The command also gives the current submission status of the active productions.

Starting a Production

The submission of a production can be started once it has been formulated and all the required jobs created. Grid Shifters should ensure they have the permission of the Production Operations Manager (or equivalent) before starting a production. Production jobs can be submitted manually or automatically.

The state of a production can also be set using:

dirac-production-change-status <Command> <Production ID> | <Production ID>
where the available commands are:
'start', 'stop', 'manual', 'automatic'

Remember to validate a production before setting it to automatic! Run a few jobs to ensure they complete successfully before launching the whole production and setting it to automatic submission.

Starting and Stopping a Production

The commands:

dirac-production-start <Production ID> | <Production ID>
and
dirac-production-stop <Production ID> | <Production ID>
are used to start and stop a production. Grid Shifters may stop a current production if a significant number of jobs are failing.

Manual Submission

A production is set to manual submission by default. To reset the submission status of one or more productions, use the command:

dirac-production-set-manual <Production ID> | <Production ID>

A small number of test jobs should be manually submitted for each new production. In the case of stripping or CCRC productions, a small number of test jobs should be sent to all the Tier1 sites and closely monitored.

To manually submit jobs to a selected site, use the following command:

dirac-production-site-submit <ProdID> <Num Jobs> <Site>
Note that the full site name string must be entered, e.g. to submit a job to CERN you must type:
dirac-production-site-submit <ProdID> 1 LCG.CERN.ch

Any observed problems or job failures should be investigated and an ELOG entry submitted. Assuming there are no problems in all the test jobs, the production may be set to automatic submission.

Automatic Submission

When started, a production set to automatic submission will submit all jobs in the production in quick succession.

A production can be set to automatic submission once you are satisfied that there are no specific problems with the production jobs. To set a production to automatic submission use:

dirac-production-set-automatic <Production ID> | <Production ID>

Monitoring a Production

Jobs in each production should be periodically checked for failed jobs (Sec. 4.2.1) and to ensure that jobs are progressing (Sec. 4.2.2).

When monitoring a production, a Grid Shifter should be aware of a number of issues which can cause jobs to fail:

  • Staging.
  • Stalled Jobs.
  • Segmentation faults.
  • DB access.
  • Software problems.
  • Data access.
  • Shared area access.
  • Site downtime.
  • Problematic files.
  • Excessive runtime.

Failed Jobs

A Grid Shifter should monitor a production for failed jobs and jobs which are not progressing. Due to the various configurations of all the sites it is occasionally not possible for an email to be sent to the lhcb-datacrash mailing list for each failed job. It is therefore not enough to simply rely on the number of lhcb-datacrash emails to indicate if there are any problems with a production. In addition to any lhcb-datacrash notifications, the Grid Shifter should also check the number of failed jobs in a production via the CLI or the Production Monitoring Webpage.

Using the CLI, the command:

dirac-production-progress [<Production ID>]
entered without any arguments will return a breakdown of the jobs of all current productions. Entering one or more ProdIDs returns only the breakdown of those productions.

A more detailed breakdown is provided by:

dirac-production-job-summary <Production ID> [<DIRAC Status>]
which also includes the minor status of each job category and provides an example JobID for each category. The example JobIDs can then be used to investigate the failures further.

Beware of failed jobs which have been killed - when a production is complete, the remaining jobs may be automatically killed by DIRAC. Killed jobs like this are ok.

Non-Progressing Jobs

In addition to failed jobs, jobs which do not progress should also be monitored. Particular attention should be paid to jobs in the states ``Waiting'' and ``Staging''. Problematic jobs at this stage are easily overlooked since the associated problems are not easily identifiable.

Non-Starting Jobs

Jobs arriving at a site but then failing to start have multiple causes. One of the most common reasons is that a site is due to enter scheduled downtime and are no longer submitting jobs to the batch queues. Jobs will stay at the site in a ``Waiting'' state and state that there are no CE's available. Multiple jobs in this state should be reported.

Merging Productions

Each MC Production should have an associated Merging Production which merges the output files together into more manageable file sizes. Ensure that the number of files available to the Merging Production increases in proportion to the number of successful jobs of the MC Production. If the number of files does not increase, this can point to a problem in the Bookkeeping which should be reported.

Ending a Production

Ending a completed production is handled by the Productions Operations Manager (or equivalent). No action is required on the part of the Grid Shifter.

Operations on Productions

All CLI scripts which can be used to manage productions are listed in DIRAC Scripts. Running a script without arguments will return basic usage notes. In some cases further help is available by running a script with the option "--help".

Web Production Monitor

Production monitoring via the web is possible through the Production Monitoring Webpage. A valid grid certificate loaded into your browser is required to use the webpage.

Features

The Production Monitoring Webpage has the following features:

Buglist and Feature Request

The procedure to submit a bug report or a feature request is outlined in Procedures.

Shifts

Grid Shifters are required to monitor all the current LHCb productions and must have a valid Grid Certificate and be a member of the LHCb VO.

Before a Shift Period

The new shifter should:

  • Ensure their Grid certificate is valid for all expected duties
  • Create accounts on all relevant web-resources
  • Subscribe to the relevant mailing lists

Grid Certificates

A Grid certificate is mandatory for Grid Shifters. If you don't have a certificate you should register for one through CERN LCG and apply to join the LHCb Virtual Organisation (VO).

To access the production monitoring webpages you will also need to load your certificate into your browser. Detailed instructions on how to do this can be found on the CERN LCG pages.

Web Resources

Primary web-based resources for DIRAC 3 production shifts:

Mailing Lists

The new Grid Shifter should subscribe to the following mailing lists:

Note that both the lhcb-datacrash and lhcb-production mailing lists receive a substantial amount of mail daily. It's suggested that suitable message filters and folders are created in your mail client of choice.

Production Operations Key

The new shifter should obtain the Production Operations key (TCE5) from the LHCb secretariat or the previous Grid Shifter.

During a Shift

During a shift Grid Shifters are expected to monitor all current productions and be aware of the current status of the Tier1 sites. A knowledge of the purpose of each production is also useful and aids in determining the probable cause of any failed jobs.

Daily Actions

Grid Shifters are expected to carry out the following daily actions for sites used in the current productions:

  • Trigger submission of pending productions.
  • Monitor active productions.
  • Check transfer status.
  • Verify that the staging at each site is functional.
  • Check that there is a minimum of one successful (and complete) job.
  • Confirm that data access is working at least intermittently.
  • Report problems to the operations team.
  • Submit a summary of the job status at all the grid sites to the ELOG.

Performance Monitoring

Grid Shifters should view the plots accessible via the DIRACSystemMonitoring page at least three times a day and investigate any unusual features present.

Production Operations Meeting

A Production Operations Meeting takes place at the end of the morning shift and allows the morning Grid Shifter to highlight any recent or outstanding issues. Both the morning and afternoon Grid Shifter should attend. The morning Grid Shifter should give a report summarising the morning activities.

The Grid Shifter's report should contain:

  • Current production progress, jobs submitted, waiting etc.
  • Status of all Tier1 sites.
  • Recently observed failures, paying particular attention to previously-unknown problems.

Ending a Shift

At the end of each shift, morning Grid Shifters should:

  • Pass on the key (TCE5) for the Production Operations room to the next Grid Shifter.
  • Prepare a list of outstanding issues to be handed over to the next Grid Shifter and discussed in the Production Operations meeting.
  • Submit an ELOG report summarising the shift and any ongoing or unresolved issues.

Similarly, evening Grid Shifters should:

  • Place the key (TCE5) to the Productions Operations room in the secretariat key box.
  • Submit an ELOG report summarising the shift and any ongoing or unresolved issues.

End of Shift Period

At the end of a shift period the Grid Shifter may wish to unsubscribe from the various mailing lists (Sec. 6.4.1) in addition to returning the Production Operations room key, TCE5 (Sec. 6.4.2).

Mailing Lists

Unsubscribe from the following mailing lists:

  • lhcb-datacrash.
  • lhcb-dirac-developers.
  • lhcb-dirac.
  • lhcb-production.

Miscellaneous

Return the key for the Production Operations Room (TCE5) to the secretariat or the next Grid Shifter.

Base Plots

The following plots should always be included in the report:

  • Total number of Jobs by Final Major Status
  • Daily number of Jobs by Final Major Status
  • Done - Completed Jobs by User Group
  • Done - Completed Production Jobs by Job Type
  • Failed Jobs by User Group
  • Failed Production Jobs by Minor Status
  • Failed User Jobs by Minor Status
  • Done - Completed Production Jobs by Site
  • Done - Completed User Jobs by Site

From these plots the Grid Shifter should then create a number of further plots to analyse the causes and execution locations of failed jobs.

Specific Plots

On analysis of the failed jobs, the Grid Shifter should produce plots of the breakdown by site of all failed jobs with the three or four main job “MinorStatus” results.

Machine Monitoring Plots

Monitoring of the LHCb VO boxes is vital to maintaining the effcient running of all Grid operations. Particular attention should be paid to the used and free space on the various disks, network and CPU usage. The machines could be monitored using Lemon

Analysis and Summary

A summary of each group of plots should be written to aid the next Grid Shifter’s appraisal of the current situation and to enable the Grid Expert on duty to investigate problems further.

ELOG

All Grid Shifter actions of note should be recorded in the ELOG. This had the benefits of allowing new Grid Shifters to familiarise themselves with recent problems with current productions. ELOG entries should contain as much relevant information as possible.

Typical ELOG Format

Each ELOG entry which reports a new problem should include as much relevant information as possible. This allows the production operations team to quickly determine the problem and apply a solution.

ELOG Entry for a New Problem

A typical ELOG entry for a new problem contains:

  • The relevant ProdID or ProdIDs.
  • An example JobID.
  • A copy of the relevant error message and output.
  • The number of affected jobs.
  • The Grid sites affected.
  • The time of the first and last occurrence of the problem.

Subsequent ELOG Entries

Once a problem has been logged it is useful to report the continuing status of the affected productions at the end of each shift.

If a Grid Shifter is unsure whether a problem has been previously logged then they should submit a fresh ELOG following the format outlined in the ELOG.

When to Submit an ELOG

A non-exhaustive list of cases when an ELOG has to be submitted include:

  • Jobs finalise with exceptions.
  • The applications run in the job crash with exceptions.
  • A production is stuck/does not proceed/is failing all the jobs/...
  • Site related problems:
    • A large number/percentage of pilots are aborting
    • Shared area slowness (e.g. : jobs failed with Application status = "SetupProject.sh execution failed")
    • The site is killing a suspiciously high number of jobs.
    • ...

Exceptions

Jobs which finalise with an exception should be noted in the ELOG. The ELOG entry should contain:

  • The production ID.
  • An example job ID.
  • A copy of the relevant error messages.
  • The number of jobs in the production which have the same status.

Crashed Application

Should submit example error log for the crashed application.

Datacrash Emails

The Grid Shifter should filter the datacrash emails and determine if the crash reported is actually due to one of the applications. If so, then the Grid Shifter should submit an ELOG describing the problem and including an example error message. The Grid Shifter should ensure the “Applications” radio button is selected when submitting the ELOG report since this means that the relevant experts will be alerted to the problem.

ELOG Problems

If ELOG is down, send a notification email to lhcb-production@cernNOSPAMPLEASE.ch.

Procedures

If a problem is discovered it is very important to escalate it to the operations team. Assessing the scale of the problem is very important and Grid Shifters should attempt to answer the questions in section 9.1.1 as soon as possible.

On the Discovery of a Problem

Once a problem has been discovered it is important to assess the severity of the problem. Section 9.1.1 provides a checklist which the Grid Shifter should go through after discovering a problem. Additionally, there are a number of Grid-specific issues to consider (Sec. 9.1.2).

Standard Checklist

On the discovery of a new problem, attempt to provide answers to the following questions as quickly as possible:

  • How many jobs does the problem affect?
  • Are the central DIRAC services running normally?
  • Are all jobs affected?
  • When did the problem start?
  • When did the last successful job run in similar conditions?
  • Is it a DIRAC problem?
    • Can extra redundancy be introduced to the system?
    • Is there enough information available to determine the error?

Grid-Specific Issues

  • Was there an announcement of downtime for the site?
  • Is the problem specific to a single site?
    • Are all the CE’s at the site affected?
  • Is the problem systematic across sites with different backend storage technologies?
  • Is the problem specific to an SE?
    • Are there any stalled jobs at the site clustered in time? *Are other jobs successfully reading data from the SE?

Feature Requests

Before submitting a feature request, the user should:

* Identify conditions under which the feature is to be used. * Record all relevant information. * Identify a use-case for the new feature.

Figure 5: Browse current support issues.

Once the user has prepared all the relevant information, they should:

Figure 6: Savannah support submit.

Figure 7: Savannah support submit feature request.

Assuming the feature request has not been previously submitted, the user should then:

  • Navigate to the ``Support'' tab at the top of the page (Fig. 6) and click on ``submit''.
  • Ensure that the submission webform contains all relevant information (Fig. 7).
  • Set the severity option to ``wish''.
  • Set the privacy option to ``private''.
  • Submit the feature request.

Bug Reporting

Before submitting a bug report, the user should:

  • Identify conditions under which the bug occurs.
  • Record all relevant information.
  • Try to ensure that the bug is reproducible.

Once the user is convinced that the behaviour they are experiencing is a bug, they should then prepare to submit a bug report. Users should:

Figure 8: Browse current bugs.

Assuming the bug is new, the procedure to submit a bug report is as follows:

  • Navigate to the ``Support'' tab at the top of the page (Fig. 6) and click on ``submit''.
  • Ensure that the submission webform contains all relevant information (Fig. 9).
  • Set the appropriate severity of the problem.
  • Write a short and clear summary.
  • Set the privacy option to ``private''.
  • Submit the bug report.

Figure 9: Example bug report.

Software Unavailability

Symptom: Jobs fail to find at least one software package.

Software installation occurs during Service Availability Monitoring (SAM) tests. Sites which fail to find software packages should have failed at least part of their most recent SAM test.

Grid Shifter actions:

  • Submit an ELOG report listing the affected productions and sites.
  • Ban the relevant sites until they pass their SAM tests.

DIRAC 3 Scripts

DIRAC Admin Scripts

  • dirac-admin-accounting-cli
  • dirac-admin-add-user
  • dirac-admin-allow-site
  • dirac-admin-ban-site
  • dirac-admin-delete-user
  • dirac-admin-get-banned-sites
  • dirac-admin-get-job-pilot-output
  • dirac-admin-get-job-pilots
  • dirac-admin-get-pilot-output
  • dirac-admin-get-proxy
  • dirac-admin-get-site-mask
  • dirac-admin-list-hosts
  • dirac-admin-list-users
  • dirac-admin-modify-user
  • dirac-admin-pilot-summary
  • dirac-admin-reset-job
  • dirac-admin-service-ports
  • dirac-admin-site-info
  • dirac-admin-sync-users-from-file
  • dirac-admin-upload-proxy
  • dirac-admin-users-with-proxy

DIRAC Bookkeeping Scripts

  • dirac-bookkeeping-eventMgt
  • dirac-bookkeeping-eventtype-mgt
  • dirac-bookkeeping-ls
  • dirac-bookkeeping-production-jobs
  • dirac-bookkeeping-production-informations

DIRAC Clean

  • dirac-clean

DIRAC Configuration

  • dirac-configuration-cli

DIRAC Distribution

  • dirac-distribution

DIRAC DMS

  • dirac-dms-add-file
  • dirac-dms-get-file
  • dirac-dms-lfn-accessURL
  • dirac-dms-lfn-logging-info
  • dirac-dms-lfn-metadata
  • dirac-dms-lfn-replicas
  • dirac-dms-pfn-metadata
  • dirac-dms-pfn-accessURL
  • dirac-dms-remove-pfn
  • dirac-dms-remove-lfn
  • dirac-dms-replicate-lfn

DIRAC Embedded

  • dirac-embedded-external

DIRAC External

  • dirac-external

DIRAC Fix

  • dirac-fix-ld-library-path

DIRAC Framework

  • dirac-framework-ping-service

DIRAC Functions

  • dirac-functions.sh

DIRAC Group

  • dirac-group-init

DIRAC Jobexec

  • dirac-jobexec

DIRAC LHCb

  • dirac-lhcb-job-replica
  • dirac-lhcb-manage-software
  • dirac-lhcb-production-job-check
  • dirac-lhcb-sam-submit-all
  • dirac-lhcb-sam-submit-ce

DIRAC Myproxy

  • dirac-myproxy-upload

DIRAC Production

  • dirac-production-application-summary
  • dirac-production-change-status
  • dirac-production-job-summary
  • dirac-production-list-active
  • dirac-production-list-all
  • dirac-production-list-id
  • dirac-production-logging-info
  • dirac-production-mcextend
  • dirac-production-manager-cli
  • dirac-production-progress
  • dirac-production-set-automatic
  • dirac-production-set-manual
  • dirac-production-site-summary
  • dirac-production-start
  • dirac-production-stop
  • dirac-production-submit
  • dirac-production-summary

DIRAC Proxy

  • dirac-proxy-info
  • dirac-proxy-init
  • dirac-proxy-upload

DIRAC Update

  • dirac-update

DIRAC WMS

  • dirac-wms-job-delete
  • dirac-wms-job-get-output
  • dirac-wms-job-get-input
  • dirac-wms-job-kill
  • dirac-wms-job-logging-info
  • dirac-wms-job-parameters
  • dirac-wms-job-peek
  • dirac-wms-job-status
  • dirac-wms-job-submit
  • dirac-wms-job-reschedule

Common Acronyms

ACL
Access Control Lists
API
Application Programming Interface
ARC
Advance Resource Connector
ARDA
A Realisation of Distributed Analysis
BDII
Berkeley Database Information Index
BOSS
Batch Object Submission System
CA
Certification Authority
CAF
CDF Central Analysis Farm
CCRC
Common Computing Readiness Challenge
CDF
Collider Detector at Fermilab
CE
Computing Element
CERN
Organisation Européenne pour la Recherche Nucléaire: Switzerland/France
CNAF
Centro Nazionale per la Ricerca e Svilupponelle Tecnologie Informatiche e Telematiche: Italy
ConDB
Conditions Database
CPU
Central Processing Unit
CRL
Certifcate Revocation List
CS
Confguration Service
DAG
Directed Acyclic Graph
DC04
Data Challenge 2004
DC06
Data Challenge 2006
DCAP
Data Link Switching Client Access Protocol
DIAL
Distributed Interactive Analysis of Large datasets
DIRAC
Distributed Infrastructure with Remote Agent Control
DISET
DIRAC Secure Transport
DLI
Data Location Interface
DLLs
Dynamically Linked Libraries
DN
Distinguished Name
DNS
Domain Name System
DRS
Data Replication Service
DST
Data Summary Tape
ECAL
Electromagnetic CALorimeter
EGA
Enterprise Grid Alliance
EGEE
Enabling Grids for E-sciencE
ELOG
Electronic Log
ETC
Event Tag Collection
FIFO
First In First Out
FTS
File Transfer Service
GASS
Global Access to Secondary Storage
GFAL
Grid File Access Library
GGF
Global Grid Forum
GIIS
Grid Index Information Service
GLUE
Grid Laboratory Uniform Environment
GRAM
Grid Resource Allocation Manager
GridFTP
Grid File Transfer Protocol
GridKa
Grid Computing Centre Karlsruhe
GriPhyN
Grid Physics Network
GRIS
Grid Resource Information Server
GSI
Grid Security Infrastructure
GT
Globus Toolkit
GUI
Graphical User Interface
GUID
Globally Unique IDentifer
HCAL
Hadron CALorimeter
HEP
High Energy Physics
HLT
High Level Trigger
HTML
Hyper-Text Markup Language
HTTP
Hyper-Text Transfer Protocol
I/O
Input/Output
IN2P3
Institut National de Physique Nucleaire et de Physique des Particules: France
iVDGL
International Virtual Data Grid Laboratory
JDL
Job Description Language
JobDB
Job Database
JobID
Job Identifer
L0
Level 0
LAN
Local Area Network
LCG
LHC Computing Grid
LCG IS
LCG Information System
LCG UI
LCG User Interface
LCG WMS
LCG Workload Management System
LDAP
Lightweight Directory Access Protocol
LFC
LCG File Catalogue
LFN
Logical File Name
LHC
Large Hadron Collider
LHCb
Large Hadron Collider beauty
LSF
Load Share Facility
MC
Monte Carlo
MDS
Monitoring and Discovery Service
MSS
Mass Storage System
NIKHEF
National Institute for Subatomic Physics: Netherlands
OGSA
Open Grid Services Architecture
OGSI
Open Grid Services Infrastructure
OSG
Open Science Grid
P2P
Peer-to-peer
Panda
Production ANd Distributed Analysis
PC
Personal Computer
PDC1
Physics Data Challenge
PFN
Physical File Name
PIC
Port d’Informació Cientfca: Spain
PKI
Public Key Infrastructure
POOL
Pool Of persistent Ob jects for LHC
POSIX
Portable Operating System Interface
PPDG
Particle Physics Data Grid
ProdID
Production Identifer
PS
Preshower Detector
R-GMA
Relational Grid Monitoring Architecture
RAL
Rutherford-Appleton Laboratory: UK
RB
Resource Broker
rDST
reduced Data Summary Tape
RFIO
Remote File Input/Output
RICH
Ring Imaging CHerenkov
RM
Replica Manager
RPC
Remote Procedure Call
RTTC
Real Time Trigger Challenge
SAM
Service Availability Monitoring
SE
Storage Element
SOA
Service Oriented Architecture
SOAP
Simple Ob ject Access Protocol
SPD
Scintillator Pad Detector
SRM
Storage Resource Manager
SSL
Secure Socket Layer
SURL
Storage URL
TCP/IP
Transmission Control Protocol / Internet Protocol
TDS
Transient Detector Store
TES
Transient Event Store
THS
Transient Histogram Store
TT
Trigger Tracker
TURL
Transport URL
URL
Uniform Resource Locator
VDT
Virtual Data Toolkit
VELO
VErtex LOcator
VO
Virtual Organisation
VOMS
Virtual Organisation Membership Service
WAN
Wide Area Network
WMS
Workload Management System
WN
Worker Node
WSDL
Web Services Description Language
WSRF
Web Services Resource Framework
WWW
World Wide Web
XML
eXtensible Markup Language
XML-RPC
XML Remote Procedure Call

-- PaulSzczypka - 14 Aug 2009

Topic attachments
I Attachment History Action SizeSorted ascending Date Who Comment
PNGpng savannah_support_submit.png r1 manage 30.6 K 2009-09-02 - 12:47 PaulSzczypka Savannah support submission
PNGpng savannah_support_browse.png r1 manage 37.5 K 2009-09-02 - 12:35 PaulSzczypka Savanah Support Dialogue
PNGpng get_std_out.png r1 manage 37.6 K 2009-09-01 - 17:27 PaulSzczypka Peek the std.out of a job via the Job Monitoring Webpage
PNGpng get_logfiles.png r1 manage 37.8 K 2009-09-01 - 17:26 PaulSzczypka View all the output files of a job via the Job Monitoring Webpage.
PNGpng savannah_bugs_example.png r1 manage 41.7 K 2009-09-02 - 12:35 PaulSzczypka Savannah bug example
PNGpng savannah_support_example.png r1 manage 42.5 K 2009-09-02 - 12:35 PaulSzczypka Savannah support example
PNGpng get_pilot_output.png r1 manage 42.8 K 2009-09-01 - 17:46 PaulSzczypka View the pilot output of a job via the Job Monitoring Webpage.
PNGpng savannah_bugs.png r1 manage 46.5 K 2009-09-02 - 12:34 PaulSzczypka Savannah bugs dialogue
PNGpng dirac-primary-states.png r1 manage 109.0 K 2009-09-01 - 17:13 PaulSzczypka Job status flowchart. Note that the ``Checking'' and ``Staging'' status are omitted.

This topic: LHCb > WebHome > LHCbComputing > ProductionShifterGuide
Topic revision: r23 - 2017-01-19 - GiacomoGraziani
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback