-- Main.atsareg - 15 Jan 2007
DIRAC 3 Discussion Pages
During the last week of 2006 we have held a workshop to outline the next generation DIRAC 3 project,
its main trends, features and organization. This page is devoted to the summary of the discussions. The
page is meant to be dynamic to reflect the outcome of the ongoing discussions.
The summary is organized in several topics which are corresponding to the subsections below. Each subsection has
a convener whose responsibility is to collect and objectively present the outcome of the relevant discussions. Conveners
are formulating the decisions taken by consensus and present discussion points where opinions differ. The objectivity
of the conveners is important for these discussions.
Project Organization
Convener: Andrei Tsaregorodtsev
Code Packaging
Code packaging has several issues:
- structure of the CVS repository reflecting the functional decomposition of the DIRAC code;
- definition of the package dependencies;
- dealing with compiled (C++) code ;
- defining package responsibilities.
The proposed structure is shown below. This is a subject to discussions and other proposals were already made. The main principle is that the same
directory structure should be maintained in the CVS repository, in the DIRAC distribution (complete or partial) and in the development environment
at least for what concerns the python code part. In the development environment direct check-in to the CVS repository must be possible. The import
statements should not depend on the development or production environments.
CVSROOT/<CMT>
/doc
/scripts
/etc
/python/contrib
/WorkflowLibrary/Modules
/Steps
/Workflows
/DIRAC/Core/Utilities
/DISET
/Workflow
/Logger
/DataAccess/Storage
/ReplicaManager
/FileCatalog
/WMS/Service
/Agent
/DB
/Client
/DMS/Service
/Agent
/DB
/Client
/ProdMgmt/Service
/Agent
/DB
/Client
/VObox/Service
/Agent
/DB
/Client
/Information/Service
/Agent
/DB
/Client
/Resources/Storage
/Computing
/Interface/API
/DIRAC-shell
/Web
Discussion Points
- The high level structure of the CVS repository
- The relation between the CVS repository structure and the release procedure
- Python code packaging
Release Procedure
The proposed DIRAC release procedure consists of several steps described below. We distinguish development and production releases.
The procedure below concerns the production release. The release procedure should be as simple as possible with most of the emphasis
on testing and documenting.
- The upcoming release is announced by the release manager/project coordinator to the project package developers.
- The developers of the DIRAC packages are committing to CVS the code which should be included into the release. The code for the release will be taken from the CVS Header revision by default. The package developers can mark the code to be released with a CVS tag if the Header revision is not the right choice. The tag is communicated to the release manager. The package release notes are are provided to the release manager.
- The code for all packages is collected on the machine of the release manager and tagged with the release tag of a kind vXXXrXXX.
- The binary code is compiled using the CMT build system for the architectures in use by the LHCb/DIRAC project.
- The release notes are compiled by the release manager and added to the doc/release.notes file. The Epydoc code documentation is compiled and made available on the DIRAC Project Web pages.
- The distribution tar file is built.
- The new release is installed on the Test system from the distribution tar file. The distribution tar file is placed in the distribution area. The Test system is configured to pick up the new release for the Worker Node installation.
- The tests are carried out with real jobs affecting all the subsystems.
- After the tests confirm the validity of the released code, the release is announced to the DIRAC developers and other relevant mailing lists.
- The release is installed in the CERN release area to be used by the DIRAC clients.
- Production and User WMS systems are upgraded to the new release if necessary.
A development release follows practically the same lines with few simplifications. The release manager collects the code on his machine by adding to the previous release one or more updated packages. The release
then is tagged with a development tag of a kind vXXXrXXXpXXX, the tar file is produced and made available in the distribution area, the release notes are updated and the release is announced to the DIRAC developers
mailing list.
Special attention should be paid to the choice of the versions. Each modification which is affecting the services interfaces and thus the service/client communications will result in the tagging the new release with the new major version. Otherwise, only minor version is incremented. This will make clear which service and client installations are functionally compatible.
Discussion points
The main open question to be discussed is whether to make a single distribution of all the DIRAC software or several distributions for different installation environments with different versions.The different environments are: WMS servers, WN installations, VO-boxes, Client installation, may be some others.
We agreed that if even several distributions are desirable they should all carry the same common release tag, no separate subsystem releases with independent tags/versions will be done.
Single distribution
The advantages of a single distribution is simplicity of its building ( just one ) and installation ( one installation procedure ). Disadvantage is that more software than necessary can be installed in a given environment.
In this approach it is possible to have different distribution versions installed in different environments. This can create confusion in understanding the compatibility of different versions of services and clients. This confusion
can be avoided with the versioning convention described above.
Subsystem distributions
Several distributions avoid installation of the unnecessary software in a given environment. It was mentioned also that this can help spotting the unnecessary dependencies in between packages.
This however can be also done without building subsystem distributions.
The size of the distribution is not an issue ( about 12 MB currently ).
Discussions indicated the necessity to include the Python interpreter distribution to the DIRAC distribution in order not to depend on local python installation variations. The
version of the python interpreter to be included should be the same as the current python version in the Application Area.
Open point is to whether include binaries compiled for different platforms into a single distribution or have platform dependent distributions. Having all the
platform dependent binaries in one distribution will increase the size but will simplify the installation procedure.
Tools for building distributions
Installation
Installation is done in several steps:
- deciding on the version of the DIRAC to be installed;
- downloading the DIRAC distribution according to the hosting platform
- installing the software
- configuring the DIRAC installation
The procedure should be transparent and simple as it is now with just one command ( script ) installation. This script will have be updated to reflect the possible changes in the way how the DIRAC distribution is built.
For a fresh installation a simple setup providing reasonable default values is done. Further configuration is detailed by editing configuration files.
The update of the DIRAC installation is possible without the need for additional setup.
Software frameworks and rules
Convener: Ricardo Graciani
Python version
Python related topics to discuss:
- Which version of python will be DIRAC3 "certified" to run on? Python2.4 is more spread than 2.5 but the latter is the newest one and has a longer lifetime.
- Will python be shipped with DIRAC3 together?
DISET Service Framework
Basic Utilities
See the
proposal of the multilayer configuration structure.
See the
proposal of the logging system.
See the
proposal of the services framework.
See the
proposed changes for current errors reported to the logger.
See the
proposed changes for current fatal errors reported to the logger.
Coding Rules
See the
proposal of the DIRAC Coding rules document.
See the
proposal of the DIRAC typical package organization.
Testing
Workload Management
Convener: Stuart Paterson
WMS
See the
Discussion Summary in the attached document
See DB discussion proposal
https://ruben.ecm.ub.es/wiki/index.php/WMS_DataBase
Discussion Points
- How to proceed with the implementation of a job prioritization mechanism.
Data Management
Convener: Andrew C. Smith
See the
December Discussion Slides/Minutes
in the attached document
A document, with a
Data Management Proposal with a fuller discussion of the points raised and subsequent discussions, is available.
Discussion Points
Tasks
- Transfer Database
- test database set up
- Transfer DB Service/Client (extension of requestDB)
- multi-threaded FTS agent (reuse of code)
- monitoring?
- Transfer integrity checks
- Dataset prototype
- script to create links in LFC
- tools to dereference datasets
- database to store dataset metadata
- web display of created datasets
- Transfer Weather Service
- simple client development
- server deployment (still need to define what is wanted)
- Data Integrity Checks (Marianne)
- SE-> LFC
- Integration of integrity checks in the Data Management Console
- Data Management Console
- Integrate Auto Data Transfer Database CLI
- LCG externals
- Into CVS
- Perhaps use of GFAL directly in SRM class.
Monitoring and Accounting
How to use the monitoring system
from DIRAC import gMonitor
gMonitor.registerActivity(name, description, category, unit, operation)
gMonitor.addMark(name, x)
The 'name' argument to registerActivity is an internal reference used when referencing the activity. This should be used when adding marks. The 'description' will be used in the monitoring page as a plot title. The 'category' is a way to group activities of certain types together when producing plots.
The 'unit' is the unit. The operation field given in the registerActivity method is used to determine the way in which the marks are treated. The possibilities are:
OP_MEAN = "mean"
OP_ACUM = "acum"
OP_SUM = "sum"
OP_RATE = "rate"
e.g. To monitor the number of banned sites over time you need the following code
gMonitor.registerActivity("BannedSites","Banned Sites","SAMAgent","Banned Sites",gMonitor.OP_MEAN)
gMonitor.addMark("BannedSites", 10)
Convener: Ricardo Graciani
See the
Discussion Summary
in the attached document
Production Management
Convener: Joel Closier
Console
Tasks
- remote repository for workflow and production
- possibility to modify some parameters at the level of a step and not only at the level of workflow
- show the association between a production template and the workflow used to create this template
- web interface to display the configuration of a production (configuration, event type, program version, ..)
- put source in CVS
- migration to python
Status
- prototype in preparation with mysql as backend.
- job part is dropped.
- not needed for the time being : instantiation of job, with inputdata.
- submitprod / expandprod / getprod / publishprod / listprod / getpath
- path like : simworkflow (5 steps for example) / dc06-v1 (modify the version of pgm)/ 10001001(change the number of event /evttype)
Production
Tasks
- publication of information : web interface what should be displayed?
- form to make the request for physics production.
- possibility to find the workflow used for the template
DIRAC tools:
- logging actions (ban/allow site, modify filter,...) (who / when /why). Implement a logging facility for the dirac-admin command to know who/what/why/when an action has been performed.
Processing DB
Tools:
- way to modify a filter
- how to create a new transformation without using the file already processed ->
- how to resubmit job ?
- how to submit jobs to a list of site ?
- registration of all file with a timelife -> registration by default
- how to flush a transformation
Browsing:
- view : how many job created / Which status for each file ?
- association of file and transformation .
Operation
- two instances of processingDB for test and production
- Any file registration by default
Production Repository Service
Proposal for the Production Repository Service architecture and interface.
DIRAC Interfaces
Convener: Andrei Tsaregorodtsev
Other Tasks