-- Main.atsareg - 15 Jan 2007

DIRAC 3 Discussion Pages

During the last week of 2006 we have held a workshop to outline the next generation DIRAC 3 project, its main trends, features and organization. This page is devoted to the summary of the discussions. The page is meant to be dynamic to reflect the outcome of the ongoing discussions.

The summary is organized in several topics which are corresponding to the subsections below. Each subsection has a convener whose responsibility is to collect and objectively present the outcome of the relevant discussions. Conveners are formulating the decisions taken by consensus and present discussion points where opinions differ. The objectivity of the conveners is important for these discussions.

Project Organization

Convener: Andrei Tsaregorodtsev

Code Packaging

Code packaging has several issues:

  • structure of the CVS repository reflecting the functional decomposition of the DIRAC code;
  • definition of the package dependencies;
  • dealing with compiled (C++) code ;
  • defining package responsibilities.

The proposed structure is shown below. This is a subject to discussions and other proposals were already made. The main principle is that the same directory structure should be maintained in the CVS repository, in the DIRAC distribution (complete or partial) and in the development environment at least for what concerns the python code part. In the development environment direct check-in to the CVS repository must be possible. The import statements should not depend on the development or production environments.

CVSROOT/<CMT>
       /doc
       /scripts
       /etc
       /python/contrib
              /WorkflowLibrary/Modules
                              /Steps
                              /Workflows
              /DIRAC/Core/Utilities
                         /DISET
                         /Workflow
                         /Logger
                         /DataAccess/Storage
                                    /ReplicaManager
                                    /FileCatalog
                     /WMS/Service
                         /Agent
                         /DB
                         /Client
                     /DMS/Service
                         /Agent
                         /DB
                         /Client
                     /ProdMgmt/Service
                              /Agent
                              /DB
                              /Client  
                     /VObox/Service
                           /Agent
                           /DB
                           /Client   
                     /Information/Service
                                 /Agent
                                 /DB
                                 /Client     
                     /Resources/Storage
                               /Computing
                     /Interface/API
                               /DIRAC-shell
                               /Web                                   
       

Discussion Points

  • The high level structure of the CVS repository
  • The relation between the CVS repository structure and the release procedure
  • Python code packaging

Release Procedure

The proposed DIRAC release procedure consists of several steps described below. We distinguish development and production releases. The procedure below concerns the production release. The release procedure should be as simple as possible with most of the emphasis on testing and documenting.

  • The upcoming release is announced by the release manager/project coordinator to the project package developers.
  • The developers of the DIRAC packages are committing to CVS the code which should be included into the release. The code for the release will be taken from the CVS Header revision by default. The package developers can mark the code to be released with a CVS tag if the Header revision is not the right choice. The tag is communicated to the release manager. The package release notes are are provided to the release manager.
  • The code for all packages is collected on the machine of the release manager and tagged with the release tag of a kind vXXXrXXX.
  • The binary code is compiled using the CMT build system for the architectures in use by the LHCb/DIRAC project.
  • The release notes are compiled by the release manager and added to the doc/release.notes file. The Epydoc code documentation is compiled and made available on the DIRAC Project Web pages.
  • The distribution tar file is built.
  • The new release is installed on the Test system from the distribution tar file. The distribution tar file is placed in the distribution area. The Test system is configured to pick up the new release for the Worker Node installation.
  • The tests are carried out with real jobs affecting all the subsystems.
  • After the tests confirm the validity of the released code, the release is announced to the DIRAC developers and other relevant mailing lists.
  • The release is installed in the CERN release area to be used by the DIRAC clients.
  • Production and User WMS systems are upgraded to the new release if necessary.

A development release follows practically the same lines with few simplifications. The release manager collects the code on his machine by adding to the previous release one or more updated packages. The release then is tagged with a development tag of a kind vXXXrXXXpXXX, the tar file is produced and made available in the distribution area, the release notes are updated and the release is announced to the DIRAC developers mailing list.

Special attention should be paid to the choice of the versions. Each modification which is affecting the services interfaces and thus the service/client communications will result in the tagging the new release with the new major version. Otherwise, only minor version is incremented. This will make clear which service and client installations are functionally compatible.

Discussion points

The main open question to be discussed is whether to make a single distribution of all the DIRAC software or several distributions for different installation environments with different versions.The different environments are: WMS servers, WN installations, VO-boxes, Client installation, may be some others.

We agreed that if even several distributions are desirable they should all carry the same common release tag, no separate subsystem releases with independent tags/versions will be done.

Single distribution

The advantages of a single distribution is simplicity of its building ( just one ) and installation ( one installation procedure ). Disadvantage is that more software than necessary can be installed in a given environment. In this approach it is possible to have different distribution versions installed in different environments. This can create confusion in understanding the compatibility of different versions of services and clients. This confusion can be avoided with the versioning convention described above.

Subsystem distributions

Several distributions avoid installation of the unnecessary software in a given environment. It was mentioned also that this can help spotting the unnecessary dependencies in between packages. This however can be also done without building subsystem distributions.

The size of the distribution is not an issue ( about 12 MB currently ).

Discussions indicated the necessity to include the Python interpreter distribution to the DIRAC distribution in order not to depend on local python installation variations. The version of the python interpreter to be included should be the same as the current python version in the Application Area.

Open point is to whether include binaries compiled for different platforms into a single distribution or have platform dependent distributions. Having all the platform dependent binaries in one distribution will increase the size but will simplify the installation procedure.

Tools for building distributions

Installation

Installation is done in several steps:

  • deciding on the version of the DIRAC to be installed;
  • downloading the DIRAC distribution according to the hosting platform
  • installing the software
  • configuring the DIRAC installation

The procedure should be transparent and simple as it is now with just one command ( script ) installation. This script will have be updated to reflect the possible changes in the way how the DIRAC distribution is built. For a fresh installation a simple setup providing reasonable default values is done. Further configuration is detailed by editing configuration files.

The update of the DIRAC installation is possible without the need for additional setup.

Software frameworks and rules

Convener: Ricardo Graciani

Python version

Python related topics to discuss:

  • Which version of python will be DIRAC3 "certified" to run on? Python2.4 is more spread than 2.5 but the latter is the newest one and has a longer lifetime.
  • Will python be shipped with DIRAC3 together?

DISET Service Framework

Basic Utilities

See the proposal of the multilayer configuration structure.

See the proposal of the logging system.

See the proposal of the services framework.

See the proposed changes for current errors reported to the logger.

See the proposed changes for current fatal errors reported to the logger.

Coding Rules

See the proposal of the DIRAC Coding rules document.

See the proposal of the DIRAC typical package organization.

Testing

Workload Management

Convener: Stuart Paterson

WMS

See the Discussion Summary in the attached document

See DB discussion proposal https://ruben.ecm.ub.es/wiki/index.php/WMS_DataBase

Discussion Points

  • How to proceed with the implementation of a job prioritization mechanism.

Data Management

Convener: Andrew C. Smith

See the December Discussion Slides/Minutes in the attached document

A document, with a Data Management Proposal with a fuller discussion of the points raised and subsequent discussions, is available.

Discussion Points

Tasks

  • Transfer Database
    • test database set up
    • Transfer DB Service/Client (extension of requestDB)
    • multi-threaded FTS agent (reuse of code)
    • monitoring?
  • Transfer integrity checks
    • minimal development
  • Dataset prototype
    • script to create links in LFC
    • tools to dereference datasets
    • database to store dataset metadata
    • web display of created datasets
  • Transfer Weather Service
    • simple client development
    • server deployment (still need to define what is wanted)
  • Data Integrity Checks (Marianne)
    • SE-> LFC
    • Integration of integrity checks in the Data Management Console
  • Data Management Console
    • Integrate Auto Data Transfer Database CLI
  • LCG externals
    • Into CVS
    • Perhaps use of GFAL directly in SRM class.

Monitoring and Accounting

How to use the monitoring system

from DIRAC  import gMonitor
gMonitor.registerActivity(name, description, category, unit, operation)
gMonitor.addMark(name, x)

The 'name' argument to registerActivity is an internal reference used when referencing the activity. This should be used when adding marks. The 'description' will be used in the monitoring page as a plot title. The 'category' is a way to group activities of certain types together when producing plots. The 'unit' is the unit. The operation field given in the registerActivity method is used to determine the way in which the marks are treated. The possibilities are:

OP_MEAN = "mean"

OP_ACUM = "acum"

OP_SUM = "sum"

OP_RATE = "rate"

e.g. To monitor the number of banned sites over time you need the following code

   
gMonitor.registerActivity("BannedSites","Banned Sites","SAMAgent","Banned Sites",gMonitor.OP_MEAN)
gMonitor.addMark("BannedSites", 10)

Convener: Ricardo Graciani

See the Discussion Summary in the attached document

Production Management

Convener: Joel Closier

Console

Tasks

  • remote repository for workflow and production
  • possibility to modify some parameters at the level of a step and not only at the level of workflow
  • show the association between a production template and the workflow used to create this template
  • web interface to display the configuration of a production (configuration, event type, program version, ..)
  • put source in CVS
  • migration to python

Status

  • prototype in preparation with mysql as backend.
  • job part is dropped.
  • not needed for the time being : instantiation of job, with inputdata.
  • submitprod / expandprod / getprod / publishprod / listprod / getpath
  • path like : simworkflow (5 steps for example) / dc06-v1 (modify the version of pgm)/ 10001001(change the number of event /evttype)

Production

Tasks

  • publication of information : web interface what should be displayed?
  • form to make the request for physics production.
  • possibility to find the workflow used for the template

DIRAC tools:

  • logging actions (ban/allow site, modify filter,...) (who / when /why). Implement a logging facility for the dirac-admin command to know who/what/why/when an action has been performed.

Processing DB

Tools:

  • way to modify a filter
  • how to create a new transformation without using the file already processed ->
  • how to resubmit job ?
  • how to submit jobs to a list of site ?
  • registration of all file with a timelife -> registration by default
  • how to flush a transformation

Browsing:

  • view : how many job created / Which status for each file ?
  • association of file and transformation .

Operation

  • two instances of processingDB for test and production
  • Any file registration by default

Production Repository Service

Proposal for the Production Repository Service architecture and interface.

DIRAC Interfaces

Convener: Andrei Tsaregorodtsev

Other Tasks

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf CS2.pdf r1 manage 76.5 K 2007-01-26 - 15:49 UnknownUser  
Microsoft Word filedoc DIRAC3-DataManagementProposal.doc r1 manage 588.0 K 2007-01-30 - 11:16 AndrewCSmith DIRAC3 Data Management Proposal
PDFpdf DIRAC_CS_schema.pdf r1 manage 89.3 K 2007-04-05 - 19:35 UnknownUser DIRAC3 CS schema proposal
JPEGjpg DIRAC_Package_Structure.jpg r1 manage 50.5 K 2007-01-23 - 11:33 UnknownUser Proposed DIRAC Tree
PDFpdf DIRAC_Package_organization-3.pdf r1 manage 8.2 K 2007-01-22 - 19:02 UnknownUser DIRAC package organization proposal
PDFpdf Dirac_Coding_Conventions.pdf r1 manage 11.7 K 2007-01-22 - 19:01 UnknownUser DIRAC Coding rules proposal
PDFpdf LoggerProposal.pdf r1 manage 84.6 K 2007-01-30 - 17:38 UnknownUser  
PowerPointppt Production_Repository_26012007.ppt r2 r1 manage 337.5 K 2007-01-26 - 13:52 UnknownUser Production Repository Service description
PDFpdf ServicesFrameworkProposal.pdf r1 manage 43.1 K 2007-01-30 - 18:46 UnknownUser  
PDFpdf dirac3-wms.pdf r1 manage 206.1 K 2007-01-15 - 10:53 UnknownUser Summary of the DIRAC3 Workload Management discussion
Edit | Attach | Watch | Print version | History: r22 < r21 < r20 < r19 < r18 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r22 - 2008-09-30 - AndrewCSmith
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback