Difference: DiracProject (1 vs. 29)

Revision 292011-06-22 - AndresAeschlimann

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 40 to 40
 

Docs

Changed:
<
<
>
>
 

Monitoring

Revision 282009-11-06 - AdriaCasajus

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 24 to 24
 
Added:
>
>
 

WMS

Revision 272009-08-04 - FlorianFeldhaus

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 18 to 18
 
Added:
>
>
 

Documentation

Revision 262009-06-08 - JoelClosier

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 22 to 22
 

Documentation

Added:
>
>
 

WMS

Revision 252009-02-16 - AdriaCasajus

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 25 to 25
 

WMS

Changed:
<
<
>
>
 

HOWTOs

Revision 242009-02-16 - RicardoGraciani

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 26 to 26
 

WMS

Added:
>
>
 

HOWTOs

Revision 232009-02-12 - AdriaCasajus

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 19 to 19
 
Added:
>
>

Documentation

WMS

 

HOWTOs

Revision 222009-02-12 - AdriaCasajus

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 19 to 19
 
Added:
>
>

HOWTOs

 

Links to related pages

Docs

Revision 212009-01-21 - AndreiTsaregorodtsev

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 30 to 30
 
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC3 Task List (presented at PASTE Meeting 05/12/06)" date="1165568969" name="DIRAC3TaskList-draft1.doc" path="DIRAC3TaskList-draft1.doc" size="53248" user="Main.paterson" version="1"
META FILEATTACHMENT attr="" autoattached="1" comment="Comments to Andrei's proposal" date="1144742121" name="DIRAC3_States.pdf" path="DIRAC3_States.pdf" size="39874" user="rgracian" version="1.1"
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"
Added:
>
>
META FILEATTACHMENT attachment="DIRAC_Pilots_Note.pdf" attr="" comment="DIRAC Pilot Framework description" date="1232550526" name="DIRAC_Pilots_Note.pdf" path="DIRAC_Pilots_Note.pdf" size="442090" stream="DIRAC_Pilots_Note.pdf" user="Main.AndreiTsaregorodtsev" version="1"

Revision 202008-11-18 - AndreiTsaregorodtsev

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 17 to 17
 
Added:
>
>
 

Links to related pages

Docs

Revision 192008-11-07 - JoelClosier

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 14 to 14
 

DIRAC 3

Added:
>
>
 

Revision 182008-11-06 - AndreiTsaregorodtsev

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 8 to 8
  This is the DIRAC project page which contains the current list of the DIRAC tasks and other materials necessary for the project activity.
Changed:
<
<

Links to related pages

Docs

Monitoring

>
>
 

Running DIRAC on a Site without WAN connectivity

DIRAC 3

Line: 23 to 17
 
Changed:
<
<

DIRAC Tasks

General project organization

Task Description Responsible Status Savannah
Create a new CVS repository and define the high level directory structure J.C., A.T. In progress Link
Develop the DIRAC3 release building tools J.C. In progress Link

Framework and coding rules

Task Description Responsible Status Savannah
Configuration Service with arbitrary hierarchical structure of the parameter space R.G., A.C. In progress Link
Logger service to collect critical messages from the DIRAC distributed system R.G., A.C. In progress Link
Definition of the DIRAC3 services framework R.G., A.C. In progress Link
Definition of the DIRAC3 agents framework A.T. In progress Link
DIRAC coding rules R.G.,A.T. In progress Link

Workload Management System

Task Description Responsible Status Savannah

Data Management System

Task Description Responsible Status Savannah

Production Management System

Task Description Responsible Status Savannah

Configuration, Accounting, Monitoring, Bookkeeping

Task Description Responsible Status Savannah

Interfaces

>
>

Links to related pages

Docs

Monitoring

 
Deleted:
<
<
Task Description Responsible Status Savannah
 
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC3 Task List (presented at PASTE Meeting 05/12/06)" date="1165568969" name="DIRAC3TaskList-draft1.doc" path="DIRAC3TaskList-draft1.doc" size="53248" user="Main.paterson" version="1"
META FILEATTACHMENT attr="" autoattached="1" comment="Comments to Andrei's proposal" date="1144742121" name="DIRAC3_States.pdf" path="DIRAC3_States.pdf" size="39874" user="rgracian" version="1.1"
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"

Revision 172008-06-21 - RicardoGraciani

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 11 to 11
 

Links to related pages

Docs

Changed:
<
<
>
>
 

Monitoring

Running DIRAC on a Site without WAN connectivity

Line: 58 to 59
 

Interfaces

Task Description Responsible Status Savannah
Deleted:
<
<
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"
META FILEATTACHMENT attr="" autoattached="1" comment="Comments to Andrei's proposal" date="1144742121" name="DIRAC3_States.pdf" path="DIRAC3_States.pdf" size="39874" user="rgracian" version="1.1"
 
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC3 Task List (presented at PASTE Meeting 05/12/06)" date="1165568969" name="DIRAC3TaskList-draft1.doc" path="DIRAC3TaskList-draft1.doc" size="53248" user="Main.paterson" version="1"
Added:
>
>
META FILEATTACHMENT attr="" autoattached="1" comment="Comments to Andrei's proposal" date="1144742121" name="DIRAC3_States.pdf" path="DIRAC3_States.pdf" size="39874" user="rgracian" version="1.1"
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"

Revision 162008-02-20 - AdriaCasajus

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 20 to 20
 
Added:
>
>
 

DIRAC Tasks

Revision 152008-02-12 - AdriaCasajus

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 19 to 19
 

DIRAC 3

Added:
>
>
 

DIRAC Tasks

Revision 142007-02-12 - AndreiTsaregorodtsev

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 25 to 25
 

General project organization

Task Description Responsible Status Savannah
Changed:
<
<
Create a new CVS repository and define the high level directory structure J.C., A.T. In progress Link
>
>
Create a new CVS repository and define the high level directory structure J.C., A.T. In progress Link
Develop the DIRAC3 release building tools J.C. In progress Link
 

Framework and coding rules

Changed:
<
<
Task Description Status
>
>
Task Description Responsible Status Savannah
Configuration Service with arbitrary hierarchical structure of the parameter space R.G., A.C. In progress Link
Logger service to collect critical messages from the DIRAC distributed system R.G., A.C. In progress Link
Definition of the DIRAC3 services framework R.G., A.C. In progress Link
Definition of the DIRAC3 agents framework A.T. In progress Link
DIRAC coding rules R.G.,A.T. In progress Link
 

Workload Management System

Changed:
<
<
Task Description Status
>
>
Task Description Responsible Status Savannah
 

Data Management System

Changed:
<
<
Task Description Status
>
>
Task Description Responsible Status Savannah
 

Production Management System

Changed:
<
<
Task Description Status
>
>
Task Description Responsible Status Savannah
 

Configuration, Accounting, Monitoring, Bookkeeping

Changed:
<
<
Task Description Status
>
>
Task Description Responsible Status Savannah
 

Interfaces

Changed:
<
<
Task Description Status
>
>
Task Description Responsible Status Savannah
 
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"
META FILEATTACHMENT attr="" autoattached="1" comment="Comments to Andrei's proposal" date="1144742121" name="DIRAC3_States.pdf" path="DIRAC3_States.pdf" size="39874" user="rgracian" version="1.1"

Revision 132007-02-06 - RicardoGraciani

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 15 to 15
 

Monitoring

Running DIRAC on a Site without WAN connectivity

Changed:
<
<
>
>
 

DIRAC 3

Revision 122007-02-06 - AndreiTsaregorodtsev

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 24 to 24
 

General project organization

Changed:
<
<
Task Description Status
>
>
Task Description Responsible Status Savannah
Create a new CVS repository and define the high level directory structure J.C., A.T. In progress Link
 

Framework and coding rules

Revision 112007-02-06 - RicardoGraciani

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 11 to 11
 

Links to related pages

Docs

Changed:
<
<
>
>
 

Monitoring

Added:
>
>

Running DIRAC on a Site without WAN connectivity

 

DIRAC 3

Revision 102007-02-06 - AndreiTsaregorodtsev

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 19 to 19
 
Changed:
<
<

Completed Tasks

>
>

DIRAC Tasks

General project organization

Task Description Status

Framework and coding rules

 
Task Description Status
Deleted:
<
<
State precisely the goals and scope of the DIRAC project. Ensure that the LHCb collaboration supports them. Completed
Define the scope of DIRAC in relation with the other Grid activities (LCG, EGEE, etc.) and other LHCb/ATLAS projects like Ganga. Completed
A solution or policy for dealing with expiring proxies has to be found. Completed
A quota system or limitation policy should be implemented for the sandboxes. Completed
Use LFC replicas at Tier-1’s and use new LFC methods to improve performance. Completed
Develop a mechanism by which the CS can be regularly checked and populated with reference to the BDII and VOMS. Completed
Implement accounting for user jobs. Completed
Implement a provisional mechanism for user unique identification until supported by VOMS (using DIRAC CS). Username should be the AFS username for users who have one. Completed
Develop specialised XML-RPC methods for large data transfers. Completed
Change the nomenclature for “Production”, “Production request” in the Production Console. Completed
Design a simple reliable framework for prioritising jobs in the WMS queue. Completed
Released code should never be overwritten. Use a patch release instead. Completed
Use effectively Savannah as bug reporting system. Completed
VOMS groups and roles should be defined. Users should get at least one group/role attributed. Utilise VOMS authentication as soon as deployed by DIRAC services and clients. Completed
Complete the migration to the new CS interface and make CS parameters on command line consistent with .ini files. Completed
Better define Data Management terminology in accordance with Grid terminology improve on code self-documentation (meaningful naming). Completed
The grid storage approach for sandbox data should be explored. Completed
Provide access and documentation to workflows (including proper versioning when needed). Completed

High Priority Tasks

Savannah Task Description Responsible Status
Task Not Ready A more formal project management should be introduced. This should include work breakdown into sub-projects and identifying the responsible. Unassigned Incomplete
Task Not Ready Bulk submission functionality should be implemented together with the job splitting facility. Unassigned Incomplete
Task Not Ready Review the set of job states in order to clearly separate DIRAC states from Application state and implement them in all the relevant components. Unassigned Incomplete
Task Not Ready Define clearly understandable and unambiguous job state strings. Unassigned Incomplete
Task Not Ready Monitor the causes of job stalled (Pilot monitoring and error categorization). Unassigned Incomplete
Task Not Ready Provide possibility of instrumenting applications in order to give more details on its progress in coordination with the Core Software group. Unassigned Incomplete
Task Not Ready For non-critical information, resume the effort on MonaLisa monitoring for global and service monitoring. Unassigned Incomplete
Task Not Ready Define a flexible policy for deletion of jobs from the Job DB as afterwards no traceback is possible (policy for reconstruction jobs required). Unassigned Incomplete
Task Not Ready Pursue experimentation of the agent optimisation, too much coupling should be avoided between the DIRAC WMS and the state of resources on LCG. Unassigned Incomplete
Task Not Ready An internal LHCb policy (e.g. using long queues when no production is running) will certainly help. Unassigned Incomplete
Task Not Ready Keep WMS optimization approaches compatible with current LCG policies (demonstrate these approaches do not introduce security holes in the Grid). Unassigned Incomplete
Task Not Ready Enforce a policy for user files registration (logical and physical namespace). Unassigned Incomplete
Task Not Ready Re-structure the CVS repository into consistent packages (sub-projects), allowing easier code maintenance and distribution. Unassigned Incomplete
Task Not Ready Urgently set up a well structured, complete and maintained set of documentation web pages for DIRAC. Unassigned Incomplete
Task Not Ready Follow python and compiler versions supported by LCG. Consider providing a standard version of python as part of the DIRAC distribution. Unassigned Incomplete
Task Not Ready Revise the task queues optimisation (classes of jobs) concerns about the number of queues becoming too large. Unassigned Incomplete
Task Not Ready Use a tool for handling dependencies, use external packages common with the Core SW. Unassigned Incomplete
Task Not Ready Pursue usage of a messaging service included for monitoring (VO-box failover mechanism, API for communication). Unassigned Incomplete

Medium Priority Tasks

Savannah Task Description Responsible Status
Task Not Ready The new DISET security model will need to be certified by the Grid specialists before the system is adapted to the new model (push for certification). Unassigned Incomplete
Task Not Ready Demonstrate the job-wrapper able to run applications not based on Gaudi (e.g. ROOT, minimisation programs). Unassigned Incomplete
Task Not Ready Simplify the Storage Element architecture and investigate using GFAL as base library (SRM client library) Unassigned Incomplete
Task Not Ready Follow lcg-utils developments in case additional functionality is provided and make sure compatibility is ensured with LFC implementation. Unassigned Incomplete
Task Not Ready Implement accounting for storage and provide tools to cross-check with the Grid accounting. Unassigned Incomplete
Task Not Ready Pursue (with low priority compared to stability of the system) research on direct CE submission and resource reservation (e.g. CREAM). Unassigned Incomplete
Task Not Ready Investigate various technologies for job prioritization (gPBox, glexec, MAUI…). Unassigned Incomplete
Task Not Ready Improve the current DISET security schema down to object manipulation. Unassigned Incomplete
Task Not Ready Define the set of roles for mapping LHCb users and define users’ group/role in VOMS. Unassigned Incomplete
Task Not Ready Define an error reporting framework to be used throughout DIRAC (e.g. global sys logger for all catchable errors). Unassigned Incomplete
Task Not Ready Improve the control and monitoring of services (e.g. MonaLisa / DIRAC Monitoring) including visual aspects of the monitoring pages. Unassigned Incomplete

Low Priority Tasks

Savannah Task Description Responsible Status
Task Not Ready Data management user and system documentation must be improved urgently. Unassigned Incomplete
Task Not Ready Include information on the data transfer to the job status, in case transfer is asynchronous. Unassigned Incomplete
Task Not Ready Provide a transfer monitoring page independent of the job monitoring and use the MonaLisa monitoring in order to visualise data traffic. Unassigned Incomplete
Task Not Ready Sort out the treatment of failed jobs for accounting (e.g. missing steps). Unassigned Incomplete
Task Not Ready Decouple Production Console workflow definitions from DIRAC releases. Unassigned Incomplete
Task Not Ready Provide better control possibilities on productions. Unassigned Incomplete
Task Not Ready Implement a remote Job repository (leverage from Ganga/AMGA experience). Unassigned Incomplete
Task Not Ready Decouple the Production Template repository from the Job repository. Unassigned Incomplete
Task Not Ready Improve control on job submission from the Production Console (on-off). Unassigned Incomplete
Task Not Ready Use abstract interface classes for service tools when multiple implementations are possible (e.g. database access). Unassigned Incomplete
Task Not Ready Develop a test suite, for bug fixes as well as global functionality and performance checks. Unassigned Incomplete
Task Not Ready Implement release compatibility checks. Unassigned Incomplete

Other

  • Allow installed Applications software to be removed / uninstalled through the job wrapper (either because obsolete or because badly built/installed)
  • Specify the procedure for submitting test jobs to the production WMS in order to test a new workflow or software release.
  • Ensure individual users can use and define workflows for private jobs (i.e. not using production submission tools)
  • Provide the DIRAC API for most commonly used user platforms, at least those supported by Ganga
  • Resubmitted jobs should be run with a different jobID
  • Provide better navigation capabilities between editors in the Production Console
  • Provide a Windows version of the Production console
  • Provide quickly a remote repository and improve the performance on X11 for the Production Console

Request For Comments

  • Production service employing cycle-scavenging paradigm (potentially Windows and Linux systems)
  • Interactive web pages for job management operations (job reschedule, delete, etc.)
  • Develop a reliable message passing system based on XML-RPC as an alternative to Jabber
  • Perform tests with the new Jabber (scalability, security, etc.). Alternatively develop an adequate message passing mechanism.
  • Exploring RSS feeds as a job monitoring tool
 
Added:
>
>

Workload Management System

Task Description Status

Data Management System

Task Description Status

Production Management System

Task Description Status

Configuration, Accounting, Monitoring, Bookkeeping

Task Description Status

Interfaces

Task Description Status
 
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"
META FILEATTACHMENT attr="" autoattached="1" comment="Comments to Andrei's proposal" date="1144742121" name="DIRAC3_States.pdf" path="DIRAC3_States.pdf" size="39874" user="rgracian" version="1.1"

Revision 92007-01-15 - AndreiTsaregorodtsev

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.paterson - 08 Dec 2006
Line: 15 to 15
 

Monitoring

Added:
>
>

DIRAC 3

 
Changed:
<
<

DIRAC 3 Task List

Below is the list of tasks based on the DIRAC Review recommendations which are necessary for DIRAC3 development.

>
>
 

Completed Tasks

Task Description Status

Revision 82006-12-08 - unknown

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
Changed:
<
<
-- Main.atsareg - 27 Mar 2006
>
>
-- Main.paterson - 08 Dec 2006
 

DIRAC Project Work page

Line: 16 to 16
 
Added:
>
>

DIRAC 3 Task List

Below is the list of tasks based on the DIRAC Review recommendations which are necessary for DIRAC3 development.

Completed Tasks

Task Description Status
State precisely the goals and scope of the DIRAC project. Ensure that the LHCb collaboration supports them. Completed
Define the scope of DIRAC in relation with the other Grid activities (LCG, EGEE, etc.) and other LHCb/ATLAS projects like Ganga. Completed
A solution or policy for dealing with expiring proxies has to be found. Completed
A quota system or limitation policy should be implemented for the sandboxes. Completed
Use LFC replicas at Tier-1’s and use new LFC methods to improve performance. Completed
Develop a mechanism by which the CS can be regularly checked and populated with reference to the BDII and VOMS. Completed
Implement accounting for user jobs. Completed
Implement a provisional mechanism for user unique identification until supported by VOMS (using DIRAC CS). Username should be the AFS username for users who have one. Completed
Develop specialised XML-RPC methods for large data transfers. Completed
Change the nomenclature for “Production”, “Production request” in the Production Console. Completed
Design a simple reliable framework for prioritising jobs in the WMS queue. Completed
Released code should never be overwritten. Use a patch release instead. Completed
Use effectively Savannah as bug reporting system. Completed
VOMS groups and roles should be defined. Users should get at least one group/role attributed. Utilise VOMS authentication as soon as deployed by DIRAC services and clients. Completed
Complete the migration to the new CS interface and make CS parameters on command line consistent with .ini files. Completed
Better define Data Management terminology in accordance with Grid terminology improve on code self-documentation (meaningful naming). Completed
The grid storage approach for sandbox data should be explored. Completed
Provide access and documentation to workflows (including proper versioning when needed). Completed

High Priority Tasks

Savannah Task Description Responsible Status
Task Not Ready A more formal project management should be introduced. This should include work breakdown into sub-projects and identifying the responsible. Unassigned Incomplete
Task Not Ready Bulk submission functionality should be implemented together with the job splitting facility. Unassigned Incomplete
Task Not Ready Review the set of job states in order to clearly separate DIRAC states from Application state and implement them in all the relevant components. Unassigned Incomplete
Task Not Ready Define clearly understandable and unambiguous job state strings. Unassigned Incomplete
Task Not Ready Monitor the causes of job stalled (Pilot monitoring and error categorization). Unassigned Incomplete
Task Not Ready Provide possibility of instrumenting applications in order to give more details on its progress in coordination with the Core Software group. Unassigned Incomplete
Task Not Ready For non-critical information, resume the effort on MonaLisa monitoring for global and service monitoring. Unassigned Incomplete
Task Not Ready Define a flexible policy for deletion of jobs from the Job DB as afterwards no traceback is possible (policy for reconstruction jobs required). Unassigned Incomplete
Task Not Ready Pursue experimentation of the agent optimisation, too much coupling should be avoided between the DIRAC WMS and the state of resources on LCG. Unassigned Incomplete
Task Not Ready An internal LHCb policy (e.g. using long queues when no production is running) will certainly help. Unassigned Incomplete
Task Not Ready Keep WMS optimization approaches compatible with current LCG policies (demonstrate these approaches do not introduce security holes in the Grid). Unassigned Incomplete
Task Not Ready Enforce a policy for user files registration (logical and physical namespace). Unassigned Incomplete
Task Not Ready Re-structure the CVS repository into consistent packages (sub-projects), allowing easier code maintenance and distribution. Unassigned Incomplete
Task Not Ready Urgently set up a well structured, complete and maintained set of documentation web pages for DIRAC. Unassigned Incomplete
Task Not Ready Follow python and compiler versions supported by LCG. Consider providing a standard version of python as part of the DIRAC distribution. Unassigned Incomplete
Task Not Ready Revise the task queues optimisation (classes of jobs) concerns about the number of queues becoming too large. Unassigned Incomplete
Task Not Ready Use a tool for handling dependencies, use external packages common with the Core SW. Unassigned Incomplete
Task Not Ready Pursue usage of a messaging service included for monitoring (VO-box failover mechanism, API for communication). Unassigned Incomplete

Medium Priority Tasks

Savannah Task Description Responsible Status
Task Not Ready The new DISET security model will need to be certified by the Grid specialists before the system is adapted to the new model (push for certification). Unassigned Incomplete
Task Not Ready Demonstrate the job-wrapper able to run applications not based on Gaudi (e.g. ROOT, minimisation programs). Unassigned Incomplete
Task Not Ready Simplify the Storage Element architecture and investigate using GFAL as base library (SRM client library) Unassigned Incomplete
Task Not Ready Follow lcg-utils developments in case additional functionality is provided and make sure compatibility is ensured with LFC implementation. Unassigned Incomplete
Task Not Ready Implement accounting for storage and provide tools to cross-check with the Grid accounting. Unassigned Incomplete
Task Not Ready Pursue (with low priority compared to stability of the system) research on direct CE submission and resource reservation (e.g. CREAM). Unassigned Incomplete
Task Not Ready Investigate various technologies for job prioritization (gPBox, glexec, MAUI…). Unassigned Incomplete
Task Not Ready Improve the current DISET security schema down to object manipulation. Unassigned Incomplete
Task Not Ready Define the set of roles for mapping LHCb users and define users’ group/role in VOMS. Unassigned Incomplete
Task Not Ready Define an error reporting framework to be used throughout DIRAC (e.g. global sys logger for all catchable errors). Unassigned Incomplete
Task Not Ready Improve the control and monitoring of services (e.g. MonaLisa / DIRAC Monitoring) including visual aspects of the monitoring pages. Unassigned Incomplete

Low Priority Tasks

Savannah Task Description Responsible Status
Task Not Ready Data management user and system documentation must be improved urgently. Unassigned Incomplete
Task Not Ready Include information on the data transfer to the job status, in case transfer is asynchronous. Unassigned Incomplete
Task Not Ready Provide a transfer monitoring page independent of the job monitoring and use the MonaLisa monitoring in order to visualise data traffic. Unassigned Incomplete
Task Not Ready Sort out the treatment of failed jobs for accounting (e.g. missing steps). Unassigned Incomplete
Task Not Ready Decouple Production Console workflow definitions from DIRAC releases. Unassigned Incomplete
Task Not Ready Provide better control possibilities on productions. Unassigned Incomplete
Task Not Ready Implement a remote Job repository (leverage from Ganga/AMGA experience). Unassigned Incomplete
Task Not Ready Decouple the Production Template repository from the Job repository. Unassigned Incomplete
Task Not Ready Improve control on job submission from the Production Console (on-off). Unassigned Incomplete
Task Not Ready Use abstract interface classes for service tools when multiple implementations are possible (e.g. database access). Unassigned Incomplete
Task Not Ready Develop a test suite, for bug fixes as well as global functionality and performance checks. Unassigned Incomplete
Task Not Ready Implement release compatibility checks. Unassigned Incomplete

Other

  • Allow installed Applications software to be removed / uninstalled through the job wrapper (either because obsolete or because badly built/installed)
  • Specify the procedure for submitting test jobs to the production WMS in order to test a new workflow or software release.
  • Ensure individual users can use and define workflows for private jobs (i.e. not using production submission tools)
  • Provide the DIRAC API for most commonly used user platforms, at least those supported by Ganga
  • Resubmitted jobs should be run with a different jobID
  • Provide better navigation capabilities between editors in the Production Console
  • Provide a Windows version of the Production console
  • Provide quickly a remote repository and improve the performance on X11 for the Production Console

Request For Comments

  • Production service employing cycle-scavenging paradigm (potentially Windows and Linux systems)
  • Interactive web pages for job management operations (job reschedule, delete, etc.)
  • Develop a reliable message passing system based on XML-RPC as an alternative to Jabber
  • Perform tests with the new Jabber (scalability, security, etc.). Alternatively develop an adequate message passing mechanism.
  • Exploring RSS feeds as a job monitoring tool
 
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"
META FILEATTACHMENT attr="" autoattached="1" comment="Comments to Andrei's proposal" date="1144742121" name="DIRAC3_States.pdf" path="DIRAC3_States.pdf" size="39874" user="rgracian" version="1.1"
Added:
>
>
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC3 Task List (presented at PASTE Meeting 05/12/06)" date="1165568969" name="DIRAC3TaskList-draft1.doc" path="DIRAC3TaskList-draft1.doc" size="53248" user="Main.paterson" version="1"

Revision 72006-12-06 - AndreiTsaregorodtsev

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.atsareg - 27 Mar 2006
Line: 15 to 15
 

Monitoring

Deleted:
<
<

High priority tasks

Functionality tasks

 
Changed:
<
<
Task Status Developers
1 Specification of destination sites or group of sites. DIRAC to LCG sites matching should be defined in the CS. The group of sites, e.g. T1 sites should be automatically selected for the LCG job submission Ongoing Andrei, Stuart
2 Job State Machine: new definitions Draft proposal is attached to the page Andrei,Stuart,Ricardo,Adria
3 Enhanced logic of data files choice for automatic job generation: site exclusion, e.g. CERN for data reconstruction. Highly correlated with tasl 1 Ongoing Andrei
4 Data Manager tools to define: bulk transfer request; bulk removal request; various data integrity checking and repair operations; tools to deal with files reported as problematic. Ongoing Marianne,Andrew
5 Replacing of the Bookkeeping Replica Tables by the LFC catalog. Update of the Bookkeeping Web pages and command line tools to use the LFC catalog instead of the Bookkeeping Replica Tables. Ongoing Marianne,Carmine
6 Implementation of the services common framework and migration of all the service to this new framework Ongoing Ricardo,Adria,Andrei
7 Securing services where necessary Ongoing Ricardo,Adria,Andrei
8 Various Processing Database optimizations: multithreaded Processing DB service, optimization of queries done by Transformation Agents, better TA steering ( creation, starting, stopping, removal ) Mostly done Andrei,Gennady

Deployment tasks

Task Status Developers
1 Conversion of the SFN protocol replicas inot SRM protocol replicas in the LFC Ongoing Marianne
2 DIRAC repackaging, release procedure (several packages with one tag for the time being) Ongoing Andrei,Joel
3 Deployment of the same DIRAC software version on the USER and PRODUCTION systems. Using on-demand pilot agents for all kinds of jobs. Ongoing Gennady,Raja,Stuart,Ricardo
4 Transfer Agent to be ready for deployment on the VO-box. Ongoing Andrew
5 Tagging and releasing the stable version of the Production Tools. Deployment of the production instance of the Processing DB service and Job Repository Ongoing Joel, Gennady
6 Inclusion of the ProcDBCatalog into the production list of catalogs. Almost done, more tests needed Andrei
7 Definition and test of all the DC06 workflows. Done Joel
8 Definitions of the deployment procedure for the Request DB service and agents. Monitoring, logging, security for these components. Ongoing Andrei,Raja,Ricardo
9 Definition of the association of the T2 to T1 centers and VO-box parameters in the Configuration Service. Implementation of the site dependant operations in a job needing communication with the VO-box Ongoing Andrew,Roberto
10 Migration to Linux of the Bookkeeping Service frontend. Putting it in the same framework as other DIRAC services Ongoing Carmine
11 Monitoring of LCG failures, success rates, etc; automatic tools to extract and report this information Ongoing Gianluca,Roberto,Ricardo

Medium priority tasks

Task Status Developers
1 LFC Catalog Client talking to multiple LFC instances. Additional logic to treat the Master LFC instance as a final source in case of read-only mirrors failure Ongoing Marianne, Juan
2 Job priorities rules should be defined and enabled in the job matching mechanism Ongoing Gianluca
3 Definition and deployment of the USER job Accounting Service Ongoing Ricardo
4 Implementation of a Message Agent on the basis of the Bookkeeping agent. Ongoing Andrei

Low priority tasks

Task Status Developers
1 Job throttling for certain types. A mecanism should be put in place to disallow running more than a defined number of jobs of certain types on certain CE's. Example is stripping jobs which we can not run more than staging pools can accomodate. The numbers and sites should be specified in a special configuration section. Ongoing Andrei
2 Introduction of the methods dealing with ACL manipulation into the LFC client. The help of the LFC developers is needed. Ongoing Juan
3 Definition and implementation of the Accounting Service interface to query the information necessary for the policy decisions. Ongoing Ricardo

Pre-production service tasks

Workload Management

  1. Stress tests with “hello world” jobs through DIRAC to a LHCb specific PPS RB(WMS). Once established we would use this RB for all subsequent tests.
  2. Running MC production jobs - storing output on local production SE & registering in production LFC (~100 simultaneous jobs; 1 output file per job; ~2 hrs per job) Running MC production jobs - storing output on local production SE & registering in production LFC (~100 simultaneous jobs; 1 output file per job; ~24 hrs per job)
  3. Running stripping jobs - using files on local production SE & storing output on local production SE & registering in production LFC (~10 simultaneous jobs; 80 i/p files per job; 1 o/p file per job;~24 hrs per job). Would also like the possibility to submit to the production CE from the PPS RB.
  4. Larger scale MC production tests - storing output on a Tier-1 production SE & registering in production LFC; need access to production batch queues (~1000 simultaneous jobs; 1 output file per job;~24 hrs per job.) Would also like the possibility to submit to the production CE from the PPS RB.
  5. Larger scale stripping jobs - using files on production SE & storing output on production SE & registering in production LFC; need access to production batch queues (~100 simultaneous jobs; 80 i/p files per job; 1 o/p file per job;~24 hrs per job)
  6. Analysis jobs - using files on production SE; need access to production batch queues (chaotic usage)

Data Management

  1. Copy & register of (~1000 ??) files into PPS LFC & SE at CERN - copy data from current CERN production SE
  2. Use of FTS to distribute these files from CERN to our Tier-1 centres participating in the PPS. Using PPS FTS, PPS SE, PPS LFC

META FILEATTACHMENT attr="" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"
META FILEATTACHMENT attr="" comment="Comments to Andrei's proposal" date="1144742120" name="DIRAC3_States.pdf" path="DIRAC3_States.pdf" size="39874" user="rgracian" version="1.1"
>
>
META FILEATTACHMENT attr="" autoattached="1" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"
META FILEATTACHMENT attr="" autoattached="1" comment="Comments to Andrei's proposal" date="1144742121" name="DIRAC3_States.pdf" path="DIRAC3_States.pdf" size="39874" user="rgracian" version="1.1"

Revision 62006-04-11 - RicardoGraciani

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.atsareg - 27 Mar 2006
Line: 72 to 72
 
  1. Use of FTS to distribute these files from CERN to our Tier-1 centres participating in the PPS. Using PPS FTS, PPS SE, PPS LFC

META FILEATTACHMENT attr="" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"
Added:
>
>
META FILEATTACHMENT attr="" comment="Comments to Andrei's proposal" date="1144742120" name="DIRAC3_States.pdf" path="DIRAC3_States.pdf" size="39874" user="rgracian" version="1.1"

Revision 52006-04-11 - AndreiTsaregorodtsev

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.atsareg - 27 Mar 2006
Line: 8 to 8
  This is the DIRAC project page which contains the current list of the DIRAC tasks and other materials necessary for the project activity.
Added:
>
>

Links to related pages

Docs

Monitoring

 

High priority tasks

Functionality tasks

Revision 42006-04-04 - JoelClosier

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.atsareg - 27 Mar 2006
Line: 25 to 25
 
Task Status Developers
1 Conversion of the SFN protocol replicas inot SRM protocol replicas in the LFC Ongoing Marianne
Changed:
<
<
2 DIRAC repackaging, release procedure To be defined Andrei,Joel
>
>
2 DIRAC repackaging, release procedure (several packages with one tag for the time being) Ongoing Andrei,Joel
 
3 Deployment of the same DIRAC software version on the USER and PRODUCTION systems. Using on-demand pilot agents for all kinds of jobs. Ongoing Gennady,Raja,Stuart,Ricardo
4 Transfer Agent to be ready for deployment on the VO-box. Ongoing Andrew
5 Tagging and releasing the stable version of the Production Tools. Deployment of the production instance of the Processing DB service and Job Repository Ongoing Joel, Gennady
6 Inclusion of the ProcDBCatalog into the production list of catalogs. Almost done, more tests needed Andrei
Changed:
<
<
7 Definition and test of all the DC06 workflows. Ongoing Joel
>
>
7 Definition and test of all the DC06 workflows. Done Joel
 
8 Definitions of the deployment procedure for the Request DB service and agents. Monitoring, logging, security for these components. Ongoing Andrei,Raja,Ricardo
9 Definition of the association of the T2 to T1 centers and VO-box parameters in the Configuration Service. Implementation of the site dependant operations in a job needing communication with the VO-box Ongoing Andrew,Roberto
10 Migration to Linux of the Bookkeeping Service frontend. Putting it in the same framework as other DIRAC services Ongoing Carmine

Revision 32006-03-30 - unknown

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.atsareg - 27 Mar 2006

Revision 22006-03-29 - NickBrook

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"
-- Main.atsareg - 27 Mar 2006
Line: 51 to 51
 
2 Introduction of the methods dealing with ACL manipulation into the LFC client. The help of the LFC developers is needed. Ongoing Juan
3 Definition and implementation of the Accounting Service interface to query the information necessary for the policy decisions. Ongoing Ricardo
Added:
>
>

Pre-production service tasks

Workload Management

  1. Stress tests with “hello world” jobs through DIRAC to a LHCb specific PPS RB(WMS). Once established we would use this RB for all subsequent tests.
  2. Running MC production jobs - storing output on local production SE & registering in production LFC (~100 simultaneous jobs; 1 output file per job; ~2 hrs per job) Running MC production jobs - storing output on local production SE & registering in production LFC (~100 simultaneous jobs; 1 output file per job; ~24 hrs per job)
  3. Running stripping jobs - using files on local production SE & storing output on local production SE & registering in production LFC (~10 simultaneous jobs; 80 i/p files per job; 1 o/p file per job;~24 hrs per job). Would also like the possibility to submit to the production CE from the PPS RB.
  4. Larger scale MC production tests - storing output on a Tier-1 production SE & registering in production LFC; need access to production batch queues (~1000 simultaneous jobs; 1 output file per job;~24 hrs per job.) Would also like the possibility to submit to the production CE from the PPS RB.
  5. Larger scale stripping jobs - using files on production SE & storing output on production SE & registering in production LFC; need access to production batch queues (~100 simultaneous jobs; 80 i/p files per job; 1 o/p file per job;~24 hrs per job)
  6. Analysis jobs - using files on production SE; need access to production batch queues (chaotic usage)

Data Management

  1. Copy & register of (~1000 ??) files into PPS LFC & SE at CERN - copy data from current CERN production SE
  2. Use of FTS to distribute these files from CERN to our Tier-1 centres participating in the PPS. Using PPS FTS, PPS SE, PPS LFC
 
META FILEATTACHMENT attr="" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"

Revision 12006-03-28 - AndreiTsaregorodtsev

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="LHCbComputing"
-- Main.atsareg - 27 Mar 2006

DIRAC Project Work page

This is the DIRAC project page which contains the current list of the DIRAC tasks and other materials necessary for the project activity.

High priority tasks

Functionality tasks

Task Status Developers
1 Specification of destination sites or group of sites. DIRAC to LCG sites matching should be defined in the CS. The group of sites, e.g. T1 sites should be automatically selected for the LCG job submission Ongoing Andrei, Stuart
2 Job State Machine: new definitions Draft proposal is attached to the page Andrei,Stuart,Ricardo,Adria
3 Enhanced logic of data files choice for automatic job generation: site exclusion, e.g. CERN for data reconstruction. Highly correlated with tasl 1 Ongoing Andrei
4 Data Manager tools to define: bulk transfer request; bulk removal request; various data integrity checking and repair operations; tools to deal with files reported as problematic. Ongoing Marianne,Andrew
5 Replacing of the Bookkeeping Replica Tables by the LFC catalog. Update of the Bookkeeping Web pages and command line tools to use the LFC catalog instead of the Bookkeeping Replica Tables. Ongoing Marianne,Carmine
6 Implementation of the services common framework and migration of all the service to this new framework Ongoing Ricardo,Adria,Andrei
7 Securing services where necessary Ongoing Ricardo,Adria,Andrei
8 Various Processing Database optimizations: multithreaded Processing DB service, optimization of queries done by Transformation Agents, better TA steering ( creation, starting, stopping, removal ) Mostly done Andrei,Gennady

Deployment tasks

Task Status Developers
1 Conversion of the SFN protocol replicas inot SRM protocol replicas in the LFC Ongoing Marianne
2 DIRAC repackaging, release procedure To be defined Andrei,Joel
3 Deployment of the same DIRAC software version on the USER and PRODUCTION systems. Using on-demand pilot agents for all kinds of jobs. Ongoing Gennady,Raja,Stuart,Ricardo
4 Transfer Agent to be ready for deployment on the VO-box. Ongoing Andrew
5 Tagging and releasing the stable version of the Production Tools. Deployment of the production instance of the Processing DB service and Job Repository Ongoing Joel, Gennady
6 Inclusion of the ProcDBCatalog into the production list of catalogs. Almost done, more tests needed Andrei
7 Definition and test of all the DC06 workflows. Ongoing Joel
8 Definitions of the deployment procedure for the Request DB service and agents. Monitoring, logging, security for these components. Ongoing Andrei,Raja,Ricardo
9 Definition of the association of the T2 to T1 centers and VO-box parameters in the Configuration Service. Implementation of the site dependant operations in a job needing communication with the VO-box Ongoing Andrew,Roberto
10 Migration to Linux of the Bookkeeping Service frontend. Putting it in the same framework as other DIRAC services Ongoing Carmine
11 Monitoring of LCG failures, success rates, etc; automatic tools to extract and report this information Ongoing Gianluca,Roberto,Ricardo

Medium priority tasks

Task Status Developers
1 LFC Catalog Client talking to multiple LFC instances. Additional logic to treat the Master LFC instance as a final source in case of read-only mirrors failure Ongoing Marianne, Juan
2 Job priorities rules should be defined and enabled in the job matching mechanism Ongoing Gianluca
3 Definition and deployment of the USER job Accounting Service Ongoing Ricardo
4 Implementation of a Message Agent on the basis of the Bookkeeping agent. Ongoing Andrei

Low priority tasks

Task Status Developers
1 Job throttling for certain types. A mecanism should be put in place to disallow running more than a defined number of jobs of certain types on certain CE's. Example is stripping jobs which we can not run more than staging pools can accomodate. The numbers and sites should be specified in a special configuration section. Ongoing Andrei
2 Introduction of the methods dealing with ACL manipulation into the LFC client. The help of the LFC developers is needed. Ongoing Juan
3 Definition and implementation of the Accounting Service interface to query the information necessary for the policy decisions. Ongoing Ricardo

META FILEATTACHMENT attr="" comment="DIRAC Job states proposal" date="1143498291" name="DIRAC_Job_States.pdf" path="DIRAC_Job_States.pdf" size="471379" user="atsareg" version="1.1"
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback