Please refer to the updated version at: https://twiki.cern.ch/twiki/bin/view/EGEE/EGEEgLiteWorkPlans

LB and JP Workplans

  • #101

- LB server already understands VOMS groups, and is able to provide fine-grain (per job) access control

- using VOMS information for logging data into LB is not foreseen -- a job is always "owned" by a specific user, only this user's credentials (and service ones, e.g. RB) are accepted

- no immediate LB-side action required

  • #301

- the specification scenario should allow many-to-many LB:RB relationship; depending on specific needs, VO may configure a single LB server to work with multiple RB's or vice versa.

- it's a configuration issue, LB is ready for the setup

  • #303

- LB is designed to be machine crash proof, data are synced to disk in a transaction-like manner, i.e. a write operation is not confirmed to the caller until the data are physically on disk.

- Disk data that should be considered are plain spool logger files and MySQL database.

- Provided the LB services are restarted with the same disk footprint and, in the case of LB server, with the same IP & hostname, the operation should resume immediately.

- estimated effort: 1 FTE month for thorough testing whether out-of-box HA solutions are usable. If not, higher effort (difficult to estimate) required for application-level HA.

- So far we do not consider this issue a true priority. Essential service failures should not be so frequent, and the Grid environment as a whole should cope with service failures.

  • #304

- Already work in progress. Stress test & measuring environment is set up (standalone LB feeder, emulating the real data sources).

- Preliminary results are quite optimistic, at least 70% of LB code should handle the 1Mjob/day load without problems even using a single instance per each LB component.

- Two major bottlenecks identified: logging events via logd (but the most productive LB event sources already use LB proxy, though), and registering large collections.

- estimated effort: 2-3 FTE months to both complete the tests and work around the bottlenecks

  • #324

- LB publishes restricted record on every job state change into R-GMA

- LB also provides native interface for reliable subscribe & receive notification on job state. The current implementation is restricted to known job id's only but can be extended to serve "all jobs in VO" like subscriptions if the functionality is required.

- estimated effort: 1 FTE month

  • #562, #563

- LB supports the concept of DAGs for long time, state information is provided for both the DAG as a whole (including counts of subjob states), and individual subjobs.

  • Not listed but required: methods for building statistics from LB

- LB, even in the old EDG versions, is able to dump data into plain files. Enabling the feature is subject of proper configuration.

- Starting from gLite 3.0 we provide a parser of the dump files that produces job statistics records -- XML files according to schema agreed with JRA2.

- In gLite 3.1 the functionality is integrated, i.e. LB server machine can be easily configured to generate these files and make them available for download.

- estimated effort: 1-2 FTE months, split among us, JRA2, and integration to deploy and test

-- Main.grandic - 27 Jun 2006

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2006-08-03 - ClaudioGrandi
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback