LB and JP Workplans
- LB server already understands VOMS groups, and is able to provide fine-grain
(per job) access control
- using VOMS information for logging data into LB is not foreseen -- a job
is always "owned" by a specific user, only this user's credentials (and
service ones, e.g. RB) are accepted
- no immediate LB-side action required
- the specification scenario should allow many-to-many LB:RB relationship;
depending on specific needs, VO may configure a single LB server to
work with multiple RB's or vice versa.
- it's a configuration issue, LB is ready for the setup
- LB is designed to be machine crash proof, data are synced to disk
in a transaction-like manner, i.e. a write operation is not confirmed
to the caller until the data are physically on disk.
- Disk data that should be considered are plain spool logger files
and
MySQL database.
- Provided the LB services are restarted with the same disk footprint
and, in the case of LB server, with the same IP & hostname,
the operation should resume immediately.
- estimated effort: 1 FTE month for thorough testing whether out-of-box
HA solutions are usable.
If not, higher effort (difficult to estimate) required for application-level
HA.
- So far we do not consider this issue a true priority. Essential service
failures should not be so frequent, and the Grid environment as a whole
should cope with service failures.
- Already work in progress. Stress test & measuring environment is set up
(standalone LB feeder, emulating the real data sources).
- Preliminary results are quite optimistic, at least 70% of LB code should
handle the 1Mjob/day load without problems even using a single instance per
each LB component.
- Two major bottlenecks identified: logging events via logd (but the most
productive LB event sources already use LB proxy, though), and registering
large collections.
- estimated effort: 2-3 FTE months to both complete the tests and work around
the bottlenecks
- LB publishes restricted record on every job state change into R-GMA
- LB also provides native interface for reliable subscribe & receive
notification on job state. The current implementation is restricted
to known job id's only but can be extended to serve "all jobs in VO" like
subscriptions if the functionality is required.
- estimated effort: 1 FTE month
- LB supports the concept of DAGs for long time, state information is provided
for both the DAG as a whole (including counts of subjob states), and
individual subjobs.
- Not listed but required: methods for building statistics from LB
- LB, even in the old EDG versions, is able to dump data into plain files.
Enabling the feature is subject of proper configuration.
- Starting from gLite 3.0 we provide a parser of the dump files that produces
job statistics records -- XML files according to schema agreed with JRA2.
- In gLite 3.1 the functionality is integrated, i.e. LB server machine can be
easily configured to generate these files and make them available for
download.
- estimated effort: 1-2 FTE months, split among us, JRA2, and integration to
deploy and test
-- Main.grandic - 27 Jun 2006