GLUE 2.0 Roadmap

Roadmap

In order to define a concrete roadmap, the Information System TF is currently collecting feedback from the following stakeholders:

  • OSG plans
    • Plans to stop running the BDII
    • Plans to provide WLCG with the needed information through other means, preferably in GLUE 2.0
  • EGI plans
    • Plans to enable GLUE 2.0 submission in WMS
    • Plans to decommission GLUE 1.3 information
    • Sites plans: All EGI sites publish now GLUE 2.0 except:
  • LHC VO plans to consume GLUE 2.0
    • ALICE
    • ATLAS
    • CMS
    • LHCb

GLUE 1 - GLUE 2 translator

The following table presents a translator from GLUE 1 to GLUE 2 for the attributes published in the REBUS installed capacities:

GLUE 1 Attribute Definition GLUE 2 Attribute Definition
GlueHostProcessorOtherDescription: Benchmark GLUE2BenchmarkValue Benchmark Value
GlueSubClusterLogicalCPUs The effective number of CPUs in the subcluster, including the effect of hyperthreading and the effects of vistualisation due to the queueing system GLUE2ExecutionEnvironmentLogicalCPUs The number of logical CPUs in one Execution Environment instance, i.e. typically the number of cores per Worker Node.
GLUE2ExecutionEnvironmentTotalInstances The total number of execution environment instances. This Should reflect the total installed capacity. i.e. including resources which are temporarily unavailable
GlueSESizeTotal GLUE2StorageServiceCapacityTotalSize The total amount of storage of the defined type. It is the sum of free, used and reserved)

Information Providers

Information providers are described in detail under the IS providers twiki. More detailed information on the installed capacities related attributes can be found in the table below:

Service Installed Capacities Related GLUE attributes
dCache It includes all pools (storage nodes) known to the system, whether or not they are enabled TBD
DPM It includes the pools/disks configured by the site admin. If the admin disables one pool or disk, the installed capacity is updated via de information provider TBD
StoRM It just reports available space information from the underlying filesystem, there's no concept of resources in downtime visible to StoRM. TBD
CASTOR/EOS (CERN) it is possible to retrieve information including broken disks or not, it depends on the query performed in CASTOR/EOS TBD
CREAM ? TBD
HTCondorCE ? TBD

Site specific recipes

Sites Service Scripts Notes
CERN-PROD HTCondorCE htcondorce-cern There are only 4 values that aren't generated dynamically by calling out to the HTCondor Pool Collector and the Compute Element Schedduler. These are HTCONDORCE_VONames = atlas, cms, lhcb, dteam, alice, ilc (shortend for brevity), HTCONDORCE_SiteName = CERN-PROD, HTCONDORCE_HEPSPEC_INFO = 8.97-HEP-SPEC06, HTCONDORCE_CORES = 16. All our htcondor worker nodes expose a hepspec fact. The averaged hepspec value on the CEs above is taken by a query of all the facts and then averaged.
CREAM CE [[][UpdateStaticInfo]] It parses the LSF configuration file to extract capacities

Definitions

The following definitions have been proposed by the Information System Task Force. Feedback from site admins is being collected:

Proposal sent to LCG-ROLLOUT:

  • GLUE2ExecutionEnvironmentLogicalCPUs: the number of processors in one Execution Environment instance which may be allocated to jobs. Typically the number of processors seen by the operating system on one Worker Node (that is the number of "processor :" lines in /proc/cpuinfo on Linux), but potentially set to more or less than this for performance reasons.

  • GLUE2BenchmarkValue: the average HS06 benchmark when a benchmark instance is run for each processor which may be allocated to jobs. Typically the number of processors which may be allocated corresponds to the number seen by the operating system on the worker node (that is the number of "processor :" lines in /proc/cpuinfo on Linux), but potentially set to more or less than this for performance reasons.

Another proposal by Andrew after some discussion:

  • GLUE2ExecutionEnvironmentLogicalCPUs: the number of single-process benchmark instances run when benchmarking the Execution Environment, corresponding to the number of processors which may be allocated to jobs. Typically this is the number of processors seen by the operating system on one Worker Node (that is the number of "processor :" lines in /proc/cpuinfo on Linux), but potentially set to more or less than this for performance reasons. This value corresponds to the total number of processors which may be reported to APEL by jobs running in parallel in this Execution Environment, found by adding the values of the "Processor" keys in all of their accounting records.
  • GLUE2BenchmarkValue: the average benchmark when a single-process benchmark instance is run for each processor which may be allocated to jobs. Typically the number of processors which may be allocated corresponds to the number seen by the operating system on the worker node (that is the number of "processor :" lines in /proc/cpuinfo on Linux), but potentially set to more or less than this for performance reasons. This should be equal to the benchmark ServiceLevel in the APEL accounting record of a single-processor job, where the APEL "Processors" key will have the value 1.

Another proposal by Brian:

  • GLUE2Execution environment: The hardware environment allocated by a single resource request.
  • GLUE2ExecutionEnvironmentLogicalCPUs: the number of single-process benchmark instances run when benchmarking the Execution Environment.
  • GLUE2BenchmarkValue: the average benchmark result when $(GLUE2ExecutionEnvironmentLogicalCPUs) single-threaded benchmark instances are run in the execution environment in parallel.
  • GLUE2ExecutionEnvironmentTotalInstances: The aggregate benchmarck results of the computing resource divided by $(GLUE2BenchmarkValue).

-- MariaALANDESPRADILLO - 2015-11-24

Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r7 - 2015-12-10 - MariaALANDESPRADILLO
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback