Information System User Requirements
Introduction
The following sections present the current status of the existing Information System use cases and the current issues identified for each of them.
Use Cases
Service Discovery
List available grid services: CE, SE, FTS...
Existing implementations
Experiment |
Description |
Involved GLUE 1.3 Attributes |
Involved GLUE 2.0 Attributes |
ALICE |
Manual query to find CEs |
|
|
ATLAS |
CE and SE manual query to feed AGIS (1) |
GlueService: GlueServiceType GlueServiceEndpoint GlueServiceVersion GlueServiceStatus GlueForeignKey GlueCE GlueCEInfoJobManager GlueCEPolicyMaxCPUTime GlueCEPolicyMaxWallClockTime GlueCEName |
ServiceType ServiceEndpointURL ServiceEndpointInterfaceVersion ServiceEndpointServingState ServiceAdminDomainForeignKey ManagerProductName ComputingShareMaxCPUTime ComputingShareMaxWallTime ComputingShareMappingQueue |
CMS |
ATP VO feed generation |
GlueCE: GlueCEAccessControlBaseRule GlueCEInfoHostName GlueCEImplementationName GlueForeignKey GlueCluster: GlueClusterUniqueID GlueForeignKey GlueCESEBindGroup: GlueCESEBindGroupCEUniqueID GlueCESEBindGroupSEUniqueID |
|
CMS |
CE/SRM visibility (Site Status Board) |
GlueCE: GlueCEUniqueID GlueCEAccessControlBaseRule GlueSite: GlueSiteUniqueID |
|
LHCb |
Manual query to find CEs |
|
|
FTS |
SRM and gsiftp automatic queries |
GlueService: GlueServiceUniqueID GlueServiceType GlueForeignKey |
ServiceID ServiceType ServiceAdminDomainForeignKey |
Open issues
- ATLAS feedback (under discussion): Single query to get CE and SE attributes (1) for OSG, ARC and EGI resources. The problem is that nowadays information from all these resources is not available in a uniform way. Information is present in different tools that need to be queried in different ways. Either centralise the information in a unique tool or make sure all available tools can be queried in a uniform way. Possible areas to be investigated:
- Understand with GOCDB developers the possibility to add more resource information (full (1) list). GOCDB is planning to provide: property bags and multiple endpoints per service.
- Understand for the extra resource information, how will dynamic information be automatically updated in GOCDB.
- Investigate "ginfo as a service" as a single point of query to get information from GOCDB and OSG.
- Service discovery via the Information System breaks whenever a resource or a site BDII goes down, while ideally it should not change or disappear even in presence of downtimes.
Software version and platform
Discover what it is the installed software in a subcluster.
Existing implementations
Experiment |
Description |
Involved GLUE 1.3 Attributes |
Involved GLUE 2.0 Attributes |
ATLAS |
Athena |
GlueSubCluster: !GlueHostApplicationSoftwareRunTimeEnvironment |
ApplicationEnvironmentAppName |
CMS |
CMSSW |
GlueCE: GlueCEAccessControlBaseRule GlueCEPolicyMaxWallClockTime GlueSubCluster: GlueHostApplicationSoftwareRunTimeEnvironment GlueHostOperatingSystemName GlueHostOperatingSystemRelease GlueHostOperatingSystemVersion GlueHostMainMemoryRAMSize |
ApplicationEnvironmentAppName ExecutionEnvironmentOSName ExecutionEnvironmentOSFamily ExecutionEnvironmentOSVersion ExecutionEnvironmentPlatform ExecutionEnvironmentMainMemorySize ExecutionEnvironmentCPUClockSpeed TO BE COMPLETED |
Open issues
None
SE Capacity
Discover dark data, that is data in storage but not in catalog.
Existing implementations
Experiment |
Description |
Involved GLUE 1.3 Attributes |
Involved GLUE 2.0 Attributes |
CMS(*) |
Storage information (Site Status Board) |
GlueSA: GlueSATotalOnlineSize GlueSAFreeOnlineSize GlueSAUsedOnlineSize GlueSATotalNearlineSize GlueSAFreeNearlineSize GlueSAUsedNearlineSize InstalledOnlineCapacity InstalledNearlineCapacity |
|
(*) Not used in production as the published information is too unreliable. More details in
this twiki
Open issues
- ATLAS feedback: Information about InstalledCapacity vs TotalSize is needed in SRM.
- CMS feedback: CMS cannot use SRM as it is not using space tokens. It needs more fine-grained sizes (per directory).
- The information is often wrong.
CE Status and queues
Ranking CEs for job submission.
Existing implementations
Experiment |
Description |
Involved GLUE 1.3 Attributes |
Involved GLUE 2.0 Attributes |
ALICE |
CREAM CE resource BDII query |
GlueCE: GlueForeignKey GlueCEStateStatus GlueSubCluster: GlueHostMainMemoryRAMSize GlueHostMainMemoryVirtualSize GlueVOView: GlueCEStateRunningJobs GlueCEStateWaitingJobs |
|
ATLAS |
|
|
|
CMS |
CRAB 2 WMS matchmaking Pilot submission |
GlueCE: GlueCEUniqueId GlueCEStateStatus GlueCEStateTotalJobs GlueCEStateEstimatedResponseTime GlueCEPolicyMaxTotalJobs GlueCEPolicyMaxCPUTime GlueCEPolicyMaxWallClockTime GlueCEImplementationName GlueSubCluster: GlueHostApplicationSoftwareRunTimeEnvironment GlueHostNetworkAdapterOutboundIP GlueCESEBindGroup: GlueCESEBindGroupSEUniqueID |
|
LHCb |
Manual query to feed DIRAC |
GlueCE: GlueCEPolicyMaxCPUTime GlueCECapability (CPUScalingReferenceSI00) |
ComputingShareMaxCPUTime BenchmarkType BenchmarkValue |
Open issues
- LHCb request: Automatic way to monitor the validity of published information for these attributes and contact sites to fix values if needed. Tracked in IS ticket #3
.
- GlueCEStateEstimatedResponseTime is often unreliable or meaningless.
Information System Clients
Existing implementations
Experiment |
Description |
Involved GLUE 1.3 Attributes |
Involved GLUE 2.0 Attributes |
ALICE |
lcg-infosites |
see #8 |
|
ATLAS |
lcg-info |
see #7 |
|
CMS |
lcg-info, LDAP modules |
see #5 |
|
LHCb |
lcg-info, lcg-infosites, ldapsearch |
see #6 |
|
Open issues
- All request: improve ginfo to implement the most common queries currently performed by lcg-info, lcg-infosites and ldapsearch.
- Experiments will provide the list of queries. Tracked in IS ticket #5
, #6
, #7
and #8
.
- The ginfo development team will provide a plan to update ginfo and provide the requested functionality. Tracked in IS ticket #9
.
WLCG Information System Evolution
In June 2015, OSG announced their plans to stop using the BDII to publish their resources (See
Slides
presented at the WLCG Operations Coordination Meeting in 18th of June). This announcement has triggered the review of the current WLCG Information System and it has been decided to create a task force to evaluate how WLCG is going to cover the existing use cases in the near future. In the following
twiki, the TF is described in detail.
WLCG Global Service Registry
The
WLCG Global Information Registry was an attempt to bring together information published by different grid infrastructures like EGI and
OSG. It shows both information on pledged resources and actual available resources. The WLCG Global Information Registry aims at aiding LHC experiments to configure their own experiment databases for job submissions and storage management.
Meetings
Presentations
Articles
Resources