EMI comments on the EGI Quality Criteria documents (Version 1)

Review was carried out by EMI technical management and QA.

Reviewed material

UMD Quality Criteria (EGI Document 240-v8) https://documents.egi.eu/public/ShowDocument?docid=240

  • Generic Quality Criteria
  • Compute Capabilities Quality Criteria
  • Data Quality Criteria
  • Storage Capabilities Quality Criteria
  • Security Capabilities Quality Criteria
  • Information Capabilities Quality Criteria
  • Operations Capabilities Quality Criteria

General remarks

- EMI project has already established its own set of quality requirements. These are formulated via the EMI SA2 policies [1]. These policies form the base of EMI component quality check and component certifications and validation. The EMI Quality Assurance Policy documents cover the entire software development life cycle.

The EGI QC documents have a very similar scope and there is a big overlap between the EMI Policies and the EGI QC documents.

- EMI has the goal to adjust its QA Policy document to satisfy its customers. Therefore, EMI plans to synchronize its own QA Policy documents with the EGI QC. However, the synchronization should be based on mutual understanding. Before EMI can commit itself to the EGI QC requirements, the EMI QA team and the technical management need to carry out a detailed, thorough analysis of each of the UMD criteria and communicate back acceptance or possible problems for each of the criteria, one by one. Meanwhile, it is recommended that the EGI-QC team also takes a look at the EMI QA Policies. The more thorough analysis of the EGI-QC and the update of the EMI Policy documents can take place only after the EMI-1 release (May/June).

- Software produced by EMI is checked against the requirements laid down in EMI Policy documents. In particular, the (already ongoing) certification and validation of the EMI-1 release components is based on the EMI policies. This implies that the EGI QC requirements can be earliest taken into account during the EMI-2 (or EMI-1.1) preparation, right after the EMI QA Policies will have been updated and synchronized with EGI QC.

- Since there is already quite large overlap between the EMI Policies and the EGI QC, the EMI-1 release components will already satisfy majority of the EGI QC as well.

- As a general problem, EMI sees that many of the criteria are not precisely formulated leaving quite some room for misunderstanding.

- Another big recurring issue is the responsibility of providing test suits for some of the criteria. There should be a clear agreement between EGI and the technology providers regarding the responsibility of test suit provisioning and maintenance.

- The Criteria template hides the most important content: We suggest to move the "Description Field" right below the title/ID. Also, changing the numeric ID to something more meaningful may be considered. The purpose of the "History" field is not clear. Other minor readability improvement would be the introduction of "Mandatory: YES/NO" instead of the current practice (inconsistent layout).

- We have found plenty of jargon abbreviations undefined and also plenty of copy/paste errors throughout the "final" documents.

- Finally, it is mentioned that the name "UMD Quality Criteria" is a bit misleading. The criteria collected in the 7 documents have very different nature: some of them are indeed quality criteria, others are just functionality tests while there are plenty of lower level technical requirements as well (e.g. what a startup script should do).

Specific Comments on Generic Quality Criteria

- Template: Change "Input from TP" to "Input from Technology Provider"

- Functional Description: The criterion description is rather vague. For example, what would be a functional description document for a developer API component, or for a user tool?

- Release Notes: already included in EMI Documentation policy

- User Documentation: very generic criteria.

- Online help (man pages): Currently it is not mandatory in EMI.

- API documentation: Currently it is not mandatory in EMI.

- Administration documentation: EMI requires an Installation and Admin guide.

- Service Reference Card: Also required in EMI

- Software licence: Required and tracked in the EMI Component Release Tracker [2]

- Source code availability: Part of the description is too vague and talks of non-quantifiable values. Otherwise the criteria is part of EMI policy.

- Build procedure documentation: Yes, part of EMI policy

- Automatic builds: Yes, EMI components are continuously tested via automatic nightly builds.

- Binary distributions: EMI has the same requirement

- Release changing testing: Similar requirement exists in EMI. The big question here is the granularity of testing. There is no way to test every minor code change or fix for trivial bugs.

- Service control and status: This is more a technical requirement than a Quality Criteria. It is currently not regulated to this extent in EMI, not part of EMI policy.

- Log files: This is more a technical requirement than a Quality Criteria. It is currently not regulated to this extent in EMI, not part of EMI policy.

- Service Reliability: Description is little bit vague (what is a good performance?). No such requirement is explicitly part of EMI policies. Nevertheless, Product Teams are required to perform scalability and performance tests as part of their product certification.

- Word writable files: Not part of EMI requirements. Who would provide test suit? This QC could be part of the security requirements, not really a generic one.

Specific Comments on Compute Capabilities Quality Criteria

- While a general attention seems to have been dedicated to test existing proprietary interfaces (e.g. CREAM, ARC) this hasn't happened for unicore, whereby the author seems to assume that unicore=bes, which is not the case (we would expect some JOBEXEC_UNICORE_* tests as well).

-It can be accepted that sometimes the tests need to be very specific, especially if they want to support nordugrid, wlcg, unicore grids as we know them and reserve support for more grids - see below - No way that what is asked can be done, see e.g. for both jobexec and jobsched stuff.

- What concerns DRMAA, EMI really cannot understand where it may fit the big picture.

- Some important test cases are missing. One such test can be: submit of a 1 hour job with half an hour proxy; or even: check that subsequent jobs by the same user with different roles & capabilities are able to properly handle file permissions and are not mixed up. Tons of similar tests could be selected. EGI should select the required tests more carefully, preferably consulting with the technology providers (or just study the existing test plans in EMI [3]). In general, most of the requird tests are pretty trivial.

- Requiring BES tests from EMI is questionable. BES is not an official EMI compute interface. Not to mention why the BES API testing should be done using the UNICORE UCC JSON language? As said, BES API testing doesn't mean UNICORE API testing, and viceversa.

- 1.6 Availability/Scalability: The current pass condition (Pass if the throughput is enough to handle at least 5000 simultaneous jobs.) is to be discussed. 5k simultaneously seems a bit unrealistic. or maybe jobs/day is something more familiar, say 50k submitted jobs/day. The QC should also report a time lag within which job status changes should be visible to the client. I don't care about a service delivering 100Kjobs a minute is the user will know they done in days.

- JOBSCH_WMS_API_2: Why is this test required? this was a EGEE2 requirement and none has really ever considered it. As of now, JSDL has been integrated into BES, so JOBSCH_BES_1 should be more than enough

- JOBSCH_EXEC_1: This is an important one. it will require some non trivial work.

- Service availability, monitoring and error handling: is this required for the WMS only?

-- #698: WMS stability and performance that's quite upsetting indeed, as the WMS is currently used by cms, as the most important example, accommodating by far for their production quality criteria which are about 50kjobs/day. the service requires one restart per month if not less.

-- 1000 simultaneous jobs does not make much sense as well. just take into account the the match-making operation alone, which is very optimized if one thinks of the 18k queues available on average, takes 1-2 seconds. the QC do not specify the job type to be stress tested (single jobs/collections/DAGs/parametric/MPI etc). again, jobs/day would be better

Specific Comments on Data and Storage Quality Criteria

- The documents describe testing of components, interfaces and protocols within EMI data. Components are Hydra, Amga and FTS. The interface is the POSIX file system access and the protocols are http(s), WebDAV, gridFTP and SRM.

- In general the proposed tests all make sense. A large fraction of the tests are already done by the responsible product teams. It was not possible to check the details. To our opinion, in some cases, the proposed tests are not sufficient, in other cases they are not useful.

- For gsiFTP the document only requires read and write. gsiFTP however has two different versions and within those versions different modes, most of which are used in our infrastructure. Testing should be done for all combinations.

- For WebDAV only testing of read and write is proposed. WebDAV has a much richer set of function which might be worth testing.

- An opposite example is the SRM. The document requires to test all "SRM operations described in SRM v2.2". EMI thinks that only those tests are useful which cover functionality used by the current infrastructure. There is functionality in SRM v2.2 which is not, and will not be implemented by any of the SE's.

- For POSIX file access, the document suggests to test all possible functions. We might consider to limit this as well.

- The real question: Who is going to provide or maintain those tests ?

Specific Comments on Security Capabilities

-AUTHN_ IFACE_2: "may use" ... ARC/gLite no way in EMI-1

- AUTHN_CA _*: Not really a "security area" problem... CA dists are packaged by others. In fact, our MW doesn't really need to know anything about a CA distribution!

- ATTAUTH_ MGMT_5: What are the ACLs being changed here?

-ATTAUTH_ WEB_1: Is this necessary? VOMS-Admin function?

-ATTAUTH_ WEB_3: Does this exist?

-AUTHZ_ PCYDEF_4: Is this SCAS? Why SCAS? EMI is phasing out SCAS!

-CREDMGMT_IFACE_1: Is this test for MyProxy?

-CREDMGMT_IFACE_3: Proxy Renewal is not a service. Test suite for this tricky as it requires WMS etc installation.

- CREDMGMT_LINK_1: STS and future work of the AAI strategy group. Otherwise "mostly harmless".

Specific Comments on Information Capabilities

- First comment is that this document is a little too simplistic and 'high-level'. While in general they have captured the main factors that are probably important, there is not enough information to really understand the quality criteria or the context.

- What are we testing; a system?, a component?, a service? Secondly the test descriptions are a little abstract for example, test that the Service information conforms to the GlueSchema v2. What does this really mean? How do you test this? Who provides the test? Who runs the test etc.

- Basically there is not enough information in the document to understand what criteria needs to be met and how.

References

[1] EMI Quality Assurance Policy documents https://twiki.cern.ch/twiki/bin/view/EMI/SA2#EMI_Policy_Documents

[2] EMI Release tracker https://savannah.cern.ch/task/?group=emi-releases

[3] Collection of EMI test plans https://twiki.cern.ch/twiki/bin/view/EMI/QCTestPlan

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2011-06-30 - AlbertoAimar
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EMI All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback