SCAS (and gLExec) Preliminary Test Plan

In this page we provide guidelines to test the first SCAS patches. A more detailed test plan and test scripts will be produced soon. These information can actually be used to test gLExec too since SCAS and gLExec are strictly related. In a GLite infrastructure the SCAS, several deployment scenarios are possible; below these scenarios are described together with the related tests.

Deployment tests

  • Installation and configuration of SCAS (fresh install). Check the SCAS daemon is started:
netstat -platu | grep scas | grep 8443 | grep LISTEN
  • Reconfiguration (check the scas daemon is running afterwords).
  • Reboot of the machine (check the scas daemon is running after reboot).
  • Check log rotate is working.
  • Check grid-map file is correct.

Deployment scenarios

Due to lack of documentation about the SCAS service, some scenarios have been derived from the following paper, where gLExec is presented, gLExec: gluing grid computing to the Unix world and from slides about SCAS presented during EGEE conferences.
SCAS2.png
SCAS Client-Server architecture
The SCAS client is the central authorization service that internally uses the usual LCAS/LCMAPS modules to implement the authorization and mapping decision. Through LCAS a request coming from a Grid user with its DN can be authorized or banned. If authorized, the user can be mapped to a local Unix account; this is done through LCMAPS. When SCAS is used, the authorization and mapping decision are not done locally, but the requested are forwarded to the SCAS service.

The SCAS service is contacted using the SCAS client module, which can be installed on a Cream CE, on a WN, or on a SE (currently only dCache is expected to interact with SCAS through gPlazma). In the Cream CE and WN the SCAS client is not used directly but through LCMAPS, which redirects a request to the SCAS service. The user identity switch (Grid user to Unix local account) is done through glexec, an executables that can spawn a process and make it run under a specified local account. In this case, is glexec that will contact the SCAS service (still through LCMAPS and the SCAS client). The SCAS server, in case the Grid user is authorized, will return a Unix uid/gid that glexec will use to execute the spawned process.

The glexec binary can be deployed on a Cream CE or on a WN where it is used to implement the pilot job scenario. In order to run glexec two environment variables must be set:

  • GLEXEC_CLIENT_CERT: pointing to a proxy certificate that will be used for the authorization decision and the identity mapping. This certificate will be passed from glexec to the SCAS service.
  • GLEXEC_SOURCE_PROXY: pointing to a proxy that the real user job can use. gLExec will copy this proxy to the home directory of the mapped user and name it $HOME/proxy. Glexec will also set the variable X509_USER_PROXY to point to this copied certificate. The certificate pointed by X509_USER_PROXY is the one that will be used to authenticate the connection between glexec and the SCAS service.

In the LHCb pilot job model two proxies are used, a 'pilot proxy' and a 'user proxy'. The 'pilot proxy' is not used by glexec, but only by the pilot agent to communicate with the DIRAC WMS services. Therefore, in this scenario, both GLEXEC_CLIENT_CERT and GLEXEC_SOURCE_PROXY point to the same proxy, the 'user proxy'.

gLExec-on-WN scenario

This scenarios refers to the late binding approach where a pilot job is submitted to a WN with the goal of managing future jobs on behalf of a group of users. In this case the SCAS server is installed on a machine, and gLExec is installed on a WN, together with the SCAS client and LCAS/LCMAPS plugins.

Basic test

A user proxy must be copied to the pilot user home directory; the two variables GLEXEC_CLIENT_CERT and GLEXEC_SOURCE_PROXY will point to this certificate. This test must be repeated using different voms roles in the user proxy (no role, lcgadmin role, production role) Execute:
glexec /usr/bin/id
and verify that the userid returned is the one mapped in the SCAS service for the user DN contained in the user proxy. gLExec is set in setuid mode to retrieve the mapping information from the SCAS server and use the retrieved Unix credential to spawn a process and execute the given command as argument.

The same test must be repeated using multiple users at the same time. Some of these users must be banned, and their request must not be executed. Since glexec can work with or without SCAS, and with or without identity change (setuid vs. log-only mode), these combination must all work.

Failure scenarios

In the case when glexec is set to work with SCAS and the SCAS service is not available, glexec has to quickly return an error code so that the pilot framework can carry on the task without using glexec. At the moment, there is no fail-over solution. The SCAS client, and so glexec, has to return an error code 'quickly', in a few seconds.

The error code that glexec must return are:

  • 201 - client error, which includes:
    1. no proxy is provided
    2. wrong proxy permissions
    3. target location is not accessible
    4. the binary to execute does not exist
    5. the mapped user has no rigths to execute the binary
    6. when GLEXEC_CLIENT_CERT is not set

  • 202 - system error
    1. glexec.conf is not present or malformed
    2. lcas or lcmaps initialization failure, can be obtained moving the lcas/lcmaps db files.

  • 203 - authorization error
    1. user is not whitelisted
    2. local lcas authorization failure
    3. user banned by the SCAS server
    4. lcmaps failure on the scas server
    5. SCAS server not running
    6. network cable unplugged on the SCAS server host.
  • 204 - exit code of the called application overlap with the previous ones
    1. application called by glexec exit with code 201, 202, 203 or 204

Service stability test

One week of high load must be executed. High load means that the SCAS service must be able to satisfy requests with an average frequency of 6Hz. On a VM of the CTB we achieved a frequency of approximately 0.66Hz (SCAS request per second), therefore 10 WNs will be used to achieve 6Hz. When launching the test, one worker node will be added every 2 hours, so that the ramp up time will be 20 hours. After 20 hours 10 WNs will keep calling glexec continuously for a week.

The scripts used for this test are explained here. Installation instructions for the worker nodes are here.

The collected results are available at: SCAS stress tests results

Site-CE

In this deployment scenarios gLExec is used to get unix credentials by a CE that runs in a web service container (CREAM CE), and therefore does not have sufficient privileges to change Unix identity of incoming requests. In order to test this scenario a job submission with the CREAM CE must be done. Currently CREAM works without SCAS. This scenario can be tested only when CREAM will be patched to support and use SCAS.

SE

dCache+gPlazma is at the moment the only SE type of client for SCAS. A specific test plan for this deployment must be prepared.

gLExec in Logging only mode

In this scenario gLExec is configured not to change the uid, but it will log information about the request and the the user identity and the process to be executed will run with original uid (pilot job or Cream CE service container). In this case gLExec does not require setuid powers. All the previous scenarios could be repeated with gLExec in logging only mode, with and without SCAS. Logging only mode with SCAS would allow sites to implement a central authorization service without using setuid features.

In gLExec two logging methods can be used and therefore must be tested: file logging and syslog. The logging option is specified in the YAIM configuration file (site-info.def file).

-- GianniPucciani - 10 Nov 2008

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatext README r2 r1 manage 5.5 K 2008-12-15 - 18:36 GianniPucciani  
PNGpng SCAS1.png r1 manage 12.5 K 2008-12-02 - 13:47 GianniPucciani  
PNGpng SCAS2.png r1 manage 12.4 K 2008-12-02 - 13:52 GianniPucciani  
Texttxt wn_installation.txt r1 manage 1.1 K 2008-12-12 - 17:30 GianniPucciani  
Edit | Attach | Watch | Print version | History: r22 < r21 < r20 < r19 < r18 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r22 - 2009-03-16 - GianniPucciani
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback