Argus certification report

Links

Patch 3076 (Initial Release)

Installation

The patch has been installed on a virtual machine with SL5 64 bits, vtb-generic-70. All the three services, PAP, PDP and PEP will be run on that host.
# yum update
# ./yaimgen.sh function yg_getHostCredentials
# ./yaimgen.sh function yg_getTestUserCredentials

# wget http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/lcg-CA.repo
# yum install lcg-CA

[root@vtb-generic-70 yum.repos.d]# cat patch3076.repo 
[patch3076]
name=patch3076
baseurl=http://grid-deployment.web.cern.ch/grid-deployment/glite/cert/3.2/patches/3076/sl5/$basearch
enabled=1

[root@vtb-generic-70 yum.repos.d]# wget http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/glite-WN.repo 

[root@vtb-generic-70 yum.repos.d]# yum install cert-glite-ARGUS

Yaimgen installation

The installation has been repeated using yaimgen scripts (head, rev 85).
[root@vtb-generic-55 trunk]# cat yaimgen.in 
YG_REPO="prod"           # Can be cert,pps,prod
YG_TARGETS="glite-ARGUS patch3076"
YG_CERT_INST="yes"

[root@vtb-generic-55 trunk]# grep YG_DEFAULT yaimgen.conf 
export YG_DEFAULTS_REPOS="lcg-CA dag glite-WN"
export YG_DEFAULTS_PACKAGES="lcg-CA"

[root@vtb-generic-55 trunk]# ./yaimgen.sh preconfigure yaimgen.in

Configuration

The configuration has been done as usual, using YAIM

# ./yaimgen.sh function yg_getYaimFiles
The YAIM twiki page does not have an Argus section yet. Waiting for it to be updated.

A trial with the following site-info.def

HOST=vtb-generic-70.cern.ch
PDP_ENTITY_ID="${HOST}/pdp"
PAPS_ENDPOINTS="https://${HOST}:8150/pap/services/ProvisioningService"
#PAP_ADMIN_DN="/DC=ch/DC=switch/DC=slcs/O=Switch - Teleinformatikdienste
PAP_ADMIN_DN="/C=CH/O=CERN/OU=GD/CN=Test user 1"
PEP_ENTITY_ID="http://${HOST}/pepd"
PDPS_ENDPOINTS="http://${HOST}:8152/authz"

USERS_CONF=/opt/glite/yaim/examples/users.conf
GROUPS_CONF=/opt/glite/yaim/examples/groups.conf
VOS="dteam"
showed that yaim terminated successfully despite several error messages:
Error: 'java' not available in command path

After installing openjdk manually, the configuration with yaim stopped asking for a password. This is due to a command "pap-admin add-ace..." run by yaim that uses a password encrypted key. After providing the password the command fails, the error can be reproduced calling it manually:

[root@vtb-generic-70 yum.repos.d]# export PAP_HOME=/opt/authz/pap; /opt/authz/pap/bin/pap-admin add-ace "/DC=ch/DC=cern/OU=computers/CN=vtb-generic-70.cern.ch" "POLICY_READ_LOCAL|POLICY_READ_REMOTE" --cert /root/.globus/usercert.pem --key /root/.globus/userkey.pem
Password:
Error: (500)No certificate found in request! 
This problem was due to missing CA file and the use of a certificate version 1. After installing the correct rpms and using a certificate version 3 the installation (with password, see https://savannah.cern.ch/bugs/?53482) ended successfully.

An error due to missing vomsdir was also present, it can be removed running the YAIM function config_vomsdir (https://savannah.cern.ch/bugs/?53235)

Functional tests

For the first round of tests, the test plan version 0.4 has been followed.

Configuration tests

Test-PAP-FUNC-1: Failed
  • KO: The server was able to start also with an empty security section in the configuration file. Is this a documentation bug since the fields are defined as required?
  • OK: Without providing the required ' poll_interval ' variable the server restart fails (though not clear form the restart command)
[root@vtb-generic-54 rel-0_1]# export PAP_HOME=/opt/authz/pap; /etc/rc.d/init.d/pap-standalone restart
Restarting pap-standalone: Starting pap-standalone: Ok.

[root@vtb-generic-54 rel-0_1]# export PAP_HOME=/opt/authz/pap; /etc/rc.d/init.d/pap-standalone status
PAP not running...removing stale pid file
PAP not running!
and the log file shows:
java.util.NoSuchElementException: 'paps:properties.poll_interval' doesn't map to an existing object
  • KO: syntax errors in the configuration file do not prevents the server to start, unless they affect a required variable.

Scripted tests output:

[root@vtb-generic-54 tests]# ./test-PAP-FUNC-1.sh 
Thu Jul 23 10:32:26 CEST 2009
---Test-PAP-FUNC-1---
1) testing required security section
FAILED
2) testing required poll_interval 
OK
3) testing syntax error: missing ']'
FAILED
4) testing syntax error: missing '='
FAILED
5) testing authz syntax error: missing ']'
OK
6) testing authz syntax error: missing ':'
OK
7) testing authz syntax error: missing 'permission'
OK
---Test-PAP-FUNC-1: TEST FAILED---
Thu Jul 23 10:35:23 CEST 2009

Test-PAP-FUNC-2: OK
  • OK: when the configuration file is not found the server does not start and print the right error messages.
Scripted tests output:
[root@vtb-generic-54 tests]# ./test-PAP-FUNC-2.sh 
Thu Jul 23 10:25:13 CEST 2009
---Test-PAP-FUNC-2---
1) testing missing configuration file
OK
2) testing missing authz file
OK
---Test-PAP-FUNC-2: TEST PASSED---
Thu Jul 23 10:27:16 CEST 2009

Test-PAP-FUNC-3: OK
  • OK: when the authorization configuration file is not found the server does not start and print the right error messages.

Scripted tests output above

PAP CLI

Test-PAP-FUNC-4: OK
  • OK: pap-admin provides meaningful help output

Test-PAP-FUNC-5: OK
  • - : pap admin commands provides meaningful output (to be verified during certification)

Test-PAP-FUNC-6: OK
  • OK: when the server is down, pap admin commands return meaningful error messages, exit code is 5

[root@vtb-generic-54 tests]# /opt/authz/pap/bin/pap-admin ping
Contacting PAP at "https://localhost:8150/pap/services/"... Error: ; nested exception is: 
        java.net.ConnectException: Connection refused

[root@vtb-generic-54 tests]# /opt/authz/pap/bin/pap-admin list-paps
Error: ; nested exception is: 
        java.net.ConnectException: Connection refused

Test-PAP-FUNC-7: Ban user: OK

  • misuse of the CLI shows correct error message and exit with code 4
  • Without privileges (no proxy or proxy with no privileges) the CLI shows correct error message and exit with code
[root@vtb-generic-54 tests]# /opt/authz/pap/bin/pap-admin ban subject "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=pucciani/CN=621330/CN=Gianni Pucciani"
Error: org.glite.authz.pap.authz.exceptions.PAPAuthzException: Insufficient privileges to perform operation 'BanOperation'.
[root@vtb-generic-54 tests]# echo $?
5
  • Using the right credentials the commands succeeds and exits with 0.
[root@vtb-generic-54 tests]# /opt/authz/pap/bin/pap-admin ban subject "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=pucciani/CN=621330/CN=Gianni Pucciani"
[root@vtb-generic-54 tests]# echo $?
0
[root@vtb-generic-54 tests]# /opt/authz/pap/bin/pap-admin lp   

default (local):

resource ".*" {

    action ".*" {
        rule deny { subject="/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=pucciani/CN=621330/CN=Gianni Pucciani" }
    }
}

Test-PAP-FUNC-8: Ban fqan; OK
  • The fqan is unbanned and the CLI returns 0

Test-PAP-FUNC-9: Unban user whit no ban policy: OK
  • The CLI shows the proper error message and return 2

Test-PAP-FUNC-10: Unban user: OK
* The user is unbanned and the CLI returns 0

Test-PAP-FUNC-11: Unban fqan with no ban policy : OK
  • The CLI shows an error message and exits with 2

Test-PAP-FUNC-12: Unban fqan: OK
  • The fqan is unbanned and the CLI exits with 0

Scripted output for user/fqan ban/unban:

[root@vtb-generic-54 tests]# ./test-ban-unban-fqan.sh 
Thu Jul 23 16:52:39 CEST 2009
---Test-BAN/UNBAN-FQAN---
1) testing fqan ban
OK
2) testing fqan unban
OK
3) testing unbanning non existing fqan
ban policy not found.
OK
---Test-BAN/UNBAND-FQAN: TEST PASSED---
Thu Jul 23 16:52:50 CEST 2009
[root@vtb-generic-54 tests]# ./test-ban-unban.sh      
Thu Jul 23 16:53:09 CEST 2009
---Test-BAN/UNBAN---
1) testing user ban
OK
2) testing user unban
OK
3) testing unbanning non existing subject
ban policy not found.
OK
---Test-BAN/UNBAND: TEST PASSED---
Thu Jul 23 16:53:17 CEST 2009

Test-PAP-FUNC-13: Add policy from file: OK
  • Policies are added and the CLI exits with 0

Test-PAP-FUNC-14: Add policy from file with error: OK
  • Policies are not added and the CLI shows a proper error message and exits with 0

Scripted tests output:

[root@vtb-generic-54 tests]# ./test-policy-from-file.sh 
Thu Jul 23 17:41:47 CEST 2009
---Test-APF---
1) testing add policy from file
OK
2) testing add policy from file with error
Syntax error no policies has been added from file:/afs/cern.ch/user/p/pucciani/public/glitetests/src/org.glite.testsuites.ctb/Argus/tests/policyfile.txt
Reason:
org.glite.authz.pap.encoder.parser.ParseException: Encountered "deni" at line 4, column 14.
Was expecting one of:
    "permit" ...
    "deny" ...
    
OK
---Test-APF: TEST PASSED---
Thu Jul 23 17:42:07 CEST 2009

Test-PAP-FUNC-15 and 16: Update policies from file: OK
  • Giving a non existing or bad formatted file the CLI returns a proper error message and exits with 2
  • Giving a non existing id the CLI returns a proper error message and exits with 2
  • With correct arguments the policy is updated and the CLI returns 0
Scripted output:
[root@vtb-generic-54 tests]# ./test-upp-from-file.sh   
Mon Jul 27 11:05:08 CEST 2009
---Test-Update-Policy-From-File---
1) testing up with non existing file
Error: file "/afs/cern.ch/user/p/pucciani/public/glitetests/src/org.glite.testsuites.ctb/Argus/tests/dummy.txt" does not exists.
OK
2) testing up with non existing resource id
Error: resource id "dummy-id-999" does not exists.
OK
3) testing up with correct resource id 
ID=3a070047-04df-4674-94a0-113ffe2ba620
OK
4) testing up with changing only an action 
ID=public-bb9552fa-1d0d-4858-b334-de5953101f7d
OK
---Test-Update-Policy-From-File: TEST PASSED---
Mon Jul 27 11:06:10 CEST 2009

Test-PAP-FUNC-17-18-19-20-21: Remove policy: OK
  • providing a non existing id the CLI shows a proper error message and exits with code 5
  • removing with an empty repository shows a proper error message and exits with code 5
  • removing providing a resource id succeeded and CLI returns 0
  • removing providing an action id succeeded and CLU returns 0
  • removing providing a rule id succeeded and CLU returns 0
  • removing providing multiple rule ids succeeded and CLU returns 0
  • removing providing multiple rule ids with a wrong one succeeded and CLU returns 0
Scripted output:
Mon Jul 27 12:03:27 CEST 2009
---Test-Remove-Policies---
1) testing removal with non existing id
error.
Error: org.glite.authz.pap.repository.exceptions.NotFoundException: Id not found: dummy_id
OK
2) testing removal with resource id
ID=9be0ab7d-b71f-4815-a88e-60108bb05a1f
OK
3) testing removal with action id
ID=public-36165e07-f6a4-4a7d-a120-7bf90239a31e
OK
4) testing removal with rule id
ID=58f55a3d-3ddd-4533-886f-c3287b4fdb14
OK
5) testing removal with multiple rules
ID=4f7a20a9-8b7e-43c0-af06-be1dd0d000e0
a878c920-86c9-485e-9960-b734f3bee29e
b192810e-15b4-4b07-ac30-8335bab5b352
OK
6) testing removal with multiple rules and one wrong
ID=327e4a4d-075a-4d13-8e5b-8cc966113273
9e650e88-d7e1-4f75-9c4a-5100783579f9
e1be2f65-ad21-412c-ae68-42be598dedc4
error.
Error: org.glite.authz.pap.repository.exceptions.NotFoundException: Id not found: another-non-existing-id
OK
7) testing removal with empty repository
error.
Error: org.glite.authz.pap.repository.exceptions.NotFoundException: Id not found: 327e4a4d-075a-4d13-8e5b-8cc966113273
OK
---Test-Remove-Polices: TEST PASSED---

Test-PAP-FUNC-22: Remove all policy: OK
* removing all policies works correctly and returns 0 Scripted output
[root@vtb-generic-54 tests]# ./test-remove-all-policies.sh 
Mon Jul 27 12:13:48 CEST 2009
---Test-Remove-All-Policies---
1) testing remove all policies 
OK
---Test-Remove-All-Polices: TEST PASSED---
Mon Jul 27 12:14:28 CEST 2009

Test-PAP-FUNC-23-24: List policy: OK
  • with empty repository the CLI returns 0
  • with non empty repository the CLI list the policies and returns 0
  • with wrong pap-alias the CLI shows proper error message and returns 5

Scripted output

Mon Jul 27 12:26:34 CEST 2009
---Test-List-Policies---
1) testing list policies on an empty repository

default (local):
No policies has been found.
OK
2) testing list policies
OK
2) testing list policies with wrong pap-alias
Error: org.glite.authz.pap.repository.exceptions.NotFoundException: Not found: papAlias=dummy_pap
OK
---Test-List-Polices: TEST PASSED---
Mon Jul 27 12:27:18 CEST 2009

PAP Management

Test-PAP-FUNC-26-27-28-29-30-31: Add/remove PAP: FAILED
  • adding a local pap works correctly and return 0
  • adding a local pap with existing alias return proper error message and exit code =0
  • adding a pap whose endpoint is down shows correct error message but return =0 (it should return 0) FAILED
  • adding a remote pap specifying the endpoint does not work: FAILED: BUG:53678
  • trying to remove the default pap shows the right error message and returns =0
  • trying to remove a non existing pap shows the right error message and returns =0
[root@vtb-generic-54 PAP-management]# ./add-remove-localpap.sh 
Mon Jul 27 16:24:55 CEST 2009
---Add/Remove-local-PAP---
1) testing apap with existing alias
pap already exists.
OK
2) testing apap with wrong endpoint
Error: ; nested exception is: 
        java.net.ConnectException: Connection refused
Failed
3) testing apap local
OK
3) test removing local pap
OK
4) test removing local default pap
Error: org.glite.authz.pap.papmanagement.PapManagerException: Deleting the default pap is not allowed
OK
5) test removing non-existing pap
PAP not found: Dummy
OK
---Test-Add/Remove-local-PAP: TEST FAILED---
Mon Jul 27 16:25:57 CEST 2009

Test-PAP-FUNC-32-33: Update PAP: Failed
  • Updating a remote pap specifying the endpoint fails due to BUG:53678
[root@vtb-generic-54 PAP-management]# /opt/authz/pap/bin/pap-admin upap PAP-14 "https://vtb-generic-14.cern.ch:8160/pap/services/" "/DC=ch/DC=cern/OU=computers/CN=vtb-generic-14.cern.ch"
Error: pap information successfully updated but cannot retrieve policies.
Error: org.glite.authz.pap.common.exceptions.PAPException: Error contacting Provisioning Policy management service: For input string: ":8160"
  • Updating a remote pap specifying only the hostname (or just the alias) works correctly

Test-PAP-FUNC-34-35: Enable/Disable paps: OK
  • dpap with non existing pap shows correct error message and returns = 0
  • testing dpap with already disabled pap does not show error message and returns 0
  • testing epap with wrong alias shows correct error message and returns = 0
  • testing epap with good alias enables the pap and returns 0
  • testing dpap with good alias disables the pap and returns 0
  • testing dpap default pap disables the pap and returns 0
  • testing epap default pap enables the pap and returns 0
Scripted output
[root@vtb-generic-54 PAP-management]# ./en-disable-pap.sh 
Tue Jul 28 09:48:23 CEST 2009
---Test-Enable/Disable-PAP---
1) testing dpap with non existing pap
PAP not found: mypap
OK
2) testing dpap with already disabled pap
OK
3) testing epap with wrong alias
PAP not found: Dummy
OK
4) testing epap with good alias
OK
4) testing dpap with good alias
OK
5) testing dpap default pap
OK
6) testing epap default pap
OK
---Test-Enable/Disable-PAP: TEST PASSED---
Tue Jul 28 09:49:32 CEST 2009

Test-PAP-FUNC-36-37: refresh cache: Failed
  • using rc with a non existing rc returns 0 without error message: Failed BUG:53695
  • using rc with a local pap shows a proper error message and returns = 0
  • using a remote pap the cache is correctly refreshed and the CLI returns 0
  • using rc on a remote pap not available shows proper error message and returns = 0

Scripted output:

[root@vtb-generic-54 PAP-management]# ./refresh-cache.sh 
Tue Jul 28 10:20:40 CEST 2009
---Test-Refesh-Cache---
1) testing rc with non existing alias
Refreshing cache for pap "Do-Not-Exist"... ok.
Failed
2) testing rc with a local pap
Refreshing cache for pap "default"...error: org.glite.authz.pap.common.exceptions.PAPException: "default" is local, nothing to refresh
OK
---Test-Refesh-Cache: TEST FAILED---

Test-PAP-FUNC-38-39-40-41-42: set/get pap orders: OK
  • with no ordering the default pap is shown first, 0 is returned.
  • tested with 3 paps, the paps are listed in the right order and 0 is returned (non listed pap are appended at the end).
  • with a non existing alias a correct error message is displayed and the CLI returns =0
  • tests with PDP will be done later

Scripted output:

[root@vtb-generic-54 PAP-management]# ./set-get-pap-orders.sh 
Tue Jul 28 10:44:03 CEST 2009
---Test-Set/Get paps order---
1) testing gpo with no order
default
OK
2) testing spo with 3 paps
OK
2) Inverting the order
OK
3) using a non existing alias
Error: org.glite.authz.pap.distribution.DistributionConfigurationException: Error in remote paps order: unknown alias "local-pp3"
OK
---Test-Set/Get paps order: TEST PASSED---
Tue Jul 28 10:44:44 CEST 2009

Test-PAP-FUNC-43-44: Set/Get Polling time: OK
  • with a remote pap defined the polling time is correctly shown and the CLI returns 0
  • using spi the time is correctly changed and the CLI returns 0
Scripted output:
[root@vtb-generic-54 PAP-management]# ./set-get-poll-interval.sh 
Tue Jul 28 11:26:48 CEST 2009
---Test-Set/Get-Poll-Interval---
1) Setting polling time
OK
2) Retrieving polling time
OK
---Test-Set/Get-Poll-Interval: TEST PASSED---
Tue Jul 28 11:27:19 CEST 2009

Test-PAP-FUNC-45: Authz commands: OK
  • list-policies is not allowed with CONFIGURATION_READ privileges
  • list-polices is allowed whit ANYONE:ALL
Scripted output:
[root@vtb-generic-54 PAP-management]# ./test-authz.sh 
Tue Jul 28 12:06:44 CEST 2009
---Test-authz---
1) testing lp with no authorization

default (local):
Error: org.glite.authz.pap.authz.exceptions.PAPAuthzException: Insufficient privileges to perform operation 'ListLocalPolicySetOperation'.
OK
1) testing lp with anyone full power
OK
---Test-authz: TEST PASSED---
Tue Jul 28 12:08:48 CEST 2009

Test-PAP-FUNC-47-48: FQAN based authorization : OK
  • This scenario has been tested with glexec and the pepcli using the following policy:
resource "wn" {

    action "execute-now" {
        rule deny { subject="CN=Test user 303,OU=GD,O=CERN,C=CH" }
        rule permit { fqan="/dteam/Role=.*/Capability=.*" }
    }

PDP daemon

Test-PDP-FUNC-1-2 daemon start/stop/status: OK
  • Without the configuration file the pdpctl start script returns an error, though the script exit with 0
  • The start/stop/status commands works as expected with a valid configuration file
  • The log file shows the retrieved policies when the server starts
  • Policies are retrieved after the retentionInterval specified in the configuration file

Test-PDP-FUNC-3 authz request from PEPd: OK
  • This has been tested several times with the end to end tests.

PDP performance

Test-PDP-PERF-1 allowed queuing requests

Test-PDP-PERF-2 overflowing requests

PEP daemon

Test-PEPD-FUNC-1-2 invalid conf file: OK
  • When the configuration file is not found an error is reported (though the init script returns 0).

Test-PEPD-FUNC-3-4-5:

Test-PEPD-FUNC-6: Environment Time PIP: OK
  • Adding the PIP to the pep ini file the authorization request is correctly enhanced with date/time attributes.

Test-PEPD-FUNC-7-8: X509 PIP: OK
* Addin the PIP to the pep ini file the authorization request is correctly enhanced with attributes about the certificates.

Test-PEPD-PERF-1-2: Performance tests:
  • pepd with and without requests caching has been used during load tests, see below.

X509 PIP

Test-X509PIP-FUNC-1: authorization with V1 user certificate ?

Test-X509PIP-FUNC-2: authorization with V3 user certificates: OK
  • A correct decision and mapping obligation is returned
[root@vtb-generic-15 ~]# pepcli --pepd http://vtb-generic-54.cern.ch:8154/authz -c test_user_300_cert.pem --resourceid "myCE" --actionid "run job" -t 60 -x
Decision: Permit
Username=dteam014
UID=18191
GID=500

Test-X509PIP-FUNC-3: authorization with VOMS proxy: OK
  • A correct decision and mapping obligation is returned
[root@vtb-generic-15 ~]# pepcli --pepd http://vtb-generic-54.cern.ch:8154/authz -c proxy_300 --resourceid "myCE" --actionid "run job" -t 60 -x
Decision: Permit
Username=dteam019
UID=18263
GID=500

Test-X509PIP-FUNC-4: Unknown CA: OK
  • Using a user certificate from an unknown CA produces an error and is correctly tracked in the log files
[root@vtb-generic-15 ~]# pepcli --pepd http://vtb-generic-54.cern.ch:8154/authz -c proxy_300 --resourceid "myCE" --actionid "run job" -t 60 -x
Decision: Deny
Status: urn:oasis:names:tc:xacml:1.0:status:processing-error
Status message: Certificate with subject DN CN=Test user 300,OU=GD,O=CERN,C=CH failed PKIX validation

15:48:52.387 - ERROR [org.glite.voms.PKIUtils:609] - keyUsage extension present, but CertSign bit not active.
15:48:52.390 - ERROR [org.glite.voms.PKIVerifier:599] - Certificate verification: no trust anchor found.
15:48:52.390 - ERROR [org.glite.authz.common.pip.provider.X509PIP:204] - Certificate with subject DN CN=Test user 300,OU=GD,O=CERN,C=CH failed PKIX validation
15:48:52.406 - ERROR [org.glite.authz.pep.server.PEPDaemonRequestHandler:222] - Error preocessing policy information points
org.glite.authz.common.pip.PIPProcessingException: Certificate with subject DN CN=Test user 300,OU=GD,O=CERN,C=CH failed PKIX valid

Test-X509PIP-FUNC-4: Unknown VOMS server: OK
  • The PEP react as expected to the missing voms certificates
[root@vtb-generic-15 ~]# pepcli --pepd http://vtb-generic-54.cern.ch:8154/authz -c proxy_300 --resourceid "myCE" --actionid "run job" -t 60 -x Decision: Not Applicable

15:57:28.028 - ERROR [org.glite.voms.PKIVerifier:378] - Cannot find usable certificates to validate the AC. Check that the voms server host certificate is in your vomsdir directory.

PEP-C

Test-PEPC-FUNC1-2-3-4-5: OK
  • The pepcli has been used to test all the combination foreseen in the test plan: OK

Test-X509PIP-FUNC-6-7: Invalid certificate presented: OK
  • The PEP react as expected when submitting a pepcli command with an expired user certificate:

[root@vtb-generic-15 ~]# pepcli --pepd http://vtb-generic-54.cern.ch:8154/authz -c proxy_300 --resourceid "myCE" --actionid "run job" -t 60 -x
Decision: Deny
Status: urn:oasis:names:tc:xacml:1.0:status:processing-error
Status message: Certificate with subject DN CN=Test user 300,OU=GD,O=CERN,C=CH failed PKIX validation
  • The PEP returns 'Permit' when the certificate is valid:
[root@vtb-generic-15 ~]# pepcli --pepd http://vtb-generic-54.cern.ch:8154/authz -c proxy_300 --resourceid "myCE" --actionid "run job" -t 60 -x
Decision: Permit
Username=dteam019
UID=18263
GID=500

Test-X509PIP-FUNC-7-8-9-10-11: deny/permit by DN and VOMS FQAN OK
  • The PEP react as expected when submitting a pepcli command with an expired user certificate.
  • With a valis proxy certificate the correct rule is applied.
  • Deny/Permit by DN works as expected NOTE: the user DN has to be in RFC2253 format, this can be obtained using:
[root@vtb-generic-15 ~]# openssl x509 -nameopt RFC2253 -in test_user_300_cert.pem -subject -noout
subject= CN=Test user 300,OU=GD,O=CERN,C=CH

  • Deny/Permit by VOMS FQAN works as expected
  • The first policies to render a decision is the one chosen when more than one applies.

X509 and Grid Map OH

Test-X509OBLIG-Func1-2-3-4: OK

  • When the grid-mapfile is not found a proper error is reported:
org.glite.authz.common.config.ConfigurationException: Unable to configure Obligation Handler MAPPING_OH. The following error was reported: Unable to read map file /etc/grid-security/grid-mapfile

  • When the gridmapdir does not exist a proper error message is reported:
org.glite.authz.common.config.ConfigurationException: Unable to configure Obligation Handler MAPPING_OH. The following error was reported: Grid map directory /etc/grid-security/gridmapdir does not exist

  • With an invalid UID the proper error message is reported and the PEP starts
12:18:37.087 - WARN [org.glite.authz.common.obligation.provider.dfpmap.impl.EtcPasswdIDMappingStrategy:77] - The GID 4294967294 is not a valid, the /etc/group entry on line 38 is being ignored

Test-X509OBLIG-Func5
  • With an empty gridmapdir:
[root@vtb-generic-15 argus]# pepcli --pepd http://vtb-generic-98.cern.ch:8154/authz -c usercerts/proxy_309 --resourceid "resource_1" --actionid "submit-job" -t 60 -x
Decision: Deny
Status: urn:oasis:names:tc:xacml:1.0:status:processing-error
Status message: Unable to map subject to a POSIX account

Test-X509OBLIG-Func6-7-8-9: OK
  • the grid map handler works as expected. When an error in the configuration is found, or the user cannot be mapped, a Deny decision is taken.

End-To-End System testing

Test-SYS-Simple-Local-Permit: allow only pilot role in a certain VO to submit jobs: OK
  • The scenario works as expected. When modifying the grid-mapfile or groupmapfile, the pepd has to be restarted.

Test-SYS-Complex Ban-1: allow all users of a VO except a few banned ones: OK
  • The scenario works as expected.

Test-SYS-Complex Ban-2: as above but Permit when banned user accesses with a certain FQAN: OK
  • This scenario has been tested allowing only users with the pilot role or with another specific role.

Test-SYS-Local-Simple Ban-1: banning a CA:
  • Successfully tested after the bug fix

Test-SYS-Local-Simple Ban-2: banning a DN: OK
* Note that the DN has to be in RFC2253 format, e.g:
root@vtb-generic-15 argus]# openssl x509 -nameopt RFC2253 -in /afs/cern.ch/user/p/pucciani/.globus/usercert.pem -subject -noout
subject= CN=Gianni Pucciani,CN=621330,CN=pucciani,OU=Users,OU=Organic Units,DC=cern,DC=ch

Test-SYS-Local-Simple Ban-3: banning a VO: OK
  • This scenario has been tested banning the vo org.glite.voms-test and allowint dteam

Test-SYS-Local-Simle Ban-4: banning a serial number: TODO

Failure tests

PEPd daemon stopped: OK
  • glexec returns immediately an error message
# ./glexec-test.sh usercerts/proxy_300
vtb-generic-15.cern.ch
===
Wed Sep 30 10:27:48 CEST 2009
===
[gLExec]:   LCMAPS failed, see '/var/log/glexec/lcas_lcmaps.log' for more info.
Glexec returned 203
with log file whowing
LCMAPS 1: 2009-09-30.10:27:48-06184 : Error: pep_authorize(request,response) failed: 11: [11]: authorize: processing error: PEPd[http://vtb-generic-20.cern.ch:8154/authz]: sending XACML request failed: curl[7]: couldn't connect to server.
LCMAPS 1: 2009-09-30.10:27:48-06184 : Error: Failed to engage contact with PEP Daemon and get an positive or negative authorization decision

  • pepcli immediately returns an error message:
[root@vtb-generic-15 argus]# pepcli --pepd http://vtb-generic-20.cern.ch:8154/authz -c usercerts/proxy_300 --resourceid "wn" --actionid "execute-now" -t 60 -x
pepcli:ERROR: failed to authorize XACML request: [11]: authorize: processing error: PEPd[http://vtb-generic-20.cern.ch:8154/authz]: sending XACML request failed: curl[7]: couldn't connect to server.

PDP deamon stopped, killed/restarted: OK (PEP cache disabled)
  • pepcli returns immediately an error message:
[root@vtb-generic-15 argus]# pepcli --pepd http://vtb-generic-20.cern.ch:8154/authz -c usercerts/proxy_300 --resourceid "wn" --actionid "execute-now" -t 60 -x
Decision: Deny
Status: urn:oasis:names:tc:xacml:1.0:status:processing-error
with pepd process log showing the cause of the error.
  • when the pdp is restarted a few seconds (<5) are need in order to get a correct decision from the pepcli.
  • when the pdp process is killed and restarted the pepcli goes through the following steps
    • pdp down: Decision: Deny, Status: urn:oasis:names:tc:xacml:1.0:status:processing-error
    • pdp starting: as above plus, Decision: Indeterminate Status: urn:oasis:names:tc:xacml:1.0:status:processing-error
    • pdp started: Permit with correct mapping
  • with the pepcli constantly issuing requests, and the pdp killed/restarted, we obtained the same results as above, with some indeterminate decisions during the pdp restart.

PDP deamon stopped, killed/restarted: OK (PEP cache enabled)
  • The tests issued 1000 requests with the pepcli. The pepd cache was enabled with 500 cached requests. During the pepcli requests the pdp daemon has been killed and restarted. All the 1000 requests obtained the right decision and user mapping.

Network interruption between PEP client and PEPd: OK
* The test has been done using the pepcli and putting down the network interface of the virtual machine where the pepd is installed. Without specifying any timeout in the pepcli, the pepcli returns after ~30s with:
# pepcli --pepd http://vtb-generic-20.cern.ch:8154/authz -c usercerts/proxy_300 --resourceid "resource1" --actionid "action1" -x
pepcli:ERROR: failed to authorize XACML request: [11]: authorize: processing error: PEPd[http://vtb-generic-20.cern.ch:8154/authz]: sending XACML request failed: curl[28]: a timeout was reached.
Following requests returns faster, few seconds, with:
# pepcli --pepd http://vtb-generic-20.cern.ch:8154/authz -c usercerts/proxy_300 --resourceid "resource1" --actionid "action1" -x
pepcli:ERROR: failed to authorize XACML request: [11]: authorize: processing error: PEPd[http://vtb-generic-20.cern.ch:8154/authz]: sending XACML request failed: curl[7]: couldn't connect to server.
When the network interruption is smaller then the timeout the pepcli obtain the expected response. The same behavior has been tested also using glexec as client.

Load tests

28 October: new pep library on physical host

This experiment was done using Argus deployed on a physical machine, and with the new version of the library librarypdp-pep-common-1.0.2.jar. The purpose of this experiment was to verify the removal of the errors previously observed due to ' Unable to resolve ID information for mapped account'. The new library did fix the issue. The only errors observed in this experiment were due to timeouts (0.0066% error rate).

Test        Description    Successful Tests        Errors    Mean Time      Mean Time    Standard Deviation    TPS Peak TPS
Test 1  pepcli multi tests     10413610              693         68.8                41.5                  154                        479

19 Oct 2009: SSL enabled, 10 clients 4 days.

This experiment was done in order to check the service behavior with SSL enabled on the pepd ('enableSSL = true' added to pepd.ini). The error rate is less than the one saw with SSL disabled:
  • Total requests: 14M
  • Total errors: 1641 (0.012%) (all Deny but expected Permit)
  • Throughput: 33.6 TPS

The response time is bigger when using SSL compared to when not using SSL (mean test time is 283.18 ms with a large standard variation 548.86 ms).

The errors were all of the same type:

10/14/09 3:12:24 PM (thread 0 run 19557 test 1): Received Decision: Deny but expected: Permit
10/14/09 3:12:24 PM (thread 0 run 19557 test 1): Status: urn:oasis:names:tc:xacml:1.0:status:processing-error
10/14/09 3:12:24 PM (thread 0 run 19557 test 1): Status message: Unable to resolve ID information for mapped account

The memory consumption of the 3 services is (only 40 hours recorded due to problems in experiment setup):

  • memory cons. with SLL This measurement though will be repeated using the status call of the pep and pdp daemon.

12 Oct 2009: Grinder pepcli-multiauthz, long test with 1 client

This test, run with a single client (1 process 1 thread), confirmed that with requests rate around 20TPS the service is very reliable and likely not to return any error.
  • Total requests: 6.4M
  • Total errors: 0
  • Throughput: 20 TPS

8 Oct 2009: Grinder pepcli-multiauthz at 150 TPS for 3 days

  • Total requests: 27.3M
  • Total errors: 9677 (0.035%), all Deny when expecting Permit
  • Throughput: 152 TPS

6 Oct 2009-2: Grinder pepcli-multiauthz varying the number of clients

This experiment was done using the same setup of the previous tests, but varying the number of clients. Each clients has a single thread. The results shows that when the number of tests per second is under a certain threshold (~50 tests per second), we are likely not to get errors. With higher requests frequencies up to 120 tests per seconds the errors are within the 0.1% of the requests. All these errors were Deny decisions got when expecting a Permit one.

6 Oct 2009: Grinder pepcli-multiauthz, 1 day, 10 users (2 banned) 1 process.

This test was done with the purpose of verifying that, with a lower amount of requests per second, the number of errors would be much lower. In fact:
  • Total requests: 1.45M
  • Total errors: 0
  • Throughput: 17.6 tests per second

5 Oct 2009: Grinder pepcli-multiauthz, 3 days, 10 users (2 banned) 3 processes.

The test was done using the grinder tool with the pepcli-multiauthz tests with 10 user certificates (2 banned). 3 worker processes with 2 threads each were used.
  • Total requests: ~25M
  • Total errors: ~12K
  • Throughput: 76 tests per second

The errors were all coming from a Deny decision got when expecting a Permit one. The banned user authorized error did not show up.

30 Sept 2009: 19 hours, 10 users, 2 banned (single process).

The banned user authorized did not show up (throughput was ~26 reqs/sec). The test script output was:
# ./argus_usecase1.sh -d 200909300830 -p usercerts
Started on: Tue Sep 29 13:24:49 CEST 2009
The test will end on 200909300830
1963746 iterations done
Ended on: Wed Sep 30 08:30:00 CEST 2009
Checking results...
User 300 should be mapped to dteam012: Error. good_mappings=194372, iterations=196326
User 301 should be mapped to dteam049: Error. good_mappings=194283, iterations=196265
User 302 should be mapped to dteam010: OK
User 303 should be always banned: OK
User 304 should be mapped to dteam007: Error. good_mappings=193984, iterations=196006
User 305 should be mapped to dteam036: Error. good_mappings=195459, iterations=197417
User 306 should be always banned: OK
User 307 should be mapped to dteam021: Error. good_mappings=194788, iterations=196792
User 308 should be mapped to dteam001: Error. good_mappings=194529, iterations=196488
User 309 should be mapped to dteam008: OK

29 Sept 2009: 17 hours, 10 users, 2 banned.

The banned user authorized error happened twice. The other errors, probably due to dropped requests were in the same numbers. These type of errors do not produce any output, so no wrong decision nor incorrect mapping. Though, they seems to be related to the errors seen in the log files 'ERROR [org.mortbay.log:87] - handle failed' and 'ERROR [org.mortbay.log:87] - EXCEPTION'.
  • Total requests: ~6.5M
  • Total errors: ~5K
  • Throughput: ~100reqs/sec

25 Sept 2009: 3 hours, 10 users, 1 banned

This test was targeting errors seen in previous tests. The test machine was a virtual machine with 1GB memory. The sleep time of the previous tests was removed with each process constantly calling the pepcli. 10 WNs were used with a single test process for each WN.

  • Total requests: ~1M
  • Total errors: ~1K (in a single case the error resulted in a Permit decision for the banned user).
  • Throughput: ~92 reqs/sec

The errors in the pepd process log (INFO level) were of two types:

13:59:53.664 - ERROR [org.mortbay.log:87] - handle failed
13:59:57.930 - ERROR [org.mortbay.log:87] - EXCEPTION

This test will be repeated increasing the pepd log level to DEBUG, and modifying the log rotation to avoid to reach the maximum file size allowed by the OS. The logging has been modified also to suppress the line "ERROR [org.glite.voms.PKIUtils:609] - keyUsage extension present, but CertSign bit not active."

4 Sept 2009: 1 day, 10 users, 2 banned

The test machine was a virtual machine with 256MB memory. The test was done with a single process that was executing the following request:
pepcli --pepd http://vtb-generic-54.cern.ch:8154/authz -c $proxy --resourceid "resource1" --actionid "action1" -t 60 -x
A random [0,5sec] sleep time was present between 2 iterations. The proxy was chosen randomly among 10 available user proxies. The set of policies loaded in the PAP was:
[root@vtb-generic-54 argus]# cat permit_dteam_ban_2_users.txt 
resource "resource1" {
    action "action1" {
        rule deny { subject="CN=Test user 303,OU=GD,O=CERN,C=CH" }
        rule deny { subject="CN=Test user 306,OU=GD,O=CERN,C=CH" }
        rule permit { fqan="/dteam/Role=.*/Capability=.*" }
    }
}
The result was:
[root@vtb-generic-15 argus]# ./argus_usecase1.sh -d 200909051700 -p usercerts
Started on: Fri Sep  4 17:35:01 CEST 2009
The test will end on 200909051700
39940 iterations done
Ended on: Sat Sep  5 17:00:02 CEST 2009
Checking results...
User 300 should be mapped to dteam019: OK
User 301 should be mapped to dteam045: Error. good_mappings=4053, iterations=4054
User 302 should be mapped to dteam041: OK
User 3 should be always banned: OK
User 304 should be mapped to dteam014: OK
User 305 should be mapped to dteam047: OK
User 6 should be always banned: OK
User 307 should be mapped to dteam040: OK
User 308 should be mapped to dteam043: Error. good_mappings=4031, iterations=4032
User 309 should be mapped to dteam001: OK
This time the error every 4 hour in noticed in the previous test was not seen.

98.96% of the response time was in the interval [0,0.5sec]:

[root@vtb-generic-15 argus]# cat datafilt_out_data 
zone 1 [0,0.5): 39524
zone 2 [0.5,2): 333
zone 3 [2,+inf): 82
The response time plot is shown in this graph:

2 Sept 2009: 15 hours, 1 user

The test machine was a virtual machine with 256MB memory. The tests was done submitting requests from a single process (using 1 user proxy), with a random sleep time [0,5] between two iterations. The request was:
pepcli --pepd http://vtb-generic-54.cern.ch:8154/authz -c proxy_300 --resourceid "resource1" --actionid "action1" -t 60 -x
The policy loaded in the PAP was allowing such a request, making the PDP return a Permit decision. Results were:
Elapsed time: ~15 hours
Started on: Wed Sep  2 17:43:01 CEST 2009
25330 iterations done
Ended on: Thu Sep  3 08:30:03 CEST 2009
Errors: 5 (Deny decision)
It was found that errors happened regularly, every 3h. The errors were traced in the log as:
20:40:27.294 - ERROR 
 [org.glite.authz.pep.server.PEPDaemonRequestHandler:287] - Error sending 
 request to PDP endpoint http://127.0.0.1:8152/authz
 org.opensaml.ws.soap.client.SOAPClientException: Unable to send request 
 to http://127.0.0.1:8152/authz
        at 
 org.opensaml.ws.soap.client.http.HttpSOAPClient.send(HttpSOAPClient.java:114) 
 [openws-1.3.0.jar:na]
        at 
 org.glite.authz.pep.server.PEPDaemonRequestHandler.sendRequestToPDP(PEPDaemonRequestHandler.java:275) 
 [pepd-1.0.0.jar:1.0.0]
        at 
 org.glite.authz.pep.server.PEPDaemonRequestHandler.handle(PEPDaemonRequestHandler.java:188) 
 [pepd-1.0.0.jar:1.0.0]
        at 
 org.glite.authz.pep.server.PEPDaemonServlet.doPost(PEPDaemonServlet.java:84) 
 [pepd-1.0.0.jar:1.0.0]
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) 
 [servlet-api-2.5-20081211.jar:na]
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) 
 [servlet-api-2.5-20081211.jar:na]
        at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) 
 [jetty-6.1.18.jar:6.1.18]
        at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157) 
 [jetty-6.1.18.jar:6.1.18]
        at 
 org.glite.authz.common.logging.AccessLoggingFilter.doFilter(AccessLoggingFilter.java:41) 
 [pdp-pep-common-1.0.0.jar:na]
        at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1148) 
 [jetty-6.1.18.jar:6.1.18]
        at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:387) 
 [jetty-6.1.18.jar:6.1.18]
        at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) 
 [jetty-6.1.18.jar:6.1.18]
        at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) 
 [jetty-6.1.18.jar:6.1.18]
        at org.mortbay.jetty.Server.handle(Server.java:326) 
 [jetty-6.1.18.jar:6.1.18]
        at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) 
 [jetty-6.1.18.jar:6.1.18]
        at 
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:879) 
 [jetty-6.1.18.jar:6.1.18]
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:747) 
 [jetty-6.1.18.jar:6.1.18]
        at 
 org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) 
 [jetty-6.1.18.jar:6.1.18]
        at 
 org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) 
 [jetty-6.1.18.jar:6.1.18]
        at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) 
 [jetty-6.1.18.jar:6.1.18]
        at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) 
 [na:1.6.0]
        at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) 
 [na:1.6.0]
        at java.lang.Thread.run(Thread.java:636) [na:1.6.0]
 Caused by: org.apache.commons.httpclient.NoHttpResponseException: The 
 server 127.0.0.1 failed to respond
        at 
 org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1976) 
 [commons-httpclient-3.1.jar:na]
        at 
 org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735) 
 [commons-httpclient-3.1.jar:na]
        at 
 org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098) 
 [commons-httpclient-3.1.jar:na]
        at 
 org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) 
 [commons-httpclient-3.1.jar:na]
        at 
 org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) 
 [commons-httpclient-3.1.jar:na]
        at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) 
 [commons-httpclient-3.1.jar:na]
        at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) 
 [commons-httpclient-3.1.jar:na]
        at 
 org.opensaml.ws.soap.client.http.HttpSOAPClient.send(HttpSOAPClient.java:102) 
 [openws-1.3.0.jar:na]
        ... 22 common frames omitted
20:40:28.967 - ERROR 
 [org.glite.authz.pep.server.PEPDaemonRequestHandler:300] - No PDP 
endpoint was able to answer the authorization request

The response time plot is shown in this graph:

The memory consumption plot is shown in this graph:

TODO

Open Issues

[root@vtb-generic-54 argus]# /opt/authz/pap/bin/pap-admin rap
/opt/authz/pap/bin/pap-admin: line 19:  2951 Killed                  $PAP_CLIENT_CMD "$@"
[root@vtb-generic-54 argus]# /opt/authz/pap/bin/pap-admin lp -srai
/opt/authz/pap/bin/pap-admin: line 19:  2975 Killed                  $PAP_CLIENT_CMD "$@"

-- GianniPucciani - 10 Jul 2009

Topic attachments
I Attachment History Action Size Date Who Comment
Postscriptps 090902mem.ps r1 manage 41.6 K 2009-09-07 - 10:30 GianniPucciani  
Postscriptps 090902rtime.ps r1 manage 133.0 K 2009-09-07 - 10:29 GianniPucciani  
Postscriptps 090904rtime.ps r1 manage 161.6 K 2009-09-07 - 10:23 GianniPucciani  
Postscriptps errors.ps r1 manage 18.9 K 2009-10-06 - 13:37 GianniPucciani  
Postscriptps mean.ps r1 manage 19.3 K 2009-10-06 - 13:38 GianniPucciani  
Postscriptps memcons_ssl.ps r1 manage 61.5 K 2009-10-19 - 09:54 GianniPucciani  
Edit | Attach | Watch | Print version | History: r47 < r46 < r45 < r44 < r43 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r47 - 2009-11-18 - GianniPucciani
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback