Shared certification managed by product team
In the new model prepare for EGI the testbed will be totaly shared, SA3 will not maintain alone a set of service like today. This central testbed will be use as a reference for everybody to make the certification of it'own product. You can find the next testbed architecture, references and information
EGI Testbed.
Notes on security management system and selinux for SL5
We are in an environment of testing and certification were the middleware is not stable enough to be in production not stable meen also not secure.
You can find in this page :
http://osct.web.cern.ch/osct/dissemination.html
Lot of information to apply on any site installation.
SElinux is now activate per default in SL5 if you don't give any information about selinux configuration in your kickstart file the definition is :
# more /etc/sysconfig/selinux
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - SELinux is fully disabled.
SELINUX=enforcing
# SELINUXTYPE= type of policy in use. Possible values are:
# targeted - Only targeted network daemons are protected.
# strict - Full SELinux protection.
SELINUXTYPE=targeted
# SETLOCALDEFS= Check local definition changes
SETLOCALDEFS=0
Only the netrwork daemons are targeted.
Our current research about selinux consequence on the middleware didn't show us yet problems.
The problems are in few packages in the Operating system.
Fot ntpd, syslogd and dnsmasq (used at least for ARGUS clients)
Operating system
For CERN usage few kickstart are available :
- SL3 for HDA and SDA (Obsolete)
- SL4 32 bits and 64 bits SDA
- SL5 64 bits and 32 bits and XEN profile
Testbed of GD for the certification
The goal of this TB is to ensure that the generic installation propose by SA3 is working properly :
- Installation and upgrade with the native operating system pkg manager.
- Configuration with YAIM
- Test a set of batch and storage system
- Maintenance of this TB has a grid with multiple site with differents settings
- Test of the basic functionality of the middleware
- Stress test the TB
Insert a node in the CERN testbed (intra CERN)
For any people who want to be connected to the certification testbed an run the certification scripts.
YOU HAVE TO INSTALL our 3 internal packages :
- ca_BitFace
- ctb-vomscerts
- lcg-sam-client-sensors-ctb
SET THE site name :
cern-tb-cert
You will find the rpms
here.
We have also an internal
StayUpToDate rpm to install on the testbed machine that need an automatic update.
Monitoring of the TB maintain by NCG via BDII
ll the documentation for the implementation is on the twiki page of the EGEE monitoring group.
Monitoring team
Log management of the middleware :
List of log files per type of service (this list has been create in december if there is change to add ping me asap).
Logs to syslog :
# lcg-logger configuration file
#syslog_flag = glite-
# The following file will be monitored
# YOU **MUST** RESTART LCG-logger FOR CHANGES TO BE APPLIED
# By default nothing is monitored, but here are a few suggestions:
log_file = /var/log/edg-fetch-crl-cron.log
# LSF
log_file = /var/log/lsfjobs.log
# CE
#log_file = /etc/grid-security/grid-mapfile
#log_file = /var/log/globus-gridftp.log
#log_file = /var/log/globus-gatekeeper.log
#log_file = /var/log/lcg-expiregridmapdir.log
# RB
log_file = /var/edgwl/logmonitor/log/events.log
log_file = /var/edgwl/networkserver/log/events.log
log_file = /var/log/edg-wl-in.ftpd.log
log_file = /root/RB-sandbox-cleanup.log
log_file = /var/edgwl/logging/status.log
log_file = /var/edgwl/jobcontrol/log/events.log
log_file = /var/edgwl/workload_manager/log/events.log
log_file = /mnt/raid/rb-state/opt/condor/var/condor/log/SchedLog
#log_file = /var/edgwl/logmonitor/CondorG.log/CondorG.*
log_file = /etc/grid-security/grid-mapfile
log_file = /var/log/globus-gridftp.log
log_file = /opt/bdii/var/bdii-fwd.log
log_file = /opt/bdii/var/tmp/stderr.log
log_file = /opt/bdii/var/bdii.log
log_file = /opt/globus/setup/globus/config.log
log_file = /var/log/fetch-crl-cron.log
log_file = /var/log/ccm-fetch.log
log_file = /var/log/ccm-purge.log
log_file = /var/log/lcg-expiregridmapdir.log
log_file = /var/log/globus-gatekeeper.log
log_file = /var/log/maui.log
log_file = /var/log/edg-mkgridmap.log
log_file = /var/log/cleanup-job-records.log
log_file = /var/log/edg-fmon-agent.log
log_file = /var/log/apel.log
log_file = /var/log/prelink.log
log_file = /var/log/cleanup-grid-accounts.log
log_file = /opt/bdiilog_file
log_file = /var/log/yum.log
Secure (and default) way to install a machine at CERN
Admin side
- Firewall settings and root access for users to be submit to Romain Wartel or Louis Poncet.
- Usage of our install.sh script with the kickstart created by the security team
- Any new stuff to add in the kickstart new root for machines or type of node has to be submit to Romain Wartel or Louis Poncet.
- If you don't submit your request and the central firewall/root access is not made you'll not be able to access to your machine*
- Storing configuration file in a place only accessible for the user which run the application and secure shared place is a good way to avoid replication of the files in various places and eventualy places not so secure.
- lemon support fro FIO monitoring
- being sure to received mail for the root of the new node
- AFS ok
User side
- Complexity of my password and my pass phrase for my certificates.
- Encryption of my ssh private key
- Locking of my personal workstation configure correctly
- Change of the password if it go trough an insecure protocol
- NEVER deactivate the firewall for long time
- Short lifetime of the proxy
- The connection to the testbed hosts have to be done trough the security gateway (gd01.cern.ch) or your workstation but NOT trough lxplus*
- The HOWTO is here : https://twiki.cern.ch/twiki/bin/view/LCG/SSHGateway
- All those rules has to be apply on YOUR workstation.
- In theory storage of ssh keys and certificates on a secure support BUT AFS is not compatible with this statement.
- The user on his machine can do what he wants btu he has to respect the basic security rules.
- ccdbuser and ccdbinfo ca be really useful for the root of the machine to certified (virtual or not)
The Testbeds management (CERN)
Today we have a firewall management and user management trough lcg-fw software of Romain. The access are granted and firewall manage for the testbed by Gergely, Di and Louis.
Currently the settings is properly made for :
- UI
- BDIIs
- lcg-CE
- Torque server
- SE classic
- DPM/LFC
- WMS/LB
- FTS
- PX
- VOMS
- WN
The admins are (patch certifier + "experts"):
- Oliver
- Di
- Maria
- Laurence
- Louis
- Andreas
How to install nodes for the testbed
- First of all get your personnal certificate get a VO member agreement.
- configure your apt or yum witht he proper config files (from the release note).
- Install j2sdk
- Update your pkgs db (apt-get update or yum update)
- Install yaim : apt-get install glite-yaim (yum install glite-yaim) and the full list of CA with apt-get install lcg-CA (or yum install ...)
At this step the process for all machines is finish and we have specific settings per type of nodes.
To install a testbed we need 1 host certificate (with a non encrypt private key) per machine exept for WN and UI.
The certification testbed
The certification grid which is available from the bdii : CERT-TOP-BDII.cern.ch
is a base for all person who need to certify a patch or a settings, by contacting
LouisPoncet or
DiQing we can add your "grid part" to our testbed which permit you only to install the service that you need to change for your tests. This achitecture permit with the usage of vNode to instantiate virtual nodes for cetification.
The testbed is partially at CERN other partners and external site specialized participate in this effort.
How to join the certifcation testbed effort
The multi site certification testbed management will permit us to share our knowledge and increase the reliability of the certification. The main goal is to certify a maximum of various configuration (batch systems, storage, firewall settings etc...)
The mailing list :
support-lcg-certification@cernNOSPAMPLEASE.ch is the main communication channel. I also want to the gmail gtalk account or any IM that is common for all this team.
https://groups.cern.ch/group/support-lcg-certification/default.aspx
Technicaly how to join the testbed
To start you need to read the generic installation Guide :
https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide310
or
https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide320
Then apply our specification (few rpms to install for certification and the right repository).
Set a simple testbed and give me from this small testbed the ldap url to access you site bdii. I'll then put this url on our main
BDII (adress bellow).
ldap://Site_bdii:2170/site=Site_name,o=grid
I always prefer to use the native tools of the Os to manage the set of packages that we need, for SL3 it was apt now on SL4/SL5 it is YUM.
You can find the basic configuration files for your yum repository in this repository on the web :
https://grid-deployment.web.cern.ch/grid-deployment/certification/repos/
You can download the .repos files for your nodes.
Always use this jpackages and this lcg-CA configuration files.
YOU MUST DOWNLOAD the file certification.repo from this repository to install our internal tools.
You have to install those 3 pkgs on your systems to be able to use our voms server and run SAM test :
from the yum repository :
https://grid-deployment.web.cern.ch/grid-deployment/certification/repos/certification.repo
yum install ca_BitFace ctb-vomscerts
You can also configure your site using :
- site name : cert-tb-cern
- WMS : lxbra2303.cern.ch (DN : /C=CH/O=CERN/OU=GRID/CN=host/lxb2032.cern.ch)
- BDII : lxbra2305.cern.ch (the usage of This BDII is required)
- VOMS : lxbra23009.cern.ch
- PX : lxbra2304.cern.ch
- MON and REG : lxb7604.cern.ch
- LFC : lxb7607.cern.ch
is this case you need to add your resources on the CERN site
BDII.
You can use :
https://bdii.web.cern.ch/bdii/
and edit the cert-tb-cern
BDII config file but let me inform.
Site and other config files (for CERN) are here :
/afs/cern.ch/project/gd/yaim-server/cert-TB-config/site-TB/
(some mistakes can be found due to the instability of way the patch are indicate in the various config files for the pkg managers).
The ldap URL of the certification (to use on your configuration) is :
ldap://lxbra2035.cern.ch:2170/mds-vo-name=cert-tb-cern,o=grid
Our Testbed :
http://tb-map.cern.ch
Nagios monitoring :
http://tb-nagios.cern.ch
SAM monitoring :
http://tb-sam.cern.ch
Certification Process
https://twiki.cern.ch/twiki/bin/view/EGEE/HowToCertifyAPatch
Architecture of the testbed:
Other information :
- TB.pdf: Last presentation about the certification testbed
Sites
To add remove or modify a site part of the testbed the tools bdii editor permit to the authorized admin to change our Top
BDII configuration.
https://bdii.web.cern.ch/bdii/
This tool can also help you to manage your siteBDII or local
TopBDII.
To have access to this tool the configuration is made trough your DN.
CERN
Resources/Services provided by CERN :
TB map :
https://lxbra2302.cern.ch/nodestatus/
SL4 : * 1 WMS * 1
FTS
- 1 Top level BDII (64 bits)
- 1 site BDII (64 bits)
- 1 lcg_CE + Torque server
- 1 Cream CE
- 2 WN (32/64bits)
- 1 LFC (32/64 bits)
- 1 DPM Mysql (32/64 bits)
- 1 DPM Pool
- 1 MON
- 1 WMS
- 1 SE classic
(Crib note for restarting any of the virtual machines if needed: SSH as root onto the dom0 host, xm list - find the right domain ID, xm reboot
, failing that, if you have to do an xm destroy, you can do an xm create auto/ (see /etc/xen))
CESGA
Batch system : Sun Grid Engine
Resources/Services provided by CESGA:
- 3 machines: 2 quad-processors / server (Intel Xeon E5310)
- lcg-CE/site-BDII = sa3-ce.egee.cesga.es
- glite-SE_dpm_mysql = sa3-se.egee.cesga.es
- glite-MON = sa3-mon.egee.cesga.es
- glite-WN = sa3-wn001.egee.cesga.es
PIC
Batch system : Condor
Site BDII: ldap://site-bdii-sa3.pic.es:2170/mds-vo-name=PIC-SA3,o=grid
Resources/Services provided by PIC:
- 1 machine: 2x Dual Core Opteron 270 (3Ghz) witch 8GB RAM and 580 GB harddisk.
- Virtual machines with XEN
- lcg-CE: vce01.pic.es
- CREAM CE: vce02.pic.es
- site-bdii: site-bdii-sa3.pic.es
- WMS: vwms01.pic.es
- condor master: condor.pic.es
- condor gLite Workernodes: vwn01.pic.es, vwn02.pic.es
- SUBNET: 193.146.196.0/22
GRNET
Batch system : Torque other: modular site configuration
Site Name: EGEE-SEE-CERT
Resources/Services provided by EGEE-SEE-CERT:
BDII_site |
ctb01.gridctb.uoa.gr |
MON |
ctb02.gridctb.uoa.gr |
cream CE |
ctb04.gridctb.uoa.gr |
DPM head node |
ctb06.gridctb.uoa.gr |
TORQUE_server |
ctb07.gridctb.uoa.gr |
DPM disk node |
ctb08.gridcbt.uoa.gr |
DPM disk node |
ctb09.gridctb.uoa.gr |
5 worker nodes |
ctb{5,10-13}.gridctb.uoa.gr |
For more information on the configuration of these services see CertTestBedAtGRNET.
Desy
Storage system : Dcache Storage resources available see below.
UCY
Administrator: Asterios Katsifodimos
Site Name: CY-02-CYGRID-CERT
Resources/Services provided by UCY:
- Dual Xeon(32bit) 2.4 GHz HT 40 GB harddisk 3GB RAM using VMWare
- Dual Opteron(64bit) 2.6 GHz 70 GB harddisk 2GB RAM using VMWare
- Dual Quad Core Xeon(64bit) 2.83 GHz 1TB harddisk 24GB RAM using VMWare
|
Hostname |
Node Type |
OS |
Architecture |
Status |
Comments |
|
bdii201.grid.ucy.ac.cy |
BDII_top |
SLC4 |
x86 |
UP |
|
|
ce201.grid.ucy.ac.cy |
lcg-CE+BDII_site |
SLC4 |
x86 |
UP |
*Batch System:*Torque |
|
wmslb201.grid.ucy.ac.cy |
WMS+LB |
SLC4 |
x86 |
UP |
|
|
se201.grid.ucy.ac.cy |
SE_dpm_mysql |
SLC4 |
x86 |
UP |
|
|
wn201.grid.ucy.ac.cy |
WN |
SLC4 |
x86 |
UP |
|
|
wn202.grid.ucy.ac.cy |
WN |
SLC4 |
x86_64 |
Down. Up on demand |
|
|
mon201.grid.ucy.ac.cy |
MON |
SLC4 |
x86 |
UP |
|
|
lfc201.grid.ucy.ac.cy |
LFC_mysql |
SLC4 |
x86 |
Down. Up on demand |
|
|
amga201.grid.ucy.ac.cy |
AMGA_postgres |
SLC4 |
x86 |
UP |
|
|
amga202.grid.ucy.ac.cy |
AMGA_oracle |
SLC4 |
x86_64 |
Down, up on demand |
x86_64 with x86 compatibility libraries |
|
ui201.grid.ucy.ac.cy |
UI |
SLC4 |
x86 |
Down. Up on demand |
|
|
thales.grid.ucy.ac.cy |
UI |
CentOS4 |
x86 |
UP |
|
|
andromeda.in.cs.ucy.ac.cy |
UI_TAR |
Ubuntu 8.10 |
x86_64 |
UP |
x86_64 with x86 compatibility libraries |
INFN
All the INFN machines below have been upgraded to SLC4
Batch system : LSF
- Site BDII URL: ldap://wmstest-ce05.cr.cnaf.infn.it:2170/site=CTB-SA3INFN,o=grid
Resources/Services provided by INFN:
- 1 lsf server: hostname = wmstest-ce02.cr.cnaf.infn.it
- 1 Lcg CE lsf : hostname = wmstest-ce06.cr.cnaf.infn.it
- 2 lsf gLite WN: hostname = wmstest-ce03.cr.cnaf.infn.it, wmstest-ce04.cr.cnaf.infn.it
- 1 gLite site BDII: hostname = wmstest-ce05.cr.cnaf.infn.it
LAL
Storage and catalog : DPM , LFC
The Team
Louis Poncet
Email : Louis.Poncet@cernNOSPAMPLEASE.ch
Gtalk : Cyferdotnet@gmailNOSPAMPLEASE.com
Administration of the certification Testbed at CERN. I manage the main BDII and the certification testbed of CERN. I am the coordinator of this activities.
Tomasz Wolak
Email : Tomasz.Wolak@cernNOSPAMPLEASE.ch
Gtalk : tomas.wolak@gmailNOSPAMPLEASE.com
Administration of the certification Testbed at CERN. I manage the main BDII and the certification testbed of CERN.
Asterios Katsifodimos
Email : asteriosk@csNOSPAMPLEASE.ucy.ac.cy
Administrator of the certification testbed and SA3 activity representative at UCY.
Ioannis Liabotis
Email : iliaboti@grnetNOSPAMPLEASE.gr
Coordination of the CertTestBedAtGRNET.
Nikos Voutsinas
Email : nvoutsin@nocNOSPAMPLEASE.edunet.gr
Coordination of the CertTestBedAtGRNET.
Kai Neuffer
Email : neuffer@picNOSPAMPLEASE.es
Coordinator of the certification testbed at PIC.
Marc Rodriguez
Email : marcr@picNOSPAMPLEASE.es
Administrator of the certification testbed at PIC.
Carlos Borrego
Email : cborrego@picNOSPAMPLEASE.es
Administrator of the certification testbed at PIC.
Esteban Freire
Email : esfreire@cesgaNOSPAMPLEASE.es
Administrator of the certification testbed at CESGA.
Alvaro Simon
Email : asimon@cesgaNOSPAMPLEASE.es
Administrator of the certification testbed at CESGA.
Owen Synge
* Site Name: certtestbed.desy.de
Email : owen.synge@desyNOSPAMPLEASE.de
Desy DCache maintainer.
One Xen Host and 6 xen clients provided by DESY:
- SE Test installation service, being automated.
- All running 32 bit SL3, 32 bit SL4 or 64bit SL4 depending on need.
|
'''Machine''' |
'''Owner''' |
'''Usage''' |
|
waterford.desy.de |
Owen |
Cert-TB BDII |
|
dublin.desy.de |
Owen |
Sl5 64bit A |
|
cork.desy.de |
Owen |
Sl5 64bit B |
|
swords.desy.de |
Owen |
SL4 32bit A |
|
sligo.desy.de |
Owen |
SL4 32bit B |
|
galway.desy.de |
Owen |
SL4 64bit A |
|
ennis.desy.de |
Owen |
SL4 64bit B |
|
lucan.desy.de |
Tigran |
Build System, test suite runner |
|
limerick.desy.de |
Owen |
UI test server |
Network
Our hosts are in the Desy DMZ subnet 131.169.5.0/255.255.255.0
Desy hosts for development and certification http://trac.dcache.org/trac.cgi/wiki/XenDomains
Laura Perini
Email : laura.perini@miNOSPAMPLEASE.infn.it Should provide resources to test LSF batch system.
Gilbert Grosdidier
Email : grodid@mailNOSPAMPLEASE.cern.ch Using few machines at cern and at LAL connected to our certification testbed.
-- AsteriosKatsofodimos - 14 Jan 2009