TWiki
>
LCG Web
>
LCGGridDeployment
>
LCGReleasePreparation
>
Glite3planning
>
GLite301
(2006-11-28,
LaurenceField
)
(raw view)
E
dit
A
ttach
P
DF
---+Notes on the gLite 3.0 RC2 release to pre-production. The gLite 3.0 PPS release is now available. It is based on LCG-2_7_0 with the addition of <verbatim> gLite WMS/LB gLite CE Combined gLite/LCG WN Combined gLite/LCG UI FTS server FTA </verbatim> There is an apt-get repository for PPS; <verbatim> rpm http://lxb2042.cern.ch/gLite/APT/R3.0-pps rhel30 externals Release3.0 updates </verbatim> The CAs have been decoupled from the release - further info on how to install them can be found here http://grid-deployment.web.cern.ch/grid-deployment/lcg2CAlist.html ---++ Result of Certificaiton glite 3.0 RC2 has been evaluated on the Certification Testbed. The glite WMS failed stress testing as the network server failed due to bug #15761. The cron job that restarts the network server also failed (see note). The glite bulk submission was not tested due to the above failure. The FTS also failed as the configuration is still incomplete. Note: The new cron, does not run cron jobs in cron.d if the file has executable permission. The following cron jobs will fail it this is set. Please ensure that after an install you ensure these will work by running "chmod a-x /etc/cron.d/*" <verbatim> UI -rwxr-xr-x 1 root root 267 Mar 24 15:09 glite-fetch-crl.cron WMS -rwxr-xr-x 1 root root 268 Apr 6 15:03 glite-fetch-crl.cron -rwxr-xr-x 1 root root 160 Apr 6 15:04 glite-wms-check-daemons.cron -rwxr-xr-x 1 root root 158 Apr 6 15:02 glite-wms-ns-proxy.cron -rwxr-xr-x 1 root root 680 Apr 6 15:02 glite-wms-purger.cron -rwxr-xr-x 1 root root 241 Apr 6 15:02 glite-wms-wmproxy-purge-proxycache.cron MON -rwxr-xr-x 1 root root 211 Mar 27 19:45 glite-iperf-check -rwxr-xr-x 1 root root 207 Mar 27 19:21 glite-udpmon-check CE glite -rwxr-xr-x 1 root root 267 Apr 6 12:26 glite-fetch-crl.cron </verbatim> ---++List of targets; Please use yaim's install_node script for fresh installs. For upgrades from RC1, use apt-get dist-upgrade. The repository and yaim now support yum. If you use yaim for installation, set REPOSITORY_TYPE="yum" in site-info.def before running install_node. This will configure yum for you. Many meta-rpm names have now been changed to rationalise the naming (lcg-* -> glite-*). For upgrading a node whose name has changed, please do the following (for example) <verbatim> rpm -e lcg-WN apt-get install glite-WN apt-get dist-upgrade </verbatim> The metapackages available are; <verbatim> glite-UI (a combined LCG/gLite UI) glite-WN (a combined LCG/gLite UI) glite-FTS (FTS server plus related services) glite-CE (the gLite CE) glite-WMSLB (WMS and LB, recommended deployment of the WMS) glite-BDII glite-LFC_mysql glite-LFC_oracle glite-MON glite-PX glite-SE_classic glite-SE_dpm_mysql glite-SE_dpm_oracle glite-SE_dpm_disk glite-SE_dcache glite-SE_dcache_gdbm glite-VOBOX glite-VOMS_mysql glite-VOMS_oracle lcg-RB lcg-CE lcg-CE_torque glite-FTA </verbatim> Many of these node types are described in the LCG Manual Install Guide http://grid-deployment.web.cern.ch/grid-deployment/documentation/LCG2-Manual-Install/ ---++Configuration Configuration for all above components is now supported via yaim (FTS still requires a manual step). Note that the configuration targets have not yet been fully synchronised with the installation targets and some names are different. Yaim has been renamed glite-yaim and has been relocated to /opt/glite/yaim. Please * Ensure any customised files are moved from /opt/lcg/yaim * Ensure your site-info.def references the new location for FUNCTIONS_DIR and perhaps others (eg USERS_CONF) * Put configuration files in /opt/glite/yaim/etc Configuration for all 'gLite' components is also supported via the native (XML) system. Where yaim is configuring a gLite node type, it populates the XML files and runs the gLite config scripts. Please note that any modifications you make to the XML files, to parameters not managed by yaim, should be *preserved*. Parameters managed by yaim will be clearly marked in the XML after it has been run. The intention is that yaim offers a simple interface if prefered, but the ability to use the more powerful native machanism is retained. Please use yaim to configure pool accounts. Yaim allows non contiguous ranges of uids which some sites require and is therefore the default user configuration mechanism. Yaim is in the apt-get repository. ---+++New Yaim parameters; <verbatim> WMS_HOST - gLite WMS + LB FTS_HOST - for building an FTS server REPOSITORY_TYPE - defaults to apt, but yum can be used. BATCH_BIN_DIR - The path of the lrms commands, eg /usr/pbs/bin BATCH_VERSION - The version of the Local Resource Managment System, eg OpenPBS_2.3 LFC_DB_HOST - Set this to use a separate db server for LFC LFC_DB - Set this to define the name of LFC's db </verbatim> Some parameters have changed for the DPM <verbatim> DPM_FILESYSTEMS - The filesystems/partitions parts of the pool DPM_DB_USER - The database user (was DPMMGR) DPM_DB_PASSWORD - The database user password (was DPMUSER_PWD) </verbatim> so the following are no longer used <verbatim> DPMMGR DPMUSER_PWD DPMPOOL_NODES </verbatim> There is more information in the example site-info.def file ---++Notes on particular node types; ---+++lcg-RB Condor is upgraded to 6.7.10 there is a new condor-lcg package which provides LCG modifications to the gahp_server and grid_monitor. Configuration of these is handled by yaim. ---+++glite WMS + LB To install the glite WMS + glite LB (recommended deployment scenario) <verbatim> install_node site-info.def glite-WMSLB configure_node site-info.def WMSLB </verbatim> ---+++Combined UI The gLite 3.0 UI is a 'combined' UI, incorporating LCG and gLite components. On the combined node, please watch out for glite commands which are symlinked to edg commands and may appear earlier in the PATH than their edg counterparts. The extent to which the glite symlinks can provide the functionality of the edg commands they replace is untested. These symlinks will be removed in future releases. The RPM based userland installation finished without conflicts but there are lots of warnings and errors due to install scripts which require root privilege. <verbatim> install_node site-info.def glite-UI configure_node site-info.def UI_combined </verbatim> ---+++WN The gLite WN has combined gLite and LCG components <verbatim> install_node site-info.def glite-WN configure_node site-info.def WN_combined </verbatim> glite-WN + Torque client <verbatim> install_node site-info.def glite-WN glite-torque-client-config configure_node site-info.def WN_combined_torque </verbatim> ---+++FTS In the case of the FTS yaim will configure all related services such as crl downloads, info provider etc but the FTS server itself must be configured using the usual gLite system. A yaim component will follow. <verbatim> install_node site-info.def glite-FTS configure_node site-info.def FTS </verbatim> ---+++gLite CE The gLite CE is configured to support only VOMS proxies. <verbatim> install_node site-info.def glite-CE configure_node site-info.def gliteCE </verbatim> If you want your gliteCE to run the site EGEE.BDII; <verbatim> configure_node site-info.def gliteCE BDII_site </verbatim> The glite-CE configuration configures also software and scheduler GIP plugins. Due to the bug in the /opt/lcg/libexec/lcg-info-dynamic-scheduler file the following command must be run in order to get a correct functionality: <verbatim> # sed -i '{s/jobmanager/blah/}' /opt/lcg/libexec/lcg-info-dynamic-scheduler </verbatim> *Batch systems and the gLite CE* If you are installing your batch system server on the same node as the CE, and you want to use yaim or gLite to configure it, please choose one or the other and stick to it. If you use yaim and then make modifications via the gLite system, any rerun of yaim will reset the configuration. The same advice applies to management of WNs. If yaim fulfils your needs, this is the recommended route. glite-CE + Torque server <verbatim> install_node site-info.def glite-CE glite-torque-server-config configure_node site-info.def gliteCE TORQUE_server </verbatim> Note that the log-parser daemon must be started on whichever node is running the batch system. If your CE node is also the batch system head node, you have to run the log-parser here. If you are running two CEs (typically LCG and gLite versions) please take care to ensure no collisions of pool account mapping. This is typically achieved either by allocating separate pool account ranges to each CE or by allowing them to share a gridmapdir. ---+++DPM A VOMS enabled DPM (1.5.5) is now available. Upgrade from LCG-2_7_0 is supported. <verbatim> install_node site-infoe.def glite-SE_dpm_mysql configure_node site-info.def [SE_dpm_mysql|SE_dpm_disk] </verbatim> ---+++dCache The yaim script for configuring dCache has received many updates from !GridPP. It offers extended functionality but is backward compatible. Note that dcache may show errors if you have more than around 56 CAs. If this is the case, currently the only fix is to identify CAs you do not need to support and remove them. Yaim does not yet support d-Cache with a postgresql based pnfs. To accommodate sites who have already upgraded to this version of pnfs, we now have two types of d-Cache SE. <verbatim> glite-SE_dcache </verbatim> This has no dependency on pnfs at all, so upgrades of either type (postgresql or gdbm) should work at the rpm level. <verbatim> glite-SE_dcache_gdbm </verbatim> This has a dependency on pnfs (ie the gdbm version) and is necessary for a new install. Please note however that pnfs_postgresql is the preferred implementation and migration is non trivial. ---+++FTA New yaim configuration for FTA. Please take the fta-info.def file from yaim's examples directory and append it to your site-info file before configuring. <verbatim> install_node site-info.def glite-file-transfer-agents-config configure_node site-info.def FTA </verbatim> ---++Fixes with respect to RC1 The following most recent critical bug fixes are contained in the new release candidate 2: Bug 15330: glite-wms-ui-cli-python masks commands from LCG UI https://savannah.cern.ch/bugs/?func=detailitem&item_id=15330 Bug 15642: When mapping all the VOs to one queue on a glite CE with LSF the ... https://savannah.cern.ch/bugs/?func=detailitem&item_id=15642 TO BE CONFIRMED BY DEVELOPER - INCONSISTENT STATE IN SAVANNAH Bug 15674: Blah submission from a glite 3.0 CE (glite flavour) to an LSF queue does not work https://savannah.cern.ch/bugs/?func=detailitem&item_id=15674 Bug 15710: gLite 3.0 job wrapper has bad kill usage https://savannah.cern.ch/bugs/?func=detailitem&item_id=15710 Bug 15769: large job collection submission and cancel through WMproxy didn't work https://savannah.cern.ch/bugs/?func=detailitem&item_id=15769 Bug 15806: matchmaking slow for bulk submission https://savannah.cern.ch/bugs/?func=detailitem&item_id=15806 Bug 15874: FTS - Can't configure the http timeout in the ChannelAgent https://savannah.cern.ch/bugs/?func=detailitem&item_id=15874 Bug 15934: Blah submission from a glite 3.0... https://savannah.cern.ch/bugs/?func=detailitem&item_id=15934 In addition, the following bug fixes in yaim have been included Bug 15101: LFC : central LFC configured for all the VOs supported by a site https://savannah.cern.ch/bugs/?func=detailitem&item_id=15101 Bug 15131: Wrong permissions in LFC catalog when VO name != local group name https://savannah.cern.ch/bugs/?func=detailitem&item_id=15131 Bug 15484: DPM and LFC config does not allow for alternative database name and server https://savannah.cern.ch/bugs/?func=detailitem&item_id=15484 Bug 15622: Request for optional LFC_DB_HOST variable in yaim. https://savannah.cern.ch/bugs/?func=detailitem&item_id=15622 Bug 15764: GLITE_TMP is set but directory is not created https://savannah.cern.ch/bugs/?func=detailitem&item_id=15764 ---+++Middleware components The gLite 3.0 issue tracking page has information on what has been fixed in RC2 https://uimon.cern.ch/twiki/bin/view/LCG/Glite30IssueTracking ---+++Yaim and configuration * Yaim support for new gLite services (combined UI, combined WN, TORQUE_server) * Support for VOs without VOMS (for gLite services) * Missing WMS_HOST switch off the configuration of gLite UI part of the combined UI * Return value of gLite configuration scripts is checked bug #15543 * GIP configuration fixed on glite CE bug #15434 * ACL publication fixed on gLite CE bug #15424 * rationalisation of DPM configuration * LFC now suports a remote DB * FTA now yaim configurable * EGEE.BDII - allow site EGEE.BDII on gliteCE * Condor config for lcg-RB * No longer mandate home dir under /home for edginfo and edguser * Bogus 'requires' removed from config_gip * ERT plugin and software plugin for gliteCE (still requires manual step as plugin expects 'jobmanager') * config_mkgridmap - support new VOMS capability syntax * RGMA - set dir perms on /etc/tomcat5 and new CATALINA_OPTS * dcache - new native info provider ---++Outstanding bugs During the integration and testing process a list of outstanding issues was maintained. Here is a summary of the issues which have not yet been addressed and were considered important; savannah issue 15050 - this has NOT been fixed. The impact is of the order of a few jobs (<5) per thousand. savannah issue 15189 - status not updated for nodes of a large collection - now fixed but missed the cut for RC2, now fine for 400 jobs, but doesn't work for 1000 jobs in a collection. savannah issue 15894 - dynamic scheduler plugin on glite-CE doesn't provide correct information. Temporary fix: <verbatim> # sed -i '{s/jobmanager/blah/}' /opt/lcg/libexec/lcg-info-dynamic-scheduler </verbatim> savannah issue 15643 - proxy renewal works, job aborts after renewal. Voms credentials are dropped. savannah issue 15688 - Jobs stay in ready state. Situation still not entirely clear, can be just a configuration problem Publishing software tags by user. Not solved yet, we will add a gridFTP server later. In configuring a UI you may see complaints about the absence of files in vomsdir. Please ignore this as the script is making an invalid assumption about the naming convention of files in there. ---++Notes Other issues to remain aware of; Between LCG-2_7_0 and gLite 3.0 !MySQL has been upgraded from 4.0 to 4.1. There has been a change in the password encryption, please keep this in mind. Pointers to documentation on the components of this release are being compiled here http://www.grid.kfki.hu/afs/gdebrecz/web/LCG/the-LCG-directory.html
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r17
<
r16
<
r15
<
r14
<
r13
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r17 - 2006-11-28
-
LaurenceField
Log In
LCG
LCG Wiki Home
LCG Web Home
Changes
Index
Search
LCG Wikis
LCG Service
Coordination
LCG Grid
Deployment
LCG
Apps Area
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
LCG
All webs
Copyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback