TWiki
>
LHCb Web
>
LHCbComputing
>
LHCbSoftwareTutorials
>
PersistencyMigration
(revision 13) (raw view)
Edit
Attach
PDF
---+ ROOT and POOL persistency, IO and IOHelper %TOC% ---++ *Foreword* Hopefully the switch from *POOL* to *ROOT* will be so transparant, you will never know anything has happened. In case that is not true, or you need to work outside of the box, you should read this TWiki. ---++ *Quick-start guide, POOL->ROOT* Do you want to test out *ROOT* persistency in an application which doesn't come with it as default? If you're an expert and just want to give this a go or have a reminder, here's the recipie <font color='green'>If you want to switch from *POOL* to *ROOT* , and it *is* supported in your application</font> * add =DaVinci().Persistency="ROOT"= to your options, (e.g. if DaVinci is your application) * In Ganga add =job.application.args=["--option='from GaudiConf import IOHelper; IOHelper("+'"ROOT", "ROOT"'+ ").postConfigServices()'"]= to your job <font color='red'>If you want to switch from *POOL* to *ROOT* , but it *is not* supported in your application</font> * see [[https://savannah.cern.ch/task/?20701][task #20701]] * Due to a bug in Gaudi, and missing services, no stacks older than the DaVinci v29r1 stack support ROOT. * You can approximately acheive support by doing the following, but it will only work in some limited cases. * =getpack !GaudiConf v15r5= * =getpack Online/RootCnv v1r12= * =getpack !GaudiSvc v18r16= * =make= (obviously) * if the build fails, see [[https://twiki.cern.ch/twiki/bin/view/LHCb/PersistencyMigration#What_if_the_build_fails][below]] * add =from !GaudiConf import IOHelper; IOHelper("ROOT","ROOT").postConfigServices()= to your options if you are running gaudi directly * or in Ganga add =job.application.args=["--option='from !GaudiConf import !IOHelper; !IOHelper("+'"ROOT", "ROOT"'+ ").postConfigServices()'"]= to your job <font color='blue'>If you want to find out if *ROOT* is supported or not:</font> * see [[https://twiki.cern.ch/twiki/bin/view/LHCb/PersistencyMigration#Is_Root_Supported][Is Root Supported? below]] If you're not an expert, and/or want to do something strange, or the above recipie does not work in your tiny corner of phase space, see below. ---------- <p> <p> <p> ---++ *Introduction* <p> <p> <p> ---+++ - POOL *POOL* is the Gaudi-developed service for storing objects to files. * It was in use as default by LHCb up until the middle of 2011, Gaudi v22r5. * It is a Root-based file structure, where every event data object is stored inside its own tree. * Initially this was thought to be an efficient storage system, since knowing the size of one object enables you to quickly navigate to any of the objects in the tree. * Unfortunately Root's underlying IO was then re-optimised to be faster on other types of data structures, namely where you have one tree or a small number of trees per file. * this makes the *POOL* implementation incredibly inefficient in terms of IO and memory usage. * *POOL* will be deprecated from Gaudi v23r0. The *POOL* format is very inefficient on *Reading*, especially from remote storage e.g. over Castor. As long as your input file is in *POOL* format, you will have memory usage and IO problems. ---+++ - ROOT *ROOT* is the name we give to a simpler service more in-keeping with the data format for which Root is optimized. * It is the new default file format, as of Stripping17, i.e. from the end of 2011, Gaudi v22r5, !DaVinci v29r1. * It is a Root-based file structure where there is only one tree containing the entire event information, then another tree containing FSRs etc. * *ROOT* was trialed with LHCb versions LHCb v32r4 onwards, but can only be considered usable from the v33r0 stack with Gaudi v22r5. * *ROOT* is the only available persistency service from Gaudi v23r0 When a file containing the same data has been stored in *ROOT* format, reading with the *ROOT* services will be much more efficient. If the file has been stored in *POOL* format, it can also be read in with *ROOT*, but for no gain in performance. ---+++ - Backwards/Forwards/Sideways Compatibility * *POOL* files can be read with *ROOT* services. * *ROOT* files cannot be read with *POOL* services. * *POOL* files can only be written by *POOL* services. * *ROOT* files can only be written by *ROOT* services. * *ROOT* and *POOL* services <font color='red'>cannot co-exist within the same Gaudi job</font>. ---------- <p> <p> <p> ---++ *IOHelper and IOExtension* <p> <p> <p> We have written two helper classes which are there not just for the experts, to help you configuring your jobs regardless of persistency. *IOHelper* [[http://lhcb-release-area.web.cern.ch/LHCb-release-area/DOC/davinci/releases/v28r4p2/doxygen/py/db/d82/class_gaudi_conf_1_1_i_o_helper_1_1_i_o_helper.html][doxygen]] [[https://svnweb.cern.ch/trac/lhcb/browser/LHCb/trunk/GaudiConf/python/GaudiConf/IOHelper.py#L12][svn]] is the main way applications, and you, will interact with input/output it can handle: * "dressing" filenames so that they are understood by Gaudi * Setting up persistency services in almost any situation * Writing output files and FSRs correctly * Appending input files *IOExtension* [[http://lhcb-release-area.web.cern.ch/LHCb-release-area/DOC/davinci/releases/v28r4p2/doxygen/py/db/d15/class_gaudi_conf_1_1_i_o_helper_1_1_i_o_extension.html][doxygen]] [[https://svnweb.cern.ch/trac/lhcb/browser/LHCb/trunk/GaudiConf/python/GaudiConf/IOHelper.py#L766][svn]] is an alternative class which uses the three letter extension at the end of the file to decide which services to use. It can also handle: * "dressing" filenames so that they are understood by Gaudi * Writing output files and FSRs correctly * Appending input files These classes live in the same module in the !GaudiConf package. The inbuilt python help and documentation is the best place to start with these classes.%SYNTAX{ syntax="python"}% from GaudiConf import IOHelper help(IOHelper) from GaudiConf import IOExtension help(IOExtension) %ENDSYNTAX% ---------- <p> <p> <p> ---++ *Migration* <p> <p> <p> Many of our applications come with *ROOT* as an option. [[https://lhcb-tag-collector.web.cern.ch/lhcb-tag-collector/display.html?project=LHCb&version=v32r4][LHCb v32r4]] and higher come with the possibility to switch to *ROOT* already embedded. Older applications will gradually be instrumented with *ROOT* to make sure we can still, for example, run the trigger on older data. From Gaudi v23r0, [[https://lhcb-tag-collector.web.cern.ch/lhcb-tag-collector/display.html?project=LHCb&version=v32r4][LHCb v34r0]] *ROOT* is the only option. In order to use *ROOT*, if it is not the default persistency, requires the correct combination of three things: 1 packages, 1 options, 1 datacards. ---+++ - What type is a certain file? A little script in !AppConfig will tell you, given a file PFN, whether it is *POOL* or *ROOT* format. If it is a remote file, you may wish to copy an example locally first to check. %SYNTAX{ syntax="sh"}% $ SetupProject SomeProject SomeVersion $ $APPCONFIGROOT/scripts/DetectFileType.py <PFN> %ENDSYNTAX% ---+++ - Is Root Supported? For most purposes, the asnwer to this question is the same as "is my version part of the same stack as DaVinci v29r1, or later" To tell whether you have all the correct items to use *ROOT* persistency. %SYNTAX{ syntax="python"}% $ SetupProject SomeProject SomeVersion $ python from GaudiConf import IOHelper print IOHelper().isRootSupported() %ENDSYNTAX% If this script succeeds without raising an exception and prints "True", then Root is supported in this application. Unfortunately for some versions this may return true, when actually root is not supported due to bugs in the underlying Gaudi. It will give you some idea and we are working on a better solution, but eventually it comes down to "is my version part of the same stack as DaVinci v29r1, or later" Updating your local version of IOHelper will pick up the check on the underlying version of GaudiSvc. Only GaudiSvc >=v18r3 supports Root properly. ---+++ - What is the default? The default persistency is controlled by IOHelper. If IOHelper does not exist, then only *POOL* is supported. %SYNTAX{ syntax="python"}% SetupProject SomeProject SomeVersion python from GaudiConf import IOHelper print IOHelper().defaultPersistency() %ENDSYNTAX% If this script throws an exception, then assume *POOL*, otherwise the default is the persistency of the defaultly constructed IOHelper. ---+++ - What packages do I need? Assuming *ROOT* isn't already supported, you need to get: %SYNTAX{ syntax="sh"}% getpack GaudiConf v15r5 getpack Online/RootCnv v1r12 getpack GaudiSvc v18r16 %ENDSYNTAX% Once these are compiled and correctly on your path, checking if Root is supported will return True. ---++++ -> What if the build fails * *The head of !GaudiSvc and also !RootCnv cannot be combined for certain Gaudi versions which are sufficiently old...* * <font color='green'>For *Gaudi >= v22r4*, the build should work as given.</font> * see the method above * <font color='blue'>For *Gaudi >= v22r0*, !GaudiSvc will not compile, because of the missing !msgLevel function, so you'll need to getpack the version of GaudiSvc which was actually released in that gaudi version, and patch it yourself:</font> * e.g. %SYNTAX{ syntax="sh"}% getpack GaudiSvc WARNING : Version not specified for package 'GaudiSvc' Select a version (v19r9_v16r6, v18r17, v18r16, v18r15, v18r14, v18r13, v18r12, v18r11, v18r10, v18r9, v18r8, v18r8-pre, v18r7, v18r7-pre, v18r6, v18r5, v18r5-pre, v18r4, v18r4-pre, v18r3, v18r3-pre, v18r2, v18r2-pre, v18r1, v18r0, v17r3, v17r2, v17r1, v17r0, v16r5, v16r4, v16r3, v16r2, v16r1, v16r0, v15r3, v15r2, v15r1, v15r0, v14r13, v14r12, v14r11, v14r10, v14r9, v14r8, v14r7, v14r6p1, v14r6, v14r5, v14r4, v14r3, v14r2, v14r1, v14r0, v13r4, v13r3, v13r2, v13r1, v13r0, v12r7, v12r6, v12r5, v12r4, v12r3, v12r2p1, v12r2, v12r1, v12r0, v11r7, v11r6p1, v11r6, v11r5, v11r4, v11r3, v11r2, v11r1, v11r0, v10r3p1, v10r3, v10r2, v10r1p3, v10r1p2, v10r1p1, v10r1, v10r0p1, v10r0, v9r0p1, v9r0, v8r4, v8r3, v8r2, v8r1, v8r0, v7r3, v7r2, v7r1, v7r0, v5r1, v4r1, (h)ead, (q)uit [v18r14]): ... cd GaudiSvc svn merge svn+ssh://svn.cern.ch/reps/gaudi/Gaudi/trunk/GaudiSvc -c6649 . #this will probably give a conflict warning, if it's about release.notes, it can be sucessfully ignored Conflict discovered in 'doc/release.notes'. Select: (p) postpone, (df) diff-full, (e) edit, (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: p --- Merging r6649 into '.': C doc/release.notes U src/PersistencySvc/OutputStream.cpp Summary of conflicts: Text conflicts: 1 %ENDSYNTAX% * <font color='red'>For Gaudi < v22r0, !RootCnv will not compile, due to the missing =DataObject.update()= method. There is no way to get around that right now, the entire stack would need to be rebuilt up to your version with a later/better Gaudi version.</font> * contact your release manager ---+++ - What about the options? 1 If *ROOT* is supported, is the default, and the application is a recent one, you do not need to change anything in order to pick up the *ROOT* persistency. 1 If *ROOT* is supported, without you needing to getpack anything, but POOL is the default, you can probably set the application to pick up *ROOT* using the configurable for the application such as:%SYNTAX{ syntax="python"}% DaVinci().Persistency="ROOT" #or Brunel().Persistency="ROOT" %ENDSYNTAX% 1 If *ROOT* is supported, but it is because you did the getpacking of the correct packages, you will instead need to force the switch of the services by appending a !PostConfigAction::%SYNTAX{ syntax="python"}% from GaudiConf import IOHelper IOHelper("ROOT","ROOT").postConfigServices() %ENDSYNTAX% using =postConfigServices= is a catch-all way of always ensuring you switch to *ROOT*, or back to *POOL*, no matter of the other options in your file are/were, and no matter what your datacard was. ---+++ - What about the DataCards? Datacards have hard-coded within them the type of services they should be read with. This is not usually a problem, since the python configuration can and does take care of swapping backwards and forwards between *POOL* and *ROOT* services. However, if you have a use case where IOHelper is not involved, such as: * some obscure application * some !GaudiPython script * Data added with its own postConfigAction * Data added or edited during run time It is advisable to first convert your datacard into a better format, with the correct persistency. ---++++ - Old format Gaudi Cards the old format is to set directly the property of the !EventSelector: %SYNTAX{ syntax="python"}% from Gaudi.Configuration import * EventSelector().Input = [ " DATAFILE='LFN:/lhcb/MC/MC09/DST/00004871/0000/00004871_00000001_1.dst' TYP='POOL_ROOTTREE' OPT='READ'", ... ] %ENDSYNTAX% In *ROOT* this becomes:%SYNTAX{ syntax="python"}% from Gaudi.Configuration import * EventSelector().Input = [ " DATAFILE='LFN:/lhcb/MC/MC09/DST/00004871/0000/00004871_00000001_1.dst' SVC='Gaudi::RootEvtSelector' OPT='READ'", ... ] %ENDSYNTAX% Notice how this overwrites any previous content of the !EventSelector, and remember that for the next step. ---++++ - New format Gaudi Cards With IOHelper the setting of input files is much easier:%SYNTAX{ syntax="python"}% from GaudiConf import IOHelper IOHelper().inputFiles([ "LFN:/lhcb/MC/MC09/DST/00004871/0000/00004871_00000001_1.dst", ... ]) %ENDSYNTAX% You only set the file name, and do not worry about the persistency itself. IOHelper handles that for you. *NB: The files in inputFiles are ADDED to the !EventSelector, no previous content is removed* If you want to remove the previous content, set the keyword =clear=True= %SYNTAX{ syntax="python"}% from GaudiConf import IOHelper IOHelper().inputFiles([ "LFN:/lhcb/MC/MC09/DST/00004871/0000/00004871_00000001_1.dst", ... ], clear=True) %ENDSYNTAX% ---++++ - Modifying the format or persistency of DataCards There is a simple script again in !AppConfig which will parse any number of options files and translate the data to IOHelper format. This only will not work if you have data set inside a different configurable or inside a postConfigAction. <verbatim> $ SetupProject SomeProject SomeVersion $ $APPCONFIGROOT/scripts/ConvertGaudiCard.py --help usage: ConvertGaudiCard.py <inputfile> [<inputfile>...] [--new/--ext/--old] [--persistency=<POOL,ROOT,MDF>] [--setpersistency] inputfile: gaudi card or options file to parse --new: (default) output in new IOHelper style --ext: output in new IOExtension style --old: output in old GaudiCard style --persistency: Convert persistency to given format, otherwise use default --setpersistency: explicitly give the persistency in the card, even if it is the default </verbatim> ---+++ - Inside Ganga? <font color='green'> *You do not need to do anything if you are using the default persistency, with the most recent Ganga/DaVinci* </font> * <font color='red'>Ganga v506r8 is the minimum version required to support ROOT persistency</font>. * https://savannah.cern.ch/bugs/?82532 *If* you need to use a non-default persistency: * Ganga flattens options files, this can mean IOHelper gets in a spot of bother when trying to interpret your input. * There are two ways potentially to circumvent this within Ganga 1 *Command line arguements to =gaudirun.py=, which can be used to circumvent the flattening and set the correct options.* * This method is a catch-all which should always work, for both ROOT and POOL * If you are migrating to a non-default persistency inside of Ganga, you should set the command line arguement to do the migration:%SYNTAX{ syntax="python"}% job.application.args=["--option='from GaudiConf import IOHelper; IOHelper("+'"ROOT", "ROOT"'+ ").postConfigServices()'"] %ENDSYNTAX% 1 *Directly in the !LHCbDataset* * This method may not work in all cases, it relies that you have already setup the correct services, and only need to change the datacard * In this case Ganga allows the dataset itself to know about it's persistency, so you can specify:%SYNTAX{ syntax="python"}% job.inputdata.Persistency="ROOT" %ENDSYNTAX% ---------- <p> <p> <p> ---++ *What can possibly go wrong?* <p> <p> <p> Because you need the combination of !DataCard, Packages and Options all to be correct, there are many permutations of many different scenarios, most of which lead to failure of one sort or another. Attatched is a flow chart of everything which may go wrong. [[%ATTACHURL%/InputData.pdf]] ---------- ---++ *FAQ* ---+++ - !TypeErrors in Ganga when parsing dataset. *Errors of the type:* =!TypeError: Type of file xxxx could not be determined use !IOHelper with specified persistency instead= * *If the file name has a "?" somewhere* in it to designate the svcClass * this is a known bug. Getpack a version of !GaudiConf >= v15r2 * *If the file extension is wierd* such as .fishpastesandwiches or .spamandeggs .xmsdst * !IOExtension cannot detemine the file type from weird extensions * Extensions are gradually being added to !IOExtension contact your release manager if you find one is missing, but first try the latest GaudiConf ------ -- Main.RobLambert - 05-Aug-2011
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
pdf
InputData.pdf
r1
manage
182.9 K
2011-08-05 - 17:12
RobLambert
Flow chart of all possible failure modes
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r19
|
r15
<
r14
<
r13
<
r12
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r13 - 2012-02-06
-
RobLambert
Log In
LHCb
LHCb Web
LHCb Web Home
Changes
Index
Search
LHCb webs
LHCbComputing
LHCb FAQs
LHCbOnline
LHCbPhysics
LHCbVELO
LHCbST
LHCbOT
LHCbPlume
LHCbRICH
LHCbMuon
LHCbTrigger
LHCbDetectorAlignment
LHCbTechnicalCoordination
LHCbUpgrade
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
Altair
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
LHCb
All webs
Copyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback