Difference: CMSUniandesGroupGangaCMS (26 vs. 27)

Revision 272009-11-03 - AndresOsorio

Line: 1 to 1
 
META TOPICPARENT name="CMSUniandesGroup"

<!-- /ActionTrackerPlugin -->
Line: 145 to 145
 

Job splitting

Changed:
<
<

SplitByFiles

>
>

SplitByFiles

  A very simple job splitter was implemented: SplitByFiles. As its name indicates, a job is splitted in n subjobs given a partition of the input dataset. You will need to construct the splitter object by passing a Ganga File, containing a plain list of the dataset file names, and tell what type of data is going to be used ("local" prepends the prefix "file:", "castor" prepends "rfio:" etc). Here a show a snippet of the splitter definition:
Line: 155 to 155
 myjob.inputdata = fdata %ENDCODE%
Changed:
<
<

ArgSplitter

>
>

ArgSplitter

  Ganga comes with an Argument Splitter: your provide a list of n-arguments and Ganga builds n-subjobs having each one the specific set of arguments. We adapted this functionality to modify the configuration file that drives the cmsRun application. You will need first to understand what parameter, its type and value/argument:
Line: 208 to 208
  More: ArgSplitter in the Ganga documention.
Added:
>
>

ArgSplitterX

The ArgSplitterX is an extended version of the normal ArgSplitter already described. In its functionality, it works the same as the ArgSplitter however it has new added feature(s):

  • For MC simulation, each subjob is assigned a unique random seed to the generator. How it works:

  • You can maintain in your script the same structure used for the ArgSplitter but replace it with the ArgSplitterX type:

<!-- SyntaxHighlightingPlugin -->
myjob.splitter = ArgSplitterX()
myjob.splitter.args = arguments #where arguments are the ones you have previously define
<!-- end SyntaxHighlightingPlugin -->

  • To have the random seed added to each subjob, you just need to add the following lines to your script:

<!-- SyntaxHighlightingPlugin -->
myjob.splitter.AddRndSeed()
<!-- end SyntaxHighlightingPlugin -->

  • That's all ... in principle this should work fine. How it works: for the moment the random seed is taken based on time in milliseconds (+min + seconds). This is not ideal yet but it is first good approximation given the limitation I have on the size of the integer. (More work and tests to be done)
 

Job merging

Ganga comes with some great merging plugins, among them RootMerger which collects the root output from all jobs and merges it (using hadd). The RootMerger has attributes files, overwrite and ignorefailed, all of them self explanatory. I put here an example of the RootMerger definition:

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback