The Configuratuion File (valSetup.config)
We can now look at how to configure the valSetup.config file for the tasks. When looking at this file it is important to note that there are multiple sections that are delimited using #’s. The following table of keys is used to describe this.
Descriptor |
Definition |
#### |
Defines start and end of document. This is possibly no longer necessary. |
### |
This defines a new major section |
## |
This defines the task name to use when under the TAGS section |
# |
This is a comment |
###VERSION### |
The section used to define the version of the submission. This is a tag for the output dataset. |
###RESUBMISSION### |
This section is used to when we want to resubmit specific tasks. |
###EXCLUDESITES### |
This section is a comma separated (no spaces allowed) list of sites you wish to exclude. Use this if particular sites are giving problems. |
###EXTRACODE-0### |
Section used to describe what extra code can be added to the job options file. |
###TAGS### |
In this section we define the tasks that we are going to submit. |
Some of these require a little further information. As a note before explaining,
please use the same format for the configuration as already exists. Changing the
format could break the way scripts work unless otherwise stated. In the following
sections I leave out the #’s when describing sections but please remember they
are actually there.
Version
For the VERSION section we are essentially giving a unique identifier that will be used for the output data set. However it is not as simple as this since the tag used here is also used in moving data into correct folders throughout the validation. For this reason it is important to always use the format yyyy-mm-dd-version. The reason version is in this identifier is for encase we do something wrong and need to redo the validation. The way this should be done is to start
on V1 and increment (V2, V3, etc) each time you need to redo all tasks. If you only need to redo a specific task, use the resubmission functionality.
Resubmission
In some cases you have not done the entire validation incorrectly and it is a costly procedure to redo everything. So the functionality to resubmit only specific tasks was added. For instance, if a certain task fails to build on the grid (occasionally a site problem and using pbook to resubmit the job does not help), make the line under this heading 1 and comment out any tasks you do not wish to resubmit (change ## to # for the heading of commented out tasks). Then rerun the GSA. This will only resubmit the requested tasks. If you want to do another resubmission after this, change the resubmission value to 2 and so on for any further resubmissions. At the start of a new set of tasks the RESUBMISSION section should be empty.
Extra Code
Occasionally we are required to use extra code in our job options file. The code we need to use can be pasted into EXTRACODE sections and it will be automatically added to the job options file for the samples we specify (see TAGS section below). There are cases when multiple different samples may need different snippets of code added to their job options file. When this is the case you can just add more EXTRACODE sections. For example if there are two snippets of code then under EXTRACODE - 0 you can paste the first snippet of code. Then create another EXTRACODE section called EXTRACODE - 1 (don’t forget the #’s) and paste the next snippet of code in there. When we need to reference these snippets of code they are read in using a top down approach. By this I mean that the numbers specified in the section name play no role in the order they are stored/indexed. The first EXTRACODE section to appear in the config file will be indexed by 0, the next by 1 and so on. The use of the word indexed will become clear in the TAGS section.
Tags
In this section we describe what datasets are in the tasks and how to process the submission of them. It is broken down into subsections delimited by ## as described in the table. This is an important section to understand since this is where most of the configuration you do will happen. We will look at how asubsection works and then you should be able to apply this to extra sections.
Each subsection describes a task or part of a task. More specifically it describes a test sample configuration and then the configuration of the referencest will be validated against. To start a section we give it a name after 1 This name describes the task. For example, if this is Task1 give it the name ##T1. On the next line we need to configure the test sample. Every line after the test sample that does not start with a # and is not blank will be read in as a configuration of a reference to use for the described test sample. After this a new task can be defined using a new ## subsection. To configure a sample we use the following format:
tag;
athena;
jobo:jobOption;code:extra-code-index;cmt:cmt;tid:tid;validNum:validNumber;skip:skip-bool
The variables in the above line are all in
italics. The only required varianbles are the
tag and
athena release and should be specified in the order shown. All other variables are optional and should only be used if required. For the optional cases one needs to specify the variable name (shown here in
bold), followed by a collon and then the variable. Variable declerations are seperated by semi-colons.
Compulury variables
tag: They are usually of the form eXXXX sXXXX rXXXX. To find the full tag used for the samples you may need to open up the panda links supplied with the validation description or look for them using the "dq2 -ls" command (dq2 should be setup first).
Athena: Athena release to be used for the submission. The athena release to use is always written in the second half of the bracket next to the sample tag in the
email describing the tasks (note that the letter ’s’ can be used as the release if the release is 17.2.0.2). The automatic validation setup does allow for specific
production releases of athena to be used such as 17.2.10.1-TrigMC. Instead of the standard 'comma' (,) setup, the validation package uses 'dashes' (-) instead. One can specify the slc, gcc and bit version (example: 17.2.11.1-slc5-gcc43).
The Athena release can cause a problem if not running on the correct slc machine. The validations may come with a mixture of Athena requirements and if you, for example, plan to run a slc5 Athena version on a sample inside an slc6 machine, you should specify something like this: 17.2.0.2-slc5. All Athenas with tags lower than 17.7 will probably need a slc5 setup, and all above will probably need a 64bit slc6 setup.
Additional (optional) variables
jobo: The subscript of the job options file, which checks out additional packages. The jobOption scripts are all found in the share/JobOptions/ directory. By default the package uses the newTauID (
TauValTop AutoConf_0.py and TauValTopBkg _AutoConf_0.py). This is the 32-bit standard job options. To access the 64-bit
file, _jobOption should be ’64-bit’. One needs only to specify the last snippet of string from the filename to change the jobOptions file. For example, if you need to use the
TauValTop _AutoConf_64bitOldTauID.py file, then you would write:
jobo:64bitOldTauID;
code: Some additional code snippets in the jobOption files may be needed when running on upgrade samples (IBL or ITK,...) since they have been running in special releases and configurations. These snippets of code are added to the jobOptions file automatically by calling this variable. It is the index for extra code that will be used for the submission as discussed previously. More information about the additional code is given
here.
cmt: This is the cmt config to use for athena
tid: Sometimes a task requires one to not run over the entire data set, but only a certain
tid of the dataset. In such a case, one can specify the tid into this field.
validNum: The valid2* type sample names did not fit into the framework set out by the
TauValidation package and this variable should be set to "2" if the sample is of valid type 2.
skip: Recently the migration from SLC5 to SLC6 has caused a couple of problems in the automatic tau validation. The problem lies in running a sample generated with SLC5 on an SLC6 machine, and vice versa. The 'hack' that was implemented in our package allows one to submit the tasks on different machines. For example, if a test is SLC6 and reference is SLC5, one would log in to a SLC6 machine and run the validation with "
skip:true" for the test and "
skip:false" for the reference. Then you will have to log into a SLC5 machine and run in the opposite manner.
--
GuillermoHamity - 28 Nov 2013