Difference: DiracPopularityService (1 vs. 7)

Revision 72012-02-26 - DanielaRemenska

Line: 1 to 1
 
META TOPICPARENT name="DanielaRemenskaWork"

Points to remember (compiled from meetings and notes):

Line: 101 to 101
 ALERT!The AnalyseXMLLogFile implementation parses the application log AND the summary.xml, but it uses LHCbDIRAC/Core/Utilities/ProductionXMLLogAnalysis.py which requires step_commons['listoutput'] to be defined.. user jobs may lack it, so it crashes :-/ Example output
Changed:
<
<
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.5 2012/01/21 22:38:58 danielar_40nikhef_2enl Exp $
>
>
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.6 2012/02/26 16:17:19 danielar_40nikhef_2enl Exp $
 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'ParametricInputData': '', 'TotalSteps': '1', 'JobName': 'Name', 'Priority': '1', 'SoftwarePackages': 'DaVinci.v29r2', 'JobReport': <DIRAC.WorkloadManagementSystem.Client.JobReport.JobReport instance at 0xd11bcf8>, 'LogLevel': 'debug', 'OutputSandbox': '*.log;summary.data;summary.xml', 'JobType': 'User', 'SystemConfig': 'ANY', 'JOB_ID': '00000000', 'StdError': 'std.err', 'Request': <DIRAC.RequestManagementSystem.Client.RequestContainer.RequestContainer instance at 0xd0eec68>, 'AccountingReport': <DIRAC.AccountingSystem.Client.DataStoreClient.DataStoreClient instance at 0xd19a830>, 'ParametricInputSandbox': '', 'JobGroup': 'lhcb', 'StdOutput': 'std.out', 'Origin': 'DIRAC', 'Site': 'ANY', 'PRODUCTION_ID': '00000000', 'MaxCPUTime': '5000', 'LogFilePath': '/project/bfys/dremensk/ctmdev/LHCbDirac_v6r8p2/etc', 'InputData': ''} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'applicationName': 'DaVinci', 'STEP_DEFINITION_NAME': 'DaVinciStep1', 'applicationVersion': 'v29r2', 'JOB_ID': '00000000', 'optionsLine': '', 'STEP_NUMBER': '1', 'StartStats': (4.2199999999999998, 0.23999999999999999, 0.0, 0.0, 8676995.6999999993), 'STEP_INSTANCE_NAME': 'RunDaVinciStep1', 'inputDataType': 'DATA', 'applicationLog': 'Step1_DaVinci_v29r2.log', 'optionsFile': '/project/bfys/dremensk/DaVinci-Default.py', 'PRODUCTION_ID': '00000000', 'STEP_ID': '00000000_00000000_1', 'StartTime': 1326251676.3210549, 'inputData': 'LFN:/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst'} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Input data defined in workflow for this Gaudi Application step
Line: 262 to 262
  Question I guess in such a case we can only report the SITE and the PFN, not the SE?
Changed:
<
<
In either case it's not really possible (or not trivial, for now) to extract the exact SE within LHCbDIRAC itself, so the "/LocalSite/LocalSE" parameter will be used instead, to send only the SITE parameter together with the hit count.
>
>
In either case it's not really possible (or not trivial, for now) to extract the exact SE within LHCbDIRAC itself, so the "/LocalSite/LocalSE" parameter will be used instead, to send only the SITE parameter together with the hit count. In the future could be modified to query the ReplicaCatalog and find out the precise SE from the SITE. But querying the RC for each file is a heavy/unnecessary operation for now.
 

Local testing of new modules:

In dirac.cfg, important lines for local testing (LHCb-Development to test on volhcb12, comment out other servers, leave only volhcb18, since it times-out with the rest.., there must be LocalArea and SharedArea defined in LocalSite)

Revision 62012-02-26 - DanielaRemenska

Line: 1 to 1
 
META TOPICPARENT name="DanielaRemenskaWork"

Points to remember (compiled from meetings and notes):

Line: 101 to 101
 ALERT!The AnalyseXMLLogFile implementation parses the application log AND the summary.xml, but it uses LHCbDIRAC/Core/Utilities/ProductionXMLLogAnalysis.py which requires step_commons['listoutput'] to be defined.. user jobs may lack it, so it crashes :-/ Example output
Changed:
<
<
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.5 2012/01/21 22:28:54 danielar_40nikhef_2enl Exp $
>
>
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.5 2012/01/21 22:38:58 danielar_40nikhef_2enl Exp $
 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'ParametricInputData': '', 'TotalSteps': '1', 'JobName': 'Name', 'Priority': '1', 'SoftwarePackages': 'DaVinci.v29r2', 'JobReport': <DIRAC.WorkloadManagementSystem.Client.JobReport.JobReport instance at 0xd11bcf8>, 'LogLevel': 'debug', 'OutputSandbox': '*.log;summary.data;summary.xml', 'JobType': 'User', 'SystemConfig': 'ANY', 'JOB_ID': '00000000', 'StdError': 'std.err', 'Request': <DIRAC.RequestManagementSystem.Client.RequestContainer.RequestContainer instance at 0xd0eec68>, 'AccountingReport': <DIRAC.AccountingSystem.Client.DataStoreClient.DataStoreClient instance at 0xd19a830>, 'ParametricInputSandbox': '', 'JobGroup': 'lhcb', 'StdOutput': 'std.out', 'Origin': 'DIRAC', 'Site': 'ANY', 'PRODUCTION_ID': '00000000', 'MaxCPUTime': '5000', 'LogFilePath': '/project/bfys/dremensk/ctmdev/LHCbDirac_v6r8p2/etc', 'InputData': ''} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'applicationName': 'DaVinci', 'STEP_DEFINITION_NAME': 'DaVinciStep1', 'applicationVersion': 'v29r2', 'JOB_ID': '00000000', 'optionsLine': '', 'STEP_NUMBER': '1', 'StartStats': (4.2199999999999998, 0.23999999999999999, 0.0, 0.0, 8676995.6999999993), 'STEP_INSTANCE_NAME': 'RunDaVinciStep1', 'inputDataType': 'DATA', 'applicationLog': 'Step1_DaVinci_v29r2.log', 'optionsFile': '/project/bfys/dremensk/DaVinci-Default.py', 'PRODUCTION_ID': '00000000', 'STEP_ID': '00000000_00000000_1', 'StartTime': 1326251676.3210549, 'inputData': 'LFN:/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst'} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Input data defined in workflow for this Gaudi Application step
Line: 262 to 262
  Question I guess in such a case we can only report the SITE and the PFN, not the SE?
Added:
>
>
In either case it's not really possible (or not trivial, for now) to extract the exact SE within LHCbDIRAC itself, so the "/LocalSite/LocalSE" parameter will be used instead, to send only the SITE parameter together with the hit count.
 

Local testing of new modules:

In dirac.cfg, important lines for local testing (LHCb-Development to test on volhcb12, comment out other servers, leave only volhcb18, since it times-out with the rest.., there must be LocalArea and SharedArea defined in LocalSite)

Revision 52012-01-21 - DanielaRemenska

Line: 1 to 1
 
META TOPICPARENT name="DanielaRemenskaWork"

Points to remember (compiled from meetings and notes):

Line: 7 to 7
 
  • Implementation should be inside LHCbDirac
  • The only information that should be sent by the job are the LFNs and respective Storage Elements. More precisely, in the finalization step, already existing also for users' jobs, a new module should be added. The information will be taken from:
    • The XML file (summary.xml), where the list of input files (LFNs) is reported. The "XML summary" is produced by the LHCb applications independently. It is inspected by LHCbDIRAC to know if the application reached the end of the computation (i.e. if it has processed all the input events).
Added:
>
>
Federico
The "XML file" of the job is the "jobDescription.xml" file that is uploaded with all the logs of the jobs. It contains the job description (which is the DIRAC worfklow) in an XML format. It is created before the job is submitted, and contain information on what to run. The "XML summary" is instead produced by the LHCb applications independently. (note to self:This means forget about summary.xml)
 
    • The SE has to be obtained from DIRAC, it is not in the XML file. We can also record the site if easier, but formally the SE is the real interesting entity. The job knows it when resolving the input data.
  • Information should be passed onto the finalization step on the files that were actually used by the job, not requested (in case for example someone has 20 input files but reads only 10 events). This also means that partially used files are reported to the Popularity Service?
Line: 100 to 101
 ALERT!The AnalyseXMLLogFile implementation parses the application log AND the summary.xml, but it uses LHCbDIRAC/Core/Utilities/ProductionXMLLogAnalysis.py which requires step_commons['listoutput'] to be defined.. user jobs may lack it, so it crashes :-/ Example output
Changed:
<
<
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.4 2012/01/17 16:39:30 danielar_40nikhef_2enl Exp $
>
>
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.5 2012/01/21 22:28:54 danielar_40nikhef_2enl Exp $
 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'ParametricInputData': '', 'TotalSteps': '1', 'JobName': 'Name', 'Priority': '1', 'SoftwarePackages': 'DaVinci.v29r2', 'JobReport': <DIRAC.WorkloadManagementSystem.Client.JobReport.JobReport instance at 0xd11bcf8>, 'LogLevel': 'debug', 'OutputSandbox': '*.log;summary.data;summary.xml', 'JobType': 'User', 'SystemConfig': 'ANY', 'JOB_ID': '00000000', 'StdError': 'std.err', 'Request': <DIRAC.RequestManagementSystem.Client.RequestContainer.RequestContainer instance at 0xd0eec68>, 'AccountingReport': <DIRAC.AccountingSystem.Client.DataStoreClient.DataStoreClient instance at 0xd19a830>, 'ParametricInputSandbox': '', 'JobGroup': 'lhcb', 'StdOutput': 'std.out', 'Origin': 'DIRAC', 'Site': 'ANY', 'PRODUCTION_ID': '00000000', 'MaxCPUTime': '5000', 'LogFilePath': '/project/bfys/dremensk/ctmdev/LHCbDirac_v6r8p2/etc', 'InputData': ''} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'applicationName': 'DaVinci', 'STEP_DEFINITION_NAME': 'DaVinciStep1', 'applicationVersion': 'v29r2', 'JOB_ID': '00000000', 'optionsLine': '', 'STEP_NUMBER': '1', 'StartStats': (4.2199999999999998, 0.23999999999999999, 0.0, 0.0, 8676995.6999999993), 'STEP_INSTANCE_NAME': 'RunDaVinciStep1', 'inputDataType': 'DATA', 'applicationLog': 'Step1_DaVinci_v29r2.log', 'optionsFile': '/project/bfys/dremensk/DaVinci-Default.py', 'PRODUCTION_ID': '00000000', 'STEP_ID': '00000000_00000000_1', 'StartTime': 1326251676.3210549, 'inputData': 'LFN:/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst'} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Input data defined in workflow for this Gaudi Application step

Revision 42012-01-17 - DanielaRemenska

Line: 1 to 1
 
META TOPICPARENT name="DanielaRemenskaWork"

Points to remember (compiled from meetings and notes):

Line: 100 to 100
 ALERT!The AnalyseXMLLogFile implementation parses the application log AND the summary.xml, but it uses LHCbDIRAC/Core/Utilities/ProductionXMLLogAnalysis.py which requires step_commons['listoutput'] to be defined.. user jobs may lack it, so it crashes :-/ Example output
Changed:
<
<
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.3 2012/01/16 09:53:02 danielar_40nikhef_2enl Exp $
>
>
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.4 2012/01/17 16:39:30 danielar_40nikhef_2enl Exp $
 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'ParametricInputData': '', 'TotalSteps': '1', 'JobName': 'Name', 'Priority': '1', 'SoftwarePackages': 'DaVinci.v29r2', 'JobReport': <DIRAC.WorkloadManagementSystem.Client.JobReport.JobReport instance at 0xd11bcf8>, 'LogLevel': 'debug', 'OutputSandbox': '*.log;summary.data;summary.xml', 'JobType': 'User', 'SystemConfig': 'ANY', 'JOB_ID': '00000000', 'StdError': 'std.err', 'Request': <DIRAC.RequestManagementSystem.Client.RequestContainer.RequestContainer instance at 0xd0eec68>, 'AccountingReport': <DIRAC.AccountingSystem.Client.DataStoreClient.DataStoreClient instance at 0xd19a830>, 'ParametricInputSandbox': '', 'JobGroup': 'lhcb', 'StdOutput': 'std.out', 'Origin': 'DIRAC', 'Site': 'ANY', 'PRODUCTION_ID': '00000000', 'MaxCPUTime': '5000', 'LogFilePath': '/project/bfys/dremensk/ctmdev/LHCbDirac_v6r8p2/etc', 'InputData': ''} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'applicationName': 'DaVinci', 'STEP_DEFINITION_NAME': 'DaVinciStep1', 'applicationVersion': 'v29r2', 'JOB_ID': '00000000', 'optionsLine': '', 'STEP_NUMBER': '1', 'StartStats': (4.2199999999999998, 0.23999999999999999, 0.0, 0.0, 8676995.6999999993), 'STEP_INSTANCE_NAME': 'RunDaVinciStep1', 'inputDataType': 'DATA', 'applicationLog': 'Step1_DaVinci_v29r2.log', 'optionsFile': '/project/bfys/dremensk/DaVinci-Default.py', 'PRODUCTION_ID': '00000000', 'STEP_ID': '00000000_00000000_1', 'StartTime': 1326251676.3210549, 'inputData': 'LFN:/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst'} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Input data defined in workflow for this Gaudi Application step
Line: 176 to 176
 DONEFederico: There's no need to inspect the xml summary files to know the input files. Plus, and what's more important, user jobs do not write the xml summary! What can be inspected is instead the JDL file, but also in the jobDescription: the jobDescription comes in the form of file, which is jobDescription.xml, but also in the form of a python dictionary, which is accessed easily by all the modules. We better have a private chat (even better a video one) on the implementation, because it's a bit tricky to understand. There is no documentation available, sorry. The SE information is also there. We cannot force (not for users), but as said we don't need to.
Changed:
<
<
Question: Yes, as far as I understand, in the workflow_commons and step_commons all the parameters from the JDL file are present. However, this doesn't give a realistic picture on which files the job actually accessed. Philippe's point in a previous email conversation was that we want to see which input files the job accessed. It could be that the InputData list was long but only the first 1000 events were processed. If this is not what we want, things are simpler.
>
>
Question How about defining your data in the optsfiles? It will not be in the jdl or jobDescription.xml file! It will only be in the summary.xml

Question: As far as I understand, in the workflow_commons and step_commons all the parameters from the JDL file are present. However, this doesn't give a realistic picture on which files the job actually accessed. Philippe's point in a previous email conversation was that we want to see which input files the job accessed. It could be that the InputData list was long but only the first 1000 events were processed. If this is not what we want, things are simpler.

  DONEFederico: Mmm I'm not convinced by this argument. Suppose that 1000 events are requested, out of 10 files. If each file has 500 events, you'd just report 2 of these files, but actually that does not mean that the other 8 files are less popular than the 2 used. The inputs are in self.workflow_commons['InputData']. What you get in self.step_commons['InputData'] is the input data of the step, which is equal to the self.workflow_commons['InputData'] only for the first step.

Revision 32012-01-16 - DanielaRemenska

Line: 1 to 1
 
META TOPICPARENT name="DanielaRemenskaWork"

Points to remember (compiled from meetings and notes):

Line: 36 to 36
 However, theoretically a job can have many steps, some of which can be standard Gaudi applications, but also custom (Root, Python) scripts.

Question Are we interested in collecting data access information only for user jobs with standard Gaudi applications like Brunel, DaVinci or also custom scripts, executables? NB:In case of multiple steps the UserJobFinalization module will only enable itself only at the end of the workflow.

Changed:
<
<
Question But the "finalization step" referred to in meetings is actually a module, not a step (a DIRAC step) ? It is in fact added at the end of each step of a multi-step job, but enabled only once, at the end of the job execution...so which "finalization step" are we talking about? Presumably, a separate module should be developed for collecting and sending the file usage data to the Popularity service. This UserJobFinalization module seems to be responsible only for uploading job output data.
>
>
DONE Federico: We care about input files, not the reason why they are used.

Question But the "finalization step" referred to in meetings is actually a module, not a step (a DIRAC step) ? It is in fact added at the end of each step of a multi-step job, but enabled only once, at the end of the job execution...so which "finalization step" are we talking about?

DONEFederico: a step is made of modules. For user jobs, the finalization step is made of just one module, but for production jobs the finalization step contains many module.

Question: This is the UserJobFinalization module, but it's added in the finalization step separately in LHCbJob.py, depending on job type: GaudiApplication, or Root Executable, BenderModule etc. Actually, it is correct that for Gaudi applications there is only ONE single step instantiated, and two modules defined in it: GaudiApplication and UserJobFinalization, right? That's why I was puzzled as to what is a "finalization" step, since it's only one.

DONEFederico: Not completely true: for Production jobs (created within Production.py) it is:

      modulesNameList = gConfig.getValue( '%s/GaudiStep_Modules' % self.csSection, ['GaudiApplication',
                                                                                    'AnalyseXMLSummary',
                                                                                    'ErrorLogging',
                                                                                    'BookkeepingReport'] )

      gaudiStepDef = getStepDefinition( 'Gaudi_App_Step', modulesNameList = modulesNameList,
                                        parametersList = parametersList )
      self.LHCbJob.workflow.addStep( gaudiStepDef )


And

        modulesNameList = gConfig.getValue( '%s/FinalizationStep_Modules' % self.csSection, ['UploadOutputData',
                                                                                             'FailoverRequest',
                                                                                             'UploadLogFile'] )

      jobFinalizationStepDef = getStepDefinition( 'Job_Finalization', modulesNameList = modulesNameList )
      self.LHCbJob.workflow.addStep( jobFinalizationStepDef )
For what regards users jobs, instead, the workflow is much simpler, and it is what you say: one step, with 2 modules (GaudiApplicationScript and UserJobFinalization).

Question: Yes, clear. And can you please correct me if I'm wrong: we want to add a third module in the step, and only for user jobs. These jobs can be of any kind: Gaudi applications, Root macros, executables, etc..Currently in the LHCbJob.py implementation, the second module (UserJobFinalization) is added separately for each of them, but only enabled when it is the last step in the entire job workflow. So I'd say the same needs to be done for the new module.

DONEFederico: We want to account the usage not only for user jobs, but also for WG-productions (Working-Group productions), that are in fact analysis jobs in a production form.For user jobs the new module should go between the existing 2.

Presumably, a separate module should be developed for collecting and sending the file usage data to the Popularity service. This UserJobFinalization module seems to be responsible only for uploading job output data.

 This means changing the logic of the LHCbJob class to include this module in the workflow steps. Let's say we care only about standard Gaudi applications, then in the getGaudiApplicationStep(...) method, the following additions should be roughly made: (say FileUsage.py is a new module in LHCbDIRAC/Workflow/Modules)
Line: 56 to 93
 VS
Changed:
<
<
Are they any different? ALERT!The AnalyseXMLLogFile implementation parses the application log AND the summary.xml, but it uses LHCbDIRAC/Core/Utilities/ProductionXMLLogAnalysis.py which requires step_commons['listoutput'] to be defined.. user jobs may lack it, so it crashes :-/
>
>
Are they any different?

DONEFederico: The analyseXMLSummary will replace the analyseLogFile from the next release, but as said before, you don't need to look into it.

ALERT!The AnalyseXMLLogFile implementation parses the application log AND the summary.xml, but it uses LHCbDIRAC/Core/Utilities/ProductionXMLLogAnalysis.py which requires step_commons['listoutput'] to be defined.. user jobs may lack it, so it crashes :-/

 Example output
Changed:
<
<
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.2 2012/01/11 13:41:50 danielar_40nikhef_2enl Exp $
>
>
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.3 2012/01/16 09:53:02 danielar_40nikhef_2enl Exp $
 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'ParametricInputData': '', 'TotalSteps': '1', 'JobName': 'Name', 'Priority': '1', 'SoftwarePackages': 'DaVinci.v29r2', 'JobReport': <DIRAC.WorkloadManagementSystem.Client.JobReport.JobReport instance at 0xd11bcf8>, 'LogLevel': 'debug', 'OutputSandbox': '*.log;summary.data;summary.xml', 'JobType': 'User', 'SystemConfig': 'ANY', 'JOB_ID': '00000000', 'StdError': 'std.err', 'Request': <DIRAC.RequestManagementSystem.Client.RequestContainer.RequestContainer instance at 0xd0eec68>, 'AccountingReport': <DIRAC.AccountingSystem.Client.DataStoreClient.DataStoreClient instance at 0xd19a830>, 'ParametricInputSandbox': '', 'JobGroup': 'lhcb', 'StdOutput': 'std.out', 'Origin': 'DIRAC', 'Site': 'ANY', 'PRODUCTION_ID': '00000000', 'MaxCPUTime': '5000', 'LogFilePath': '/project/bfys/dremensk/ctmdev/LHCbDirac_v6r8p2/etc', 'InputData': ''} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'applicationName': 'DaVinci', 'STEP_DEFINITION_NAME': 'DaVinciStep1', 'applicationVersion': 'v29r2', 'JOB_ID': '00000000', 'optionsLine': '', 'STEP_NUMBER': '1', 'StartStats': (4.2199999999999998, 0.23999999999999999, 0.0, 0.0, 8676995.6999999993), 'STEP_INSTANCE_NAME': 'RunDaVinciStep1', 'inputDataType': 'DATA', 'applicationLog': 'Step1_DaVinci_v29r2.log', 'optionsFile': '/project/bfys/dremensk/DaVinci-Default.py', 'PRODUCTION_ID': '00000000', 'STEP_ID': '00000000_00000000_1', 'StartTime': 1326251676.3210549, 'inputData': 'LFN:/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst'} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Input data defined in workflow for this Gaudi Application step
Line: 130 to 171
 xmlsum=XMLSummarySvc("CounterSummarySvc")
Changed:
<
<
Question How will we force jobs to produce the summary.xml when it is optionally set by the users? Or should we parse the data access directly from the Job log file? (example Step1_DaVinci_v29r2.log)
>
>
Question How will we force jobs to produce the summary.xml when it is optionally set by the users? Or should we parse the data access directly from the Job log file? (example Step1_DaVinci_v29r2.log)

DONEFederico: There's no need to inspect the xml summary files to know the input files. Plus, and what's more important, user jobs do not write the xml summary! What can be inspected is instead the JDL file, but also in the jobDescription: the jobDescription comes in the form of file, which is jobDescription.xml, but also in the form of a python dictionary, which is accessed easily by all the modules. We better have a private chat (even better a video one) on the implementation, because it's a bit tricky to understand. There is no documentation available, sorry. The SE information is also there. We cannot force (not for users), but as said we don't need to.

Question: Yes, as far as I understand, in the workflow_commons and step_commons all the parameters from the JDL file are present. However, this doesn't give a realistic picture on which files the job actually accessed. Philippe's point in a previous email conversation was that we want to see which input files the job accessed. It could be that the InputData list was long but only the first 1000 events were processed. If this is not what we want, things are simpler.

DONEFederico: Mmm I'm not convinced by this argument. Suppose that 1000 events are requested, out of 10 files. If each file has 500 events, you'd just report 2 of these files, but actually that does not mean that the other 8 files are less popular than the 2 used. The inputs are in self.workflow_commons['InputData']. What you get in self.step_commons['InputData'] is the input data of the step, which is equal to the self.workflow_commons['InputData'] only for the first step.

Question: Well the argument is that, unless the rest 8 files are touched by the job, they should not be accounted for as popular. Note that even if only one event is read from a file, it will be reported as used. From what I remember, the strongest argument for using the job summary report, was because one can see which files were actually used, rather than report on every file listed as a job input. If we want to account for all input files in the job description (hence, contained in the workflow_commons['inputData']), then why do we need a brand new module? The JobWrapper itself can send the list of all files.

DONEFederico: The JobWrapper is in DIRAC only, this is a LHCbDIRAC development.

  Four possible statuses of the files reported in summary.xml after the payload execution:
Line: 168 to 221
  We should report only in case the application execution is successful, and the final step is reached.
Added:
>
>
DONEFederico: When a file is requested by a job, it is used.

 ALERT!However, we can't force users to use LFNs, for instance, a job can be specified with the PFNs directly, and it will still work:
from Configurables import DaVinci

Revision 22012-01-11 - DanielaRemenska

Line: 1 to 1
 
META TOPICPARENT name="DanielaRemenskaWork"

Points to remember (compiled from meetings and notes):

Line: 14 to 14
 
Changed:
<
<

Implementation details

>
>

Implementation details/concerns

 The existing steps and modules for jobs are available here. For LHCb these could be e.g. Gauss, Boole, Brunel,DaVinci, Bender, etc. The LHCb job workflow is defined in LHCbDIRAC/Interfaces/API/LHCbJob.py
Line: 56 to 56
 VS
Changed:
<
<
Are they any different? ALERT!The AnalyseXMLLogFile implementation parses the application summary.xml, but it uses LHCbDIRAC/Core/Utilities/ProductionXMLLogAnalysis.py which requires step_commons['listoutput'] to be defined.. user jobs may lack it, so it crashes :-/
>
>
Are they any different? ALERT!The AnalyseXMLLogFile implementation parses the application log AND the summary.xml, but it uses LHCbDIRAC/Core/Utilities/ProductionXMLLogAnalysis.py which requires step_commons['listoutput'] to be defined.. user jobs may lack it, so it crashes :-/
 Example output
Changed:
<
<
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.1 2012/01/11 03:43:11 danielar_40nikhef_2enl Exp $
>
>
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Initializing $Id: DiracPopularityService.txt,v 1.2 2012/01/11 13:41:50 danielar_40nikhef_2enl Exp $
 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'ParametricInputData': '', 'TotalSteps': '1', 'JobName': 'Name', 'Priority': '1', 'SoftwarePackages': 'DaVinci.v29r2', 'JobReport': <DIRAC.WorkloadManagementSystem.Client.JobReport.JobReport instance at 0xd11bcf8>, 'LogLevel': 'debug', 'OutputSandbox': '*.log;summary.data;summary.xml', 'JobType': 'User', 'SystemConfig': 'ANY', 'JOB_ID': '00000000', 'StdError': 'std.err', 'Request': <DIRAC.RequestManagementSystem.Client.RequestContainer.RequestContainer instance at 0xd0eec68>, 'AccountingReport': <DIRAC.AccountingSystem.Client.DataStoreClient.DataStoreClient instance at 0xd19a830>, 'ParametricInputSandbox': '', 'JobGroup': 'lhcb', 'StdOutput': 'std.out', 'Origin': 'DIRAC', 'Site': 'ANY', 'PRODUCTION_ID': '00000000', 'MaxCPUTime': '5000', 'LogFilePath': '/project/bfys/dremensk/ctmdev/LHCbDirac_v6r8p2/etc', 'InputData': ''} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'applicationName': 'DaVinci', 'STEP_DEFINITION_NAME': 'DaVinciStep1', 'applicationVersion': 'v29r2', 'JOB_ID': '00000000', 'optionsLine': '', 'STEP_NUMBER': '1', 'StartStats': (4.2199999999999998, 0.23999999999999999, 0.0, 0.0, 8676995.6999999993), 'STEP_INSTANCE_NAME': 'RunDaVinciStep1', 'inputDataType': 'DATA', 'applicationLog': 'Step1_DaVinci_v29r2.log', 'optionsFile': '/project/bfys/dremensk/DaVinci-Default.py', 'PRODUCTION_ID': '00000000', 'STEP_ID': '00000000_00000000_1', 'StartTime': 1326251676.3210549, 'inputData': 'LFN:/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst'} 2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile INFO: Input data defined in workflow for this Gaudi Application step
Line: 169 to 168
  We should report only in case the application execution is successful, and the final step is reached.
Added:
>
>
ALERT!However, we can't force users to use LFNs, for instance, a job can be specified with the PFNs directly, and it will still work:
from Configurables import DaVinci
##############################################################################
d = DaVinci()
DaVinci().DataType = "2011"
DaVinci().EvtMax = 15000
from Gaudi.Configuration import *
EventSelector().Input=[ "DATAFILE='dcap://bee37.grid.sara.nl:22125/pnfs/grid.sara.nl/data/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst' TYP='POOL_ROOTTREE' OPT='READ'"];

In this case the job inputData parameter is an empty list, and the JobWrapper does NOT go through the InputDataResolution step (no Input Data Policy is applied). But the job reads events successfully, and the summary.xml file will look like:

<?xml version="1.0" encoding="UTF-8"?>
<summary version="1.0" xsi:noNamespaceSchemaLocation="$XMLSUMMARYBASEROOT/xml/XMLSummary.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <success>True</success>
   <step>finalize</step>
   <usage>
      <stat unit="KB" useOf="MemoryMaximum">578596.0</stat>
   </usage>
   <input>
      <file GUID="4C889DC0-8929-E111-AAE4-0030487DF702" name="PFN:dcap://bee37.grid.sara.nl:22125/pnfs/grid.sara.nl/data/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst" status="full">12092</file>
   </input>
   <output />
   <counters>
      <counter name="DaVinciInitAlg/Events">12092</counter>
      <counter name="CounterSummarySvc/handled">12094</counter>
   </counters>
   <lumiCounters />
</summary>
Which means the LFNs are not reported in the summary.xml, but the PFNs are. Do we consider this use case, or just ignore it? AFAIK people often (for CASTOR especially) specify job input with
 Input = "DATAFILE='PFN:castor:/castor/cern.ch/user/t/test/blabla.dst' TYP='POOL_ROOTTREE' OPT='READ'" 
An example of such a job at /afs/cern.ch/user/d/dremensk/DEBUG_2867 (Setup: LHCb-Development JobID=2867)

Question I guess in such a case we can only report the SITE and the PFN, not the SE?

 

Local testing of new modules:

In dirac.cfg, important lines for local testing (LHCb-Development to test on volhcb12, comment out other servers, leave only volhcb18, since it times-out with the rest.., there must be LocalArea and SharedArea defined in LocalSite)

Revision 12012-01-11 - DanielaRemenska

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="DanielaRemenskaWork"

Points to remember (compiled from meetings and notes):

  • The objective of the popularity service is only to get a ranking of how popular a dataset is, not to identify lost/corrupted replicas and source of data access failure.
  • The unit of information will be the LFC directory.
  • Implementation should be inside LHCbDirac
  • The only information that should be sent by the job are the LFNs and respective Storage Elements. More precisely, in the finalization step, already existing also for users' jobs, a new module should be added. The information will be taken from:
    • The XML file (summary.xml), where the list of input files (LFNs) is reported. The "XML summary" is produced by the LHCb applications independently. It is inspected by LHCbDIRAC to know if the application reached the end of the computation (i.e. if it has processed all the input events).
    • The SE has to be obtained from DIRAC, it is not in the XML file. We can also record the site if easier, but formally the SE is the real interesting entity. The job knows it when resolving the input data.
  • Information should be passed onto the finalization step on the files that were actually used by the job, not requested (in case for example someone has 20 input files but reads only 10 events). This also means that partially used files are reported to the Popularity Service?

  • This new module should send the information to the Popularity Service which will then populate the StorageUsageDB. Currently this database (volhcb12 development machine) has two tables: Popularity and DirMetadata. The Popularity table contains traces (DID, Count, SEName, InsertTime) of individual jobs. The jobs themselves only send entries of type (SE, DirDict ) where DirDict = { dir1: count1, dir2: count2,..}. Change: Instead of inserting the Directory ID, which implied a query to the table with the LFN directories, simply insert the LFN. In this way, we save a query. And then, in an asynchronous way, the Popularity agent when it creates the reports, it will check if the LFN paths stored in the Popularity table exist in the LFN directory table, and will raise an error in case they are not there. DirMetadata contains bookkeeping information: configName, configVersion, Conditions, Proc.Pass, eventType, fileType, production etc.

Implementation details

The existing steps and modules for jobs are available here. For LHCb these could be e.g. Gauss, Boole, Brunel,DaVinci, Bender, etc. The LHCb job workflow is defined in LHCbDIRAC/Interfaces/API/LHCbJob.py

For example, in the method

def setApplication(...)
there is a call to
 __getGaudiApplicationStep(..)
which controls the definition for a Gaudi application step. Two modules are included by default (and several parameters): GaudiApplication.py and UserJobFinalization.py.

Another example, in the method

 setRootPythonScript(...)
there is a call to
__getRootApplicationStep(...) 
the two defined modules for a Root script step are: RootApplication.py and UserJobFinalization.py

etc.

However, theoretically a job can have many steps, some of which can be standard Gaudi applications, but also custom (Root, Python) scripts.

Question Are we interested in collecting data access information only for user jobs with standard Gaudi applications like Brunel, DaVinci or also custom scripts, executables? NB:In case of multiple steps the UserJobFinalization module will only enable itself only at the end of the workflow. Question But the "finalization step" referred to in meetings is actually a module, not a step (a DIRAC step) ? It is in fact added at the end of each step of a multi-step job, but enabled only once, at the end of the job execution...so which "finalization step" are we talking about? Presumably, a separate module should be developed for collecting and sending the file usage data to the Popularity service. This UserJobFinalization module seems to be responsible only for uploading job output data. This means changing the logic of the LHCbJob class to include this module in the workflow steps. Let's say we care only about standard Gaudi applications, then in the getGaudiApplicationStep(...) method, the following additions should be roughly made: (say FileUsage.py is a new module in LHCbDIRAC/Workflow/Modules)

    moduleName = 'FileUsage'
    fileUsage = ModuleDefinition( moduleName )
    fileUsage.setDescription( 'Blabla module ')
    body = 'from %s.%s import %s\n' % ( self.importLocation, moduleName, moduleName )
    fileUsage.setBody( body )
...
    step.addModule( fileUsage )

Question There seems to be a Savannah task on the use of the XML summary. Is this considered finished? It is not included in the job workflow as far as I can see. I see two (identical?) modules developed:

VS

Are they any different? ALERT!The AnalyseXMLLogFile implementation parses the application summary.xml, but it uses LHCbDIRAC/Core/Utilities/ProductionXMLLogAnalysis.py which requires step_commons['listoutput'] to be defined.. user jobs may lack it, so it crashes :-/ Example output

2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Initializing $Id: DiracPopularityService.txt,v 1.1 2012/01/11 03:43:11 danielar_40nikhef_2enl Exp $ 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'ParametricInputData': '', 'TotalSteps': '1', 'JobName': 'Name', 'Priority': '1', 'SoftwarePackages': 'DaVinci.v29r2', 'JobReport': <DIRAC.WorkloadManagementSystem.Client.JobReport.JobReport instance at 0xd11bcf8>, 'LogLevel': 'debug', 'OutputSandbox': '*.log;summary.data;summary.xml', 'JobType': 'User', 'SystemConfig': 'ANY', 'JOB_ID': '00000000', 'StdError': 'std.err', 'Request': <DIRAC.RequestManagementSystem.Client.RequestContainer.RequestContainer instance at 0xd0eec68>, 'AccountingReport': <DIRAC.AccountingSystem.Client.DataStoreClient.DataStoreClient instance at 0xd19a830>, 'ParametricInputSandbox': '', 'JobGroup': 'lhcb', 'StdOutput': 'std.out', 'Origin': 'DIRAC', 'Site': 'ANY', 'PRODUCTION_ID': '00000000', 'MaxCPUTime': '5000', 'LogFilePath': '/project/bfys/dremensk/ctmdev/LHCbDirac_v6r8p2/etc', 'InputData': ''} 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile DEBUG: {'applicationName': 'DaVinci', 'STEP_DEFINITION_NAME': 'DaVinciStep1', 'applicationVersion': 'v29r2', 'JOB_ID': '00000000', 'optionsLine': '', 'STEP_NUMBER': '1', 'StartStats': (4.2199999999999998, 0.23999999999999999, 0.0, 0.0, 8676995.6999999993), 'STEP_INSTANCE_NAME': 'RunDaVinciStep1', 'inputDataType': 'DATA', 'applicationLog': 'Step1_DaVinci_v29r2.log', 'optionsFile': '/project/bfys/dremensk/DaVinci-Default.py', 'PRODUCTION_ID': '00000000', 'STEP_ID': '00000000_00000000_1', 'StartTime': 1326251676.3210549, 'inputData': 'LFN:/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst'} 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Input data defined in workflow for this Gaudi Application step 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  VERB: Job has no input data requirement 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  VERB: Performing log file analysis for Step1_DaVinci_v29r2.log 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Resolved the step input data to be:
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: LFN:/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Resolved the job input data to be:
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO:  
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Attempting to open log file: Step1_DaVinci_v29r2.log 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Attempting to parse xml log file: summaryDaVinci_00000000_00000000_1.xml 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Check application ended successfully e.g. searching for "Application Manager Finalized successfully" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "Terminating event processing loop due to errors" meaning job would fail with "Event Loop Not Terminated" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "SysError in <TDCacheFile::ReadBuffer>: error reading from file" meaning job would fail with "DCACHE connection error" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for " glibc " meaning job would fail with "Problem with glibc" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "Failed to resolve" meaning job would fail with "IODataManager error" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "Writer failed" meaning job would fail with "Writer failed" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "Not found DLL" meaning job would fail with "Not found DLL" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "Cannot connect to database" meaning job would fail with "error database connection" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "Standard std::exception is caught" meaning job would fail with "Exception caught" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "Error: connectDataIO" meaning job would fail with "connectDataIO error" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "Bus error" meaning job would fail with "Bus error" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "segmentation violation" meaning job would fail with "segmentation violation" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "Error:connectDataIO" meaning job would fail with "connectDataIO error" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "GaussTape failed" meaning job would fail with "GaussTape failed" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "Could not connect" meaning job would fail with "CASTOR error connection" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: Checking for "User defined signal 1" meaning job would fail with "User defined signal 1" 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: XMLSummary reports success = True  
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: XMLSummary reports step finalized 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: 0 file(s) on fail status, 0 file(s) on part status, 1 file(s) on full status, 0 file(s) on other status, 0 file(s) on mult status 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile ERROR: {'Data': {}, 'Message': '0 file(s) on fail status, 0 file(s) on part status, 1 file(s) on full status, 0 file(s) on other status, 0 file(s) on mult status', 'OK': False} 
Exception while module execution
Module DaVinciStep1 AnalyseXMLLogFile
'listoutput'
== EXCEPTION ==
<type 'exceptions.KeyError'>: 'listoutput'

  File "/project/bfys/dremensk/cmtdev/LHCbDirac_v6r8p2/InstallArea/python/DIRAC/Core/Workflow/Step.py", line 272, in execute
    result = step_exec_modules[mod_inst_name].execute()

  File "/project/bfys/dremensk/cmtdev/LHCbDirac_v6r8p2/InstallArea/python/LHCbDIRAC/Workflow/Modules/AnalyseXMLLogFile.py", line 108, in execute
    self.__finalizeWithErrors( result['Message'] )

  File "/project/bfys/dremensk/cmtdev/LHCbDirac_v6r8p2/InstallArea/python/LHCbDIRAC/Workflow/Modules/AnalyseXMLLogFile.py", line 258, in __finalizeWithErrors
    self.workflow_commons['outputList'] = self.step_commons['listoutput']
===============
All the data access information is effectively there in the last printed lines, can be reused!
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: XMLSummary reports success = True  
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: XMLSummary reports step finalized 
2012-01-11 03:14:59 UTC dirac-jobexec/AnalyseXMLLogFile  INFO: 0 file(s) on fail status, 0 file(s) on part status, 1 file(s) on full status, 0 file(s) on other status, 0 file(s) on mult status 

To enable producing the summary.xml at the end of a job execution, the following should be put in the job description:

danielar@herault etc $ cat DaVinci-Default.py
from Configurables import DaVinci

d = DaVinci()
DaVinci().DataType = "2011"
DaVinci().EvtMax = 15000

from Configurables import LHCbApp
LHCbApp().XMLSummary='summary.xml'
from Configurables import XMLSummarySvc
xmlsum=XMLSummarySvc("CounterSummarySvc")

Question How will we force jobs to produce the summary.xml when it is optionally set by the users? Or should we parse the data access directly from the Job log file? (example Step1_DaVinci_v29r2.log)

Four possible statuses of the files reported in summary.xml after the payload execution:

- full : the file has been fully read

- part : the file has been partially read

- mult : the file has been read multiple times

- fail : failure while reading the file

An example:

danielar@herault 2743 $ cat summary.xml 
<?xml version="1.0" encoding="UTF-8"?>

<summary version="1.0" xsi:noNamespaceSchemaLocation="$XMLSUMMARYBASEROOT/xml/XMLSummary.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <success>True</success>
   <step>finalize</step>
   <usage>
      <stat unit="KB" useOf="MemoryMaximum">747096.0</stat>
   </usage>
   <input>
      <file GUID="" name="LFN:/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst" status="full">12092</file>
      <file GUID="" name="LFN:/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000583_1.bhadron.dst" status="part">2908</file>
   </input>
   <output />
   <counters>
      <counter name="DaVinciInitAlg/Events">15000</counter>
      <counter name="CounterSummarySvc/handled">15003</counter>
   </counters>
   <lumiCounters />
</summary>

We should report only in case the application execution is successful, and the final step is reached.

Local testing of new modules:

In dirac.cfg, important lines for local testing (LHCb-Development to test on volhcb12, comment out other servers, leave only volhcb18, since it times-out with the rest.., there must be LocalArea and SharedArea defined in LocalSite)

DIRAC
{
  Setup = LHCb-Development
...
Configuration
  {
    Version = 2011-11-28 08:54:23.394934
    Name = LHCb-Prod
    #@@-rgracian@diracAdmin - /DC=es/DC=irisgrid/O=ecm-ub/CN=Ricardo-Graciani-Diaz
    EnableAutoMerge = yes
    Servers = dips://volhcb18.cern.ch:9135/Configuration/Server
    #Servers += dips://volhcb12.cern.ch:9135/Configuration/Server
    #Servers += dips://lhcb-kit.gridka.de:9135/Configuration/Server
    #Servers += dips://volhcb19.cern.ch:9135/Configuration/Server
    #Servers += dips://kot.nikhef.nl:9135/Configuration/Server
    #Servers += dips://vobox07.pic.es:9135/Configuration/Server
    #Servers += dips://volhcb30.cern.ch:9135/Configuration/Server
    #Servers += dips://cclcglhcb01.in2p3.fr:9135/Configuration/Server
    #Servers += dips://lcgvo-s3-03.gridpp.rl.ac.uk:9135/Configuration/Server
    #Servers += dips://lhcbprod.pic.es:9135/Configuration/Server
    #Servers += dips://ui01-lhcb.cr.cnaf.infn.it:9135/Configuration/Server
    MasterServer = dips://volhcb18.cern.ch:9135/Configuration/Server
  }
...
LocalSite
{
  #@@-rgracian@diracAdmin - 2011-06-22 17:01:24
  FileCatalog = LcgFileCatalogCombined
  LocalSE = SARA-RAW
  LocalSE += SARA-RDST
  LocalSE += SARA-ARCHIVE
  LocalSE += SARA-DST
  LocalSE += SARA_M-DST
  LocalSE += SARA-USER
  Architecture = x86_64-slc5-gcc43-opt
  Site = LCG.NIKHEF.nl
  LocalArea = .
  SharedArea = /project/bfys/lhcb/sw
}

Testing jobs without submitting them (mode = 'local') to the WMS and involving the entire DIRAC machinery (other mode is 'agent')

from LHCbDIRAC.Interfaces.API.DiracLHCb import DiracLHCb
from LHCbDIRAC.Interfaces.API.LHCbJob import LHCbJob
j = LHCbJob()
j.setCPUTime(5000)
j.setApplication('DaVinci','v29r2','/project/bfys/dremensk/DaVinci-Default.py', inputData=['/lhcb/LHCb/Collision11/BHADRON.DST/00012957/0000/00012957_00000753_1.bhadron.dst'])
j.setOutputSandbox(['*.log','summary.data','summary.xml'])
j.setLogLevel('debug')
j.setInputDataType('DATA')
dirac = DiracLHCb()
jobID = dirac.submit(j,mode='local')
print 'Submission Result: ',jobID
print j._toXML()
The last line will generate an .xml description of the job requirements. The workflow modules, steps, parameters and input data are defined here, and can be modified to include new modules/steps. An example can be found at: /afs/cern.ch/user/d/dremensk/JobInput.xml

export VO_LHCB_SW_DIR=/project/bfys/lhcb/sw
SetupProject LHCbDIRAC v6r8p2
export DIRACSYSCONFIG=/project/bfys/dremensk/cmtdev/LHCbDirac_v6r8p2/etc/dirac.cfg
dirac-jobexec JobInput.xml -o LogLevel=debug

Once the module is implemented....

A simple example to insert entries in the Popularity table:
from DIRAC.Core.Base.Script import parseCommandLine
parseCommandLine()
from DIRAC.Core.DISET.RPCClient import RPCClient
s = RPCClient("DataManagement/DataUsage")
se = 'CERN-DST'
dirDict = {}
dirDict['/test/'] = 2
dirDict['/lhcb/certification/test/ALLSTREAMS.DST/00000002/0000/'] = 5
s.sendDataUsageReport( se, dirDict )

-- DanielaRemenska - 11-Jan-2012

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback