Difference: MCProdCoordination ( vs. 1)

Revision 12017-11-26 - JunichiTanaka

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="JunichiTanaka"
2015 - Sep 2017

MC Production

  • site=CERN-PROD, site=IN2P3-CC, site=INFN-T1

  • Memo
    • ScaleTaskLength=0.2 (r7772 vs r8084)
    • ramcount=6000;ramUnit=MBPerCore
    • r7980 SDR-34+TRTOnlyArgon=SDR-35
    • r6869 (25ns v4) ATLAS-R2-2015-03-01-00
      • r7192-r7195 (mu=20,40,60,80 constant)
      • r7146 with TrigMC 20.1.5.6.5
      • r7245 to keep CalHits and PID
      • r7291 with ATLAS-R2-2015-03-06-00
      • Digit+rec w/o pileup
        • r7059 with ATLAS-R2-2015-03-01-00, don't use r7059, we need to add DataRunNumber=222500 for no-pileup, No track slimming
        • r7279 with ATLAS-R2-2015-03-06-00
        • r7504r7508 with ATLAS-R2-2015-03-01-00, DataRunNumber=222500, No track slimming
        • r7509 with ATLAS-R2-2015-03-01-00, DataRunNumber=222500 (based on r7059)
        • r7051
    • a777 (25ns v2)
    • r7267 (25ns MC15b v1) ATLAS-R2-2015-03-01-00
      • r7359 HITtoRDO:from Digitization.DigitizationFlags import digitizationFlags;digitizationFlags.overrideMetadata+=[\"SimLayout\",\"PhysicsList\"];
    • r7326 (25ns MC15b v2)
      • r7532 HITtoRDO:from Digitization.DigitizationFlags import digitizationFlags;digitizationFlags.overrideMetadata+=[\"SimLayout\",\"PhysicsList\"];
    • s2726 (MC15a default at 2015 Oct etc) 19.2.4.9, ATLAS-R2-2015-03-01-00 and MC15aPlus truth strategy
      • s2758-s2761 for Req4471: I modified postExec and preExec.
        • s2758 with ATLAS-R2-2015-03-06-00_VALIDATION this is "updated IBL decriptions and pixel conditions"
        • s2774 with ATLAS-R2-2015-03-06-01_VALIDATION 10% more IBL Material
        • s2759 with ATLAS-R2-2015-03-01-04_VALIDATION 10% more IBL/Pixel Material
        • s2760 with ATLAS-R2-2015-03-01-18_VALIDATION 25% more IBL/Pixel Material
        • s2761 with ATLAS-R2-2015-03-01-19_VALIDATION 50% more IBL/Pixel Material
      • s2763-s2768 for Req4624: the same as s2726 expect geometryVersion
        • s2763 with ATLAS-R2-2015-03-01-02_VALIDATION
        • s2764 with ATLAS-R2-2015-03-01-11_VALIDATION
        • s2765 with ATLAS-R2-2015-03-01-12_VALIDATION
        • s2766 with ATLAS-R2-2015-03-01-13_VALIDATION
        • s2767 with ATLAS-R2-2015-03-01-14_VALIDATION
        • s2768 with ATLAS-R2-2015-03-01-15_VALIDATION
    • AODMerge_tf.py
      • r6282 ... MC15b etc (20.1.4.7)
      • r7351 ... 20.7.2.2
      • r7393 ... 20.7.3.2
      • r7434 ... 20.1.5.10.1
      • r7443 ... 20.1.5.10
      • r7577 ... 20.3.0.3
      • r7586 ... 20.3.3.1
      • r7501 ... 20.7.3.6
      • r7536 ... 20.7.3.8
      • r7564 ... 20.1.8.5 with updated argument names (3 of 4) event_per_job is still old.
      • r7576 ... 20.1.9.3 with updated argument names (3 of 4) event_per_job is still old.
      • r7592 ... 20.1.6.4 with updated argument names (3 of 4) event_per_job is still old.
      • r7605 ... 20.7.4.2 with updated argument names (3 of 4) and event_per_job is removed.
      • r7676 ... 20.7.5.1
    • Important JIRA

  • Pileup
See the attached "pileup_input.txt". Our default mu-profile is optimized to 10k. Also we need at least 5k to get a full mu distributions (=sum of evnts); https://svnweb.cern.ch/trac/atlasoff/browser/Simulation/RunDependentSim/RunDependentSimData/trunk/share/configLumi_run284500_v2.py Also we can reduce it by using "preExec like "ScaleSampleBy=0.2".

  • Rod
The task asks for 2400 HS06sPerEvent but some jobs run for 30hrs giving cputime=30*3600*8*10/1000 = 8640. (HS06s/10 is almost "real 1 sec". 2400 means 240 sec = 4 min / one event [/ one core].)

  • Rod and Andrej
I think the target size is around 5GB, but be careful not to hit 10GB as ther eis an athena limit. The WN scratch if at leas 14GB, but can be much more for some sites and for MCORE. I would say not more than 5GB except in some rare cases. More than that could be a problem for network transfers, eg 1h timeout could come into effect. (14GB is for one core.)

  • Zach's e-mails and Josh's e-mail
number of low-pt min bias (events) number of high-pt min bias (events) Notice that one is a factor of ~100 higher than the other. That's all that matters if the preInclude or preInclude_h2r lists include some files that look like they are setting up run-dependent MC conditions (as these do). In that case, the actual mu value is taken from a database, and those two numbers are only used to normalize the minbias samples relative to one-another.

You should look for those pre-includes to see what is really going on, and AMI lists those. If you see something that says "run dependent monte carlo", "lumi profile" with a run number, etc, then you should expect that the number of X pt minbias parameters are being overridden by values in the database. If you don't see any pre-includes like that, then mu is something like numberoflowptminbias+numberofhighptminbias.

As Zach says, numberoflowptminbias+numberofhighptminbias sets the initial value, if there is no run dependent MC configuration this is constant through out the job. Otherwise it is over-ridden by the specific value for each event. In this case the initial value is an upper bound on the values that can be found in the dataset (necessary to make sure various internal pile-up machinery is configured correctly).

  • Guillaume's e-mails with my questions
larRODFlags.NumberOfCollisions.set_Value_and_Lock(20); larRODFlags.nSamples.set_Value_and_Lock(4) I can imagine the 2nd one, which might be for samples used in OF but I cannot understand the 1st one. What is it?

The first option is to set the mu value for the pileup optimization to 20, this is what we have been using for all mc15

Is this value very sensitive to the pile condition? I mean that does LAr performance change so much, for example, in case of 30 instead of 20?

BTW, do you know the next one? This might not be for LAr but if you know, please teach me. from BeamFlags import jobproperties; jobproperties.Beam.numberOfCollisions.set_Value_and_Lock(20.0)

The lar noise is not very sensitive to the values used for the OFC optimization. i.e if real mu is =25, using mu=20 or 30 in the optimization will not be a dramatic effect on the noise but probably one will see small changes.

> from BeamFlags import jobproperties;
> jobproperties.Beam.numberOfCollisions.set_Value_and_Lock(20.0)
this is probably used by other systems as well

We introduced larRODFlags.NumberOfCollisions to be able to define separately the mu for the LAR OFC optimization from this global flag (and if this NumberOfCollisions flag is not set, it will default to Beam.numberOfCollisions if I recall correctly)

  • Answer from Zach on EVNTtoHITS:simFlags.SimBarcodeOffset.set_Value_and_Lock(200000)
It depends a bit on the release, but 200k is the default in most of the old ones. For the new releases (and MC16), 1M will be the default. Iím afraid Iíve forgotten whether itís 1M already in 19.2.XÖ

  • Condition DB tag
    • CondDB tags (see Oda-san's e-mail at 11/21)
    • Run numbers in MC
      • 222500 ... MC15a 50ns but mu=0
      • 222510 ... MC15a 50ns
      • 222525 ... MC15a 25ns
      • 222526 ... MC15b 25ns
      • Simulation (s-tag)
        • Zach's answer (18 Jan 2016): The run number is used to pick the conditions, but only some conditions are used in simulation: coarse alignment and beamspot are the two most important. Those two runs should have the same values for those folders, so they should be interchangeable for simulation. Indeed, MC15b simulation and MC15a simulation (and MC15c simulation too!) can all go with the same tag if you wish.
      • Reconstruction (r-tag)
        • We should use the correct one.
    • Many links about CondDB
    • Mini Guide for folder tag

  • Trigger DB keys
    • MC trigger DB MCRECO:DBF:TRIGGERDBMC:2019,12,16 is SMK 2019, L1 12, HLT 16

  • Tools with Jose's scripts
    • Requests
    • Analytics
    • Generator versions
    • Task search
    • Timing summary ... totalEvents is the number of finished events.
      • Useful to get fractions of finished events of specific request id.
        • ex 4686 in the "Request Id" field then select "recon" from the drop down menu, then select "event summary only" and then click submit.

  • Grace period
    • 48h, with the exception for reprocessing 1 day ago (it was set 12h)

  • LHE files, external files
    • web interface
      • before using this interface, we need to add account name to /afs/cern.ch/atlas/groups/MCProd/www/register/index.php

  • Tips
    • ignoreTrfParams=inputLogsFile
    • Rod: It is very important that sim tasks have the --DBRelease=current argument OR the task need wnconnectivity=http.

  • "UseFrontier.py: Deprecation Warning - please use PyJobTransforms/UseFrontier.py instead"

  • _debug
    • This request has special word "_debug" in the description so it cannot be processed automatically. To process it you need to remove "_debug" from the description.

  • Zach pileup
    • >Just to clarify the original accounting: the most dangerous pile up is in time. The ratio of high to low is something like 320:1. For mu of 40, that means 1/8 events has a high pT minbias event in time. So for our largest single datasets, which are about 100M events, we need around 100M/8=12.5M high pT events. I believe the old math was done with mu=30 or so, so we were just on the other side of rounding to 10M.

  • Error message
    • ORA-06512: at "ATLAS_RUCIO.CHECK_DID_UNIQUENESS", line 12
    • ORA-04088: error during execution of trigger 'ATLAS_RUCIO.CHECK_DID_UNIQUENESS'
      • Rod's answer: this is because someone retried the task creating the unmerged AOD. The task output was already deleted, because it was merged. Panda bravely tries to create the datasets anyway, but cannot due to the uniqueness constraing. The deleted datasets are still in the rucio Db.

  • About "sub" datasets in rucio, they will be deleted in about 2 weeks once the corresponding task is finished. Indeed in the production system, there is a protection, that is, we cannot use datasets with "sub" as input.

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback