P1 Data Taking
This twiki is for P1 data taking. For the one at EMF, please refer to
EMFdataTake Twiki.
Is it the RC Shadow's responsibility to keep this twiki up-to-date!
If there is something incomplete, incorrect or unclear as you are taking a calibration, please consult the RC Shadow from the previous week, and make the proper updates to the twiki immediately.
Introduction
Steps to take Phase-1 data at Point1 during LS2.
Call the LAr RC before any action on the system!
Requirements:
- Login to Point-1 via the ATLAS gateway:
ssh -Y
your_account@atlasgw.cern.ch
- From the atlasgw (or via an SCR machine), you can login to the FELIX machine:
ssh -Y pc-lar-felix-03
and the LAR MON machine ssh pc-lar-mon-01
- Apply for the roles of LAR:remote
for pc-lar-scr-**
in SCR and LAR:DAQ:expert
for pc-lar-felix-03
in USA15.
Steps to start the long run
- Setup LTDB
- Configure partition
- Setup LATOME
- Start the Run
- Check logs and post an elog and add the run number to the list
of runs (google spreadsheet)
Setup LTDB
- Login (from
atlasgw
): ssh -Y pcatllar05
Launch the WinCCOAProject GUI:
WCCOAui -proj ATLLARLTDB -p main.pnl
Follow the steps under "Use WinCCOAProject" in the
LtdbConfigurationEmfPeripheral twiki.
(You can find more information about the WinCCOAProject and LTDB/felix setup on the
LtdbConfigurationP1Peripheral twiki.)
Configure the Partition
From atlasgw, do the following:
cd /det/lar/project/shiftarea/CurrentVersion
# setup the lar environment
source /det/lar/project/scripts/env/sod.sh
# setup the EMECA partition (LTDB A03L is in EMECA)
source setup.sh -t EMECA
# launch the tdaq gui
daq
In the TDAQ IGUI panel, do the following:
- click INITIALIZE
- open the LAr Master Panel: click (at the top) Load Panels > Larg.Panels.MasterPanel
- under the LAr tab, select the Calibration Manager tab
- Select your desired Sequence Type (typically Pedestal)
- Get the calibration db information and publish it to IS: at the bottom right of the Calibration Manager panel, click the black arrow and then the blue wheel.
- under the Run Informations and Settings tab on the right, select the Settings tab, make sure the RunType is consistent with the calibration sequence type (i.e. LArPedestals), then click Set Values
- click CONFIG
Do
not click start yet. Leave the TDAQ IGUI and terminal session open (for now).
Setup LATOME
- You should connect on :
pc-lar-mon-01
cd /detwork/lar/LATOME/LATOME_config/LATOME_config-v2.2.1
./check_firmware_version.sh
./reset_latome.sh # This will reset, setup and calibrate the latome (The BCR signal should be stable ie you should have passed the "CONFIGURE" FSM transition of your partition
If all the steps show "PASS" on your terminal the latome is ready to send data
If some of the steps show "FAIL" and there is a file connection issue in the logfiles, you can try to redo the LOCx2 scan in the WinCCOAProject GUI.
Start the Run
Start the partition run: from the TDAQ IGUI open in the
atlasgw
terminal session, click
START.
From the same terminal session on
pc-lar-mon-01
, do the following to start taking data:
cd /det/lar/project/firmware/LATOME/datatake_script/launch_long_run
./Start.sh
Check the LATOME Registers
This step is really important, as it is a check to make sure the LATOME is taking data properly!
You need to check the registers of the latome. To do so you need to use the firmware control script :
-
sod
-
/det/lar/project/firmware/LATOME/config_test/LATOME_FW-v2.2.1/LATOME/projects/firmware_control/shell.latome_012.sh
ISTAGE : Status (live) -> check if ISTAGE is working
-
istage.sync_box.bcid_calibration.value
should be all around 250, #except fiber 0,1,12,13,14,24,25,35,36 and 37.
TTC : TTC status (live) -> check the TTC rate
-
bcr_counter
should be increasing, the rate is around 11245 Hz.
-
l1a_counter
should be increasing if you are sending L1A.
-
period_info.bad_perido_counter
should be 1. If this is increasing, then something is wrong in TTC path (You should mention it on the e-log)
MON: MON Status (live)
- check if L1A is working and packet is increasing
- With L1A,
mon.moni_recipex_count
should be increasing.
- If
mon.moni_overrun_count
is increasing, then something is wrong!
Make sure that the MON and ADC readout are enabled: for example, once the run have been started you need to cross check that the files are well produced (and with on non negligible size) on
pc-lar-mon-01
and also the logs
# MON data are recorded under :
watch -n3 ls -lht /data/via_script/
ls log_LDPB_data_taking_YYMMDD_HHMMSS.log
#ADC Mean data are under:
watch -n3 ls -lht /detwork/lar/LATOME/LATOME_config/LATOME_config-v2.2.1/outputs_adc_mean/
ls /detwork/lar/LATOME/LATOME_config/LATOME_config-v2.2.1/log_LATOME_adc_readout_YYMMDD_HHMMSS.log
#Input Stage data are under:
watch -n3 ls -lht /detwork/lar/LATOME/LATOME_config/LATOME_config-v2.2.1/output_IS_adc/
The data are automatically copied to EOS :
/eos/atlas/atlascerngroupdisk/larg-upgrade/LATOME
In the TDAQ IGUI open in the
atlasgw
terminal session, you may now close the TDAQ IGUI and exit the terminal session but do
not exit the partition. Hint: you may want to set the access mode to "display" and close the GUI. This way, the partition stays running and also somebody else can take control if needed (though as long as there is no other IGUI open, one should be able to take control anyway).
Write an elog
Write a elog using this
template
and add the run to
the
list
.
To check the
LArC FW version go to pc-lar-mon-01:
cd /det/lar/project/firmware/LArC/LArC-scripts/LArC-v18011000/scripts
python larcRegs.py 10
End the Run
Stop the LATOME Run
- You should again connect on :
pc-lar-mon-01
cd /det/lar/project/firmware/LATOME/datatake_script/launch_long_run
./Stop.sh
Keep this window open.
Exit the Partition
From atlasgw, do the following:
cd /det/lar/project/shiftarea/CurrentVersion
# setup the lar environment
source /det/lar/project/scripts/env/sod.sh
# setup the EMECA_DS partition (LTDB A03L is in the EMECA)
source setup.sh -t EMECA_DS
# launch the tdaq gui
daq
In the TDAQ IGUI panel, do the following:
- take control of the partition: at the top of the panel, click Access Control > Control
- click STOP
- click UNCONFIG
- click SHUTDOWN
- Close the IGUI and exit the partition: at the top of the panel, click File > Close Igui and Exit Partition
Check Logfiles
Check the following logfiles on
pc-lar-mon-01
# MON data are recorded under :
ls -lhtr /data/via_script/
ls log_LDPB_data_taking_YYMMDD_HHMMSS.log
#ADC Mean data are under:
ls -lhtr /detwork/lar/LATOME/LATOME_config/LATOME_config-v2.2.1/outputs_adc_mean
ls /detwork/lar/LATOME/LATOME_config/LATOME_config-v2.2.1/log_LATOME_adc_readout_YYMMDD_HHMMSS.log
#Input Stage data are under:
ls -lhtr /detwork/lar/LATOME/LATOME_config/LATOME_config-v2.2.1/output_IS_adc
Check LATOME registers
A full snapshot of the LATOME registers are taken every 10 min and dumped to log files that can be found in
/det/lar/project/firmware/LATOME/LATOME_config/logfile/
.
Check the registers below for errors and report any problems and the time it occurred in the elog. The time corresponds to the first log file where the error occurred so it has a granularity of 10 min.
To check if there are bitslip errors on the input stage fibers (indicating that the fiber lost synchronization) you have to check for this register:
istage.istage_top_box.status.bitslip_error_X.value
where "X" is the fiber number going from 1 to 48. An example command to loop over all files produced in a certain day and looking at the bitslip error for fiber 5 is below (in zsh):
#should provide a script
foreach f (`find -newermt '-1 day'`)
grep --with-filename istage.istage_top_box.status.bitslip_error_5.value $f
end
This code produces the following output:
2019-06-07_00-06-49.r_nodes.txt:READ 0x20000098 PASS istage.istage_top_box.status.bitslip_error_5.value: 0=0x0
2019-06-07_00-18-59.r_nodes.txt:READ 0x20000098 PASS istage.istage_top_box.status.bitslip_error_5.value: 0=0x0
2019-06-07_00-31-09.r_nodes.txt:READ 0x20000098 PASS istage.istage_top_box.status.bitslip_error_5.value: 0=0x0
2019-06-07_00-43-16.r_nodes.txt:READ 0x20000098 PASS istage.istage_top_box.status.bitslip_error_5.value: 432848=0x69ad0
2019-06-07_00-55-24.r_nodes.txt:READ 0x20000098 PASS istage.istage_top_box.status.bitslip_error_5.value: 1632004=0x18e704
2019-06-07_01-07-41.r_nodes.txt:READ 0x20000098 PASS istage.istage_top_box.status.bitslip_error_5.value: 2842892=0x2b610c
2019-06-07_01-19-54.r_nodes.txt:READ 0x20000098 PASS istage.istage_top_box.status.bitslip_error_5.value: 2842892=0x2b610c
Where one can clearly see the bitslip counter errors increasing around 00:43 and it becomes stable again at 01:07.
Other register to look at are:
- BCR error counter : ttc.period_monitor.period_info.bad_period_counter_0
- L1A counter (this on should increase continuously otherwise it means the LATOME is not receiving L1A): ttc.monitoring_counters.l1a_counter_0.value
Write elog
- Reply to your previous elog using this template
. Be sure to mention any and all issues/problems that occurred.
- Add the run information to the list
(google spreadsheet)
--
KileyElizabethKennedy1 - 2019-09-17