DAQ strategy for CLOUD prototype beamtest

André David

July 3, 2006 DRAFT


The purpose of a having a DAQ strategy for the CLOUD protype test in the CERN T11 beam in September-November 2006 is to ensure that a posteriori it is possible to correlate in time the information collected by the different instruments brought to CERN.

To start there are two very important pieces of information that are required from every person responsible for a detector or instrument:

  • What is the software tool with which data is stored, with which we can identify a point where the data could be sent to the CLOUD DAQ

  • What is the data format used for analysis, with which we can agree on a common format

This information can be entered by editing the topics in the table of section Sources of Data. To edit this twiki page only requires you are a registered user at CERN and have a computer account. The user registration procedure requires you to be physically present at CERN at the time, so it cannot be done until your next visit. So, for the moment, you will have to send any information to me (Andre David) by email and I will edit the page for you.

Once this exercise is completed we can then proceed to identify the best way of implementing an architecture for the CLOUD beam test and - in the spirit of a protype test - learn more about the possible architecture for the future running of the experiment.

Section (a possible) Architecture presents several ideas that can be used as a starting point.

Something decoupled from what the architecture will look like are the specifications of what data and how much data each detector produces per unit time (kByte/sec).

The most important parameters are the maximum event size $ s $ and maximum event rate $ r $. These two can be replaced by the maximum data rate $ R $, calculated as the maximum of the product $ s_i r_i $, where $ i $ is an index that covers all possible operating scenarios $ \{ s_i, r_i \} $ for a given data source.

If your detector is not yet included in the Table, please tell me and I will correct it.

Sources of Data

Name Institute Contact Person Existing DAQ interface Maximum data rate Data format
H2SO4 mass spectrometer, CIMS Heidelberg Heinfried Aufmhoff Windows 2k 1 MB / 20 sec with PCM (1.25x nominal 40kB/s) written in hard disk
Ion Mobility Spectrometer DNSC Michael Avngaard Labview 8.0 on WinNT 2.3 kB / 20 sec (1.0x nominal 10 MB per day) text files written in hard disk
Atmospheric Ion Spectrometer, AIS Helsinki Mikko Sipila and Sander Mirme      
Condensation Particle Counter 1 Helsinki Mikko Sipila and Sander Mirme Sectops (Win/Lin, but Win tested) 70 kB / 20 sec at 1 spectrum/sec (70x nominal 1 minute cycles) CSV file on disk per cycle
CPC 2,3 Mainz Joachim Curtius In-house DAQ, control & OS (V25pro CPU + Pascal) <=> RS232/485 (Windows/DOS) 60 kB / 20 sec at 10 Hz sampling (10x nominal) ASCII-table on a PCMCIA memory card
CPC 4,5 PSI Ernest Weingartner      
Scanning Mobility Particle Sizer, NanoSMPS PSI Ernest Weingartner      
Differential Mobility Analyser, DMA+CPC PSI Ernest Weingartner      
Water CPC PSI Ernest Weingartner      
Electrostatic Precipitator PSI Ernest Weingartner      
Gas and aerosol conditions          
Carrier gas composition CERN Ferdi Hahn      
Carrier gas flow rate CERN Ferdi Hahn      
Water vapour DNSC Michael Avngaard      
Trace gas: O3 DNSC Michael Avngaard      
Trace gas: SO2 DNSC Michael Avngaard      
Trace gas: NOx DNSC Michael Avngaard      
Run conditions          
Tent temperature DNSC Michael Avngaard      
Chamber pressure DNSC Michael Avngaard      
UV intensity DNSC Michael Avngaard      
Field cage voltage DNSC Michael Avngaard      
Beam conditions          
T11 beamline CERN Andre David Software adapter Read back settings  
Beam telescope CERN Andre David VME2PC less than 64 bytes per second  

(a possible) Architecture

In CLOUD the detectors and other data sources will most likely run continuously and independently of each other. It seems that the best approach is that all data is saved, but tagged with a timestamp. Provided we synchronise all the PCs with the a time server, this timestamp will ensure synchronisation of the data at the 5 ms level (see below). I assume this is adequate; if not please inform me straight away.

This way, offline analyses can identify and extract all the data from the detectors that were operating at any given calendar time, as well as the run conditions.

Types of time tags

At present we have identified three types of data sources in what regards time tagging:

  • Sources pushing acquired data, for which each datum should be accompanied by a timestamp identifying the time interval during which the datum was acquired. This can range from a few microseconds for single reads to a few seconds for long scans.

  • Sources pushing manually set data, for which there must be an implicit assumption that the value is valid until a new value is set. For instance, if the beam type is set to "protons" at 12:00 and to "pions" at 14:00, then it is assumed that it was protons from 12:00 to 14:00.

  • Sources being polled, for which there must be an implicit assumption that the value is valid in time from half-way towards the previous value and halfway towards the next value. For example, if the temperature of the gas is read at every $ \Delta t $ seconds, then a set of values taken at instants $ t_i $ will be valid in the interval of time from $ t_i-\frac{\Delta t}{2} $ to $ t_i+\frac{\Delta t}{2} $.

PC synchronisation

Following the idea of tagging all pieces of data from every data source with the time at which they were produced (or are valid), the problem of making sure that all tagging clocks are running synchronously becomes important.

Supposing that CLOUD will not never require a time tagging accuracy better than 5 ms, the NTP protocol seems to be appropriate for synchronising PCs, in particular, because there is a CERN NTP service.

This means that we do not expect any detector to have an acquisition rate faster than 200 Hz or 200 individually timed samples per second.

(In case of increased needs, it is still possible to improve the timing accuracy by a factor 5 by adding a dedicated timing server that has a GPS clock connected.)

Compensating for the delays in different detectors

Now that we know that each PC can tag pieces of data to within an accuracy of 5 ms, the final problem is to make sure that we have a way to compensate for the different amounts of time that it takes for different measurements to be tagged.

This, for most part, will be due to the different delay paths from when the front-end acquires the data to when the PC tags the data.

A way to circumvent this difficulty is to consider that every measurement is framed between a start time tag and an end time tag.

Run conditions

Alongside the data from the sampling instruments, we need to record all the operating conditions for the run - some of which will be read automatically (e.g. temperature, pressure...) and some of which may be manually entered at the start of the run (e.g. field cage voltage). These so-called slow control values and settings will also be timestamped and recorded so that we will have an unambigous record of the run conditons associated with the corresponding data from the sampling instruments.


After having guaranteed that all the different data sources are tagged with times that are synchronized with respect to each other, the problem of data storage is simply one of data access, because once the format is agreed upon, the only offline work is to assemble the pieces belonging to the same time.

Perhaps the simplest way is for files to be stored on some central disk service in parallel, while a central database is populated with pairs of time tags and links to the file locations.


  • The T11 bealine will deliver one or two pulses of charged pions, each of duration 460 ms, during the so-called the PS supercycle of 14.4 s. The beam momentum is 3.5 GeV/c, which means the pions are in the "minimum ionising" region of energy loss, typical of most cosmic rays in the troposphere. The beam size with the present magnets and CLOUD optics settings is about 1.1 m (horizontal) by 1.7 m (vertical). The time-averaged beam intensity can be adjusted between zero and 100 times the galactic cosmic ray intensity at ground level (which is about a factor 2 times the GCR intensity at the top of the troposphere). The beam intensity can be finely adjusted in this range and will be measured to about 5% precision or better with the beam telescope.

Topic attachments
I Attachment History Action Size Date WhoSorted descending Comment
PNGpng 1012f4225507e282b43198a3ca927833.png   manage 0.3 K 2006-07-03 - 12:19 UnknownUser  
PNGpng 18e661fcfc00ebfcedde7da052602a82.png   manage 0.3 K 2006-07-03 - 12:22 UnknownUser  
PNGpng 1af9dcecc465950e25f7153943970180.png   manage 0.2 K 2006-07-03 - 12:21 UnknownUser  
PNGpng 4049221e94599a6696f1c9b1f9061b5c.png   manage 0.4 K 2006-07-03 - 12:20 UnknownUser  
PNGpng 6d3e500aea9fff1c1214e02cb3c27cd5.png   manage 0.4 K 2006-07-03 - 12:24 UnknownUser  
PNGpng 72f8ab13f56f855e098e0ea6e73251c1.png   manage 0.2 K 2006-07-03 - 12:24 UnknownUser  
PNGpng 75695b46abca7ce53dfa3b4e984a45ca.png   manage 0.2 K 2006-07-03 - 12:19 UnknownUser  
PNGpng 793d6602f044affad0290fdc4f61ce36.png   manage 0.2 K 2006-07-03 - 12:21 UnknownUser  
PNGpng 7d812f42a1a2989270b6cc832741d6fb.png   manage 0.4 K 2006-07-03 - 12:24 UnknownUser  
PNGpng b07861c8ab466eab8dc9f4a9b1da3882.png   manage 0.2 K 2006-07-03 - 12:19 UnknownUser  

This topic: Main > TWikiUsers > AndreDavid > CloudDAQ
Topic revision: r13 - 2006-08-04 - AndreDavid
This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback