SPSCollaboration

This page collects the informations for the CERN LARP collaboration on the development of a fast feedback system for single bunch instability at for the SPS.

A temporary page is at here

DRAFT for a data exchange protocol

I would like to present a set of proposals for data exchange. I first concentrate on physical quantities we are interest by defining code name of physical quantities. Where not specified each quantity is a real number in SI units. This is the essential part, we must agree that data is unambiguously defined and sufficient for our analysis. Later I propose several data formats and briefly list pros and cons. I conclude with the data hosting problem.

Please add your comments, modifications and concerns. I hope we we can quickly define a simple standard.

I start with the physical quantities that we may want to study.

Simulated centroid motion

Useful for coherent effects and bridging physics with feedback system

name description
zs z position of the slice
xoff centroid x offset
yoff centroid y offset
xpoff centroid px/p or x prime offset
ypoff centroid py/p or y prime offset
sx std of the distribution in x
sy std of the distribution in y
npart number of particles per slice
nquant number of physical quantities (8 in this case)
nslice tot number of slices
nkick tot number of kick location
nturn tot number of turns
sim_centroid_full whole data set multi dimensional array of nturn x nkick x nslice x nquant elements
sim_centroid turnby turn data set at a single kick station multi dimensional array of nturn x nslice x nquant elements

Simulated individual particle motion

Useful for studying incoherent motion

name description

q charge of macroparticle in abs(e_charge) unit
m rest mass of the macroparticle in eV
x x position offset w.r.t. reference particle
y y position offset w.r.t. reference particle
z z position offset w.r.t. reference particle
p reference momentum in GeV/c
xp px/p relative momentum
yp py/p relative momentum
zp pz/p relative momentum
nquant number of physical quantities, in this case 9
npart number of particles
nkick tot number of kick location
nturn tot number of turns
sim_part_full whole data set multi dimensional array of nturn x nkick x npart x nquant elements
sim_part turnby turn data set at a single kick station multi dimensional array of nturn x npart x nquant elements

Wide band pickup raw data

name description
cycleid id numer of the machine super cycle
cycledate date time in seconds from epoch of the starting of machine super cycle
cycledatestr string representation of cycledate in the RFC822 format (e.g. Wed, 18 Feb 2009 12:37:35 -0500)
fs sampling frequency
nframe number of sampled frame
nsample number of samples per frame
framedate date time in seconds from epoch of the trigger time
framesecfrac fraction of seconds of trigger time
frametoffset trigger time offset in unit of 1/fs
sigma sigma signal array of nframe x nsample 8bit integer
sigmascale sum signal voltage scale of the LSB
delta delta signal array of nframe x nsample 8bit integer
deltascale delta signal voltage scale of the LSB
sigmaatt wide band attenuation of sigma signal (attenuators )
deltaatt wide band attenuation of sigma signal (attenuators )
bandwidth scope bandwidth
impedance scope input impedance

This data are basically a collection of multidimensional array and few parameters. A common data format may be useful to store everyting in a consistent way.

Data formats

SPS measurements data have a variety of data formats:

* the exponential stripline pickup uses the internal tektronix format '.wfm'

* the long stripline pickup (called also headtail monitor) uses ascii SDDS data

* the wall current monitor uses a proprietary ascii file

Headtail code uses ascii file with space, line break, double line break to encode 3D data sets.

I don't know natives format for WARP.

Claudio and John uses Matlab for post processing.

I use python with numpy library for post processing.

The data format should have the following properties:

1. hold a large quantity of data efficiently. It must hold 60M sample for each of the hundreds acquisitions.

2. be compatible with simulation and post processing tools

The alternatives I see are:

0. keep raw data as they are and provide tools for reading them for MATLAB, python, c, fortran. Some of the tools are already available.

1. gzip ascii files. It save some space, but it expensive for reading and creating. 60M points in ascii floating number corresponds to 1.8GB uncompressed file, 180GB compressed file instead of 20MB compressed binary. Easy to create and read.

2. raw binary file + description. Compact but can be a little bit tricky to create and read.

3. SDDS file format (http://www.aps.anl.gov/Accelerator_Systems_Division/Operations_Analysis/SDDSInfo.shtml). Opensource. It holds metadata and can be binary. Does not support compression, it is not so widespread outside accelerator lab, it is not optimized for access speed, it is not endian transparent. It needs external libraries for reading and writing. Library must be compiled from source, and it is not always straight forward.

4. hdf5 format (http://www.hdfgroup.org/HDF5/). Opensource. Similar to SDDS, but with more features, it support compression natively, it is optimized for access speed (do not need to load everything in memory but just the subset requested), it is widespread. Native in matlab ( http://www.mathworks.com/access/helpdesk/help/techdoc/ref/hdf5.html) compiled library available for linux and windows system (http://www.hdfgroup.org/HDF5/release/platforms5.html), python library available (http://h5py.alfven.org/),octave, r, c, c++, fortran, fortran 90, ... library available.

Data hosting

ftp directories with readme seems the easiest way to get data across several labs. ssh + rsync is probably the most efficient and safe.

The quantity of data, even if we are very careful, will easily exceed tens of GB. I have already 33GB.

CERN or other labs may provide this space but usually they require people have local accounts to access it.

Anyway if data is limited, I have some space on CERN afs and CERN webspace but it requires CERN account for upload.

I also have a personal web space with good bandwidth that may hold up to 10GB of space without starting complaining. It should be good for short term.

comments

Provided that the physical quantities are well defined, we have few option for data formats and data hosting.

For data format I would choose the option 4 (hdf5 file format) because it offers the best performance, maximum flexibility, opensource, compiled libraries available for any system and language. Otherwise option 2 (binary + description where the description is a C compilable code) would offers the same flexibility, performance but would require more work.

For data hosting I can provide ftp + http for short term, but I need a better solution for long term. Maybe a shared account for the CERN firewall would suffice. Alternative option are welcomed.

Those are the first thought.

-- RiccardoDeMaria - 18 Feb 2009

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2009-04-03 - RiccardoDeMaria
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback