Automation of Data Validation
Introduction
To assist with the validation of the accounting data, a script has been written to download the data from the two sources, calculate their ratios and publish the results on the Site Status Board.
The script is set to be executed automatically on a monthly basis.
Extra emphasis was given on the visualisation of the data.
This work was done by Dimitrios Christidis as part of the CERN openlab Summer Student Programme 2016, under the supervision of Julia Andreeva.
The full report is available on
ZENODO
.
Information for users
Main view
The
Accounting Validation
view shows the latest data in a table format.
The data is grouped by experiment, then by metric.
Figure 1: The main view on the Site Status Board.
Graphs
The true usefulness of the Site Status Board comes from the ability to visualise the data over time.
Click on one of the
Ratio metrics on the table header.
This will take you to the Site View for the requested metric.
Specify the desired
Time Period and
Site Selection, then press
Refresh.
Figure 2: One of the graphs produced by the Site Status Board.
For easy access, below are direct links to the data from January 2016 to June 2017:
Colour coding
The colour reflects the distance of a particular ratio from 100%.
As the distance grows, the colour changes from green to yellow and finally to red.
Three thresholds are defined to specify which colour to choose.
To allow for more flexibility, they are set on a per-experiment basis.
The values that are currently used are shown below.
Table 1: Colour-coding thresholds, as of 2016-10-17.
Raw data
Clicking anywhere on the graph will display a JSON file.
The value of the
Status
key is the raw data for that particular site and month, separated by commas.
The numbers are:
HS06 Experiment,
HS06 Site,
HS06 Ratio,
Raw Experiment,
Raw Site and
Raw Ratio.
Figure 3: Display of raw data in JSON file.
Offline spreadsheets
The Site Status Board is not suitable for the display of older data in table format.
Thus, the automation script also produces Microsoft Office Excel files.
Information for maintainers
Documentation is provided with the source code on
CERN GitLab
.
--
DimitriosChristidis - 2017-03-02