Monitoring using JSON metadata files (JSON Collector)

Metadata file formats for the FileBasedEVF

BU

Simple data file (.jsn) and definition / legend file (.jsd)

Simple data file (.jsn)
  • File type: JSON
  • Output by: BU
  • Data fields
    • number of events
    • total size
    • others...
  • Other fields
    • definition is the location of the legend file for this format
    • source is a string representing the source of this data

Example:

{
   "data" : [ "1022", "122122"],
   "definition" : "/path/to/def.jsd",
   "source" : "bu-1"
}

Definition / Legend file (.jsd)
  • File type: JSON
  • Output by: ??
  • Fields
    • legend array of name/operation values for each monitored field
    • file file path of this legend; used as reference for data json files

{
   "legend" : [
      {
         "name" : "Events",
         "operation" : "sum"
      },
      {
         "name" : "Total size",
         "operation" : "sum"
      }
   ],
   "file" :"/path/to/def.jsd"
}

NB: The number of elements in the data array of the DATA file must match the number of elements in the legend array of the LEGEND file!

FU

Fast output file (.fast), Data+histogram file (.jsh) and definition/legend file (.jsd)

Fast output file (.fast)
  • File type: CSV
  • Output by: CMSSW process
  • Data fields
    • first line is path to definition, which has the same format as above; microstates will have a "histogram" operation instead of the regular sum, avg, etc.
    • following lines are comma-separated values of the different counters and are appended to the end of the file by line
      • number of events processed
      • number of events accepted by any stream
      • ministate
      • microstate
      • more...

Example:

/path/to/def.jsd
100,50,12,15
200,70,12,800
...

Data+histogram file (.jsh)
  • File type: JSON
  • Output by: Aggregating .fast files
  • Data fields (same as .fast files, but with histogram vectors for the states)
    • number of events processed
    • number of events accepted by any stream
    • ministate histogram array
    • microstate histogram array
    • more...

Example:

{
   "data" : [ "100", "50"],
   "ministates" : "[0,0,0,0,4,0...]",
   "microstates" : "[0,0,0,0,4,0...]",
   "source" : "fu-pid"
}

Definition/legend file (.jsd)
  • Same as the BU-type legend file, but with array of ministates and microstate names

{
   "legend" : [
      {
         "name" : "Events processed",
         "operation" : "sum"
      },
      {
         "name" : "Events accepted by any stream",
         "operation" : "sum"
      }
   ],
   "ministates"   : ["Mname1", "Mname2", ...],
   "microstates" : [ "mname1", "mname2", ...],
   "file" :"/path/to/def.jsd"
}

Handling these formats

  • Simple data files (BU-type) are aggregated by using operations specified in the definition file
  • Fast output files are aggregated by looking at the definition (1st line) and:
    1. if the field is a regular operation just take the last value
    2. if it is a microstate, take corresponding value in each row
    3. place these values in a Data+histogram type JSON file
  • Data+histogram files are aggregated using operations specified in the definition file

Writing JSON metadata files

The following C++ types are currently monitorable:

  • IntJ: wraps int
  • DoubleJ: wraps double
  • StringJ: wraps std::string

Below is an example showing how to use the JSONCollector output API to generate monitoring files in the format above. This approach to JSON file writing is useful when we want to configure the monitoring output without changing the code but only the definition (an external file).

Writing simple data files (BU-type)

#include "JSONCollector/interface/JsonMonitorable.h"
#include "JSONCollector/interface/DataPointMonitor.h"
#include "JSONCollector/interface/JSONSerializer.h"

#include <iostream>
#include <vector>
#include <string>

using namespace std;
using namespace jsoncollector;

class Monitored {

public:
        // some variables to monitor
        // types defined in JsonMonitorable.h
	IntJ nEvents;
	DoubleJ totalSize;
	StringJ someString;

};

int main() {

	Monitored mon;

        // set names of the variables to be matched with JSON Definition
	mon.intvar.setName("Events");
	mon.doublevar.setName("Sizes");
	mon.stringvar.setName("SomeString");

        // create a vector of all monitorable parameters to be passed to the monitor
	vector<JsonMonitorable*> monParams;
	monParams.push_back(&mon.nEvents);
	monParams.push_back(&mon.totalSize);
	monParams.push_back(&mon.someString);

        // create a DataPointMonitor using vector of monitorable parameters and a path to a JSON Definition file
	DataPointMonitor monitor (monParams, "/path/to/simple_def.jsd");

        // give some values to the monitored parameters
	mon.nEvents = 1023;
	mon.totalSize = 512223;

        // create a DataPoint object and take a snapshot of the monitored data into it
	DataPoint dp;
	monitor.snap(dp);

        // serialize the DataPoint and output it
	string output;
	JSONSerializer::serialize(&dp, output);
	cout << output << endl;

        // write this string to a file
        // ...

        return 0;
}

For the above code we need a definition file at the specified path. The output format will be given by the definition at /path/to/simple_def.jsd.

{
   "legend" : [
      {
         "name" : "Sizes",
         "operation" : "sum"
      },
      {
         "name" : "Events",
         "operation" : "sum"
      }
   ],
   "file" : "/path/to/simple_def.jsd"
}

Writing fast files (FU-type)

These files are not JSON, but CSV. They will be converted to JSON by the aggregation process.

#include "JSONCollector/interface/JsonMonitorable.h"
#include "JSONCollector/interface/FastMonitor.h"
#include "JSONCollector/interface/JSONSerializer.h"

#include <vector>

using namespace std;
using namespace jsoncollector;

class Monitored {

public:
	IntJ processedEvents;
	IntJ acceptedEvents;
	IntJ microstate;
	IntJ macrostate;

};

int main() {

	Monitored mon;

        // set names of the variables to be matched with JSON Definition
	mon.processedEvents.setName("Processed Events");
	mon.acceptedEvents.setName("Accepted Events");
	mon.microstate.setName("Microstate");
	mon.macrostate.setName("Macrostate");

        // create a vector of all monitorable parameters to be passed to the monitor
	vector<JsonMonitorable*> monParams;
	monParams.push_back(&mon.processedEvents);
	monParams.push_back(&mon.acceptedEvents);
	monParams.push_back(&mon.macrostate);
	monParams.push_back(&mon.microstate);

        // create a FastMonitor using vector of monitorable parameters, a path to a JSON Definition file and the output file path
	FastMonitor
			monitor(
					monParams,
					"/path/to/histo_def.jsd",
					"/path/to/output.fast");

        // change the monitored parameters
	mon.processedEvents = 100;
	mon.acceptedEvents = 76;
	mon.microstate = 3;
	mon.macrostate = 1;

	monitor.snap();

         // change the monitored parameters again
	mon.processedEvents = 200;
	mon.acceptedEvents = 150;
	mon.microstate = 9;
	mon.macrostate = 1;

	monitor.snap();

        // do something else ...

	monitor.snap();

	return 0;
}

For the above code we need a definition file at the specified path. The output format will be given by the definition at /path/to/histo_def.jsd.

{
   "legend" : [
      {
         "name" : "Processed Events",
         "operation" : "sum"
      },
      {
         "name" : "Accepted Events",
         "operation" : "sum"
      },
      {
         "name" : "Microstate",
         "operation" : "mHisto"
      },
      {
         "name" : "Macrostate",
         "operation" : "MHisto"
      }
      
   ],
   "file" : "/path/to/histo_def.jsd"
}

Aggregating JSON metadata files

API

Below is an example of using the API to aggregate json metadata files read from a directory.

#include "JSONCollector/interface/JSONFileCollector.h"
#include <vector>
#include <string>

using std::vector;
using std::string;

int main() {

   string inputFolder = "/path/to/jsnfiles/";
   string outputFile = "/output/path/out.jsn";
   vector<string> inputJSONFilePaths;

   string outcomeString;
   
   // get a list of .jsn files that respect the regular expression <mon.*> for the file name
   JSONFileCollector::getJSONFileList(inputFolder, inputJSONFilePaths, outcomeString, "mon.*");

   // aggregate json files and write the output file
   // the last argument (formatForDisplay) is set to false if we want to keep the same format for output
   // if we want a more readible output, we set this to true (the output file will no longer respect the input format)
   int outcome = JSONFileCollector::collectAndOutput(inputJSONFilePaths, outputFile, false);

   return outcome;
}

Command line tool

Available as a command line tool that loads json data files from a dir (optionally using a regex for the file name) and outputs the aggregated result according to the legend. Input files must have the same legend and be consistent.

Usage:

./JSONCollector [-d] [-r <regex>] -o <outfile> -i <indir1> <indir2>...<infileN>
where:
  • [-d] if flag is set will output for display, meaning it will merge Data and Legend files into one. This file can no longer be re-aggregated.
  • [-r ] regular expression to be satisfied by the input file names of json files
  • -o one output file of the operation
  • -i a list of inputs for aggregation, may be individual files or dirs containing files

CODE

Code is available here: /afs/cern.ch/user/a/aspataru/public/JSONCollector

Open issues

  • Meaning of microstate numbers: only required for visualization, so the mapping between state number and name will be defined somewhere else

-- AndreiSpataru - 28-Nov-2012

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng JSONCollector_classes.png r1 manage 353.1 K 2012-10-10 - 18:02 AndreiSpataru Class diagram
Edit | Attach | Watch | Print version | History: r19 < r18 < r17 < r16 < r15 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r17 - 2012-12-17 - AndreiSpataru
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback