This page is a short reference on
NeuroBayes, an advanced neural network implementation. This page will mainly cover those information which are specific for the usage at CERN. More general information, documentation can be found at
http://www.neurobayes.de
.
Important information:
NEW: New version 3.16.0, licence server adress changed!
Due to the changed infrastructure of LXPLUS a licence server was installed in August 2013 which has to be used to be able to run
NeuroBayes from now on.
To use this licence server set the environment variable PHIT_LICENCE_SERVER to lcgapp-slc6-physical2 and the port number:
export PHIT_LICENCE_SERVER=lcgapp-slc6-physical2:16820
Beginning of April 2011 the
"latest" symlink to the newest
NeuroBayes installation in afs will be removed and no longer provided. Please use the explicit version from now on (currently this is 3.16.0. Version 10 and 11 were named after the release year and are older than version 3).
Thus your
NEUROBAYES
shell variable should look like
export NEUROBAYES=/afs/cern.ch/sw/lcg/external/neurobayes/3.16.0/x86_64-slc6-gcc44-opt
A full setup script thus reads:
#!/bin/bash
export export PHIT_LICENCE_SERVER=lcgapp-slc6-physical2:16820
export NEUROBAYES=/afs/cern.ch/sw/lcg/external/neurobayes/3.16.0/x86_64-slc6-gcc44-opt
export HOST=`hostname`
export LD_LIBRARY_PATH=$NEUROBAYES/lib:$LD_LIBRARY_PATH
export PATH=$NEUROBAYES/external:$PATH
See also
lxplus setup below.
NeuroBayes® is, as the name already points to, an advanced multivariate analysis tool, which takes benefit from neural network techniques and combines them with bayesian statistics in order to yield a well performing, fast and overtraining-save algorithm for data analysis.
It was started as a tool for doing event selection in high energy physics analysis. But over the years it was further developed and made its way to economic fields.
Nowadays, the company
<Phi-T> GmbH
is further developing, maintaining and supporting
NeuroBayes. Since
NeuroBayes has started as tool for physicists, <Phi-T> wants to continue supporting scientific users and give them
NeuroBayes licence at special rates. A special agreement has been arranged for CERN, allowing users to use
NeuroBayes on all lxplus machines free of charge for training the so called
NeuroBayesTeacher.
Later on
NeuroBayes can be applied on data on any machine (which is software comptabile), because scientific users get a special version of the so called
NeuroBayesExpert which does not require a licence. Thus it will espacially also run in the
GRID.
The
NeuroBayes workflow is depicted in the following image, showing also the relation between
NeuroBayed Teacher, Expert and expertise.
How can NeuroBayes be used at CERN?
Users at CERN can use
NeuroBayes on lxplus machines. Every lxplus machine can get a licence from a licence server and
NeuroBayes is installed in CERN afs.
In order to use it you have to set the following environment variables:
export NEUROBAYES=/afs/cern.ch/sw/lcg/external/neurobayes/<version>/<architecture> #currently the newest version is 3.16.0 which only works on SLC6, users depending on SLC5 should use 3.7.0
export PHIT_LICENCE_SERVER=lcgapp-slc6-physical2:16820
export LD_LIBRARY_PATH=$NEUROBAYES/lib:$LD_LIBRARY_PATH
export PATH=$NEUROBAYES/external:$PATH
Alternatively you can just run:
export PHIT_LICENCE_SERVER=lcgapp-slc6-physical2:16820
. /afs/cern.ch/sw/lcg/external/neurobayes/3.16.0/x86_64-slc6-gcc44-opt/setup_neurobayes.sh
Please note, that this afs folder is only accessible from within CERN. If you want to use the
NeuroBayesExpert at another site mounting the /afs file system, please use
export NEUROBAYES=/afs/cern.ch/sw/lcg/external/neurobayes_expert/<version>/<architecture>
For a test run you could use a simple c-tutorial which you can find
/afs/cern.ch/user/s/sroecker/public/NeuroBayes-c_tutorial.tgz
at and which is also explained below.
How can I use NeuroBayes in my analysis?
There are several ways to use
NeuroBayes in your analysis. The basic approach in nowadays HEP analysis would be using the C++ interface to communicate with
NeuroBayes directly. But there are also interfaces for using
NeuroBayes in
TMVA or in MVA from CMSSW, which can be both more conveniente if you already are using either one for your classificator.
General documentation
The general
NeuroBayes documentation can be found in the
Users's Guide
C++ interface
The
NeuroBayes-C++ interface is documented in this
pdf-file
Three small tutorial examples using the C++ interface together with
ROOT can be downloaded from
here
. Explanations for the tutorial can be found at
http://neurobayes.phi-t.de/index.php/tutorials/cc
.
A quick try on lxplus can be done by issuing the following commands. Please make sure you have setup a
ROOT version in advance. In order to link your program on SLC6 you have to use the same or a newer g++ version which was used to compile SLC6, gcc 4.6.2.
. /afs/cern.ch/sw/lcg/external/gcc/4.6.2/x86_64-slc6-gcc46-opt/setup.sh
. /afs/cern.ch/sw/lcg/app/releases/ROOT/5.34.09/x86_64-slc5-gcc46-opt/root/bin/thisroot.sh
wget http://neurobayes.phi-t.de/downloads/NeuroBayes-c_tutorial.tgz
tar -zxf NeuroBayes-c_tutorial.tgz
cd NeuroBayes/c-tutorial
./setup_neurobayes.sh
make
make trainingdata
./train2
The setup_neurobayes.sh refers to /afs/cern.ch/user/g/gedum/public/NeuroBayes/software which seems outdated. Update it to the setup script advertised at the very top of this page.
This will run for you a quick
NeuroBayes training. The whole tutorial and its explanations can be found
here
TMVA interface
A
TMVA interface for
NeuroBayes has been developed for
TMVA versions included in
ROOT since 5.18. Due to interface redesign in
TMVA-4, the mechanism had to be readdapted, but it is now working again, if one applies a small patch to
ROOT. Detailed information on how to setup and receive the interface and how to use
NeuroBayes in the context of
TMVA can be found at
http://neurobayes.phi-t.de/index.php/tutorials/tmva
. An updated version with additional patches can be found at
https://github.com/sroecker/tmva-neurobayes
.
Trainings modes
As of now
NeuroBayes can be trained in three different ways, which are essentially different. The parameter to set the different training modes are
NB_DEF_PRE
(global preprocessing Flag0 and
NB_DEF_ITER
(number of iterations).
Normal iterative training
This will perform an ordinary iterative training, which will minimize the loss-function. The settings are:
nb->NB_DEF_ITER(100); // number of training iteration
nb->NB_DEF_PRE(612); // global preprocessing flag
//optional parameters
nb->NB_DEF_METHOD("BFGS"); // faster training mode with automatic stop if optimal point is reached
Zero iteration mode
In this training mode there will be no training real training of a neural network, but just the preprocessing will be performed followed by another analytical transformation of the inputs into the discriminator.
Settings:
nb->NB_DEF_ITER(0); // number of training iteration
nb->NB_DEF_PRE(622); // global preprocessing flag
//optional parameters
nb->NB_DEF_SHAPE("DIAG"); // perform spline fit on the output to bring
// discriminator in linear correlation to the signal probablity
This training mode is extremely fast. It is usually as good as a network training in means of discrimination power, but the output is not necessarily linearly correlated to the signal probablity (but can automatically be transformed) and the discriminator distribution is sometimes not smooth, which may look odd to
the human eye.
Internal boost
Most MVA-techniques can be boosted, which means one trains a second MVA which gets high weights on those events which the first classificator did misclassify.
NeuroBayes has the ability to perform a two step boosting process internally, consisting of first step with zero-iteration and an iterative training in the second step. The settings needed for this are a mixture of those for iterative and zero iteration training.
nb->NB_DEF_ITER(100); // number of training iteration
nb->NB_DEF_PRE(622); // global preprocessing flag
//optional parameters
nb->NB_DEF_METHOD("BFGS"); // faster training mode with automatic stop if optimal point is reached
nb->NB_DEF_SHAPE("DIAG"); // perform spline fit on the output to bring
// discriminator in linear correlation to the signal probablity
Support
The main source for support around
NeuroBayes should be
www.neurobayes.de. There you will find documentation, tutorials, a helpdesk and so on and so forth. If for some reason you need urgent help, you may contact
mailto:neurobayes@blue-yonderNOSPAMPLEASE.com.
--
SteffenRoecker - 2015-07-20