Chapter 10: Software Infrastructure
10.1 Installing CMS Software
Complete:
CMSSW Software Installed at a Generic Site
CMSSW does not need to be installed locally at every site. Users can access it via the LXPLUS cluster at CERN or at some of the T1/T2 centers. Otherwise, to install it locally follow the instructions at
Installing software using apt. As there are some dependences on the local environment, some unexpected problems may come up. In that case, please report them to
the software development tools hypernews list
.
If you are using a local installation, you may need to use slight variations of the standard instructions in this workbook for some tasks. Any variations should be documented in the workbook page
Specific Information for Remote Sites and Institutions. If not, please contact your system admin and ask that the necessary information be added or links provided.
Responsible:
SudhirMalik
Last reviewed by:
AndreasPfeiffer 12 Feb 2008
10.2 Developing Software
Complete:
Detailed Review status
Goals of this page
When you finish this page, you should understand the basic concepts to consider when developing your own software. Additionally, you should know how to document your code so that it appears in the CMS online documentation.
Eventually this page will also contain instructions for contributing analysis documentation to the main documentation repository.
Contents
Instruction for writing code
Tools and instruction for writing code can be found in
the Offline Guide Framework section.
In particular, to avoid problems with memory leaks, you should follow
the guidelines for using pointers
.
Writing Code to run in the CMSSW Framework
Instructions for developing CMSSW code, adding packages, checking new code into the CVS repository, and maintaining your code are currently found in the
CMSSW Developers Guide.
Writing private code to run in CMSSW
This section is intended for users who want to write "user" code which will never go into CMSSW public software releases. If you intend for your code to go into CMSSW releases, you should probably request a normal package with the normal procedure via the Tag Collector. For this see the
Writing Code section below.
Creating a user area in the CMSSW repository
There is an area called "UserCode" in the CMSSW repository where you can
create yourself a package to store your private user code and cfg files, for example:
UserCode/PElmer
To create such a user CVS area for yourself, do the following:
1) Choose a packagename for yourself. Typically this should be your
name (e.g. with some capitalization as above, or your CERN unix
username, etc.) so that it is semi-obvious whose package it is.
2) Then create your package (here for an example "JohnDoe" user):
a) Set your CVSROOT to point to the CMSSW CVS repository.
You do this by doing the following, for example:
***********************************************
[malik@cmslpc02 ~/test]$ kserver_init
Enter your username at CERN:malik
Getting your Kerberos 5 credentials from CERN
Password for malik@CERN.CH:
Ticket cache: FILE:/tmp/krb_cern_6398
Default principal: malik@CERN.CH
Valid starting Expires Service principal
01/21/10 23:03:12 01/22/10 23:03:12 krbtgt/CERN.CH@CERN.CH
renew until 01/28/10 23:03:12
***********************************************
b) 'cd' to some temporary area
cvs co UserCode/README
c) mkdir UserCode/JohnDoe
d) cd UserCode
e) cvs add JohnDoe
In general it is advisable to structure it like a normal CMSSW package,
-
- e. with the same subdirectory structure. To do this, you can do:
mkdir JohnDoe/src JohnDoe/interface
cvs add JohnDoe/src JohnDoe/interface
mkdir JohnDoe/data JohnDoe/test
cvs add JohnDoe/data JohnDoe/test
If you do not think you need some of these subdirectories at the
moment, you simply do not 'mkdir' and 'cvs add' them.
3) Then you can 'cd' into the "src" area of some scram working/developer
area and do:
cvs co UserCode/JohnDoe
and begin to add your code and cfg files to the subdirectories. If you
need a
CMS.BuildFile, you can copy an example from a normal CMSSW package
and edit it to serve your purposes.
Commit your files to your own subdirectory
Please note that you should not commit files to the UserCode
directory itself, but only in your equivalent of the JohnDoe
sub-directory....
You should find your code on the url
here
to
UserCode in CVS.
General Principles for Software Development
Test your new code locally, on small data samples
When writing your own code, it is important to run on your local working environment and test the analysis code on a small number of events - this saves time and resources.
You should develop your own code locally, and compile it, then run on small test samples to enable you to quickly track down and fix bugs.
You should also save and test your code regularly - otherwise it can be very time-consuming to track down errors which might produce obscure messages.
Once you are confident your user code is working correctly, you should run a short test job on the sample that you plan to use in your analysis (either data or Monte Carlo). This will enable you to test the compatibility between your code and the specific sample you want to analyse. Additionally it will enable you to test if your code runs properly in the case that a different submission process is used for the main job compared to local running (i.e., batch submission compared to interactive use) Sometimes certain commands or characters may cause the code to react differently.
Batch systems for job submission are prioritized to provide fair access for all users, and very short jobs attract a high priority - therefore small test jobs can be run quickly. This can provide a significant saving of time on a busy system.
Collating your results
At the end of your analysis job, you will want to "summarise" your analysis jobs and produce plots of information from the analysis and reconstruction. This is typically a less compute-intensive job, and tends to be something people do in their local environment.
Contributing documentation
While you are developing your own software, it is likely that you will find areas in which you can improve the online documentation so that other users can benefit from your experiences.
There are three principal ways in which you can do this.
- The first is the CMS Offline WorkBook, of which the current page is a part. If there are pages relevant to your code, you should give feedback and suggest improvements.
- The second is the CMS Offline Guide, which gives all the details of the CMS software. Whenever your code becomes public, you should provide a page which explains how it works and how to use it.
- The third is to make your code "self-documenting" by adding descriptive comments to your code which can later be farmed by the Doxygen system.
The CMS Software Documentation Policy is described in
SWGuideDocumentationPolicy.
Contribute to the CMS WorkBook
Contributions to the documentation of the CMS WorkBook project are very welcome. Please see the
Guide for contributors.
Contribute to the CMSSW Reference Manual
The CMSSW Reference Manual contains automatically generated documentation, produced by Doxygen, describing the classes and packages. The manual can be found at
http://cmsdoc.cern.ch/cms/cpt/Software/html/General/gendoxy-doc.php
.
An explanation of how contributors can annotate their code to add material to the Reference Manual is available following the link from the bottom
of the Reference Manual main page.
Review status
Responsible:
SudhirMalik
Last reviewed by:
SudhirMalik - 21 Jan 2010
10.3 Optimizing your Code
Complete:
Optimising your code is simple. The process is follows:
- Determine a good moment for effective optimisation.
- Establish a baseline reference you know to be correct. You will not be able to conclude your changes were an improvement until you have been able to verify the results remain correct. Obviously enough this implies you need to have a realistic test suite.
- Establish the goals you are trying to achieve in the light of most significant performance factors, so you know when you are done. Assign priority and amount of effort that can be afforded to reach the goals.
- Measure performance of representative tasks. Make sure the tests cover a sufficient part of the phase space of real life usage. The measurements need to be repeatable and unambiguous.
- Analyse the benchmark results for performance bottlenecks and memory-related problems.
- Review data structures and algorithms used. If deep code changes are required, make first these corrections and repeat until orders of magnitude issues have been addressed.
- Address remaining hot spots. Measure again until satisfied.
It is very common only one issue can be addressed at a time.
One frequently runs into a single "dragon head" that needs to
be chopped off before anything else can be even seen, and then
other heads will pop up and can be addressed. Many of such
issues are "low-hanging fruit," requiring very modest effort
to identify.
We have collected
common performance issues
you might wish to review.
When to begin optimizing your code
Performance measurement and optimisation can begin when the code is correct and essentially complete. Until then effort is better spent in reaching these two goals. It is an excellent idea to work iteratively: producing first a complete but a limited system, testing, validating and benchmarking it, and using the experience to build a better system. This allows performance to be addressed early in a healthy matter. Premature and uninformed optimisation rarely yields the desired gains and frequently reduces performance, for the high price of distorting the programme design unnecessarily and other losses in maintainability.
For the optimisation effort to yield measurable improvements on representative tasks, one must know whether and where effort is needed. A well-informed optimisation depends on sufficiently realistic, detailed and accurate measurements, developers are notoriously bad at guessing where optimisation is necessary. The optimisation effort should be proportional to its value, substantial development in particular is likely to change the performance characteristics so early efforts should be duly limited.
Factors contributing to performance
A number of design and programming factors affect performance,
roughly in the following order of significance:
- It seems obvious perhaps, but one needs to understand the problem the programme is solving. Usually there is vast latitude in how any one problem can be solved, and it is important to know how much of that latitude can be used and has already been tried.
- Algorithmic and data structure design changes have orders of magnitude more performance impact than tuning a specific algorithm.
- Mapping concepts well to an implementation is very important. Specifically watch for over-reliance on strings, recomputing values or lookups unnecessarily, excessive sorting and copying large objects.
- Every class should have a single clear mission and there needs to be a clear object life time and ownership policy. Apart from the design aspects that otherwise require these, it is next to impossible to optimise anything that serves numerous purposes or has ambiguous ownership.
- Simple linear code easily understood is preferable. Hiding a simple linear logic into hundreds of seemingly unrelated small code fragments or in impenetrable thicket of indirection layers are just as bad as the more traditional monstrous algorithms of several thousands of lines of complex branching and looping patterns.
- Memory usage patterns and performance correlate well. High-performance code should not allocate memory only to release it soon afterwards. Use the least necessary amount of memory required by the algorithm and aim for memory locality. Modern CPUs take a big hit for poor memory access patterns and the problems take an expert to analyse. Exceeding CPU cache limits, frequent pointer dereferences and pointer walks around memory compound to a high price.
- Most "clever tricks" make things worse, not better. Only use tricks where they have clearly measurable significant impact.
Measuring Performance
Under construction
Identifying bottlenecks
Valgrind/kcachegrind can be very helpful for identifying bottlenecks.
Here is a recipe of how to use them:
- in the cfg file something like this:
service = ProfilerService {
untracked int32 firstEvent = 2
untracked int32 lastEvent = 51
untracked vstring paths = { "p1"}
}
valgrind --tool=callgrind --combine-dumps=yes --instr-atstart=no
--dump-instr=yes --separate-recs=1 cmsRun ****.cfg
# edit the lines above to one single line
- use kcachegrind to interpret the valgrind output which is in the file callgrind.out... ending with the PID of the job. You may need to add something to your path, similar to the example
setenv PATH /afs/cern.ch/cms/bin/i386_linux2:${PATH}
kcachegrind callgrind.out.xxxxx
If the results indicate that no part of the programme is a bottleneck, but the code is too slow, this usually indicates there is an overall structural problem.
Identifying memory performance issues
valgrind --tool=memcheck --leak-check=full cmsRun run.py > & vglog.out &
- Or, suppressing the leaks from root::
valgrind --tool=memcheck --leak-check=full --suppressions=${ROOTSYS}/etc/valgrind-root.supp cmsRun run.py > & vglog.out &
Making performance improvements
Under construction
Some very short practical hints
- prefer fixed data structures, reusable for each event, to dynamic push_backs per event. But of course this works only if you know a maximum size.
- avoid (of course) unnecessary new/delete
- avoid unnecessary operator= usage
- unnecessary creation of object on the stack
Optimal programming of EventSetup issues
The following code example , called once per event, is fast, but still slower than needed:
edm::ESHandle<EcalTPGFineGrainEBGroup>theEcalTPGFineGrainEBGroup_handle;
setup.get<EcalTPGFineGrainEBGroupRcd>().get(theEcalTPGFineGrainEBGroup_handle);
const EcalTPGFineGrainEBGroup * ecaltpgFgEBGroup =
theEcalTPGFineGrainEBGroup_handle.product();
You can make it optional, depending on whether the cacheIdentifier() has changed from one event to the other:
if (setup.get<EcalTPGLinearizationConstRcd>().cacheIdentifier()!=theCacheIDThatWasKept_) {
const EcalTPGFineGrainEBGroup * ecaltpgFgEBGroup =
theEcalTPGFineGrainEBGroup_handle.product();
}
This is really interesting when there are several EventSetupRecords to be got.
Review status
Responsible:
PeterElmer
Last reviewed by:
UrsulaBerthon 26 Feb 2008
No permission to view CMS.WorkBookPerformanceCommonIssues