source $VO_CMS_SW_DIR/slc4_ia32_gcc345/external/perfreport/2.0.0/profile.d/init.shor
source $VO_CMS_SW_DIR/slc4_ia32_gcc345/external/perfreport/2.0.0/profile.d/init.cshwhere $VO_CMS_SW_DIR points to your CMSSW installation directory. If the PerfReport path does not exist, then ask the person in charge to install the PerfReport package (as found in official cms repository), or, to do it yourself, follow the instructions about installing packages at https://twiki.cern.ch/twiki/bin/view/CMS/CMSSW_bootstrap.
On LXPlus (64bit SLC4), a working command to get perfreport in your path is
source /afs/cern.ch/cms/sw/slc4_ia32_gcc345/cms/perfreport/2.0.0/etc/profile.d/init.sh(or
init.csh
if you use tcsh instead of bash shell)
gunzipit first. After that, type
perfreport -fi -i (your igprof file) -o (output directory)PerfReport will then create your HTML report in the output directory you specified. The entry page is called overall.html.
setenv VALGRIND_LIB /afs/cern.ch/user/m/moserro/public/vgfcelibor
export VALGRIND_LIB=/afs/cern.ch/user/m/moserro/public/vgfceliband then run Callgrind as
valgrind --tool=callgrind --fce=(outputfile).fce cmsRun (...your configuration...)or
valgrind --tool=callgrind --fce=(outputfile).fce --instr-atstart=no cmsRun (...your configuration...)if you want to use ProfilerService/ProfilerAnalyzer. After that, type
perfreport -ff -i (your fce output file) -o (output directory)PerfReport will then create your HTML report in the output directory you specified. The entry page is called overall.html.
perfreport -fe -i (your edmeventsize file) -o (output directory)PerfReport will then create your HTML report in the output directory you specified. The entry page is called objects_pp.html.
perfreport -fi -i (newer profile) -r (older profile) -o (output directory)for IgProf files or
perfreport -ff -i (newer profile) -r (older profile) -o (output directory)for FCE files. PerfReport will then create your HTML report in the output directory you specified. The entry page is called overall.html.
perfreport -f(f|i|e) -i (profile)The switches -f and -i are mandatory. If omitted, the program halts:
-f(f|i|e)
-i (inputfile)
-c (regconfig)
-d (reportdesc)
-o (outdir)
-r (profile)
-t (tempdir)
-m 'key,value'
-y (counter)
See https://twiki.cern.ch/twiki/bin/view/CMS/IgProfAnalysis for explanations about the various IgProf counters.
-p
ls $PERFREPORT_PATH/*.xmlyou will obtain a list of the (currently 4) standard report descriptions that PerfReport uses by default if you omit the -d switch. Here is what these XML files are used for:
perfreport ... -d $PERFREPORT_PATH/pr2cmsswfltmsg.xml ...and combine with any other options you like.
<Report> ...(options) ... <ReportSection> ... </ReportSection> <ReportSection> ... </ReportSection> ... </Report>Every report section defines a specific view of the data represented in the report and has its own entry page together with, possibly, subpages linked from there. To specify global report options, you can use the following tags:
subelement | purpose |
---|---|
GlobalFilters | Use this tag to specify a chain of filters which will be applied to the profiling data before it is analyzed, aggregated and reported. The syntax of the tag is as follows:<GlobalFilters>The filters are applied one after another in the order you specify. See the section on filters below to understand how to describe the properties of each filter. |
CallstackAnalysis | Put yes inside this element to have PerfReport create an extra page outside of any report section, showing global callstack statistics (top-cost callstacks, callstack depth histogram, ...). Defaults to no if missing. |
<Report> <ReportSection> </ReportSection> </Report>It will produce a standard report with default options which shows the top 20 functions overall in a table and a pie chart. Optionally however, you can add any of the following subtags:
subelement | purpose |
---|---|
MainTitle | Specifies the main heading for the entry page of the reportsection. Try to meaningfully describe here what the page shows. |
SubTitle | Specifies the subtitle for the entry page of the reportsection. Use to further describe what is the content of the page. |
DisplayItems | Put an integer in this subtag which determines how many entries should be at most printed in the list on this page and on its subpages. If any of the lists exceeds this number, the remainder of the report items is summarized in an {others} entry. If the reportsection lacks this tag, the default value is used, which is 20. Set the value to zero if you wish to have no limit at all. Be aware that if your input data devises a symbol table with a lot of entries (as it is obviously the case with CMSSW), then producing lists of unlimited length will make the report both huge and unbrowsable. |
OutFile | Specifies the filename for the entry page of the reportsection. Provide only the filename and NO path; the utility will concatenate the name with the output path specified at invocation time. |
ProducePieCharts | Put the value yes in this tag to activate production of pie charts on this page and its subpages. Put no to deactivate it. Any other value or a missing tag is interpreted as no. |
ProduceLegends | Put the value yes in this tag to activate production of legends for the pie charts on this page and its subpages. Put no to deactivate it. Any other value or a missing tag is interpreted as no. If you have deactivated production of charts on this page, this value is ignored. |
ProduceInfoPages | Put the value yes in this tag to activate production of function info pages (showing callers and callees) for every function appearing on this page and its subpages. Put no to deactivate it. Any other value or a missing tag is interpreted as no. |
Aggregation | Use this tag to specify in what way data should be aggregated in this report section. The description uses templates or custom aggregates and is described here. If you omit the tag, single cost elements (functions) are shown in the top-level list without appying aggregation. |
subelement | purpose |
---|---|
ShowCallCount | Shows the number of times a given function has been called. This column is visible by default, given that the report type is either Callgrind FCE or IgProf memory. Put no to override. In case of IgProf performance or EdmEventSize, this option is ignored. |
ShowInclusives | Shows inclusive cost numbers. This column is visible by default. |
ShowSelfRelatives | Shows percentages of self cost relative to the total cost in the list. This column is visible by default. |
value | meaning |
---|---|
FunctionName | Displays only the name of the function without namespace or type information. |
FunctionNameWithSignature | Displays the function name inclusive of return type and argument list. |
FullFunctionName | Displays a fully qualified function name including all levels of preceding namespaces. This is the default. |
FullFunctionNameWithSignature | Displays a fully qualified function name including all levels of preceding namespaces plus return type and argument list. |
IntelligentName | Tries to guess what is the most 'useful' information. This is: displays the top-level namespace, then, if necessary, dots (...), then the lowest-level namespace (which is most likely the class name), then the function names. Removes all templating information and signature (templates are replaced by dots if the template collapser is deactivated). However, if the given function is the definition of a C++ operator then signature and return type information is displayed. Help with improving on these rules and send us your suggestions! |
value | meaning |
---|---|
SelfCost | Sorts by descending self cost. This is the default. Note: in case of edmEventSize profiles, this refers to the plain object size. |
InclusiveCost | Sorts by descending inclusive cost. Note: in case of edmEventSize profiles, this refers to the compressed object size. |
FullFunctionName | Sorts by the full function name (inclusive of namespace/class information), alphabetically. |
FunctionName | Sorts by the function name (without namespace), alphabetically. |
subelement | purpose |
---|---|
IncludeNavHeader | Put yes inside this element to place the header on the page that displays the current reportsection. It will be possible to navigate among all reportsections that include the header. Those which don't are not reachable via links and constitute separate entry points into the report. The default is no. You can omit the tag. |
NavLevel1Caption | Provide a short name for the highest-level category the current reportsection belongs to. To be displayed in the leading row of the header. |
NavLevel2Caption | Provide a short name for the mid-level category the current reportsection belongs to. To be displayed in the middle row of the header. |
NavLevel3Caption | Provide a short name for the the current reportsection. To be displayed in the lowest row of the header. |
<Filter> <Type>...</Type> <ItemSet> ... </ItemSet> </Filter>and in the other case simply
<Filter> <Type>...</Type> </Filter>Within the <Type> tag, specify the kind of filter you want to apply. The following exist:
Type | filter action and purpose |
---|---|
CollapseItemSet | Collapses a given set of functions to be reported as only one function. The set has to be specified using the <ItemSet>-tag of the filter element (see dedicated section below). |
CollapseTemplates | Collapses all functions that were generated by the compiler from the same template, e.g. multiply<int> and multiply<float> are collapsed and treated as the one function. The Collapser does not distinguish between the different places in which a template can occur, such that vector<int>::push_back and vector<double>::push_back are equally collapsed. If different instances of templated function reside in different libraries, they are nevertheless collapsed. The 'library' of the collapsed element will be reported in terms of a list mentioning all the where instances of the template occur. This filter does not have an ItemSet property to it. |
CollapseTemplatesSeparateLibs | Same as CollapseTemplates, but elements are not collapsed across libraries. That is, if instances of the same function template are linked into different libraries, they are kept apart in the performance report. The flat list will show identical names for such functions (since the templates have been stripped). By navigating to the function profile, you can check that the elements reside in distinct libraries. This filter does not have an ItemSet property to it. |
Prune | Removes a given set of functions (specified by the ItemSet subelement) from the report by pruning the respective branches in the profile tree. The costs of the given elements as well as all costs in any descendants will disappear from the report. Typical use cases for this type of filter:
|
SubtreeInline | Removes a given set of functions (specified by the ItemSet subelement) from the report by inlining the inclusive weight of the respective branches into their callers. That is the total call cost to any of these elements will be added to the self cost of their respective callers and the element as well as all of its descendants will disappear from the report. Typical use cases for this type of filter:
|
NodeInline | Removes a given set of functions (specified by the ItemSet subelement) from the report by inlining the self weight of the respective nodes into their callers. Only the self cost of each node matching the item set will be inlined into the caller. The descendants called by the function removed will be inherited by the caller. Typical use cases for this type of filter:
|
Skip | Removes a given set of functions (specified by the ItemSet subelement) from the report by making their callers inherit their descendants but deleting the self costs arising in them. There is no typical use case for this filter. Skip was the only removal mode in PerfReport 1 which had to deal with incomplete information coming from standard Callgrind and can be used to emulate this behaviour in order to be able to compare new results with old reports. |
<ItemSet> <Type> ... </Type> ... </ItemSet>Depending on the type of set you select inside the <Type> subtag, there are different options you can select below. The following item set catgeories are available:
Type | purpose and available options |
---|---|
ExactMatch | The set contains every report item (function) for which a given property matches a given string value. Specify the propery you want to use inside a <Property>-tag following the type. Specify the value you want to match it to inside a <Value>-tag following the property. Be careful to use &gr; and < inside value tags if you want to use > or < signs inside your string constants. The list of properties available for the property tag can be found below. |
RegExpMatch | This set contains every report item (function) for which a given property matches a given pattern which is interpreted as an EXTENDED POSIX regular expression. Specify the propery you want to use inside a <Property>-tag following the type. Specify the pattern you want to match it against inside a <Pattern>-tag following the property. Be careful to use &gr; and < inside pattern tags if you want to use > or < signs inside your regular expression. The list of properties available for the property tag can be found below. |
And | The boolean connective And ItemSet contains every report item (function) that is contained in two given operand ItemSets. For example, use <ItemSet>Inside the <LeftOperand> and <RightOperand>-blocks you can use standard <ItemSet>-blocks to describe the operand sets recursively. |
Or | The Or-ItemSet contains a report item iff one of the two specified operand ItemSets contains it. The syntax is analogous to the And-case. |
Not | Write<ItemSet>to build a filter through which a report item passes if it does not pass the given operand filter. |
Property | meaning |
---|---|
Library | The library where the function is stored. An appropriate value could for example be "mylibrary.so". |
FirstNamespace | The name of the namespace the function belongs to. Only the first level of nesting is used, so if the function is called A::B::C::f(), then the value of FirstNamespace is A. |
FullNamepsace | The full name of the namespace the function belongs to. All levels of nesting are used, so if the function is called A::B::C::f(), then the value of FullNamespace is A::B::C. |
LastNamepsace | The name of the innermost namespace the function belongs to. Only the last level of nesting is used, so if the function is called A::B::C::f(), then the value of LastNamespace is C. This usually corresponds to the name of the class. |
FunctionName | The name of the function. Namespace and class information, return types, function signature, etc. are all removed. For instance, if the function is called void A::B::C::f(int x), then the value of FunctionName is f. |
FullFunctionName | The full name of the function, inclusive of namespace(s) and class. Return types, function signature, etc. are removed. For instance, if the function is called void A::B::C::f(int x), then the value of FullFunctionName is A::B::C::f. |
<RegressionConfig> <ThresholdDecrease>...</ThresholdDecrease> <ThresholdIncrease>...</ThresholdIncrease> <Renamings> ... </Renamings> </RegressionConfig>
<Renaming> <Search>...</Search> <Replace>...</Replace> <Part>...</Part> <Lookup>...</Lookup> </Renaming>The various elements have the following meaning and usage:
element | meaning and usage |
---|---|
Search | Here you provide the name which has changed. As usual, make sure you use < for < and > for >. |
Replace | Here you provide the name after the change. |
Part | Here you specify where in the part of the function names the change has occurred. The following values can be used:
|
Lookup | Here you specify the direction of the change. Admissible values are:
|
Reviewer/Editor and Date (copy from screen) | Comments |
---|---|
KatiLassilaPerini - 15 Mar 2007 | created template page |
VincenzoInnocente - 16 Mar 2007 | page content last edited |
Robin Moser - 29 Mar 2007 | page content last edited |
JennyWilliams - 17 Aug 2007 | moved deprecated PerfReport version 1 information to here |
I | Attachment | History | Action | Size | Date | Who | Comment |
---|---|---|---|---|---|---|---|
![]() |
illu_collapse.png | r1 | manage | 2.0 K | 2007-08-16 - 14:48 | UnknownUser | illustrates the effect of a collapsing filter |
![]() |
illu_nodeinline.png | r1 | manage | 2.0 K | 2007-08-16 - 15:02 | UnknownUser | illustrates the effect of the 'nodeinline' filter |
![]() |
illu_prune.png | r2 r1 | manage | 1.5 K | 2007-08-16 - 15:05 | UnknownUser | illustrates the effect of the 'prune' filter |
![]() |
illu_skip.png | r1 | manage | 1.6 K | 2007-08-16 - 15:03 | UnknownUser | illustrates the effect of the 'skip' filter |
![]() |
illu_subtreeinline.png | r1 | manage | 2.1 K | 2007-08-16 - 15:03 | UnknownUser | illustrates the effect of the 'subtreeinline' filter |
![]() |
renamings.png | r2 r1 | manage | 39.3 K | 2007-08-15 - 15:08 | UnknownUser | illustration of the name matching process for regression analysis |