HPlogo HP-UX Reference > C

caliper(1)

Requires Optional HP Caliper Software
For Integrity Systems Only
Caliper 4.3
HP-UX 11i Version 2: December 2007 Update
» 

Technical documentation

 » Table of Contents

 » Index

NAME

caliper — measure, report, and analyze program performance data

SYNOPSIS

caliper measurement [collect_options] [report_options] program [program_args]

caliper measurement [collect_options] [report_options] pid [pid]...

caliper measurement [collect_options] [report_options] -w

caliper {report|merge|diff} [report_options] [database]...

caliper advise [advise_options] [database]...

caliper info [info_options]

caliper -g [--jre path_to_java]

caliper { -h | -H }

caliper -v

Remarks

This command requires installation of the optional HP Caliper software, which is not included in the standard operating system.

DESCRIPTION

The caliper command measures, reports on, and analyzes the performance of native Itanium(R) and Itanium 2 programs.

Quick Start

This Quick Start section provides examples of common command invocations and frequently used options. Note that additional quick start help is available from HP Caliper's Quick Start Reference card located at:

caliper_root/doc/quick_start.pdf

and also downloadable from:

http://www.hp.com/go/caliper.

To make a sampled call-graph measurement and report results:

caliper my_app

To measure and report sampled data cache misses:

caliper dcache my_app

To find a process's (for example, pid 1234) hottest functions over a 60-second period:

caliper fprof -e60 1234

To find the hottest functions across the entire system (for both kernel and processes) over a 120-second period:

caliper fprof -o output_file -w -e120

To re-report the last measurement run with source and assembly code:

caliper report -rall

To analyze the latest performance runs and provide advice:

caliper advise

To see the difference between two collection runs of the same application:

caliper diff database1 database2

To start the caliper GUI to interactively measure and explore performance data:

caliper -g

Common Measurements

cpu

cpu execution statistics

cstack

sampled call stack profile

dcache/icache

cache miss profiles

ecount

total cpu event counts

fprof

flat execution profile

scgprof

sampled call graph profile

Common Options

Save measurement results to named database:

-d DATABASE

Set length of measurement (in seconds):

-e ELAPSED_TIME

Specify which cpu metric sets to measure:

  • -m EVENT_SET[:all|:user|:kernel][,EVENT_SET]...

  • where: EVENT_SET is:

  • overview|cpi |fp|l1dcache|l1icache |l2cache|l3cache|stall |threadswitch

Specify which cpu events to measure:

  • -m CPU_EVENT[:EVENT_PARAM]...[,CPU_EVENT]...

Save textual performance report to file:

-o TEXT_FILE

Selectively measure application processes:

  • -p {all|root|root-forks|PATTERN[,PATTERN]...}

Control reporting details:

-r [all|none]

Control the sampling specification for profiles:

-s PERIOD[,VARIATION[,CPU_EVENT]]

Measure all threads individually:

-t

Perform measurement system wide (not on specific processes):

-w

Get help:

-h for short help -H for full help

General Description

The caliper command measures, reports on, and analyzes the performance of native Itanium programs. To obtain performance data, caliper measures some metrics using the Itanium's Performance Monitoring Unit (PMU) and, on some platforms, measures other metrics by inserting probe code into the running program. No special preparation of the measured program is necessary.

caliper is available on HP-UX 11i v2 and later and Linux/Itanium (2.6.5 kernel and later). Note that not all features are available on all platforms; see the PLATFORM-SPECIFIC ADDENDA section below for details.

caliper operates in one of two broad collection modes: in per-process mode (using --scope=process) or in system-wide mode (using --scope=system).

In per-process mode, caliper tracks and measures individual processes, optionally following fork and exec calls.

In system-wide mode, caliper measures data across all CPUs on the system, and then attributes samples to individual processes. Using system-wide mode is a good way of understanding the broad performance picture on a system, before "drilling down" using caliper in per-process mode.

caliper can measure programs which are built 32-bit or 64-bit, shared bound or minshared bound, optimized or debug. Applications can be written in C, C++, Fortran 9x, assembly (if standard runtime conventions are followed) or a mixture of these languages.

You can specify the program to be measured using an absolute path, a relative path or a simple file name. caliper searches $PATH when you specify a simple file name and looks in the current directory only if $PATH includes it.

Alternatively, you can specify one or more already running processes to measure by listing their ID(s) on the command line. caliper will measure processes until they terminate, unless you use the --duration option, or if you stop caliper by sending it a SIGINT (e.g., using Ctrl-C in a terminal window), which will also generate a performance report or write performance data to a database.

Note that stopping caliper with a SIGINT (Ctrl-C) when dynamic instrumentation is being used (measurements acount, cgprof, fcount, and fcover) will cause caliper to immediately and forcibly terminate all processes being measured before writing data. Using SIGINT with any other measurement will stop caliper while allowing measured processes to continue normally.

caliper can both collect data and optionally generate reports (in ASCII, CSV, HTML, or any combination) and/or create a caliper database in a single run (using the syntax, ". caliper measurement ...").

caliper can also:

  • Generate a report from a previously created database (caliper report),

  • Merge or diff the data from multiple databases (caliper merge and caliper diff),

  • Analyze performance metrics from one or more previously created databases (caliper analyze), or

  • Generate descriptions of reports and CPU events (caliper info).

The measurement argument to the caliper command is really the name of a measurement configuration file which determines what measurement to make and how to make it. You can use the standard measurement configuration files supplied with caliper or you can create your own. You can also override measurement configuration file settings on the command line or in a caliper initialization file (.caliperinit).

One example of a measurement is fprof, which tells caliper to collect the data needed to produce a flat profile report based on CPU cycles.

Another example of a measurement is dcache, which produces a report detailing where data cache misses have occurred during execution of the program. See the EXAMPLES section for examples of using fprof, dcache, and other measurements.

If an executable has been stripped of local symbols, caliper can only report names for global functions. If no symbol table (or debug information) exists at all, caliper will only report address information.

Like all performance measurement tools, caliper can affect the runtime performance characteristics of the program being measured.

Some measurements, such as ecount, have negligible impact while dynamic instrumentation-based measurements (acount, cgprof, fcount, fcover) can have a large effect. Take this "Heisenberg" effect into consideration when interpreting any performance data.

When making measurements, performance data is always saved to a caliper database. You can use the --database option to specify the name and location of the database caliper creates.

If you do not use --database, then the database (named for the type of measurement made) will be saved in the ./.hp_caliper_databases directory (this default can be changed with the CALIPER_DATABASES environment variable).

The most recently created measurement database is pointed to by an automatic "latest" symlink in the CALIPER_DATABASES directory. If a simple database name is given on a caliper report or advise run, then the CALIPER_DATABASES directory is searched after the current directory for the database. So, caliper report fprof will report from the most recent fprof measurement run, caliper report latest will report the latest measurement run of any type.

In addition to measuring and reporting performance data, caliper can also analyze collected data and make suggestions for improving your program's performance. This analysis is driven by a set of rules which look for specific data metrics indicating typical performance problems.

You can also write your own rules customized to your application. This is an iterative process where you first make one or more caliper performance runs saving the results in databases, run caliper advise on those databases, review the suggestions, make changes to your program and/or how it is built, and repeat.

Note that because every program is unique, not all of the suggestions will apply.

There are several defaults assumed when portions of the caliper command-line are omitted. If measurement is omitted, but either a program name or process ID list is given, then the scgprof measurement is assumed. If report is omitted, but a database if given, then report is assumed. If report is given, but no databases are listed, then latest is assumed. Finally, if advise is given, but no databases are listed, then all of the databases in the CALIPER_DATABASES directory are analyzed.

Finally, there is a full Graphical User Interface (GUI) available as an alternative to the command-line user interface. The GUI can be used to set up and make measurements as well as graphically explore collected performance data. There are two ways to invoke the GUI:

  • The first is to run it on the local machine with the command caliper -g.

  • The second is to install a remote GUI client on a Windows or Linux desktop system, run that program, and remotely connect to an Integrity server on which the measurements will be made.

Measurement Configuration Files

The measurement argument to the caliper command is actually the name of a measurement configuration file. Measurement configuration files determine which measurements caliper makes. You can use any of caliper's standard measurement configuration files or you can create and use your own.

The measurement parameter in the command line can be a simple, relative, or absolute file name. If a simple file name is given, then caliper first looks in the current working directory for the file. If not found, then caliper looks in the caliper_root/config directory. By default, caliper is installed in /opt/caliper on HP-UX, and /opt/hp-caliper on Linux.

Many of the settings in the measurement configuration file can be overridden by their corresponding options on the command line.

Note that when making your own measurement configuration file, the first line of the file must be a comment which begins with

# caliper

to pass a caliper validity check.

HP Caliper provides the following standard measurement configuration files. Also see the PLATFORM-SPECIFIC ADDENDA section for additional platform-specific measurement configuration files that may be supplied.

alat

Measures failed ALAT checks.

branch

Measures branch (mis-)predictions.

cstack

Measures sampled call stack profile.

cycles

Measures sampled cycle profiles. Available only on dual-core processors.

dcache

Measures data cache misses.

dtlb

Measures data tlb misses.

ecount

Measures total cpu event counts.

fprof

Measures a flat profile of sampled instruction addresses.

icache

Measures instruction cache misses.

itlb

Measures instruction tlb misses.

pmu_trace

Collects per-(kernel)thread traces of sampled cache misses, TLB misses, ALAT misses, branch mispredictions, instruction addresses, and CPU events.

scgprof

Creates a call graph profile using sampled branch data.

traps

Profiles traps, interrupts, and faults.

Options

There are a number of command line options available which alter caliper's operation. Option names and literal arguments can be abbreviated to their shortest, non-ambiguous spelling. Although the command line synopsis above shows caliper options following measurement, in reality they can precede and/or follow it.

In the option descriptions below, lower-case text (computer/bold font) is a literal which is typed as shown and upper-case text (italics or underline font) is descriptive and must be replaced with real values.

See PLATFORM-SPECIFIC ADDENDA below for additional options that may be available.

The following option can be used to supply caliper options from a text file:

--options-file=OPTIONS_FILE or -f=OPTIONS_FILE

Specifies a text file (OPTIONS_FILE) containing a list of caliper command-line options (separated by spaces or newlines). You can also use an options file to specify a caliper measurement configuration file as well as the application to be profiled and its arguments.

Data Collection Options

The command line options which affect data collection are:

--branch-sampling-spec= EVT_PERIOD[,VARIATION [, CPU_EVENT[:EVENT_PARAM]...]]

  • Controls the branch sampling rate for an scgprof collection. See the --sampling-spec option below for argument details.

--database=DATABASE[,unique]

(Can also be specified with the -d option.)

This saves measurement results to the named database.

Performance data is always saved to a caliper database, whether one is explicitly specified or not. If the --database option is used to specify the explicit name and location of the database, then that database is used.

If the --database option is not given, then a database (named for the type of measurement made) will be saved in the ./.hp_caliper_databases directory (this default can be changed with the CALIPER_DATABASES environment variable). The most recently created measurement database is pointed to by an automatic "latest" symlink in the CALIPER_DATABASES directory.

The optional unique qualifier will append the process ID of the HP Caliper process to the name of the database it is writing to, e.g., mydb-21952.

--data-summary

Valid only for the dcache measurement, this option specifies that caliper should additionally record (and report) global variables and process regions (stack, heap, data, etc.) associated with data cache misses.

--duration=ELAPSED_TIME

(Can also be specified with the -e option).

Specifies how long the measured application should run (in seconds) before caliper stops measuring it. When caliper stops measuring, the application is resumed and runs freely.

--event-defaults=EVENT_PARAM[:EVENT_PARAM]...

Specifies the default CPU event parameters. These apply to all CPU events unless an event-specific setting is provided for the given parameter.

EVENT_PARAM is defined as:

[{privilege-level-mask|PLM}= ]{all|user|kernel } | {threshold|T}=THRESHOLD

PLM determines the privilege level setting for a given metric. By default, metrics are measured when your application runs in user space. The privilege levels available are: user, kernel, and all.

THRESHOLD is an integer value that determines the semantics of the counts reported by HP Caliper. When the threshold is zero (the default), the reported count is the number of occurrences of the event. When the threshold is not zero, the reported count is the number of CPU cycles during which the number of occurrences of the event met or exceeded the threshold.

--frame-depth=COUNT

Valid only for the cstack measurement, this option specifies the maximum number of frames to unwind while collecting call stack samples. The default depth is 32.

--metrics= EVENT_SET[:all|:user|:kernel][, EVENT_SET]...

--metrics= CPU_EVENT[: EVENT_PARAM]...[,CPU_EVENT]...

  • (Can also be specified with the -m option.)

  • Specifies the event set or CPU events to measure. If no event is specified (or --metrics= is specified), then no metrics will be reported. (Note: Specifying --metrics= is not valid for the cpu and ecount measurements.)

  • EVENT_SET is a predefined collection of CPU events you specify only with the measurement, cpu. (See CPU Metrics EVENT_SET Description below for more information.)

  • CPU_EVENT is a CPU event as recorded by the Performance Monitoring Unit (PMU) of the processor. You can change the default CPU events recorded for any of the following measurements: alat, branch, cycles, dcache, dtlb, ecount, fprof, icache, itlb, pmu_trac, traps.

  • You can list CPU events available (along with descriptions) by using caliper info (see Information Options below). Defaults specific to a given measurement can be found in the measurement's configuration file.

  • EVENT_PARAM allows you to change the privilege level (default: user) and threshold (default: 0) used when counting events. See --event-defaults above for the syntax of EVENT_PARAM.

--module-default=all|none

Specifies the default setting for load module inclusion in the measurement.

--module-exclude=MODULE[:MODULE]...

Specifies the load modules to be excluded from measurement. Module names can be given as a simple file name (libapplib1.so) which matches libraries of this name in any directory; a full-path file name (/home/dev/libs/libapplib1.so) which matches only this one specific library; or a full-path directory name (/usr/lib/) which matches all libraries within this directory or any lower sub-directories.

Note that the trailing / is required to distinguish a directory name.

For instrumentation-based measurements (acount, cgprof, fcount, fcover), the specified load modules are not instrumented; for all other measurements, any samples in the specified load modules are simply discarded.

--module-include=MODULE[:MODULE]...

Specifies the load modules to be included in the measurement. Module names can be given as a simple file name (libapplib1.so) which matches libraries of this name in any directory; a full-path file name (/home/dev/libs/libapplib1.so) which matches only this one specific library; or a full-path directory name (/usr/lib/) which matches all libraries within this directory or any lower sub-directories.

Note that the trailing / is required to distinguish a directory name.

--module-search-path=DIRECTORY[:DIRECTORY]...

Specifies a list of directories to be searched when a load module file (executable or shared library) cannot be found. A load module may not be found if the measured process uses chroot(2) or chdir(2) and then loads libraries or executes other binaries using relative paths. (See also the entry for this option in the section, Reporting Options.)

--process={default|all|root|root-forks|[some:][OPT,...]PATTERN[, PATTERN]...}

  • (Can also be specified with the -p option).

  • Specifies which processes in an application's process tree should be measured.

  • Use root to measure only your application's root process.

  • Use root-forks to measure your application's root process and any processes it forks.

  • Use all (the default) to measure all your application's processes.

  • For more information, see Process Selection below.

    --sampling-spec=TIME_PERIOD

    --sampling-spec=EVT_PERIOD [, VARIATION[,CPU_EVENT [:EVENT_PARAM]...]]"

  • (Can also be specified with the -s option).

  • Controls the sampling rate and the event that triggers samples.

  • The first form (--sampling-spec=TIME_PERIOD) is used only for the cpu and cstack measurements.

  • The second form is used for all other sample-based measurements (fprog, dcache, icache, etc.).

  • TIME_PERIOD is a sampling period in seconds, milliseconds, or microseconds (specified as a Ns, Nms, Nus, respectively, where N is an integer).

  • For the cpu measurement, the default TIME_PERIOD is 8ms of CPU time (see also --cpu-aggregation=COUNT).

  • For the cstack measurement, the default TIME_PERIOD is 100 miliseconds (wall-clock time). Note that time is measured in CPU cycles (for cpu measurement) or real time (for cstack measurement).

  • EVT_PERIOD specifies how many sampling events should occur between samples.

  • VARIATION specifies how much to vary the number of events between samples (may be specified as either an exact count or as a percentage of the sampling rate if followed by %).

  • CPU_EVENT specifies the CPU event to use for sampling. You can list CPU events available (along with descriptions) by using caliper info (see Information Options below).

  • EVENT_PARAM allows you to change the privilege level (default: user) and threshold (default: 0) in effect for the CPU event counter used to trigger samples. See --event-defaults above for the syntax of EVENT_PARAM.

--scope={process|system|PSET_LIST}[,attr-mod|,attr-proc|,attr-none]

Defines a measurement's scope.

PSET_LIST is defined as: pset=pset_id [:pset_id]...

Caliper can measure activity on individual processes (process scope), on all CPUs in the system (system scope) or, on HP-UX systems, on all CPUs in selected processor sets (pset scope).

The system and pset scopes are supported for all PMU-based measurements (alat, branch, cpu, cycles, dcache dtlb, ecount, fprof, icache, itlb, pmu_ttrace, traps). The default scope is process.

With system and pset scopes, the qualifiers attr-mod, attr-proc, and attr-none can be specified. For measurements involving PMU samples, this determines how such samples are attributed. attr-mod is the default, and tells caliper to attribute samples to individual processes and their load modules, whenever possible.

attr-proc causes attribution simply to processes alone; attr-none specifies no process attribution at all.

In all three qualifier cases, samples recorded in the kernel will be attributed to kernel modules, if possible. The -w option is a shortcut for --scope=system,attr-mod.

When the scope is system, the command-line arguments program and program_args should not be provided. For more information, see System-Wide Measurements below.

--thread=sum-all|all

-t=sum-all|all

Specifies how thread data should be collected and reported. Specify all to collect and report data per thread. (The -t option is a shortcut for --thread=all). Specify sum_all to collect and report data summed across threads. Default: sum_all.

Note that this option is currently only supported for the following measurements: alat, branch, cstack, cycles, dcache, dtlb, fprof, icache, itlb, traps.

Reporting Options

The command line options which affect reporting are:

--callpath-cutoff=PERC_CUTOFF[,CUM_PERC_CUTOFF[,MIN_COUNT]]

Valid only for the cstack measurement, this option specifies a cutoff value that limits hot call paths reported in Hot Call Paths sections.

Reporting of call paths stops when, for the given sort metric, a call path is encountered whose associated metric percentage is below PERC_CUTOFF (default 1.0) or when the CUM_PERCENT_CUTOFF has been met or exceeded (default 100.0).

The MIN_COUNT argument sets the minimum number of call paths to be displayed (default: 5) regardless of the settings for PERC_CUTOFF and CUM_PERC_CUTOFF.

--context-lines=COUNT_SOURCE[,COUNT_DISASSEMBLY]

Specifies that function details should show at least count_source source lines (default: 5 for source-only reports or 0 for reports with disassembly code) before and after reporting a source line entry with associated performance data.

Set COUNT_SOURCE to all to report all source lines for reported functions. As with COUNT_SOURCE, set COUNT_DISASSEMBLY to show context disassembly lines (default: 3).

Applies to PMU histogram reports only.

--csv-file=CSV_FILE[,append|,create][,per-process|,shared][,unique]

Specifies a file in which to write a caliper performance report in Comma-Separated-Values format (CSV) for use in a spreadsheet or for further processing.

The file can be opened in append or create mode (default: create). Multi-process reports can be generated per-process (exec name is appended to each file) or to a single, shared file (default: shared). Specify unique to have the process ID appended to the report file name.

--description-details={all|none|[target] [:processor][:run][:times] [:sampling][:images][:help ]}

  • Controls which subsections are included in the description section at the top of each report.

  • The default: target:processor:times:sampling.

  • Specify all to include all subsections, none to exclude all subsections, or specify a list of the subsections you want included. The list can include one or more of the following subsection identifiers (shown with their associated subsection title):

target

Target Application

processor

Processor Information

run

Run Information

times

Target Execution Time

sampling

Sampling Specification

images

Load Modules Included

help

Report Help

--detail-cutoff=PERC_CUTOFF[,CUM_PERC_CUTOFF[,MIN_COUNT]]

Specifies a cutoff value that limits functions reported in function details sections.

Reporting of functions stops when, for the given sort metric, a function is encountered whose associated metric percentage is below PERC_CUTOFF (default 1.0) or when the CUM_PERCENT_CUTOFF has been met or exceeded (default 100.0).

The MIN_COUNT argument sets the minimum number of functions to be displayed (default: 0) regardless of the settings for PERC_CUTOFF and CUM_PERC_CUTOFF.

Applies to PMU histogram reports only.

--group-by=executable|module|none

Specifies how data for matching processes and modules (those with matching basenames) is combined in reports.

Specify executable to have data for matching processes combined whenever possible. This is the default.

Specify module to ignore individual processes, and create a "module-centric" report for matching modules across all measured processes.

Specify none to combine no data across any processes.

--html=[OUTPUT_DIR[,STYLE[,ENTRIES_PER_PAGE]]]

Writes performance data as an HTML-formatted report in directory OUTPUT_DIR (default: ./Caliper_HTML).

STYLE specifies the color theme of the report. STYLE can be set to black, gold, or white (default: white).

ENTRIES_PER_PAGE specifies the number of entries on each web page (default: 20). The --html option is supported for the following reports: alat, branch, cgprof, cycles, dcache, dtlb, fprof, icache, itlb, scgprof, and traps.

--info

Causes HP Caliper to append help information to the end of textual reports.

--kernel-path=PATH

Specifies the path to the kernel file you want HP Caliper to use for symbol lookup and disassembly.

This option only applies when kernel profiling is involved, typically when the sampling specification for the measurement has a privilege level mask of kernel or all.

--latency-buckets=TRUE|FALSE

Specifies whether or not the latency bucket information should appear in dcachereports (default: TRUE).

This option is used only with the dcache measurement.

--module-search-path=DIRECTORY[:DIRECTORY]...

Specifies a list of directories to be searched when a load module file (executable or shared library) cannot be found.

A load module file may not be found if the load module is not available from the location recorded at data collection time.

--output-file=OUTFILE[,append|,create][,per-process|,shared][,unique]

(Can also be specified with the -o option.)

Specifies the file caliper writes its report to (default is stdout). The file can be opened in append or create mode (default: create).

Multi-process reports can be generated per-process (exec name is appended to each file) or to a single, shared file (default: shared).

Specify unique to have the process ID appended to the report file name.

--per-module-data=TRUE|FALSE

Specifies whether or not caliper should report functions in one list (default: --per-module-data=FALSE), or report functions grouped by load module (--per-module-data=TRUE).

The following measurements support --per-module-data=TRUE: alat, branch, cycles, dcache, dtlb, fprof, icache, itlb, and traps.

--percent-columns={total|cumulative|total:cumulative}

Specifies what types of percentages are shown in reports.

Specify total to have a column showing percentage of samples with respect to total number of samples taken.

Specify cumulative to have a column showing cumulative percentage of samples.

Applies to PMU histogram reports only.

--process-cutoff=PERC_CUTOFF[,CUM_PERC_CUTOFF[,MIN_COUNT]]

Specifies a cutoff value that limits processes reported in process summary section.

Reporting of processes stops when, for the given sort metric, a process is encountered whose associated metric percentage is below PERC_CUTOFF (default 2.0) or when the CUM_PERCENT_CUTOFF has been met or exceeded (default 100.0).

The MIN_COUNT argument sets the minimum number of processes to be displayed (default: 5) regardless of the settings for PERC_CUTOFF and CUM_PERC_CUTOFF.

Applies to PMU histogram reports only.

--read-init-file=TRUE|FALSE

Specifies whether or not caliper should look for and read .caliperinit, the caliper initialization file (default: TRUE).

Caliper looks for a .caliperinit file in the current directory; if not found, caliper then looks in your home directory.

Settings in an initialization file take precedence over measurement configuration file settings. Command-line options take precedence over settings in an initialization file.

--report-details={all|none|statement|instruction|statement:instruction}

(Can also be specified with the -r option). For PMU histogram reports only. Specifies level of program detail reported (default: statement).

Specify statement to have data aggregated by source statement.

Specify instruction to obtain reports at the lowest level of granularity available.

Specify none to disable detail reports entirely. Specify all to print both source and instruction level details.

--report-details=[module][:directory][:file][:function][:unknown]

(Can also be specified with the -r option). For function coverage reports only. Specifies which coverage reports are produced.

Default: module:directory:file:function:unknown.

Specify module for the load module summary report, directory for the source directory summary report, file for the source file summary report, and/or function for the function detail report. Additionally specify unknown to include functions from unknown source files in the summary and detail coverage reports.

--skip-functions=FUNC[,FUNC]...

Valid only for cstack; specifies functions that are of no interest. Call stacks are not reported if their leaf routine is one of the specified functions.

--sort-by=METRIC

Specifies that performance data is to be sorted by values of given metric. (Default: See Metrics for Sorts/Cutoffs below.)

--source-path-map=PATHMAP[:PATHMAP]...

Specifies the PATHMAP used in finding source files (used for reporting source statements).

PATHMAP entries are separated by a colon (:) and applied in order until a file match is found.

Simple entries are prepended to file names; comma-separated entries specify to substitute the path to the left of the comma with the path to the right of the comma. Perl regular expressions are allowed in the left half of a substitution.

Applies to PMU histogram reports only.

--summary-cutoff=PERC_CUTOFF[,CUM_PERC_CUTOFF[,MIN_COUNT]]

Specifies a cutoff value that limits functions reported in function summary sections.

Reporting of functions stops when, for the given sort metric, a function is encountered whose associated metric percentage is below PERC_CUTOFF (default 0.1) or when the CUM_PERCENT_CUTOFF has been met or exceeded (default 100.0).

The MIN_COUNT argument sets the minimum number of functions to be displayed (default: 5) regardless of the settings for PERC_CUTOFF and CUM_PERC_CUTOFF.

Applies to PMU histogram reports only.

--system-model=MODEL_NUMBER

Specifies the system model number for reporting latency buckets with the dcache measurement.

This option is necessary if you want HP Caliper to report system-specific latency buckets on Linux. If you do not use this option, a default set of latency buckets will be used.

On HP-UX, HP Caliper automatically obtains the model number using the model command (see model(1)).

--traps-reported=TRAP_NAME[,TRAP_NAME]...

Specifies which traps, faults, and interrupts (collectively referred to as traps) caliper should report.

The traps measurement collects samples for 34 different traps, but only 6 trap types can be shown in a single report. You can re-report from the same HP Caliper database to see different sets of traps.

TRAP_NAME is is an abbreviation for a trap name. The default TRAP_NAME values and their associated traps are:

ITLB

Instruction TBL fault

DTLB

Data TBL fault

UADREF

Unaligned Data Reference fault

GEXCP

General Exception

FPFLT

Floating-Point Fault

FPTRP

Floating-Point Trap

For a list of all available values for TRAP_NAME, run the caliper command:

caliper info -r traps

Advise Options

The command line options which affect advise-only runs are:

--advice-classes=[all]|[[general][:cpu][:memory][:io][:system]]

Specifies which classes of advice to report (default: all). Every piece of advice is classified by which performance area it apples to.

Specify one or more of general for general advice, cpu for advice related to basic instruction execution, memory for memory-related information, io for input-output performance issues, and/or system for advice about items such as system calls. Only advice in the selected classes will be printed.

--advice-cutoff=MIN_INDEX[,MIN_COUNT[,MAX_COUNT]]

Specifies how much advice to report. Each bit of advice has an index value reflecting its relative importance.

Advice is sorted with the most important items first and the list stops when the index value of an item is below MIN_INDEX (default: 5.0) or MAX_COUNT (default: 15) items have been printed.

The MIN_COUNT (default: 5) argument sets the minimum amount of advice to report regardless of the setting for MIN_INDEX.

--advice-details=[all]|[[description][:improvement][:measurement] [:explanation][:rule]]

  • Specifies how much detail to report for each piece of advice (default: all).

  • Each piece of advice may contain a brief description of what it is focused on (description), a suggestion for improving performance in this area (improvement), additional performance measurements which can be made to further explore this area (measurement), a more detailed explanation of what this performance area is (explanation), and the name of the rule generating this advice (rule).

  • Most rules will not have all of these components. One or more of these can be given to limit what gets reported.

    --analysis-focus= {[executable:]all|[executable:]NAME[,[executable:]NAME]...}

  • Specifies which executable program(s) to report on. By default, all (all) executables which have performance data in the given database(s) will be analyzed and a separate report produced for each.

  • To analyze only selected executables, list them by giving only their simple filename (no path information).

--rule-files=RULEFILE1[,RULEFILE2]...

Specifies a list of files which contain the analysis rules to use (default: default). Only the rule files listed (and any rule files that they include) will be used.

Each RULEFILE can be either a relative path, an absolute path, or a simple filename. Simple file names are first searched for in the current directory, then (if not found) in the user's personal rules directory (~/caliper_advisor), and finally (if still not found) in the directory of the running caliper (caliper_root/rules).

GUI Options

The command line options which affect the graphical user interface are:

--gui

(Can also be specified with the -g option.)

Start the caliper GUI to interactively measure and explore performance data.

--jre

Specify the location of the Java Runtime Environment (JRE) that will run the GUI. If --jre is not specified, caliper will first check if the environment variable JAVA_HOME is set to a JRE; if not, caliper will attempt to find a JRE via the PATH environment variable.

Information Options

The command line options which affect information-only runs are:

--cpu-counter=PARTIAL_COUNTER_NAME|KEYWORD|all

(Can also be specified with the -c option.)

Specifies that information about the cpu counters which match the given (partial) name or keyword be output.

The information fields to search are given with the --search option.

The information fields to output are given with the --report option. Use all to report on all cpu counters.

The --cpu-counter and --report options are mutually exclusive. If neither is given, then --cpu-counter is assumed.

--detail={all|[name][:abbreviation][:category][:title][:description]}

(Can also be specified with the -d option.)

Specifies which information fields to include in cpu counter reports. It can be any combination of name, abbreviation, category, title, or description; separated by colons or all.

Default: name:abbreviation:title.

--output-file=OUT_FILE[,append|,create]

(Can also be specified with the -o option.)

Specifies the file in which to write the caliper information report (default is stdout).

The file can be opened in append or create mode (default: create).

--report=all|REPORT_TYPE

(Can also be specified with the -r option.)

Specifies that information about the report types which match the given (partial) REPORT_TYPE name be output. This is the same information which is include in measurement reports when the --info option is used.

Report types are alat, branch, dcache, dtlb, ecount, fprof, icache, itlb, pmu_trace, or scgprof.

Use all to report on all report types.

The --cpu-counter and --report options are mutually exclusive.

--search={all|[name][:abbreviation][:category][:title][:description]}

(Can also be specified with the -s option.)

Specifies which information fields are to be searched for cpu counter reports. It can be any combination of name, abbreviation, category, title, or description; separated by colons or all.

Default: name:abbreviation.

Help and Version Options

The following general options can be used to get syntax help or print the caliper version:

-h or -?

Prints the quick help text. When used, must be used alone on the caliper command line.

--help or -H

Prints the long help text. When used, must be used alone on the caliper command line.

--version or -v

Prints the caliper version identification. When used, must be used alone on the caliper command line.

Specifying Settings with an Initialization File

You can save settings in a file, named .caliperinit, that HP Caliper automatically uses at start-up. Putting the options in an initialization file simplifies the command line you use to launch HP Caliper.

For example, you can specify global settings for all of your reports, such as system libraries to exclude and output file locations. With your preferences in the initialization file, you can then simply type:

$ caliper fprof

The resulting report would use your predefined preferences. Using this approach you could, for example, change your preferences without having to change the HP Caliper command line in a Makefile.

Note: Any option specified on the command line overrides the corresponding setting in an initialization file.

There are a number of reporting options not available from the command line that you can set in an option file. These are:

disasm_mark_branch_targets=TRUE|FALSE

Determines if targets of branch instructions are preceded by a colon (:) in disassembly.

Default: False

disasm_target_name_limit=LIMIT

Specifies the maximum number of characters to print for branch target symbols in disassembly.

Default: 30

use_parens_for_statement_data=TRUE|FALSE

If TRUE, statement-level data in reports is placed in parentheses.

Default: True

suppress_statement_data=TRUE|FALSE

If TRUE, no statement-level data will be reported.

Default: False

suppress_init_warnings=TRUE|FALSE

If TRUE, no warnings will be issued if unrecognized variables are detected in the initialization file or in measurement configuration files.

Default: False

The initialization file can be in the current working directory, your home directory, or in both locations. When you start HP Caliper, it searches for the presence of an initialization in this order:

1.

.caliperinit in the current working directory

2.

.caliperinit in your home directory.

and when caliper finds one, it executes based on that initialization file.

An initialization file is a Python script, similar to Caliper measurement configuration file. Here is a sample .caliperinit file:

application="ls" if caliper_config_file == 'branch': process='all' elif caliper_config_file == 'my_count': application="/opt/mpi/bin/mpirun" arguments = "-np 2 /proj/dynopt/test_fe/mpi_hello_world" elif caliper_config_file == 'dcache': application="/opt/mpi/bin/mpirun" arguments = "-np 2 /proj/dynopt/test_fe/mpi_hello_world" elif caliper_config_file == 'itlb': application="/opt/mpi/bin/mpirun" arguments = "-np 2 /proj/dynopt/test_fe/mpi_hello_world" module_exclude="/usr/bin/sh"

The syntax inside the initialization file is the same as in measurement configuration files. In particular most long options that are specified in the command line could be specified in the initialization file, replacing the dash (-) in the option long name with underscore (_) to form the variable name.

Process Selection

When dealing with multi-process applications, it is important to be able to select processes to be measured in a process tree. This section explains how to do this selection by using the --process option.

Caliper has a choice between three behaviors when considering what to do with a process:

measure

The process is measured. Caliper is informed of new processes generated via fork, vfork, or exec.

track

The process is not measured. Caliper is informed of new processes generated via fork, vfork, or exec.

ignore

The process is not measured. Caliper is not informed of new processes generated via fork, vfork, or exec.

Caliper will pick which behavior is chosen depending on the process option. See PLATFORM-SPECIFIC ADDENDA below for additional information on process selection.

This section uses the term root process. The root process is the process at the root of the process tree. It is either the process started by Caliper or the process to which Caliper attaches.

The simple options are:

--process=root

Only the root process is measured.

--process=root-forks

Only the root process and processes forked from the root process are measured.

--process=all

Every process in the process tree is measured. This is the default for all measurements.

--process=default

This option is available to be able to explicitly request the default behavior. This is equivalent to specifying root for all measurements.

The complex options are:

--process=[some:][OPT1,...]PATTERN

Each --process=some:... argument provided to Caliper is interpreted as an additional filter. Those filters are applied in order to each new process. If a filter matches the process, the behavior (measure, track, ignore) associated with the process is memorized. Caliper will use the behavior of the last matching filter to determine what to do with the process.

When no OPT1,... component is provided, the default interpretation is as follows: the PATTERN component is interpreted as a list of glob patterns separated by colons (:); if the basename of the executable matches any of those patterns, the process is measured, otherwise it is tracked.

The presence of keywords in OPT1,... modifies those semantics as follow:

measure

Processes matching this filter will be measured.

track

Processes matching this filter will be tracked.

ignore

Processes matching this filter will be ignored.

glob

The PATTERN is interpreted as a list of colon-separated glob patterns.

regexp

The PATTERN is interpreted as a Python/Perl regular expression that is tested using the search() function (i.e., any non-empty match will be considered a positive match).

file

The string used against the PATTERN is the basename of the main executable of the process.

arg0|argv0

The string used against the PATTERN is argument 0 of the process.

arg1|argv1

The string used against the PATTERN is argument 1 of the process.

root

The filter only matches the root process.

fork

The filter only matches processes created via fork().

exec

The filter only matches processes created by exec().

For keyword families measure|track|ignore, glob|regexp and file|arg0|arg1, only the last keyword used in each family is considered. For the keyword family root|fork|exec, multiple keywords will be considered as specifying a logical OR operation between the keywords.

The prefix some: is only necessary when no option is provided and the PATTERN could be mistaken for one of the simple options (root, root-forks, all, default).

--process=custom:FUNCTION

Allows you to specify a Python function to be used as a filter for processes.

System-Wide Measurements

Only PMU-based measurements are available in system-wide mode (across all CPUs in the system, instead of selected processes).

Measurements involving dynamic instrumentation see Measurement Categories) and cstack are not supported in system-wide mode. The measurement can occur at any privilege level; the default privilege level for system-wide mode is all: user and kernel space.

By default, samples are attributed to both a process and a load module, whenever possible. Alternatively, you can specify (via --scope) that samples be attributed to processes only or neither processes nor load modules. Both alternatives reduce the overhead of collecting and reporting performance data.

(Note that attribution settings to --scope do not affect attribution to samples in the kernel.)

In pmu_trace, addresses referring to user-space modules will not get resolved regardless of the sample attribution requested.

HP Caliper cannot locate an executable or a shared library on HP-UX if it is invoked using a relative path. In addition, at certain times, executables and shared libraries cannot be located even if they are specified with complete paths. If this problem occurs, the result can be a large number of samples reported as "unattributed". The workaround is to use the --module-search-path option to specify a list of directories where the executables and shared libraries are located.

Usage model:

  • Replace the executable invocation with --scope=system (or -w)

  • Define a measurement duration (--duration=seconds or -eseconds) or measure until SIGINT (Ctrl-C) is received.

On HP-UX, the --scope pset=pset_id[:pset_id]... option can be used to measure activity on all CPUs belonging to the specified processor sets. For example,

--scope pset=1:2

measures activity on all CPUs in processor sets 1 and 2. You can use the psrset command (see psrset(1)) with the -i option to find processor assignment for all processor sets in the system.

Metrics for Sorts/Cutoffs

The following report types support the use of the following metrics for sorting and applying cutoffs, where the default metric for sorting is shown enclosed in [ ]:

alat

[sampled-misses]

branch

target, branch-ways, [mispredict], back-end-only-mispredict

cstack

[samples], samples-running (HP-UX only), samples-blocked (HP-UX only)

cycles

[samples]

dcache

sampled-misses, [latency], avg-latency

dtlb

[sampled-misses], l2-fills, hpw-fills, soft-fills

fprof

[samples]

icache

sampled-misses, [latency], avg-latency

itlb

[sampled-misses], hpw-fills, soft-fills

scgprof

[samples}, call-count, msecs-per-call

traps

[samples]

By default, traps are always sorted on the first trap specified with --traps-reported (or ITLB if --traps-reported is not used). Specifying --sort-by=samples sorts based on values in the "Trap Samples" column.

Cutoff settings are based on the same metric as the sort, by default. Use --summary-cutoff or --detail-cutoff to override the default behavior.

EXTERNAL INFLUENCES

Environment Variables

caliper recognizes the following environment variables:

CALIPER_DATABASES

specifies the databases directory where implicit caliper databases (those not specified with a --database option) are stored. The default databases directory is ./.hp_caliper_databases.

CALIPER_OPTS

Specifies a set of caliper options which are used for every measurement run. The contents of CALIPER_OPTS is prepended to the command line before it is processed. It is possible to specify all caliper arguements and options via CALIPER_OPTS.

PLATFORM-SPECIFIC ADDENDA

HP-UX

There are some situations where caliper cannot insert probe code in a program or portions of a program due to non-standard or unusual conditions, such as when an assembly routine violates standard runtime conventions. In such situations, caliper will issue warning messages and proceed to measure as much of the program as it can.

The following additional measurements are supplied with caliper on HP-UX:

acount

Measures basic block arc counts.

cgprof

Measures call graph profile. (This is an enhanced version of the gprof command.)

cpu

Measures per-process metrics based on sampled CPU events.

fcount

Measures function counts.

fcover

Measures function coverage.

The following additional options, option arguments, or option features are available on HP-UX:

--bus-speed=MHZ

Specifies the bus speed in MHz for the sysbus event set. If you specify the sysbus event set, you must use the --bus-speed option to provide bus speed.

--inlines|--noinlines

The --inlines option specifies that caliper should record and report inline functions, if data is available in the binary to discover such inlines.

--cpu-aggregation=COUNT

This option is valid only when using the cpu measurement. Specifies how many samples (sampling period specified using --sampling-spec=TIME_PERIOD option) will be aggregated into one aggregated sample.

If COUNT is 0 or 1, the samples will not be aggregated. By default 125 low-level samples will be aggregated into one user-reported sample.

--cpu-details=[statistics|means][:samples]

This option is valid only when using the cpu measurement. Specifies whether to print all 7 statistical metrics (statistics) or just the mean and standard deviation (means). samples controls the printing of each sample in addition to reporting the summary statistics.

The default is means (report only the mean and standard deviation values).

--duration

This option is not supported for cgprof, fcount, fcover, and acount runs.

--exclude-caliper=TRUE|FALSE

The option is only valid when -w (--scope=system) is specified. This option is used to exclude/include the Caliper process activity as part of the measurement.

The default is TRUE (exclude).

Note that this option is not available on Linux and the behavior there is equivalent to a setting of FALSE (include Caliper).

--exclude-idle=TRUE|FALSE

The option is only valid when -w (--scope=system) is specified. This option is used to exclude/include the idle task as part of the measurement.

The default is TRUE (exclude).

Note that this option is not available on Linux and the behavior there is equivalent to a setting of FALSE (include the idle task).

--html

This option is also supported for cgprof reports.

--kernel-path

The default kernel path used is: /stand/current/vmunix (HP-UX 11i v2 and later).

--measure-on-interrupts=on|off|only

This option controls whether the Itanium PMU is enabled while processing interrupts/traps.

on

means that the PMU is enabled all the time (during regular processing as well as interrupt processing).

off

means that the PMU is only enabled during regular processing and disabled during interrupt processing.

only

means that the PMU is enabled during interrupt processing and disabled during regular processing.

The default is on for --scope=system and cpu measurements.

The default is off or the following measurements when the measurement scope is process (--scope=process): alat, branch, cycles, cache, dtlb, ecount, fprof, icache, itlb, pmu_trace, scgprof, traps.

Note that this option is not available on Linux and the behavior there is equivalent to a setting of on (PMU is enabled during regular processing as well as interrupt processing).

--memory-usage={all|[begin][:timed][:end][:PERIOD[s|m|h]]}

Controls the collection and reporting of memory usage data. Current memory use can be measured at any or all (all) of:

  • the beginning (begin) of the run,

  • periodically (timed) throughout the run, or

  • at the end (end) of the run.

When making timed measurements, current memory use is sampled every PERIOD number of seconds (default if no qualifier is given), minutes, or hours.

Only samples which show a difference in memory utilization from the previous sample are saved and reported to reduce the volume of data. No memory usage data is collected or reported unless the option is used.

This measurement can be made in conjunction with any Caliper measurement. This option is only available on HP-UX 11i v2 or later and with --scope=process measurement runs.

--pbo-data-type=arc-stride|dcache

Controls the type of data collected when you use the HP compilers' +Oprofile=collect option.

Choose arc-stride to collect arc counts, stride data, or both (depending on which have been enabled at compile time).

Choose dcache to collect data cache miss data.

Default: arc-stride.

Alternatively, you can choose the type of data collected by assigning arc-stride or dcache to the environment variable, PBO_DATA_TYPE.

--report

This option also supports these report types: acount, cgprof, fcount, and fcover.

--system-usage={all|runstatus|syscalls|runstatus:syscalls}

Controls the collection and reporting of system usage data. Two types of system usage data can be collected: runstatus (how much time each process spent running, eligible to run but not running, and waiting), syscalls (the count and time spent in every syscall called by a process), or all (the default).

No system usage data is collected or reported unless the option is used.

This measurement can be made in conjunction with any Caliper measurement. This option is only available on HP-UX 11i v2 or later and with --scope=process measurement runs.

--user-regions=default|rum-sum

For runs involving the Itanium PMU, specifies whether the data should be collected for the entire run (default), or only in regions delimited by the PMU enable/disable instructions (rum-sum). For more information, see Limiting PMU Measurements below.

When attaching to a process to perform acount, cgprof, fcount, or fcover measurements, the dependent shared libraries of the program must be mapped as private before you can attach to the process. You can enable private mapping of the shared libraries by using the chatr command with the +dbg enable option on the program file (see chatr(1)). When attaching to a process for acount, cgprof, fcount, or fcover runs, caliper will remain attached to the target process until it exits (see also --duration=SECONDS).

Stopping caliper with the use of SIGINT (e.g., Ctrl-C in a terminal window) for cgprof, fcount, fcover, and acount runs will result in all processes being forcibly terminated after Caliper generates a performance report or writes data to a database.

CPU Metrics EVENT_SET Description

CPU Metrics measurement type requires HP-UX 11i V2 September 2004 OE (B.11.23.0409) or later. You can specify the event sets and sampling period with the --metrics and --sampling-spec options, respectively.

The --cpu-aggregation=COUNT option specifies how many samples will be aggregated into one aggregated sample. You can measure multiple event sets in the same run. By default, the overview metrics consisting of the following 8 event sets will be measured: cpi, stall, dispersal, l1icache, l1dcache, l2cache, tlb, fp.

Example:

caliper cpu -o cpu.txt program

This will run program, measuring and reporting the following metrics by taking one sample every 8 milliseconds: cpi, stall, dispersal, l1icache, l1dcache, l2cache, tlb, fp. By default 125 low-level samples will be aggregated into one user-reported sample resulting in one aggregated sample collected per second. The result is saved in the text file, cpu.txt.

You can specify one or more comma separated list of predefined event sets. The following event sets are available. overview is the default.

brpath

Provides information on the dynamic mix of branch types, branch path distribution, branch per instruction, etc.

brpred

Provides metrics that are useful in assessing the effectiveness of branch prediction.

c2c

Provides metrics related to cache coherence activity.

cpi

Provides metrics related to Cycles Per Instruction (CPI)

cpubus

Provides information on the demand that a specific CPU presents to the CEC chip set, and the demand the CPU experiences due to the CEC traffic initiated by other CPUs or I/O components in the system.

cspec

Provides metrics on the effectiveness of control speculation.

dispersal

Provides qualitative view of the parallelism that is available as seen at instruction dispersal.

dspec

Provides metrics on the effectiveness of data speculation.

l1dcache

Provides miss rate information for the L1 data cache.

l1icache

Provides miss and prefetch usage information for the L1 instruction cache.

l2cache

Provides miss rate information for the L2 unified cache. Not available on dual-core processors.

l2dcache

Provides miss rate information for the L2 data cache. Only available on dual-core processors.

l2icache

Provides miss rate information for the L2 instruction cache. Only available on dual-core processors.

l3cache

Provides miss rate information for the L3 unified cache.

overview

Provides an overview of processor activity by collecting multiple event sets.

On non-dual-core processors, the event sets used are: cpi, stall, dispersal, l1dcache, l1icache, l2cache, tlb, fp.

On dual-core processors, the event sets used are: cpi, stall, dispersal, l1dcache, l1icache, l2dcache, l2icache, tlb, fp, threadswitch.

Note that, on dual-core processors, specifying overview is equivalent to specifying:

--metrics=cpi,stall,dispersal,\

l1dcache,l1icache,l2dcache,l2icache,\

l2dcache,l2icache,tlb,fp,threadswitch

queues

Provide BRQ (Bus Request Queue) metrics that may give some insight into possible system bus related performance problems.

stall

Provides metrics on primary CPU performance limiters by breaking the CPI into seven components.

sysbus

Provides metrics on system bus utilization. If you specify the sysbus event set, you must use the --bus-speed option to provide bus speed in MHz. For example, --bus-speed=200.

tlb

Provides metrics related to TLB misses.

threadswitch

Provides metrics on hyperthreading thread switch behavior. Only available on dual-core processors.

Limiting PMU Measurement

For measurements that involve the Itanium PMU, you can restrict measurements to specific parts of your application. The supported measurements are:

  • alat, branch, cycles, dcache, dtlb, ecount, fprof, icache, itlb, pmu_trace, scgprof, traps.

By default, HP Caliper measures PMU events for your entire program. However using --user-regions=rum-sum allows you to restrict measurements to performance-sensitive regions of code.

To use this feature:

  • Modify the application source code to use the header file provided with HP Caliper. The default location of the header file is caliper_root/include/caliper_control.h.

  • In your source code, add the HP Caliper macros to enable and disable the Itanium PMU.

    • To enable the PMU, insert: CALIPER_PMU_ENABLE();

      Using CALIPER_PMU_ENABLE() enables the PMU for the current thread until the next CALIPER_PMU_DISABLE(). When the PMU is already enabled, CALIPER_PMU_ENABLE() does not have any effect.

    • To disable the PMU, insert: CALIPER_PMU_DISABLE();

      When the PMU is already disabled, CALIPER_PMU_DISABLE() does not have any effect.

  • Use the command-line option --user-regions=rum-sum or place user_regions="rum-sum" in a measurement configuration file.

    This option causes HP Caliper to allow the measured applications to control the PMU. When specified, the PMU is initially disabled and HP Caliper will not measure the application until the first CALIPER_PMU_ENABLE() is executed.

If you do not specify the --user-regions=rum-sum option, CALIPER_PMU_ENABLE() and CALIPER_PMU_DISABLE() do not have any effect and the instructions behave as no-ops.

Metrics for Sorts/Cutoffs Specific to HP-UX

Here is additional information on "Metrics for Sorts/Cutoffs" specific to HP-UX.

The following additional report types support the use of the following metrics for sorting and applying cutoffs, where the default metric for sorting is enclosed in [ ]:

acount

arc-count, [taken-count]

cgprof

[samples], seconds, call-count, msecs-per-call

fcount

[call-count]

fcover

address, name, reached-count, reached-percent, unreached-count, [unreached-percent]

Additional Environmental Variable on HP-UX

The following additional environment variable is available on HP-UX:

CALIPER_HOME

Specifies the (non-default) caliper_root location when caliper is automatically invoked.

This is only needed when caliper is not installed in its default location (/opt/caliper) and a program compiled with the +Oprofile=collect option (profile based optimization) is run.

Limitations

The current HP-UX version of caliper has the following limitations:

  • Only aggregated results can be produced for multi-threaded programs by the acount, cgprof, ecount, fcount, and fcover measurements.

  • Handwritten assembly functions which do not follow the standard language runtime conventions may not be properly measured for instrumentation-based reports: acount, cgprof, fcount, and fcover.

  • Only native Itanium programs produced by the HP C, C++ and Fortran 9x compilers can be measured. PA-RISC programs, although they can run on Itanium systems, cannot be measured.

  • The option --scope=system is only supported on HP-UX B.11.23.0409 or later.

  • The option --scope=system can not be used while any other PMU measurement is running on the system.

  • The option --scope=system can only be used by privileged users, unless this security measure is disabled by setting the kernel tunable perfmon_allow_user_per_cpu to the value 1.

  • DLKM components are only listed if the user is privileged. Function-level information is not available for those modules.

LINUX

--kernel-path

There is no default kernel path used when sampling is done while in kernel mode. By default, only kernel module and function information is produced for samples; this option must be used to display disassembled instructions for kernel modules.

Limitations:

The current Linux version of caliper has the following limitations:

  • Correlating sample data to source files created with GNU compilers requires debug information created with the -g compiler option.

EXAMPLES

Here are some examples of common uses of caliper:

caliper ecount program

  • This will run program, measuring and reporting the total number of Itanium instructions executed (IA64_INST_RETIRED), the total number of nops executed (NOPS_RETIRED) and the total number of CPU cycles expended (CPU_CYCLES).

caliper pmu_trace --sampling-spec=10000,0,CPU_CYCLES program

  • This will run program, measuring and reporting the number of Itanium instructions executed (IA64_INST_RETIRED), the number of nops executed (NOPS_RETIRED) and the number of CPU cycles (CPU_CYCLES) every 10,000 cpu cycles (with no sampling variation. The pmu_trace measurement default is to sample every 50,000,000 cpu cycles.

CALIPER_OPTS="--module-exclude=/usr/lib/" caliper fprof program

  • This will run program, measuring and reporting a flat profile of sampled instruction addresses, excluding all system libraries.

caliper cstack program

  • This will run program, measuring and reporting a call stack profile by periodically sampling the application program counter and each of its thread's call stacks.

caliper report --detail=0 fprof

  • This will re-report the last fprof measurement run with all functions included in the report (no matter how little CPU time they used).

caliper info --detail=all L2

  • This will produce an information report including all details on all cpu events with "L2" in name.

caliper info -r itlb

  • This will produce an information report on the itlb measurement report.

caliper scgprof -w -e10

  • This will collect all activity across all CPUs in the system for a duration of 10 seconds, producing a sample-based call graph report.

caliper ecount -pall \ sh -c '/usr/bin/ls; /usr/bin/echo done'

  • This will measure /usr/bin/sh, /usr/bin/sl, and /usr/bin/echo executions.

caliper ecount --process=ls:echo \ sh -c '/usr/bin/ls; /usr/bin/echo done'

  • This will measure both /usr/bin/ls and /usr/bin/echo processes.

caliper ecount --process=*[ho] \ sh -c '/usr/bin/ls; /usr/bin/echo done'

  • This will measure both /usr/bin/sh and /usr/bin/echo processes.

caliper ecount --process='(arg1,regexp)tmp$' \ sh -c '/usr/bin/ls /var/tmp; /usr/bin/echo tmp listed'

  • This will measure both the /usr/bin/ls and /usr/bin/echo processes, since the regular expression tmp$ matches their argument 1.

caliper fprof -d fprof.db1 cc -g himom.c ; caliper fprof -d fprof.db2 cc -O himom.c ; caliper diff -o output fprof.db1 fprof.db2

  • This will create a report with the difference between the data collected in the two collection runs.

caliper advise DB1 DB2

  • This will analyze the data in HP Caliper databases DB1 and DB2, and make suggestions for performance improvements.

HP-UX ONLY EXAMPLES

caliper cgprof program

  • This will run program, measuring and reporting an extended gprof-like call graph profile.

caliper cgprof --html=HTML program

  • This performs the same cgprof performance measurement as the previous example but produces an HTML-formatted report in directory HTML for browsing.

AUTHOR

HP Caliper was developed by the Hewlett-Packard Company.

FILES

caliper_root

Anchor location of caliper installation, default /opt/caliper (HP-UX) or /opt/hp-caliper (Linux).

caliper_root/LICENSE

The caliper license terms.

caliper_root/THIRDPARTYLICENSEREADME.txt

Contains the license terms for third-party software used by caliper.

caliper_root/bin/caliper

caliper executable.

caliper_root/config/

Directory containing standard measurement configuration files.

caliper_root/contrib/

Holds useful contributed files.

caliper_root/doc/

Online documentation directory.

caliper_root/examples/

Example files.

caliper_root/gui/

Contains the local GUI client files.

caliper_root/gui_clients/

Remote GUI client installation files.

caliper_root/html/

Contains support files for generated HTML measurement reports.

caliper_root/lib/python2.3/

Python support directory.

caliper_root/man/

Manpage directory.

caliper_root/rules/

Shared analysis rules directory.

~/caliper_advisor/

User personal analysis rules directory.

./.hp_caliper_databases/

Implicit databases storage directory.

SEE ALSO

aCC(1), cc(1), chatr(1), f90(1), ld(1).

Online help is available at caliper_root/doc/index.html (HTML format).

The online HP Caliper User Guide is located at caliper_root/doc/caliperug.pdf (PDF version) and caliper_root/doc/html/caliper/C/caliperug.hmtl (HTML version).

The online HP Caliper Rule Writer Guide is located at caliper_root/doc/rule_writer_guide.pdf (PDF version).

There are detailed information files describing each HP Caliper measurement report at caliper_root/doc/text/*.help

There are complete lists of cpu_event values with full descriptions in caliper_root/doc/text/itanium_cpu_counters.txt and caliper_root/doc/text/itanium2_cpu_counters.txt.

cpu_event values (PMU events) are also described in the Intel(R) Itanium(tm) Processor Reference Manual for Software Development document.

The caliper website is at http://www.hp.com/go/caliper and contains additional technical information and updates.

Reference and context-sensitive help for the HP Caliper GUI is available by selecting the "Help Contents" or "Context-sensitive Help" items in the "Help" menu.