perfex 2.5.5 (Tungsten/Xeon)
The perfex tool from IA-32 PerfCtr is available on the Tungsten cluster.
Location: /usr/apps/tools/perfctr/bin
Available documentation (from the perfex source) is below.
Rick
NAME
perfex - a command-line interface to x86 performance counters
SYNOPSIS
perfex [-e event] .. [--p4pe=value] [--p4pmv=value] [-o file] command
perfex { -i | -l | -L }
DESCRIPTION
The given command is executed; after it is complete, perfex
prints the values of the various hardware performance counters.
OPTIONS
-e event | --event=event
Specify an event to be counted.
Multiple event specifiers may be given, limited by the
number of available performance counters in the processor.
The full syntax of an event specifier is "evntsel/escr@pmc".
All three components are 32-bit processor-specific numbers,
written in decimal or hexadecimal notation.
"evntsel" is the primary processor-specific event selection
code to use for this event. This field is mandatory.
"/escr" is used to specify additional event selection data
for Pentium 4 processors. "evntsel" is put in the counter's
CCCR register, and "escr" is put in the associated ESCR
register.
"@pmc" describes which CPU counter number to assign this
event to. When omitted, the events are assigned in the
order listed, starting from 0. Either all or none of the
event specifiers should use the "@pmc" notation.
Explicit counter assignment via "@pmc" is required on
Pentium 4 and VIA C3 processors.
The counts, together with an event description are written
to the result file (default is stderr).
--p4pe=value | --p4_pebs_enable=value
--p4pmv=value | --p4_pebs_matrix_vert=value
Specify the value to be stored in the auxiliary control
register PEBS_ENABLE or PEBS_MATRIX_VERT, which are used
for replay tagging events on Pentium 4 processors.
Note: Intel's documentation states that bit 25 should be
set in PEBS_ENABLE, but this is not true and the driver
will disallow it.
-i | --info
Instead of running a command, generate output which
identifies the current processor and its capabilities.
-l | --list
Instead of running a command, generate output which
identifies the current processor and its capabilities,
and lists its countable events.
-L | --long-list
Like -l, but list the events in a more detailed format.
-o file | --output=file
Write the results to file instead of stderr.
EXAMPLES
The following commands count the number of retired instructions
in user-mode on an Intel P6 processor:
perfex -e 0x004100C0 some_program
perfex --event=0x004100C0 some_program
The following command does the same on an Intel Pentium 4 processor:
perfex -e 0x00039000/0x04000204@0x8000000C some_program
Explanation: Program IQ_CCCR0 with required flags, ESCR select 4
(== CRU_ESCR0), and Enable. Program CRU_ESCR0 with event 2
(instr_retired), NBOGUSNTAG, CPL>0. Map this event to IQ_COUNTER0
(0xC) with fast RDPMC enabled.
The following command counts the number of L1 cache read misses
on a Pentium 4 processor:
perfex -e 0x0003B000/0x12000204@0x8000000C --p4pe=0x01000001 --p4pmv=0x1 some_program
Explanation: IQ_CCCR0 is bound to CRU_ESCR2, CRU_ESCR2 is set up
for replay_event with non-bogus uops and CPL>0, and PEBS_ENABLE
and PEBS_MATRIX_VERT are set up for the 1stL_cache_load_miss_retired
metric. Note that bit 25 is NOT set in PEBS_ENABLE.
DEPENDENCIES
perfex only works on Linux/x86 systems which have been modified
to include the perfctr driver. This driver is available at
http://www.csd.uu.se/~mikpe/linux/perfctr/.
NOTES
perfex is superficially similar to IRIX' perfex(1).
The -a, -mp, -s, and -x options are not yet implemented.
Copyright (C) 1999-2003 Mikael Pettersson