*nix Documentation Project
·  Home
 +   man pages
·  Linux HOWTOs
·  FreeBSD Tips
·  *niX Forums

  man pages->IRIX man pages -> prof (1)              
Title
Content
Arch
Section
 

Contents


PROF(1)								       PROF(1)


NAME    [Toc]    [Back]

     prof - analyze SpeedShop performance data

SYNOPSIS    [Toc]    [Back]

     prof [options] [speedshop_data_file  | pixie_counts_file] ...
     prof [options] executable-name [speedshop_data_file  | pixie_counts_file]
     ...

DESCRIPTION    [Toc]    [Back]

     prof analyzes one or more data files generated by the SpeedShop
     performance tools and produces a report. (Note that most reports are
     formatted with long lines,	and should be viewed in	a window that is 135
     characters	wide, and printed in wide format.)  The	second form is used to
     analyze data files	generated by the SpeedShop performance tools if	the
     target program is not in the same directory as the	data files (in which
     case <executable-name> should be the path to the target program).

     Multiple files can	be included only if they are recorded from the same
     executable	with the same experiment type.	The default listing for	all
     experiments lists functions in descending order of	the appropriate
     exclusive (meaning	from within the	function, rather than included from
     calls it makes) performance metric.  (See below, FUNCTION LIST.)  Options
     allow sorting by calls or by inclusive metrics, for those experiments
     where the recorded	data supports them.  Where applicable, the
     -b[utterfly] option also produces a listing of callers and	callees	for
     each function, with attribution percentages and time or counts.  (See
     below, BUTTERFLY LIST.)  Addition listings	may also be generated; see
     below.

     The current implementation	supports the following SpeedShop experiments:

     usertime (callstack profiling, user+system	time trigger) causes the
     program to	be interrupted every 30	milliseconds during its	running	time
     (user or system mode, but not including any wait time), and to record the
     callstack at each interrupt.  It can show both inclusive and exclusive
     time.  usertime data is statistical in nature, and	will show some
     variance from run to run.

     totaltime (callstack profiling, wall-clock	time trigger) causes the
     program to	be interrupted every 30	milliseconds of	wall-clock time	during
     the run, and to record the	callstack at each interrupt.  It can show both
     inclusive and exclusive time.  totaltime data is statistical in nature,
     and will show some	variance from run to run.

     [f]pcsamp[x] asks the kernel to look at the user PC every 10
     milliseconds, and record a	histogram of the value of the program counter
     at	each clock tick, using 16-bit bins, one	for each PC value.  It can
     only show exclusive data, that is data about where	the program counter
     was, not the callstack to get there.  The [f] prefix specifies 1
     millisecond profiling, instead of 10 milliseconds.	 The [x] suffix
     specifies 32-bit count bins, instead of 16-bit.  *pcsamp* data is
     statistical in nature, and	will show some variance	from run to run.



									Page 1






PROF(1)								       PROF(1)



     *_hwc asks	the kernel to look at the user PC every	time the hardware
     performance counter specified by the experiment overflows,	and record a
     histogram of the value of the program counter at overflow.	 These
     experiments can only be run on R10000 machines; other machines do not
     have the hardware performance counters.  There are	a number of these
     experiments defined; see speedshop(1).  They can only show	exclusive
     data, that	is data	about where the	program	counter	was, not the callstack
     to	get there.  The	particular counter used	and its	overflow value are
     specified in the experiment.  Some	of the Hardware	Performance counter
     sampling experiments are statistical in nature, and will show some
     variance from run to run; others are exact, provided that the program
     executes the exact	same sequence of instructions.

     Among the interesting counter prefixes to use are:
      cy_ for cycle counting;
      gi_ for graduated	instructions;
      ic_ for primary instruction cache	misses;
      sic_ for secondary instruction cache misses;
      dc_ for primary data cache misses;
      sc_ for secondary	data cache misses;
      tlb_ for TLB misses;
      gfp_ for graduated floating-point	instructions; and
      fsc_ for failed store-conditional	instructions.

     *_hwctime (callstack profiling, R10K hardware counter overflow trigger)
     causes the	program	to be interrupted at every N overflows of the
     particular	counter	and to record the callstack at each interrupt.	The
     value of N	depends	on the particular counter chosen.  It can take the
     same set of prefixes as above, but	since callstacks are recorded, it can
     show both inclusive and exclusive time.  *_hwctime	data is	statistical in
     nature, and will show some	variance from run to run.

     ideal causes the code to be instrumented to count the number of times
     each basic	block is executed.  (A basic block is a	region of the program
     that can be entered only at the beginning and exited only at the end.)
     The data recorded also contains counts for	all function pointer calls.
     From this data, a machine model is	used to	compute	the exclusive time
     (cycles) spent in each function.

     Inclusive time computations, performed when the -b[utterfly] flag is
     specified to prof,	calculates the exclusive time for each routine as
     above, and	then propagates	the time to each caller	of each	routine	in
     proportion	to the number of calls.	 For example, if sin(x)	takes 1000
     cycles, and its callers - procedures foo()	and bar() - call it  25	and 75
     times respectively, 250 cycles are	attributed  to foo() and 750 to	bar().
     By	propagating cycles this	way, __start() should end up with all the
     cycles counted in the program.

     Note that the propagation according to the	number of calls	may not	be
     reasonable	for some routines, and may lead	to misleading reports.	For
     example, if a matrix-multiply routine is substituted for sin(x) in	the
     above example, and	bar's calls are	for 2X2	matrices, while	foo's calls



									Page 2






PROF(1)								       PROF(1)



     are for 100X100 matrices, the attribution distributes 3/4 of the time to
     bar, whereas nearly all the time really should be attributed to foo.

     fpe asks the floating-point exception library to trace all	floating-point
     exceptions, with their callstacks.	 These experiments show	listings
     similar to	those for usertime experiments,	except the data	reported is a
     count of floating-point exceptions, rather	than projected CPU time.

     io	asks for a trace of all	IO calls made by the program.  These
     experiments will show calls attributed to functions, and with
     -b[utterfly], can show which functions made which calls.

OUTPUT REPORTS    [Toc]    [Back]

     prof writes an analysis of	the performance	data to	stdout.	 The first
     thing in the list is a summary of the experiment, and description of the
     environment in which it was recorded.  That is followed by	a header that
     summarized	the particular data recorded.  Following that is the function
     list;  if -b[utterfly] is given, and the data supports it,	a butterfly
     list will be presented.

     If	-h[eavy] or -l[ines] is	given, for *pcsamp*, *hwc, and ideal
     experiments, a report of data at the source line level is appended,
     sorted by the performance metric computed on a line basis,	or by
     functions,	and then by line numbers within	a function, respectively.  For
     other experiments,	these options are ignored.

     If	-basicblocks is	given, for ideal experiments only, A report of data at
     the basic-block level is appended.	 If -archinfo is given,	also for ideal
     experiments only, a summary report	of register usage, instruction usage,
     and various other statistics is appended.	For other experiments, these
     options are ignored.

     If	-dsolist is given, a list of the DSOs used by the program is appended.

     If	-usage is given, a summary of the resources used by the	program	is
     appended.

FUNCTION LIST    [Toc]    [Back]

     The default output	for any	experiment is a	function list, sorted in the
     order of exclusive	values of the primary metric--the performance cost of
     the function, as computed from the	data recorded.	For many experiments,
     this metric is a time, printed in the report in seconds; for others, it
     is	a count	of events-FPEs,	IO calls, or R10K counter overflow counts.

     The report	begins with a legend line, naming each of the columns of data.
     Each line in the list has an index; if -b[utterfly] is not	specified, the
     index will	be in numerical	order; if it is	specified, the index will be
     in	order of functions as sorted by	the inclusive metric.  The index
     serves as a poor-person's hyperlink through the butterfly,	and between
     the butterfly and the function list.





									Page 3






PROF(1)								       PROF(1)



     To	the left of the	index, is the primary metric, followed by its
     representation as a percentage of the whole program, followed by the
     cumulative	percentage up to that point in the list.  The next two columns
     are inclusive value of the	metric., and its representation	as a
     percentage.

     Following that may	be additional columns of data, as appropriate to the
     particular	data recorded.	Finally, rightmost on the line will be the
     name of the function, and the DSO and source file basenames.

BUTTERFLY LIST    [Toc]    [Back]

     The butterfly list	is a set of records that show the callers and callees
     of	each function.	The list is sorted in the order	of inclusive values of
     the primary metric.  For each function, its callers are shown above it,
     and its callees below.  The center	line is	for the	function itself, and
     shows the index at	the beginning and end of the line.  The	second and
     third columns are the inclusive percentages of the	primary	metric and its
     absolute value.  The next two columns of the center line are the
     exclusive percentage and the exclusive value of the metric, followed by
     the function identification.

     Callees are shown below the function, with	attribution percentages	and
     values lined up below the self percentages	and values of the center
     function.	In absolute mode (the default),	the percentages, including the
     center node's self	value, should add up to	the inclusive percentage of
     the center	node;  in relative mode, specified with	-rel[ative] flag, the
     percentages should	add up to 100 %.  In either mode, the attribution
     values should add up to the inclusive time	of the central function.  The
     attribution data for callees is followed by the inclusive value for that
     caller and	callsite, followed by the callsite identification, with	an
     address, a	source file, and a line	number.	 For callees, the source file
     will be that of the central function.

     The callers are shown above the central function, also with attribution
     percentages and values, but these have a different	meaning, and are
     aligned with the central nodes inclusive values to	so indicate.  The
     attribution percentage and	value in a caller's line represents the
     percentage	and value, respectively, of the	central	function's metric that
     was attributed to that callsite.  In absolute mode, the percentages
     should add	up to the central function's percentage; in relative mode,
     they should add up	to 100%.  The attribution values should	add to the
     central function's	values.	 The attribution data for callers is followed
     by	the inclusive value for	that caller and	callsite, followed by the
     callsite identification, with an address, a source	file, and a line
     number.  For callers, the source file will	be that	of the calling
     function.

LINE-LEVEL LISTS    [Toc]    [Back]

     When invoked with the -h[eavy] or -l[ines]	arguments, a list of linelevel
 data	is produced.  When -h[eavy] is used, the line list is sorted
     by	the primary metric associated with each	line, with lines from the
     various DSOs and source files intermixed.	When -l[ines] is used, the



									Page 4






PROF(1)								       PROF(1)



     lines are sorted based on the function they came from, with all lines
     with non-zero values from each function printed before any	lines from the
     next most important function.

     For each line, the	leftmost column	contains the value for the primary
     metric computed from the particular experiment, followed by its
     representation as a percentage, followed by the cumulative	percentage
     over all lines printed thus far.  These data are followed by any
     additional	metrics	for the	particular experiment, and then	by the name of
     the function, its DSO, file and line number.

BASIC BLOCK LIST    [Toc]    [Back]

     When invoked with the -basicblocks	argument (applicable ideal experiments
     only), a list of all the basic blocks in the program that were executed
     is	generated.  It is preceded by a	header,	and a column header line, and
     then the list of basic blocks, in order of	total cycles.

     Each basic	block is printed with its index, the number of cycles per
     execution of the block, the count of executions, the total	cycles,	and
     the total cycles represented as a percentage.  Trailing that on each
     line, is the function, and	the address of the beginning of	the line, with
     the DSO, source file, and source line number.

ARCHITECTURAL INFORMATION REPORT    [Toc]    [Back]

     When invoked with the -archinfo argument (applicable ideal	experiments
     only), a list of various metrics concerning execution of the program is
     printed.  The report consists of a	header,	and a number of	subsections,
     each with the appropriate headers.

     For integer registers, counts are printed of the number of	times each
     register was used,	and its	percentage, the	number of times	each register
     was used as a base	register, and its percentage, and the number of	times
     each register is used as a	destination register, and its percentage.  For
     floating point registers, the same	data, less the base count statistics,
     is	printed.

     Following the register usage statistics, a	number of counts of
     instruction types or sequences are	printed, each with a description,
     followed by a list	of all the different instructions used,	sorted by the
     number of times each was executed.	 For each instruction, the dynamic
     count of executions, and its representation as a percentage is printed.
     That is followed by the cumulative	execution percentage, and the count of
     the number	of distinct instructions of that type that were	executed one
     or	more times, and	that count represented as a percentage.

DSO LIST    [Toc]    [Back]

     When invoked with the -dsolist argument, a	report of summary information
     about the DSOs in the execution of	the program is printed.	 For each DSO,
     it	gives the name,	a count	of instructions, functions, source-files, and
     source-lines, the high and	low addresses, and the full pathname to	the
     DSO.  If the DSO is ignored in computation, either	because	it belongs to
     the SpeedShop runtime, and	the -showss flag is not	provided, or because



									Page 5






PROF(1)								       PROF(1)



     the DSO was excluded by being listed as with a -xdso argument, or the DSO
     was excluded by virtue of its not being listed with a -dso	argument, the
     DSO is flagged as ignored.

RESOURCE USAGE    [Toc]    [Back]

     When invoked with the -usage argument, a report of	summary	usage data, as
     measured by the kernel during the run, is produced.  The data consists of
     both per-process (per-file) and per-system	metrics.  Included are realtime
 (wall	clock time), user, system and wait times.  The per-process
     data is shown as a	sum of the data	from the experiment files; the realtime,
 and the system-wide statistics are printed as maxima	over the data
     in	each file.  Included are the accounting	timers,	giving the time	spent
     in	each of	the various process states.  The sum of	the accounting timers
     should be approximately equal to the elapsed real-time, since the process
     must always be in one of the states.  There is some skew in the reading
     of	the data, so some discrepancy should be	expected.  Other summary
     statistics	include	bytes read and written,	page faults, context switches,
     system calls, and process-size statistics.

DISASSEMBLY AND	SOURCE LISTS
     When invoked with the -dis	argument, an assembly listing of the program
     is	generated.  If the -dis	argument is given as well, the source code for
     the disassmebly is	intermixed with	it.

OPTIONS    [Toc]    [Back]

     -calipers	[n1,]n2
	  flag causes prof to compute the data between caliper points n1 and
	  n2, rather than for the entire experiment.  If n1 >= n2, an error is
	  reported, otherwise if n1 is negative	it is set to the beginning of
	  the experiment, and if n2 is greater than the	maximum	recorded, it
	  is set to the	maximum.  If n1	is omitted, zero is assumed.

     -b[utterfly]
	  or -gprof causes prof	to print a report showing the callers and
	  callees of each function, with inclusive time	attributed to each.
	  For ideal experiments, the attribution is based on a heuristic,
	  while	for the	various	callstack sampling/tracing experiments the
	  attribution is precise (although usertime and	totaltime, as well as
	  some *_hwctime experiments are statistical in	nature).  This flag is
	  ignored for experiments where	the data does not support inclusive
	  calculations.

     -p[threads] <pthread_id1>,<pthread_id2>,...,<pthread_idn>
	  For usertime,	totaltime, *_hwctime, io, and fpe experiments on
	  applications that use	pthreads (on Irix 6.5 or later), analyze data
	  only for the specified pthreads.  This flag is ignored for other
	  experiments.

     -u[sage]
	  Print	a report on system statistics and timers.





									Page 6






PROF(1)								       PROF(1)



     -dis[assemble]
	  Disassemble and annotate the analyzed	object code with cycle times
	  or number of PC samples.  This option	can be used when generating
	  reports for ideal, pcsamp, or	prof_hwc experiments.

     -S(-source)
	  Disassemble and annotate the analyzed	object code with cycle times
	  (or PC samples) and interleave the source code.  This	option can be
	  used when generating reports for ideal, pcsamp, or *_hwc
	  experiments.

     -h[eavy]
	  Reports the most heavily used	lines in descending order of use.
	  This option can be used when generating reports for ideal, pcsamp,
	  or prof_hwc experiments.  It is ignored for other experiments.

     -l[ines]
	  Like -h[eavy], but group lines by procedure, with procedures sorted
	  in descending	order of use.  Within a	procedure, lines are listed in
	  source order.	 This option can be used when generating reports for
	  ideal, pcsamp, or prof_hwc experiments.  It is ignored for other
	  experiments.

     -[no]cordfb
	  Disables or enables cord feedback file generation for	the executable
	  only.	Cord feedback is used to arrange procedures in the binary in
	  an optimal ordering, to improve both paging and instruction-cache
	  performance.	Users can use cord(1) (or, in the near-term future,
	  ld(1)) to actually do	the procedure-ordering.

     -cordfball
	  Enables cord feedback	for the	executable and all DSOs.

     -feedback
	  Produces files with information that can be used to (a) tell the
	  compiler how to optimize compilation of the program next time	and
	  (b) arrange procedures in the	binary in an optimal ordering.	Users
	  can invoke "cc(1) -fb	<cfb-filename>"	to use the compilationoptimization
 feedback	file for subsequent compilations.  To disable
	  cord feedback, while producing compiler feedback, use	the options
	  -feedback -nocordfb together.
	  Procedures are normally ordered by their measured invocation counts;
	  if -gprof is also specified, procedures are ordered using call graph
	  counts (that capture caller-callee relationships as well), rather
	  than invocation counts.  The cord feedback file is named <a.out>.fb
	  or <lib*so*>.fb.  The	-feedback option produces cord feedback	for
	  the executable only. To get cord feedback for	all the	DSOs as	well,
	  use the options -feedback -cordfball together.
	  This option also produces a file with	information that the compiler
	  system can use to recompile, optimizing by using measured branch
	  frequencies, etc.  The feedback file is produced for the executable
	  only.	 It is named <a.out>.cfb and is	a binary file.	It may be



									Page 7






PROF(1)								       PROF(1)



	  dumped using the fbdump(1) routine.  This option can only be used
	  with pixie (ideal-time) data files, and the data should be recorded
	  on a binary that was compiled	-O0.  Recording	data on	binaries with
	  higher optimization will generate a feedback file that does not have
	  the appropriate correspondence between source	lines and machine
	  code.

     -ws  Generate a working-set file for the current caliper setting, for the
	  executable only.

     -wsall
	  Generate a working-set file for the current caliper setting, for the
	  executable and all the non-ignored DSOs.

     -showss
	  Enables display of functions from the	SpeedShop runtime.  Normally
	  such functions are suppressed	from the reports and computations.  In
	  addition, some statistics for	prof's own memory usage	will be
	  printed.

     -dsolist
	  List all the DSO's in	the program and	their start and	end text
	  addresses.

OUTPUT CONTROLS    [Toc]    [Back]

     -rel[ative]
	  Show percentage attribution in a butterfly report relative to	the
	  central function.  The default is to show percentages	as absolute
	  percentages over the whole run.

     -inclusive
	  Sort function	list by	inclusive data,	rather than by exclusive data.
	  This option can only be used when generating reports for those
	  experiments which have inclusive data; it is ignored for others.

     -calls
	  Sort function	list by	procedure calls	rather than by time.  This
	  option can only be used when generating reports for ideal
	  experiments, or for basic block counting data	obtained with pixie.

     -q[uit]  n
	  Truncates the	 -h[eavy], -l[ines] and	-b[utterfly] listings after
	  the first n procedures or lines have been listed.

     -q[uit] n%
	  Truncates the	 -h[eavy], -l[ines] and	-b[utterfly] listings after
	  those	procedures or lines up to the one which	takes more than	n
	  percent of the total.

     -q[uit] ncum%
	  Truncates the	 -h[eavy], -l[ines] and	-b[utterfly] listings after
	  those	procedures or lines up to the one which	brings the cumulative



									Page 8






PROF(1)								       PROF(1)



	  total	to n percent.  (For -b[utterfly], it behaves the same as -quit
	  n%.)	For example, "-q[uit] 15" truncates each part of the listing
	  after	15 lines of text, "-q[uit] 15%"	truncates each part after the
	  first	line that represents less than 15 percent of the whole,	and
	  "-q[uit] 15cum%" truncates each part after the line that brought the
	  cumulative percentage	above 15 percent.

     -dislimit n
	  Disassemble only those basic blocks with frequency >=	n.

     -nh  Suppress various header blocks from the output.

     SELECTIVITY OPTIONS

     -dso dso_name
	  Report only on the named DSO.	 Only the basename of the DSO need be
	  specified not	the full pathname to the DSO; the .so suffix is
	  required.  Multiple instances	of the -dso flag can be	given, and the
	  executable is	considered a DSO, like any other.  All the DSOs	from
	  an experiment	can be listed with the -dsolist	flag.

     -xdso dso_name
	  Exclude the named DSO	from any reports.  Only	the basename of	the
	  DSO need be specified	not the	full pathname to the DSO; the .so
	  suffix is required.  Multiple	instances of the -xdso flag can	be
	  given.

     -o[nly] procedure_name
	  If you use one or more -o[nly] options, the profile listing includes
	  only the named procedures, rather than the entire program. If	any
	  option uses an uppercase "O" for -O[nly], prof uses only the named
	  procedures, rather than the entire program, as the base upon which
	  it calculates	percentages.

     -e[xclude]	procedure_name
	  If you use one or more -e[xclude] options, the profiler omits	the
	  specified procedure from the listing.	 If any	option uses an
	  uppercase "E"	for -E[xclude] , prof also omits that procedure	from
	  the base upon	which it calculates percentages.

     CPU OPTIONS

     prof normally uses	the scheduling model for the processor on which	it is
     being run to perform the analysis.	 The user can override the default
     with any of the following options:

     {-r10000 |	-r8000 | -r5000	| -r4000 | -r3000}
	  Note that these options are only meaningful for ideal	time and
	  pixie-counts data.






									Page 9






PROF(1)								       PROF(1)



     -clock megahertz
	  Set the CPU clock speed to  megahertz	 MHz.  Alters the appropriate
	  parts	of the listing to reflect the clock speed.  The	default	value
	  is the clock speed of	the machine on which the experiment was
	  performed.

     -cycle nanosecond
	  Set cycle time to nanosecond ns. This	is the same as -clock
	  1000/nanosecond.

DEBUGGING OPTIONS    [Toc]    [Back]

     -debug:debug_flags
	  Where	debug_flags can	be combinations	of the following.

	   GPROF_FLAG	   0x00000001

	   COUNTS_FLAG	   0x00000002

	   SAMPLE_FLAG	   0x00000004

	   MISS_FLAG	   0x00000008

	   FEEDBACK_FLAG   0x00000010

	   CORD_FLAG	   0x00000020

	   USERPC_FLAG	   0x00000040

	   MDEBUG_FLAG	   0x00000080

	   BEAD_FLAG	   0x00000100

DIAGNOSTICS    [Toc]    [Back]

     prof prints warnings and fatal errors on stderr.  With inclusive cycle
     counting, prof prints a list of functions at the end which	are called but
     not defined. Any functions	starting with _rld listed here is normal
     behavior. They appear because rld is not instrumented.

     One way to	sanity-check inclusive cycle counts is to look at the
     percentage	cycles for __start().  If the value is anything	less than 98-
     99%, the inclusive	report is suspect.  Look for other warnings which
     indicate that prof	didn't take into account certain procedures.

     There are a number	of known cases when prof fails to list cycles of a
     procedure in the inclusive	listing.  The reasons can be one of the
     following:

	- init & fini sections,	mips stubs are not part	of any procedure.

	- calls	to procedures which don't use a	jump and link are
	  not recognized as procedure calls.




								       Page 10






PROF(1)								       PROF(1)



	- executions of	global procedures with same name in different DSOs.
	  In this case,	only one of them is listed.

     All these exceptions are listed at	the end	of the -gprof listing, under a
     separate section.

LIMITATIONS    [Toc]    [Back]

     With both pc-sampling and basic block counting, prof may report times for
     procedures	named with a prefix of *DF*, for example *DF*_hello.init_2. DF
     stands for	Dummy Function,	and indicates cycles spent in parts of text
     which are not in any function, for	example, init and fini sections, and
     MIPS.stubs	sections.  For pc-sampling, and	hardware counter overflow pcsampling,
 a dummy function, *DF*_Other is used to report sampling hits in
     an	overflow bin; these hits include time spent in rld, and	any other text
     regions not listed	in any DSO.

     If	any of the object files	linked into the	application have been stripped
     of	line-number information	(with ld -x for	example), prof warns about the
     affected procedures.  The instruction counts for such procedures are
     shown as a	procedure total, not on	a per-basic-block basis.  Where	a line
     number would normally appear in a report on a function without line
     numbers question marks appear instead.

     prof does not take	into account interactions between basic	blocks in
     ideal time	runs. prof computes the	cycles for one execution of a basic
     block, assuming all registers are free at entry to	the block, and it
     multiplies	this count by the number of times that basic block is
     executed.	In real	programs, a block may be entered with a	result not yet
     ready in a	register, or with a function unit busy,	so the cycle count
     computed could be either higher or	lower than the correct value.
     Extending the computations	to include inter-block state would be
     prohibitively expensive.

     When run on a program using shared	libraries, prof	sometimes combines the
     times of real but anonymous procedures in a shared	library	into the
     preceding (according the library memory layout) externally	visible
     function.	The times reported for your procedures are not affected	by
     this attribution-error, which is normally minor.

     prof cannot be run	on programs which have been stripped.

     Compiler optimization level 3 does	procedure inlining.  This can result
     in	extremely misleading profiles since the	time spent in the inlined
     procedure shows up	in the profile as time spent in	the procedure into
     which it was inlined.  It is generally better to use compiler
     optimization level	2 or less when gathering an execution profile.

     Fortran alternate entry point times are attributed	to the main
     function/subroutine, since	there is no general way	for prof to separate
     the times for the alternate entries.





								       Page 11






PROF(1)								       PROF(1)


SEE ALSO    [Toc]    [Back]

      
      
     speedshop(1), ssrun(1), ssdump(1),	sscord(1), ssorder(1), pixie(1),
     fbdump(1)


								       PPPPaaaaggggeeee 11112222
[ Back ]
 Similar pages
Name OS Title
ssrun IRIX set up and run a process to collect SpeedShop performance data
ssdump IRIX print out the contents of SpeedShop performance experiment data files
ssperf IRIX SpeedShop Performance Analyzer
kernprof IRIX special executable for SpeedShop performance measurements on the unix kernel
cvexprnt IRIX print out the contents of WorkShop performance experiment records or data files
pixstats IRIX analyze program execution
ssapi IRIX SpeedShop runtime library
ssapi IRIX SpeedShop runtime library
io_ss IRIX SpeedShop I/O tracing library
malloc_ss IRIX SpeedShop memory allocation library
Copyright © 2004-2005 DeniX Solutions SRL
newsletter delivery service