Add "trim path" option which can be used to relocate sources. (#366)
Add "trim path" option which can be used to relocate sources.
When pprof is asked to show annotated source for a profile collected on
another machine for a Go program, the profile contains absolute paths
which may not exist on the local machine. In that case pprof currently
fails to locate the source with no option to help it. The new option
adds a way to specify one or several source path prefixes that should be
trimmed from the source paths in the profile before applying the search
using search path option.
For example, taking the example from the issue where the source file
path in the profile is
/home/teamcitycpp/agent09/work/56cbaf9067/_gopath/src/badoo/lakafka/main.go
and the local path is /home/marko/lakafka/main.go. The user may
specify
`-trim-path=/home/teamcitycpp/agent09/work/56cbaf9067/_gopath/src/badoo`
to make pprof find the source. The source path doesn't need to be
specified if the current working dir is anything at or under
`/home/marko/`.
When the trim path is not specified, it is guessed heuristically based
on the basename of configured searched paths. In the example above,
setting `-search-path=/home/marko/lakafka` would be sufficient to
activate the heuristic successfully. Or having the current directory as
`/home/marko/lakafka` since the search path is by default set to the
current working directory. Note that the heuristic currently
does not attempt to walk the configured search paths up like the search
does. This is to keep it simple, use `-trim-path` explicitly in more
complicated cases.
Fixes #262.
weblist output disassembly now contains inlined source. (#235)
weblist output disassembly now contains inlined source.
In addition, fixed the line number assigned to instructions from inlined calls.
As an illustration, see the before and after output for a "pprof -weblist" command for a simple function:
[before](https://ghemawat.github.io/scratch/tmp/list1.html)
[after](https://ghemawat.github.io/scratch/tmp/list2.html)
Note that in the "before" page, the 7.26 second line cannot be expanded, but in the "after" page clicking on it reveals a wealth of information.
* Add support for tag units
Updated tests for adding tag units
* updated proto docs to describe tag units
* fixed formatting
* added additional test
* require specific units for graph tag to be recognixed
* updated tests and formatting
* Fixed formatting
* Fixed style error
* modify proto_test to be sure tag units not overwritten
* Clarified label docs
* call legacy profile memory allocations size, not byte
* updated tests to call bytes tags in legacy profile size
* Revert "updated tests to call bytes tags in legacy profile size"
This reverts commit 9289378af2.
* Revert "call legacy profile memory allocations size, not byte"
This reverts commit 6b18973562.
* Updated so units not modified when profile written out
* Fixed formatting errors
* Removed field for inferred numeric label units from profile
* Modified profile String() output
* fixed formatting error
* changed name of function used to identify units for numeric labels
* Refactor String() for profile
* Modified for clarity and address comments
* renamed function for clarity
* Made numeric label units field seperate in profile
* Modified test
* Refactored identifyNumLabelUnits()
* Updated comments
* Fixed formatting error
* Addressed comments
* Fixed style error
* clarify comment
* Refactored to address comments
* addressed comments
* addressed comments
* addressed comments
* Make top table sortable.
* Recompute Sum% column contents after sorting.
* Cleanups identified during review.
* Simplify top table sorting by generating it in Javascript.
Use common header for remaining web views (source code, disassembly, peek) (#214)
Use common header for remaining web views (source code, disassembly, peek).
Details:
* Made it easier to reuse css, javascript, etc. across views.
* Added two new templates: sourcelisting, plaintext.
* Use new templates for showing source code, disassembly, peek.
* Merged Functions menu into View menu.
* View change menu entries are now always enabled.
* Removed now unused "newWindow" argument to navigate.
* Removed redundancy in web handlers.
* Renamed /weblist URL to /source.
* Assembly and source code listings are now limited to 500 functions
sorted by decreasing flat value (to prevent huge delays when
selecting asm/source view across the entire profile).
Extend web UI to contain a tabular "top" display (#194)
* Split template into multiple defined templates for ease of sharing.
* Add a menu bar at the top and tie all buttons to it.
* Add top view.
* Handle top table clicks.
* Handle regexp based selection of top entries.
* Added test for /top view.
* Reduced code duplication between / and /top handlers.
* Add Refine menu to Top view as well.
Details:
* Pass base URL to web page.
* Use base URL in navigate() to stay in same view.
* Change Reset to stay in same view.
* Handled review comments + increased top display from 50 to 500 rows:
* Moved "inline" indicator to separate column of top display.
* Moved "Reset" button to "Refine" menu.
* Simplified some Javascript.
* Handle all meta characters in quotemeta.
* Removed unnecessary css "display: inline" from closedetails.
Signed-off-by: Sanjay Ghemawat <sanjay@alum.mit.edu>
* Added an interactive web interface triggered by passing -http=port
on the command line. The interface is available by visiting
localhost:port in a browser.
Requirements:
* Graphviz must be installed.
* Browser must support Javascript.
* Tested in recent stable versions of chrome and firefox.
Features:
* The entry point is a dot graph display (equivalent to "web" output).
* Nodes in the graph can be selected by clicking.
* A regular expression can also be typed in for selection.
* The current selection (either list of nodes or a regexp)
can be focused, ignored, or hidden.
* Source code or disassembly of the current selection can be displayed.
* Remove unused function.
* Skip graph generation test if graphviz is not installed.
* Added -http port and the various modes of using pprof to the
usage message.
* Web interface now supports "show" option.
* Web interface automatically opens the browser pointed at
the page corresponding to command line arguments.
* Some tweaks for firefox.
* Handle review comments (better usage message, more testing).
* Handled review comments:
1. Capture and display errors like "Focus expression matched no samples".
2. Re-ordered buttons to match other interfaces.
3. Use UI.PrintErr to print error messages.
* Handle javascript code review comments (a bunch of cleanups).
Also added pprof binary to .gitignore.
* Align file names in weblist disassembly
The disassembly displayed by weblist attempts to align file:line
information after each instruction, but two things interfere with this
alignment: 1) the instruction text often has tabs in it, which defeats
the length-based alignment and generally interacts poorly with HTML
rendering and 2) the instruction text often has special HTML
characters like < and > in it, but we pad the string after escaping
it, so the length is again not representative of how it will display.
For example, the following shows what this looks like for disassembly
that contains < and >:
. . 41c634: cmp $0x200,%rdx mgcsweepbuf.go:130
. . 41c63b: jae 41c64f <runtime.(*gcSweepBuf).pop+0x4f> mgcsweepbuf.go:130
. . 41c63d: mov (%rax,%rdx,8),%rcx mgcsweepbuf.go:130
. . 41c64f: callq 424800 <runtime.panicindex> mgcsweepbuf.go:130
. . 41c654: ud2 mgcsweepbuf.go:130
Fix these problems by replacing tab characters with spaces and padding
the string before escaping it. After this, the file names are properly
aligned:
. . 41c634: cmp $0x200,%rdx mgcsweepbuf.go:130
. . 41c63b: jae 41c64f <runtime.(*gcSweepBuf).pop+0x4f> mgcsweepbuf.go:130
. . 41c63d: mov (%rax,%rdx,8),%rcx mgcsweepbuf.go:130
. . 41c64f: callq 424800 <runtime.panicindex> mgcsweepbuf.go:130
. . 41c654: ud2 mgcsweepbuf.go:130
* Separate discontiguous assembly blocks
Currently, weblist's disassembled output for each line simply lists
all instructions that are marked with that line number. This leads to
confusing disassembly because a lot of things look like straight-line
control flow when they aren't. For example, index panics look like
they always happen:
. . 41c62c: test %al,(%rax)
. . 41c62e: and $0x1ff,%edx
. . 41c634: cmp $0x200,%rdx
. . 41c63b: jae 41c64f <runtime.(*gcSweepBuf).pop+0x4f>
. . 41c63d: mov (%rax,%rdx,8),%rcx
. . 41c64f: callq 424800 <runtime.panicindex>
. . 41c654: ud2
In reality, 41c64f is at the end of the function, but it looks like we
call it immediately after a successful index operation.
Fix this by adding a vertical ellipses to separate blocks of assembly
that have other instructions between them. With this change, the above
looks like:
. . 41c62c: test %al,(%rax)
. . 41c62e: and $0x1ff,%edx
. . 41c634: cmp $0x200,%rdx
. . 41c63b: jae 41c64f <runtime.(*gcSweepBuf).pop+0x4f>
. . 41c63d: mov (%rax,%rdx,8),%rcx
⋮
. . 41c64f: callq 424800 <runtime.panicindex>
. . 41c654: ud2
internal/report: change the format of tags command output
This change also includes:
- Instead of printing and sorting based on the first value in the
internal value slice, use the sample type user selected.
- Use the unit and print numbers in human friendly version.
- Add a flag '-update' to internal/driver/driver_test to update the
golden files.
TODO: apply to other tests that use other golden files.
*: golint, go tool vet, unsued, gosimple, staticcheck (#113)
* `git ls-files -- '*.go' | xargs gofmt -s -w`
* driver: unexport InternalOptions
This was an exported method that returned an internal type, so it's
unlikely that anyone depends on its presence.
* internal/binutils: remove newline in error string
golint reported: error strings should not be capitalized or end with
punctuation or a newline
* internal/driver: s/buildId/buildID/ per golint
* internal/elfexec: remove punctuation in error string
golint reported: error strings should not be capitalized or end with
punctuation or a newline
* internal/graph: correct comment
Found with golint.
* internal/graph: add method comment
golint reported: exported method Edge.WeightValue should have comment
or be unexported
* internal/report: remove newline in error string
golint reported: error strings should not be capitalized or end with
punctuation or a newline
* *: remove unused code
Found with honnef.co/go/tools/cmd/unused:
internal/binutils/disasm_test.go:137:8: const objdump is unused (U1000)
internal/driver/driver_test.go:353:6: type testProfile is unused (U1000)
internal/graph/graph.go:838:6: func countEdges is unused (U1000)
profile/proto.go:148:6: func encodeStringOpt is unused (U1000)
* *: simply code
Found with honnef.co/go/tools/cmd/gosimple:
internal/driver/commands.go:471:24: should omit comparison to bool constant, can be simplified to !b (S1002)
internal/driver/driver.go:163:5: should omit comparison to bool constant, can be simplified to !trim (S1002)
internal/driver/driver.go:168:5: should omit comparison to bool constant, can be simplified to !focus (S1002)
internal/driver/driver.go:172:5: should omit comparison to bool constant, can be simplified to !tagfocus (S1002)
internal/driver/driver.go:176:5: should omit comparison to bool constant, can be simplified to !hide (S1002)
internal/graph/dotgraph.go:213:34: should use make(map[string][]*Tag) instead (S1019)
internal/report/report.go:1061:3: should replace loop with label = append(label, rpt.options.ProfileLabels...) (S1011)
profile/proto.go:157:5: should omit comparison to bool constant, can be simplified to !x (S1002)
* *: correct mistakes
Found with honnef.co/go/tools/cmd/staticcheck:
driver/driver.go:280:20: infinite recursive call (SA5007)
internal/driver/driver_focus.go:54:11: this value of err is never used (SA4006)
profile/proto.go:74:32: x | 0 always equals x (SA4016)
* Add linters to .travis.yml
Do not honor call_tree option when generating non-visual reports.
There is a check to only honor the call_tree option when generating non-visual
reports, but it wasn't being checked in all appropriate places, causing a
mismatch that triggered a panic.
Populate more carefully the profile.proto in the topproto output
Instead of putting a pretty-printed string on the function.name,
populate the function.name, .file and location line with the
information from the node. This will help more easily extract this
information.
Also add some cachign to reuse functions and added a small test.
Previous loc information on assembly listing was being printed for
every line, while the intent was to print it only when it changes.
Also update the tests to expose and test that case, and remaster
other tests to match.
Disassembly reports generated by pprof -disasm will now include line number
information as generated by objdump. This will make the generated assembly more
readable.
As part of this I've introduced a new assemblyInstruction struct. Previously
the code was reusing the graph.Node to represent assembly instructions but it
seems better to have a dedicated type for this.
When -mean is selected, currently pprof divides the sample value
by value[0], which is expected to be the number of samples. This
is intended to produce mean value per sample. These means cannot
be added. Instead, we should add the value and the number of samples
independently and perform the division at the end.
To do this we will create a separate function to get the number of samples,
and accumulate it independently from the sample value (weigth) and apply
the division after the accumulation is completed.
When generating callgrind format output, produce cost lines at
instruction granularity. This allows visualizers supporting the
callgrind format to display instruction-level profiling information.
We also need to provide the object file (ob=) in order for tools to find
the object file to disassemble when displaying assembly.
We opportunistically group cost lines corressponding to the same
function together, reducing the number of superfluous description lines.
Subposition compression (relative position numbering) is also used to
reduce the output size.
Disambiguate names for kcachegrind under the call_tree option
When using the call_tree option and generating a graph for kcachegrind,
it will merge back nodes that are distinct on the tree, producing some
confusing results. Add a suffix so that these entries are kept separate.
This addresses the problem described in
http://yosefk.com/blog/how-profilers-lie-the-cases-of-gprof-and-kcachegrind.html ,
particularly the summary "Choosing a profiler is hard" section.
This is to keep the new TrimTree functionality from breaking any code
currently using the public interface. We do this by separating NodeSet
from nodePtrSet and creating different functions for each.
In a graph, NodeInfo maps one to one to the nodes, so it suffices to
just find the top NodeInfo s and only keep those nodes in the graph. In
a tree however, a single NodeInfo may map to many nodes. As of this
commit, a call to 'web 10' in pprof on a tree will return all the nodes
corresponding with the top 10 NodeInfo s.
Before this change, the node count used in the label is the proposed amount
provided by the user. If some nodes were trimmed and the graph ended up with
less nodes than the user asked for, the report label will now reflect this.
Add source_path option to point pprof to source files
Currently pprof will look for source files only on the current directory
and its parents. This makes it hard to examine sources on jobs where
there are multiple source trees (eg from different libraries).
Add a variable to provide a search path for source files. It will default
to the cwd, so there will be no change in behavior by default.
When generating a call tree, pprof was using a map to keep track of all the
inline nodes for a location. That is incorrect as it may cause inline functions
at different nesting levels to reuse the same node, causing the resulting graph
to not be a tree.
When creating a tree nodes with the same info may appear on multiple places
in the tree. Keeping one of them preserves them all, which may cause disconnected
nodes to remain. To ensure the resulting graph is a connected tree, do not include
children on any removed node, which is suitable for the normal tree refinement
(nodecount and nodefraction) but does not allow visual refinement, which may eliminate
intermediate nodes. Disable visual mode refinement for call_tree to avoid this issue.
Add new trimproto option that generates a new profile while removing
symbol information for functions below nodefraction. This reduces the
profile sizes significantly.
Separate implementation of graph and tree creation to speed it up.
Graph implementation maps upfront all locations to sequences of nodes,
tree implementation uses a per-parent map to keep track of a different
node per location per parent.