Force saving the module name for unsymbolized frames (#394)
Force saving the module name for unsymbolized frames
In some cases pprof isn't saving the module name in graph nodes, so
reports end up with nodes of the form '<unknown>'.
The module name is often stripped to allow symbolized frames from
different versions of a binary to be merge.
Force saving the objfile name if the name is unknown so we can print
some information for unsymbolized frames.
* Add support for tag units
Updated tests for adding tag units
* updated proto docs to describe tag units
* fixed formatting
* added additional test
* require specific units for graph tag to be recognixed
* updated tests and formatting
* Fixed formatting
* Fixed style error
* modify proto_test to be sure tag units not overwritten
* Clarified label docs
* call legacy profile memory allocations size, not byte
* updated tests to call bytes tags in legacy profile size
* Revert "updated tests to call bytes tags in legacy profile size"
This reverts commit 9289378af2.
* Revert "call legacy profile memory allocations size, not byte"
This reverts commit 6b18973562.
* Updated so units not modified when profile written out
* Fixed formatting errors
* Removed field for inferred numeric label units from profile
* Modified profile String() output
* fixed formatting error
* changed name of function used to identify units for numeric labels
* Refactor String() for profile
* Modified for clarity and address comments
* renamed function for clarity
* Made numeric label units field seperate in profile
* Modified test
* Refactored identifyNumLabelUnits()
* Updated comments
* Fixed formatting error
* Addressed comments
* Fixed style error
* clarify comment
* Refactored to address comments
* addressed comments
* addressed comments
* addressed comments
There is an issue where decreasing nodefraction (which should produce
more nodes) ended up reducing the number of displayed nodes. This is
because we have an upper limit on number of nodes to display and the
number of node-lets counts against that limit. This can cause a problem
if a node with large number of node-lets shows up near the front of
the list of nodes (e.g., std::allocator_traits::allocate in a heapz
profile had 102 node-lets, which caused all subsequent nodes to be
dropped).
Fixed by limiting the count of charged node-lets per node to
maxNodelets (4), which is the maximum number of node-lets we
display per node.
*: golint, go tool vet, unsued, gosimple, staticcheck (#113)
* `git ls-files -- '*.go' | xargs gofmt -s -w`
* driver: unexport InternalOptions
This was an exported method that returned an internal type, so it's
unlikely that anyone depends on its presence.
* internal/binutils: remove newline in error string
golint reported: error strings should not be capitalized or end with
punctuation or a newline
* internal/driver: s/buildId/buildID/ per golint
* internal/elfexec: remove punctuation in error string
golint reported: error strings should not be capitalized or end with
punctuation or a newline
* internal/graph: correct comment
Found with golint.
* internal/graph: add method comment
golint reported: exported method Edge.WeightValue should have comment
or be unexported
* internal/report: remove newline in error string
golint reported: error strings should not be capitalized or end with
punctuation or a newline
* *: remove unused code
Found with honnef.co/go/tools/cmd/unused:
internal/binutils/disasm_test.go:137:8: const objdump is unused (U1000)
internal/driver/driver_test.go:353:6: type testProfile is unused (U1000)
internal/graph/graph.go:838:6: func countEdges is unused (U1000)
profile/proto.go:148:6: func encodeStringOpt is unused (U1000)
* *: simply code
Found with honnef.co/go/tools/cmd/gosimple:
internal/driver/commands.go:471:24: should omit comparison to bool constant, can be simplified to !b (S1002)
internal/driver/driver.go:163:5: should omit comparison to bool constant, can be simplified to !trim (S1002)
internal/driver/driver.go:168:5: should omit comparison to bool constant, can be simplified to !focus (S1002)
internal/driver/driver.go:172:5: should omit comparison to bool constant, can be simplified to !tagfocus (S1002)
internal/driver/driver.go:176:5: should omit comparison to bool constant, can be simplified to !hide (S1002)
internal/graph/dotgraph.go:213:34: should use make(map[string][]*Tag) instead (S1019)
internal/report/report.go:1061:3: should replace loop with label = append(label, rpt.options.ProfileLabels...) (S1011)
profile/proto.go:157:5: should omit comparison to bool constant, can be simplified to !x (S1002)
* *: correct mistakes
Found with honnef.co/go/tools/cmd/staticcheck:
driver/driver.go:280:20: infinite recursive call (SA5007)
internal/driver/driver_focus.go:54:11: this value of err is never used (SA4006)
profile/proto.go:74:32: x | 0 always equals x (SA4016)
* Add linters to .travis.yml
Do not remove a residual edge between A and B if there isn't another path
going between A and B. We previously would remove it if A and B had a common
ancestor, but that caused some value duplication as A could become a leaf
but its value would still be accounted for in B.
When -mean is selected, currently pprof divides the sample value
by value[0], which is expected to be the number of samples. This
is intended to produce mean value per sample. These means cannot
be added. Instead, we should add the value and the number of samples
independently and perform the division at the end.
To do this we will create a separate function to get the number of samples,
and accumulate it independently from the sample value (weigth) and apply
the division after the accumulation is completed.
When generating callgrind format output, produce cost lines at
instruction granularity. This allows visualizers supporting the
callgrind format to display instruction-level profiling information.
We also need to provide the object file (ob=) in order for tools to find
the object file to disassemble when displaying assembly.
We opportunistically group cost lines corressponding to the same
function together, reducing the number of superfluous description lines.
Subposition compression (relative position numbering) is also used to
reduce the output size.
Disambiguate names for kcachegrind under the call_tree option
When using the call_tree option and generating a graph for kcachegrind,
it will merge back nodes that are distinct on the tree, producing some
confusing results. Add a suffix so that these entries are kept separate.
This addresses the problem described in
http://yosefk.com/blog/how-profilers-lie-the-cases-of-gprof-and-kcachegrind.html ,
particularly the summary "Choosing a profiler is hard" section.
This is to keep the new TrimTree functionality from breaking any code
currently using the public interface. We do this by separating NodeSet
from nodePtrSet and creating different functions for each.
If we are deleting a node N, if the edge between N and its parent and
the edge between N and its child are both inline, then the resulting
residual edge from N's parent and the child will also be inline.
In a graph, NodeInfo maps one to one to the nodes, so it suffices to
just find the top NodeInfo s and only keep those nodes in the graph. In
a tree however, a single NodeInfo may map to many nodes. As of this
commit, a call to 'web 10' in pprof on a tree will return all the nodes
corresponding with the top 10 NodeInfo s.
Allow weblist to work even if assembly is not available
Weblist provides source and assembly combined in a web document.
It needs access to the binary to print the assembly, but currently
refuses to generate the source if the binary can't be find.
Fall back to just generating the source if the binary isn't found.
Graphs are built using Go maps which do not provide a determinist
traversal. We sort the nodes to remove the non-determinism, but
the sort fails to provide a fully deterministic ordering in cases
where the attributes being sorted have equal values. Detect those
cases and fallback to a full comparison of all fields.
When generating a call tree, pprof was using a map to keep track of all the
inline nodes for a location. That is incorrect as it may cause inline functions
at different nesting levels to reuse the same node, causing the resulting graph
to not be a tree.
When creating a tree nodes with the same info may appear on multiple places
in the tree. Keeping one of them preserves them all, which may cause disconnected
nodes to remain. To ensure the resulting graph is a connected tree, do not include
children on any removed node, which is suitable for the normal tree refinement
(nodecount and nodefraction) but does not allow visual refinement, which may eliminate
intermediate nodes. Disable visual mode refinement for call_tree to avoid this issue.
Add new trimproto option that generates a new profile while removing
symbol information for functions below nodefraction. This reduces the
profile sizes significantly.
Separate implementation of graph and tree creation to speed it up.
Graph implementation maps upfront all locations to sequences of nodes,
tree implementation uses a per-parent map to keep track of a different
node per location per parent.
When reconciling samples, cleanup file paths to allow functions to be
combined if their source filenames match after cleanup (eg "src.cpp" and
"./src.cpp").
Also clean up existing uses of filepath to avoid passing empty strings
to it: filepath.Base("") returns "."