Only do Seek if there is an EOF in the perf fetcher
This is to handle the case where we have a file that is smaller than the
perf.data header, but is still (or can be converted in another way) a
valid profile.proto.
When generating callgrind format output, produce cost lines at
instruction granularity. This allows visualizers supporting the
callgrind format to display instruction-level profiling information.
We also need to provide the object file (ob=) in order for tools to find
the object file to disassemble when displaying assembly.
We opportunistically group cost lines corressponding to the same
function together, reducing the number of superfluous description lines.
Subposition compression (relative position numbering) is also used to
reduce the output size.
Speed up proto postprocessing phase in proto decoder
The proto decoder associates creates pointers to reference each
location/function/mapping based on their id. These ids can be
arbitrary uint64s, but often they are generated in sequence from
1 to N.
The overhead of keeping these indices in a hash is about a 20% of
the cost of decoding a profile. Speed it up by using an array to
track values from 1 to N, and a hash for values outside that range.
Disambiguate names for kcachegrind under the call_tree option
When using the call_tree option and generating a graph for kcachegrind,
it will merge back nodes that are distinct on the tree, producing some
confusing results. Add a suffix so that these entries are kept separate.
This addresses the problem described in
http://yosefk.com/blog/how-profilers-lie-the-cases-of-gprof-and-kcachegrind.html ,
particularly the summary "Choosing a profiler is hard" section.
This will enable symbolization support for Go on Mac OS
It re-enables symbolization using debug/pprof/symbol on
Go profiles in the legacy format, and implements basic
mach-O support on the binutils package.
This is to keep the new TrimTree functionality from breaking any code
currently using the public interface. We do this by separating NodeSet
from nodePtrSet and creating different functions for each.
If we are deleting a node N, if the edge between N and its parent and
the edge between N and its child are both inline, then the resulting
residual edge from N's parent and the child will also be inline.
In a graph, NodeInfo maps one to one to the nodes, so it suffices to
just find the top NodeInfo s and only keep those nodes in the graph. In
a tree however, a single NodeInfo may map to many nodes. As of this
commit, a call to 'web 10' in pprof on a tree will return all the nodes
corresponding with the top 10 NodeInfo s.