Make '-noinlines' a separate flag, introduce '-filefunctions' granularity. (#420)
This change consists of two relatively independent parts, but they are
both about handling the granularity, so bundling them together.
First, in #415 there is a discussion that having a granularity mode by
source lines but with inlines hidden would be useful. The agreement is
also that adding `-linenoinlines` granularity would make the granularity
flags too messy (and they are already somewhat messy with `-addresses`
and `-addressnoinlines`. So, it was proposed to make `-noinlines` a
separate flag, which is what this change does. Note that the flag is now
pulled out of the granularity group so it's a bit of backward
incompatible change but I think it is acceptable. For the example in
issue #415 the user would now be able to specify `-list foo -noinlines`
to produce annotated source where the metrics from the inlined functions
are attributed to the calling inliner line.
With this change, I am also dropping the `-addressnoinlines` granularity
which is now supported as `-addresses -noinlines` combination. I
couldn't find any usage of this option at least internally at Google, so
I think it's safe to remove it.
Second, I am adding a separate `-filefunctions` granularity which groups
the data by both function and file. This is a follow-up to #110 from the
past where we changed the `-functions` granularity to not group by file
(it used to), and since then there was a couple of reports where using
just function name alone would over-aggregate the data in cases when a
function with the same name is contained in multiple source files (e.g.
see b/18874275 internally).
Also, make a number of assorted documentation and `-help` fixes.
For certain formats (eog, evince, gv, web, weblist), the visualizer
would be invoked regardless of whether the user specified a specific
output file to write to. In order to ensure that output is properly
written, the invokeVisualizer function has a check for whether the
output is os.Stdout and alters behavior based on that. This is the
wrong abstraction layer to do that work.
At the core of the problem is that PostProcess is used to both
do data post-processing that is inherent to the format, and also
to invoke some visualization for the data (to Stdout or browser).
We split these two steps apart and make it obvious which is which
by adding a visualizer field to Command.
This seperation of concerns allows us to simplify the code.
When generating callgrind format output, produce cost lines at
instruction granularity. This allows visualizers supporting the
callgrind format to display instruction-level profiling information.
We also need to provide the object file (ob=) in order for tools to find
the object file to disassemble when displaying assembly.
We opportunistically group cost lines corressponding to the same
function together, reducing the number of superfluous description lines.
Subposition compression (relative position numbering) is also used to
reduce the output size.