Do not honor call_tree option when generating non-visual reports.
There is a check to only honor the call_tree option when generating non-visual
reports, but it wasn't being checked in all appropriate places, causing a
mismatch that triggered a panic.
Previous loc information on assembly listing was being printed for
every line, while the intent was to print it only when it changes.
Also update the tests to expose and test that case, and remaster
other tests to match.
Disassembly reports generated by pprof -disasm will now include line number
information as generated by objdump. This will make the generated assembly more
readable.
As part of this I've introduced a new assemblyInstruction struct. Previously
the code was reusing the graph.Node to represent assembly instructions but it
seems better to have a dedicated type for this.
When generating callgrind format output, produce cost lines at
instruction granularity. This allows visualizers supporting the
callgrind format to display instruction-level profiling information.
We also need to provide the object file (ob=) in order for tools to find
the object file to disassemble when displaying assembly.
We opportunistically group cost lines corressponding to the same
function together, reducing the number of superfluous description lines.
Subposition compression (relative position numbering) is also used to
reduce the output size.