Home > Archive > Compilers > November 2004 > Instrumenting code for profiling.
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Instrumenting code for profiling.
|
|
|
| Greetings!
I had a question regarding how compilers do instrumentation to
collect profile information. Specifically, how do compilers handle the
part of mapping the collected profile back to the original program
(which is without the instrumentation).
Say a program X is instrumented with profiling calls for gathering
branch execution frequency. If I use the virtual address of the branch
instruction as the index or key against which I collect the frequency
of that branch, then I have a problem since the virtual address would
be completely different in the instrumented version and in the
non-instrumented version. So how do I maintain this mapping between
the profile data that I collect and the actual instruction in the
program? I think I need some sort of 'unique' id for each branch in
the original program. Could someone give me some pointers about how
this is done in the commercial compilers?
Warm regards,
Parianth
| |
| glen herrmannsfeldt 2004-11-17, 4:00 pm |
| PK wrote:
> I had a question regarding how compilers do instrumentation to
> collect profile information. Specifically, how do compilers handle the
> part of mapping the collected profile back to the original program
> (which is without the instrumentation).
Not long after I started learning Fortran (a long time ago by now), I
found a program called FETE. Actually, it was a cataloged procedure
on an IBM system, which would run the FETE program and then the
Fortran compiler.
FETE would read in a Fortran 66 program and then write out a new
Fortran program that would count the number of times each statement
was executed. For logical IF statements, it would separately count
the number of times the condition was true. At the end it would then
print out the source program with execution counts, approximate times,
and number of times an IF was true. The normal compiler source output
was turned off, so the only source listing you saw was that generated
by the executing program.
Not so long ago I tried to trace down FETE, but it seems that
it doesn't exist anymore.
Presumably a similar system could be done for other languages.
-- glen
| |
| Steven Bosscher 2004-11-17, 4:00 pm |
| par_ianth@yahoo.com (PK) wrote
> I had a question regarding how compilers do instrumentation to
> collect profile information. Specifically, how do compilers handle the
> part of mapping the collected profile back to the original program
> (which is without the instrumentation).
In GCC, the profile is read in at the same point in the compilation
process where the instrumentation is added to gather the profile
information.
The cfg instrumentation adds edge and block counters, so when you read
the profile, your control flow graph must match the one that was
instrumented. Similarly for value profiling the instructions you
instrumented must still be in the same place (ie. basic block,
instruction) when you want to use the value profile information.
This means that GCC really *must* read the fed back profile at the
same point in the compilation process where the CFG was previously
instrumented for the test runs. Otherwise there is no mapping from
the profile data to the intermediate language.
This also means that when you change the source, any profile
information you might have is no longer useful. Other compilers
apparently don't have this restriction. My understanding is that the
DEC compiler could still use some profile information even when the
source code has been modified. If anyone knows more about this, I'd
like to hear... ;-)
Gr.
Steven
| |
| Nick Maclaren 2004-11-19, 3:57 am |
| glen herrmannsfeldt <gah@ugcs.caltech.edu> writes:
|>
|> FETE would read in a Fortran 66 program and then write out a new
|> Fortran program that would count the number of times each statement
|> was executed. ...
Yes. There were other, similar ones, too.
|> Presumably a similar system could be done for other languages.
If you can't, it would be pretty hard to compile! With Fortran or
BCPL, you could take short cuts and not do a full parsing job; with C
or Algol 68, you would have to do a full parse, and would find it
easiest to hack a compiler.
Regards,
Nick Maclaren.
| |
| Ira Baxter 2004-11-21, 3:58 am |
| "PK" <par_ianth@yahoo.com> wrote
> I had a question regarding how compilers do instrumentation to
> collect profile information. Specifically, how do compilers handle the
> part of mapping the collected profile back to the original program
> (which is without the instrumentation). ...
I don't know how it is done in commercial compilers.
We do it by instrumenting the source code with profiling probes, using
full language parsers for each language to enable us to capture
accurate source line information and associate it with the probe.
This works very well indeed for our present set of profiling tools for
C, C++, COBOL, Java, and C#.
The technical details can be found in a white paper
at http://www.semdesigns.com/Products/TestCoverage
--
Ira D. Baxter, Ph.D., CTO 512-250-1018
Semantic Designs, Inc. www.semdesigns.com
| |
| Michael Tiomkin 2004-11-21, 3:58 am |
| nmm1@cus.cam.ac.uk (Nick Maclaren) wrote
> glen herrmannsfeldt <gah@ugcs.caltech.edu> writes:
> |>
> |> FETE would read in a Fortran 66 program and then write out a new
> |> Fortran program that would count the number of times each statement
> |> was executed. ...
>
> Yes. There were other, similar ones, too.
>
> |> Presumably a similar system could be done for other languages.
>
> If you can't, it would be pretty hard to compile! With Fortran or
> BCPL, you could take short cuts and not do a full parsing job; with C
> or Algol 68, you would have to do a full parse, and would find it
> easiest to hack a compiler.
Well, with some natural assumptions, you don't need to do full
parsing for C profiling. You can use the line info in the executable
to find the images of the "statements" (BTW, the line num/file name
can be used as an ID for a statement). In C, the only problem can be
jump tables in switch statements, but you can find the pattern that
your compiler uses and update the jump tables as well.
For a function with one entry (the case of C), it's not very
complicated to make a copy of the function extended with
instrumentation. You even don't neeed to analyze the executable: a
good disassembler can help in this task, and then you can easily
insert profiling instructions.
The question is if you really need to do profiling by yourself. Most
compilers would happily do this for you, and there are other tools
that do profiling for executables, like Vtune of Intel did for Win/x86
platform.
Michael
| |
| Ben L. Titzer 2004-11-21, 3:58 am |
| par_ianth@yahoo.com (PK) wrote
> I had a question regarding how compilers do instrumentation to
> collect profile information. Specifically, how do compilers handle the
> part of mapping the collected profile back to the original program
> (which is without the instrumentation).
>
> Say a program X is instrumented with profiling calls for gathering
> branch execution frequency. If I use the virtual address of the branch
> instruction as the index or key against which I collect the frequency
> of that branch, then I have a problem since the virtual address would
> be completely different in the instrumented version and in the
> non-instrumented version. So how do I maintain this mapping between
> the profile data that I collect and the actual instruction in the
> program? I think I need some sort of 'unique' id for each branch in
> the original program. Could someone give me some pointers about how
> this is done in the commercial compilers?
One thing that you can do, depending on the situation, is to run the
program inside an instruction-level simulator that is capable of
gathering the profiling information that you want. For example, a
simulator might collect an execution count for each instruction, a
branch direction count, etc. You can then use this information and the
mapping back to the source program to get source frequency
information.
Depending on how accurate the simulation is (for example, the
simulator may not mimic the real processor's pipeline), you can get
more or less accurate results.
In my work, I have written a simulator for an embedded architecture
that fully simulates the behavior of the hardware at the clock cycle
level. It then allows "probes" to be inserted at various instructions,
and the probes "fire" when that instruction executes. Thus the probe
can collect profiling information, branch frequency, enable other
probes, etc.
This is a good approach if you can get it--it doesn't require hacking
the program or the compiler.
| |
| Nick Maclaren 2004-11-27, 3:56 am |
| Michael Tiomkin <tmk@netvision.net.il> wrote:
>
> Well, with some natural assumptions, you don't need to do full
>parsing for C profiling. You can use the line info in the executable
>to find the images of the "statements" (BTW, the line num/file name
>can be used as an ID for a statement). In C, the only problem can be
>jump tables in switch statements, but you can find the pattern that
>your compiler uses and update the jump tables as well.
>
> For a function with one entry (the case of C), it's not very
>complicated to make a copy of the function extended with
>instrumentation. You even don't neeed to analyze the executable: a
>good disassembler can help in this task, and then you can easily
>insert profiling instructions.
Well, the original posts were referring to inserting profiling in the
source, but in fact my remarks apply to both, for similar but unrelated
reasons.
Your 'solution' doesn't work in general, for a great many reasons,
including (but not limited to):
The mapping between statements and lines can be very unclear,
ambigous and even meaningless, especially when the preprocessor is
used heavily.
The mapping between statements and locations is the code is much
the same if any optimisation is used, especially global optimisations
such as inlining.
There may be no disassembler that generates compilable code, nor
even any documentation - even if there is, its output may need
reverse engineering to insert instructions.
There is often not even any documentation on the exact output
format produced by the compiler.
> The question is if you really need to do profiling by yourself. Most
>compilers would happily do this for you, and there are other tools
>that do profiling for executables, like Vtune of Intel did for Win/x86
>platform.
The original poster was asking about how to write such a tool, and
the answer is that you have to write a partial parser for Fortran,
BCPL etc., but effectively a full one for Algol 68, C etc.
In the case of C, you need a full preprocessor stage to extract the
actual statements, and even then inserting code can have some subtle
effects on the code surrounding it. In C90, the main problem was
to do with whether blocks contained declarations, but C99 has added
delightful little "gotchas" like:
6.5.2.5 Compound literals
[#6] The value of the compound literal is that of an unnamed
object initialized by the initializer list. If the compound
literal occurs outside the body of a function, the object
has static storage duration; otherwise, it has automatic
storage duration associated with the enclosing block.
No, you can no longer replace:
a = <expression>;
by:
{++count_table[<statement number>]; a = <expression>;}
Regards,
Nick Maclaren.
|
|
|
|
|