Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

register allocation: basic blocks, liveness and next use
Hi,

I've just implemented register allocation by splitting three address
code instructions into basic blocks, and computing liveness and next
use for each block. I'm assuming that all variables are live at the
end of the block, and all temporaries dead, and move backwards.

This technique works well, except for the following scenario.
Assume the following code segment: print(a(2) + a(-3));

The TAC would be something of this sort:

== Block 1 =========
PUSHPARAM 2
CALL t1 A                       //call A and store the return value in
t1
== Block 2 =========
PUSHPARAM -3
CALL t2 A
== Block 3 =========
ADD t3 = t1, t2
PRINT t3
=================

The problem is that the temporaries in both block 1 (t1) and block 2
(t2) would be discarded (assumed dead since they are at the end of the
block) and block 3 can't compute the addition.

Am I missing something here?

Many thanks for all your help,
K. Phillips.

p.s. I know about the graph colouring technique, I just want to try
both techniques.

Report this thread to moderator Post Follow-up to this message
Old Post
kphillips
03-23-08 03:32 AM


Re: register allocation: basic blocks, liveness and next use
On Mar 22, 8:22 pm, kphillips <kevin.phillip...@yahoo.com> wrote:
> The TAC would be something of this sort:
>
> == Block 1 =========
> PUSHPARAM 2
> CALL t1 A                       //call A and store the return value in
> t1
> == Block 2 =========
> PUSHPARAM -3
> CALL t2 A
> == Block 3 =========
> ADD t3 = t1, t2
> PRINT t3
> =================
>
> The problem is that the temporaries in both block 1 (t1) and block 2
> (t2) would be discarded (assumed dead since they are at the end of the
> block) and block 3 can't compute the addition.
>
> Am I missing something here?

Perhaps the definition of a basic block.

The definition of a temporary requires that none of its defs can reach
the end of the corresponding block.  Obviously your t1..3 don't meet
this requirement.


Report this thread to moderator Post Follow-up to this message
Old Post
Gene
03-24-08 12:38 AM


Re: register allocation: basic blocks, liveness and next use
kphillips <kevin.phillips83@yahoo.com> writes:

> ...
> I've just implemented register allocation by splitting three address
> code instructions into basic blocks, and computing liveness and next
> use for each block. I'm assuming that all variables are live at the
> end of the block, and all temporaries dead, and move backwards.
>
> This technique works well, except for the following scenario.
> [expression containing procedure calls as subexpressions]
> ...
> Am I missing something here?...

Because the lifetime of each of your temporaries is contained within a
single expression, and because an expression is usually contained
within a single basic block, you are expecting the lifetime of each
temporary to be contained within a single basic block.  But sometimes
an expression can contain control flow, and hence be divided across
multiple basic blocks, resulting in multi-block lifetimes for
temporaries.  You found one case of this.  Another case arises if you
have conditional (if-then-else) expressions, such as the ternary ?:
operator in C-family languages, or short-circuit boolean operators,
such as the && and || operators in C-family langauges.

If your language has only procedure calls as intra-expression control
flow, and if the only reason you are breaking the code into basic
blocks is for your local register allocation, then the simplest
solution would be to treat procedure calls as block boundaries.  The
blocks won't be basic blocks any more -- but they probably will be
what you want.

But if you are going to accommodate other control flow within
expressions (such as from ?:, &&, ||), then I think that you are going
to have to make some more major change in the structure of your
compiler.  Off the top of my head, here are some options that seem
plausible:

(0) Bite the bullet and do global analysis -- but I know you are
setting that option aside.

(1) Transform the source program (for example, by AST rewriting prior
to translation to a three-address code) so as to not have any
control flow within an expression.  This would entail introducing
extra variables and breaking statements containing complex
expressions into multiple statements.

(2) Rather than breaking the three-address code into blocks based only
on information within the three-address code (such as the control
flow instructions or the non-procedure-call control flow
instructions), break the three-address code into blocks at
boundaries that correspond to the source-level statements.  That
way,q temporaries' lifetimes will be contained within blocks.  (The
blocks will again not be basic blocks.)  This may require some
restructuring of your compiler if the information about
source-level statements is gone before you do the breaking into
blocks.

Report this thread to moderator Post Follow-up to this message
Old Post
Max Hailperin
03-24-08 12:38 AM


Re: register allocation: basic blocks, liveness and next use
This is a correction to my own earlier post.  I accidentally missed a
"not" in the following:

> ... then the simplest
> solution would be to treat procedure calls as block boundaries.  The
> blocks won't be basic blocks any more -- but they probably will be
> what you want.

I meant to suggest *not* treating procedure calls as block
boundaries.


Report this thread to moderator Post Follow-up to this message
Old Post
Max Hailperin
03-24-08 12:38 AM


Re: register allocation: basic blocks, liveness and next use
Gene <gene.ressler@gmail.com> writes:

> On Mar 22, 8:22 pm, kphillips <kevin.phillip...@yahoo.com> wrote: 
>
> Perhaps the definition of a basic block.

Actually, the original poster doesn't seem to have missed the
definition of a basic block.  The CALL instructions do create basic
block boundaries.  And what you say in the remainder of your message
doesn't contradict this.  Instead, you seem to be suggesting that he
missed the definition of a temporary.

> The definition of a temporary requires that none of its defs can reach
> the end of the corresponding block.  Obviously your t1..3 don't meet
> this requirement.

I'm not convinced that "temporary" has a consensus meaning, the way
"basic block" does.  I've seen plenty of authors other than the
original poster use "temporary" to mean "a name introduced by the
compiler, as opposed to appearing in the source code."

But even if we accept your definition of "temporary" as meaning
"local" (to a basic block), the way you would do that would not be by
stipulating that the definitions can't reach the end of the block, but
rather by stipulating that they not be live at the end of the block.
The last of the definitions for any given name will always reach the
end of the block.  (Recall, a definition "reaches" a point if control
must have passed through the definition on the way to the point
without passing through any other definition of the same name in
between.)


Report this thread to moderator Post Follow-up to this message
Old Post
Max Hailperin
03-24-08 12:38 AM


Re: register allocation: basic blocks, liveness and next use
Many thanks for the suggestions given. I will definitely analyse the
pros and cons and hopefully come up with a solution. It does look
strange that no book pointed this out. But at least I know that I'm on
the right track. Once again, thanks for your help.

Regards,
Kevin Phillips


Report this thread to moderator Post Follow-up to this message
Old Post
kphillips
03-24-08 12:38 AM


Re: register allocation: basic blocks, liveness and next use
On Mar 23, 1:16 pm, Max Hailperin <m...@gustavus.edu> wrote:
> Gene <gene.ress...@gmail.com> writes: 
> 
>
> Actually, the original poster doesn't seem to have missed the
> definition of a basic block.  The CALL instructions do create basic
> block boundaries.  And what you say in the remainder of your message
> doesn't contradict this.  Instead, you seem to be suggesting that he
> missed the definition of a temporary.
> 
>
> I'm not convinced that "temporary" has a consensus meaning, the way
> "basic block" does.  I've seen plenty of authors other than the
> original poster use "temporary" to mean "a name introduced by the
> compiler, as opposed to appearing in the source code."
>
> But even if we accept your definition of "temporary" as meaning
> "local" (to a basic block), the way you would do that would not be by
> stipulating that the definitions can't reach the end of the block, but
> rather by stipulating that they not be live at the end of the block.
> The last of the definitions for any given name will always reach the
> end of the block.  (Recall, a definition "reaches" a point if control
> must have passed through the definition on the way to the point
> without passing through any other definition of the same name in
> between.)

You're right I should have said temps must be dead at the end of the
block, not unreachable.  Terminology brain cramp.  Sorry

On the other hand, I disagree about basic blocks. I was refering to
the ASU definition -- maximal sequences of code with no gotos.  The
intent is for block boundaries to fall at branches and convergences of
flow.  Call/returns don't cause flow branches.  In this case, the OP
is apparently using library calls for array indexing and output,
certainly non-branching operations.


Report this thread to moderator Post Follow-up to this message
Old Post
Gene
03-25-08 12:35 AM


Re: register allocation: basic blocks, liveness and next use
kphillips <kevin.phillips83@yahoo.com> writes:

> Many thanks for the suggestions given. I will definitely analyse the
> pros and cons and hopefully come up with a solution. It does look
> strange that no book pointed this out. But at least I know that I'm on
> the right track. Once again, thanks for your help.

Actually, it's not strange to me at all that no book you read pointed
this out.  It's at the level of detail that is often skipped. (see
below) It has no theoretical implications and at the practical level
any of Max's suggestions will work equally well.  I worked on lots of
code generators in the days when this was a relevant issue and each
author picked a different one of the possibilities, mostly without
thinking that it was worth commenting on.  (Actually, I think each of
the compilers I worked on did explain the exact choices made, but only
in the parts of the code where the choices were implemented.)

Reaching back into the dim recesses of my brain, I recall that Bob
Freiberghouse (author of optimizing PL/I, FORTRAN, et. al. compilers)
used different numbering schemes for temporaries which could cross a
basic block boundary and ones that could not.  Ones within a block
were given positive numbers and were always freed by the end of the
block.  Ones that could cross a basic block were given negative
numbers and explicitly freed by an instruction which marked their last
use.  If I recall correctly, for temporary allocation a call did not
break a basic block.  However, for other optimizations it did.

Fred Chow (author of Pascal, C, et. al. compilers) treated temporaries
like variables, but had a local numbering scheme for all variable uses
within a basic block and a global numbering scheme for uses of a
variable without respect to block boundaries.  From any given use, one
could always access both numbering schemes.  I believe calls always
broke basic blocks.

My recollection of other compilers is more sketchy, as I did
significantly more work on the above suites than on PCC, GCC, or any
other compilers.  However, the issue was well on the mind of C
compiler writers as you will find that the standard (at one time at
least) talked about which values had to be preserved/restored over
setjmp/longjmp calls.  The motivation for that was to allow
temporaries to be kept in registers across calls.

In general, most compilers that I have seen treat temporaries like
variables.  That is generally fairly simple to implement and is easy
to get semantically correct.  If one uses that as the general case,
but then detects and special cases the temporaries which live entirely
in one basic block, one gets most of the easy-to-find important cases.

I've seen more divergence in whether a call breaks a block or not.
There are definite advantages both ways.  The problem with calls is
the semantics that often get layered on top of them, such as non-local
gotos (exceptions or signals) or aliasing and non-local reference
issues.

below: While I am not surprised you haven't found a book that covered
that aspect, I suspect that it isn't that it is never addressed in
compiler texts.  For example, I would be willing to bet that Bob
Morgan's book covers it, as did Bill Wulf's book, and the book on the
VAX PL/I compiler.  I would also be surprised if it wasn't addressed
somewhere in Steve Muchnick's book.  I know that Fred Chow's PhD
thesis explains the choices he made in that area.  Unfortunately, most
of those books are either old or not at the introductory level.
Moreover, I think there are many fine compiler texts that wouldn't be
better if they mentioned that kind of detail.  I suspect most student
compilers implement languages where not breaking a block at a call
boundary solves the problem and they don't have any semantics where
doing that causes things to break.

Hope this helps,
-Chris

 ****************************************
************************************
**
Chris Clark              Internet: christopher.f.clark@compiler-resources.co
m
Compiler Resources, Inc.       or: compres@world.std.com
23 Bailey Rd             Web Site: http://world.std.com/~compres
Berlin, MA  01503           voice:  (508) 435-5016
USA                           fax:  (978) 838-0263  (24 hours)
----------------------------------------------------------------------------
--

Report this thread to moderator Post Follow-up to this message
Old Post
Chris F Clark
03-25-08 12:35 AM


Re: register allocation: basic blocks, liveness and next use
> I suspect most student
> compilers implement languages where not breaking a block at a call
> boundary solves the problem and they don't have any semantics where
> doing that causes things to break.

Thanks for your in-depth reply. I'm still relatively new to the area
and started off through books that are intended for academic courses.
Augmenting the grammar / features does in fact lead you to situations
not addressed by the same books (although most of the questions can be
answered in the dragon book ...). Still, this mailing list archive has
been a great resource so far in looking for answers and leads.


Regards,
K. Phillips


Report this thread to moderator Post Follow-up to this message
Old Post
kphillips
03-28-08 09:55 AM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

Compilers archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 09:52 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.