Home > Archive > Compilers > August 2007 > latest trends in compiler optimization research?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
latest trends in compiler optimization research?
|
|
| nikita desai 2007-07-29, 4:27 am |
| I'm from gujarat , a state in india .
i am currently pursuing my Masters in computer engginerring and have
to do research work for it , and i want to know what is latest
research work happening in the code optimization field of compilers
.... can anyone suggest me some website for it? thanks
Ms Nikita Desai,
Lecturer ,
IT department,
DDU, Nadiad
| |
| Jason Lee Eckhardt 2007-07-29, 7:10 pm |
| nikita desai <npd_ddit@yahoo.com> wrote:
>i am currently pursuing my Masters in computer engginerring and have
>to do research work for it , and i want to know what is latest
>research work happening in the code optimization field of compilers
>... can anyone suggest me some website for it? thanks
Try searching for PLDI, MICRO, JILP, IEEE TC, TOPLAS, TACO, etc.
That is, read the conference proceedings and journals in the area
to get a feel for the trends in code optimization/compiler research.
There are many more, but that should be plenty to get you started.
Once you have read some of those papers, their bibliographies will
lead you to many more publications.
| |
| SM Ryan 2007-07-29, 7:10 pm |
| nikita desai <npd_ddit@yahoo.com> wrote:
# I'm from gujarat , a state in india .
#
# i am currently pursuing my Masters in computer engginerring and have
# to do research work for it , and i want to know what is latest
# research work happening in the code optimization field of compilers
# ... can anyone suggest me some website for it? thanks
I'm out of the bleeding edge, but an important lesson I got some
years ago: the best way to optimise code is provide programmers
with a notation that let's them express optimal code. I worked
many years ago on the Cyber 205 Fortran vectoriser; as good as we
got the vectoriser, most people still preferred to directly use
the vector notation added to Fortran.
The greatest speeds up are how the original code is written
not how you tinker with the code as written. So give programmers
the tools to write good code.
--
SM Ryan http://www.rawbw.com/~wyrmwif/
[It is indeed amazing how much advanced program analysis is just
recovering the program that was in the programmer's head when he wrote
the code. -John]
| |
| Ray Dillinger 2007-07-31, 4:24 am |
| nikita desai wrote:
> I'm from gujarat , a state in india .
>
> i am currently pursuing my Masters in computer engginerring and have
> to do research work for it , and i want to know what is latest
> research work happening in the code optimization field of compilers
> ... can anyone suggest me some website for it? thanks
I don't know what the latest research work or trends are, but
I can say what some of the biggest problems that compiler
writers need to solve in modern architectures are.
Effective use of multi-core CPU's,
Effective use of multi-CPU architectures,
Effective use of shared memory and cache in multi-CPU architectures,
Effective use of per-CPU memory or per-core memory in multi-CPU
architectures.
If you can find anyone who has a general and useful set of
theoretical or practical tools to address these problems, I
would say you're way ahead of the game.
Bear
| |
| Amit Gupta 2007-07-31, 10:11 pm |
| On Jul 30, 10:34 am, Ray Dillinger <b...@sonic.net> wrote:
> Effective use of multi-core CPU's,
> Effective use of multi-CPU architectures,
> Effective use of shared memory and cache in multi-CPU architectures,
> Effective use of per-CPU memory or per-core memory in multi-CPU
> architectures.
Right on the mark, Bear.
I would like to make another request. Can I solicit good references
for any of the above topics (books, preferably research papers). If
different poeple make suggestions over a period of time, I can repost
later collecting all the references in one single post.
Thanks
| |
|
| Well, as in any science, the more you discover, the less you know...
In compiler optimisation, it depends if you target computer engineering
or computer science.
In computer engineering, you can check the latest or the upcoming
architectures of the market (multicore, DSP, VLIW, etc.) and see what
you can do for them. Any empirical result would be interesting for
quick publication in conferences, but such results become obsolete
after few years (even if they would be cited for many years).
In computer science, you can take an open problem (there are too many,
from backend to frontend), often highlighted by computer engineering,
and try to have a mathematical result. It is a more difficult process,
but your results would stay valid forever.
S
| |
| Roberto Waltman 2007-08-07, 7:16 pm |
| Ray Dillinger wrote:
>...some of the biggest problems that compiler
>writers need to solve in modern architectures are.
>
>Effective use of multi-core CPU's,
>Effective use of multi-CPU architectures,
I understand the implications of sharing (or not) on-chip caches,
differences in bandwidth when transferring data on and off chip, etc.
but still...
Are these two really two different problems?
Roberto Waltman
| |
| Tony Finch 2007-08-07, 7:16 pm |
| Ray Dillinger <bear@sonic.net> wrote:
>
>I don't know what the latest research work or trends are, but
>I can say what some of the biggest problems that compiler
>writers need to solve in modern architectures are.
>
>Effective use of multi-core CPU's,
>Effective use of multi-CPU architectures,
>Effective use of shared memory and cache in multi-CPU architectures,
>Effective use of per-CPU memory or per-core memory in multi-CPU
> architectures.
I don't think these are problems that can be tackled just by compiler
writers: they are more in the realm of language designers or system
architects (depending on whether you put the features in the language
or the library). Promising approaches include:
* message-passing, such as in Erlang
* software transactional memory
* data-parallel array ops
all of which require the right kind of application design to work.
Of course the compiler writer has to make them work *well*, but that's
out of scope if the compiler implements C.
Tony.
--
f.a.n.finch <dot@dotat.at> http://dotat.at/
| |
|
| On Jul 29, 8:49 am, nikita desai <npd_d...@yahoo.com> wrote:
> I'm from gujarat , a state in india .
>
> i am currently pursuing my Masters in computer engginerring and have
> to do research work for it , and i want to know what is latest
> research work happening in the code optimization field of compilers
> ... can anyone suggest me some website for it? thanks
Parallellizing compilers.
you better visit http://gcc.gnu.org/wiki
Onkar Mahajan
TCS Mumbai
| |
| Anton Lokhmotov 2007-08-07, 7:16 pm |
| Dear SM Ryan,
> I worked many years ago on the Cyber 205 Fortran vectoriser; as good
> as we got the vectoriser, most people still preferred to directly
> use the vector notation added to Fortran.
I really like this comment of yours. I am writing a PhD on programming
embedded SIMD architectures, and would love to quote something like
that. Could you please let me know if there's any particular source I
could quote (papers, manuals, memoires, etc.)?
Of course, I'd be happy to hear any other comments from the members of
our esteemed community :).
Kind regards,
Anton.
--
http://www.cl.cam.ac.uk/~al407/
| |
| SM Ryan 2007-08-07, 7:16 pm |
| Anton Lokhmotov <al407@cam.ac.uk> wrote:
# Dear SM Ryan,
# > I worked many years ago on the Cyber 205 Fortran vectoriser; as good
# > as we got the vectoriser, most people still preferred to directly
# > use the vector notation added to Fortran.
#
# I really like this comment of yours. I am writing a PhD on programming
# embedded SIMD architectures, and would love to quote something like
# that. Could you please let me know if there's any particular source I
# could quote (papers, manuals, memoires, etc.)?
I don't know if any documentation has survived; this was twenty to
thirty years ago. The 205 and Kuck vectorisers had the same problem
that is also hitting modern parallelisers: the transform cannot modify
the semantics of the original serial code, and the original serial
code contains more order constraints than the programmer
intended. Which means the programmer has to somehow tell the optimiser
that there are no conflicts, or the conflicts are not signficant.
A rather simple example, is that the 205 had vector instructions for
do 1 i=1,n
1 a(x(i)) = s*a(x(i))
with vector gather, vector multiply, and vector scatter, but it's
impossible in general to prove the loop is interference free. So a
vectoriser cannot vectorise it. However the programmer may know
absolutely that x has no repeated values and is actually interference
free, but the serial notation does not permit this to be expressed.
Fortran programmers instead of trying to craft serial loops that
could be recognised, often wrote exactly what they wanted
in the vector notation extension to the compiler. I don't recall
exactly what it was, but it was something like
call v8gathr(t,a,x,n)
t(1;n) = s*t(1;n)
call v8scatr(a,t,x,n)
so the programmer expresses exactly what they intend and take
full responsibility for proving no interference upon themselves.
I eventually realised it's better use of everyone's effort to give
programmers a good notation for expressing themselves accurately and
clearly without unnecessary constraints to express their algorithm at
a high level. (The 205 vector notation was not very good; in
retrospect it would've been a better use of resources to improve the
vector notation than to worry about the vectoriser.)
I lost track of Fortran before 90 was standarised, but the array
notation of 8X was seriously broken: while theoretically letting
programmers express themselves at a high level, it forced serial
semantics whether the programmer intended that or no; thus the
notation could not be vectorised or parallelised any easier than if
explicit do loops had been used.
Transforming code to a parallel version assuming no conflicts is
always trivial. There is no real advantage to a special array notation
just for that. The intractable part is proving there is no
interference so that the reordered execution of the parallelised code
is semantically equivalent to the serial code; a notation that does
not require that often impossible proof is a big win for the
programmer, assuming the programmer accepts the proof responsibility
for themselves.
--
SM Ryan http://www.rawbw.com/~wyrmwif/
One of the drawbacks of being a martyr is that you have to die.
| |
| wclodius@lanl.gov 2007-08-09, 7:16 pm |
| On Aug 7, 2:03 pm, SM Ryan <wyrm...@tsoft.org> wrote:
> Anton Lokhmotov <al...@cam.ac.uk> wrote:
><sni>
> A rather simple example, is that the 205 had vector instructions for
>
> do 1 i=1,n
> 1 a(x(i)) = s*a(x(i))
>
> with vector gather, vector multiply, and vector scatter, but it's
> impossible in general to prove the loop is interference free. So a
> vectoriser cannot vectorise it. However the programmer may know
> absolutely that x has no repeated values and is actually interference
> free, but the serial notation does not permit this to be expressed.
>
> Fortran programmers instead of trying to craft serial loops that
> could be recognised, often wrote exactly what they wanted
> in the vector notation extension to the compiler. I don't recall
> exactly what it was, but it was something like
>
> call v8gathr(t,a,x,n)
> t(1;n) = s*t(1;n)
> call v8scatr(a,t,x,n)
>
> so the programmer expresses exactly what they intend and take
> full responsibility for proving no interference upon themselves.
>
> I eventually realised it's better use of everyone's effort to give
> programmers a good notation for expressing themselves accurately and
> clearly without unnecessary constraints to express their algorithm at
> a high level. (The 205 vector notation was not very good; in
> retrospect it would've been a better use of resources to improve the
> vector notation than to worry about the vectoriser.)
>
> I lost track of Fortran before 90 was standarised, but the array
> notation of 8X was seriously broken: while theoretically letting
> programmers express themselves at a high level, it forced serial
> semantics whether the programmer intended that or no; thus the
> notation could not be vectorised or parallelised any easier than if
> explicit do loops had been used.
I would not call the array notation seriously broken. Yes it does not
provide semantics that are not present in the equivlent assignment
using do loops, and optimizing array expression code is equivalent to
optimizing the equivalent loops. But it does provide a more compact way
of writing that equivalent code, and that code is as a result quicker
to write, easier to read, and less prone to errors. The most
straightforward optimizations for such codes were also already
imlemented inthe Fortran banckend (optimizations involving examining
sequences of such assignments took awhile). Further sequential
semantics is the safest semantics for this type of notation.
FWIW Fortran 95 and 2003 have the FORALL assignment statement that
provides the semantics you desire, however it has not been popular in
the modern codebase for a variety of reasons. Novices tend to view the
FORALL syntax as an alternative form of DO syntax, and not an
assignment statement, i.e., they might use it in error prone ways.
Most processors in use today, particularly the desktop machines
commonly used in early development, do not have gather/scatter
capabilities, so that semantics in practice often does not provide the
optimization that it allows. (Although similar semantics are useful
for networks of such processors.) While the gather/scatter semantics is
useful for some types of codes, e.g., sparse matrix calculations, they
are only a subset of scientific codes, and not always the time
critical parts.
>
> Transforming code to a parallel version assuming no conflicts is
> always trivial. There is no real advantage to a special array notation
> just for that. The intractable part is proving there is no
> interference so that the reordered execution of the parallelised code
> is semantically equivalent to the serial code; a notation that does
> not require that often impossible proof is a big win for the
> programmer, assuming the programmer accepts the proof responsibility
> for themselves.
| |
| Brooks Moses 2007-08-09, 10:14 pm |
| SM Ryan wrote:
> I lost track of Fortran before 90 was standarised, but the array
> notation of 8X was seriously broken: while theoretically letting
> programmers express themselves at a high level, it forced serial
> semantics whether the programmer intended that or no; thus the
> notation could not be vectorised or parallelised any easier than if
> explicit do loops had been used.
The problem with the actual standardized Fortran 90 semantics is
arguably the opposite: it forces parallel semantics whether the
programmer intended that or no. For example,
a(x) = s * a(y)
(where x and y are vectors) allows x to contain some of the same
elements as y, and requires results that are consistent with calculating
the entire vector of right-hand sides prior to doing any assignments.
This tends, in cases where the scalarizer isn't sufficiently clever (or
sufficiently well-informed), to result in the creation of unnecessary
array temporaries -- which can be a substantial performance hit in cases
where those temporaries are large.
- Brooks
--
The "bmoses-nospam" address is valid; no unmunging needed.
| |
| Tony Finch 2007-08-10, 7:15 pm |
| SM Ryan <wyrmwif@tsoft.org> wrote:
>
>I lost track of Fortran before 90 was standarised, but the array
>notation of 8X was seriously broken: while theoretically letting
>programmers express themselves at a high level, it forced serial
>semantics whether the programmer intended that or no; thus the
>notation could not be vectorised or parallelised any easier than if
>explicit do loops had been used.
There was an interesting paper about High Performance Fortran presented
at the recent HOPL III conference. HPF adds parrallelization pragmas to
Fortran 90.
http://portal.acm.org/citation.cfm?id=1238844.1238851
Tony.
--
f.a.n.finch <dot@dotat.at> http://dotat.at/
| |
| srimks11@gmail.com 2007-08-11, 7:15 pm |
| > i am currently pursuing my Masters in computer engginerring and have
> to do research work for it , and i want to know what is latest
> research work happening in the code optimization field of compilers
> ... can anyone suggest me some website for it? thanks
>
> Ms Nikita Desai,
> Lecturer ,
> IT department,
> DDU, Nadiad
Please check http://www.hipeac.net
(High-Performance Embedded Architecture and Compilation)
It contains depth of research works related to Compilers & Embedded
Computing.
HIH
Mukesh K Srivastava
srimks11@gmail.com
| |
| glen herrmannsfeldt 2007-08-12, 8:12 am |
| wclodius@lanl.gov wrote:
(snip)
PL/I has the serial semantics, where, for example,
A=A+A(5);
the new value of A(5) is used in additions for elements past A(5).
Fortran instead requires the opposite, that the effect is as if the
whole right side was evaluated before any change is made to the left
side. If the compiler can't verify no side effects, a temporary is
needed.
(snip)
[color=darkred]
> FWIW Fortran 95 and 2003 have the FORALL assignment statement that
> provides the semantics you desire, however it has not been popular in
> the modern codebase for a variety of reasons.
I don't believe that FORALL does what you say, though I think it
should have done that. From Fortran 2003:
"Execution of a forall-assignment-stmt that is an assignment-stmt
causes the evaluation of expr and all 15 expressions within
variable for all active combinations of index-name values.
These evaluations may be 16 done in any order. After all these
evaluations have been performed, each expr value is assigned to
the 17 corresponding variable. The assignments may occur in any
order."
The right side can be evaluated in any order, and the values assigned
in any order, but all evaluations are, in theory, done before any
assignments. Again, temporaries may be required.
(snip)
-- glen
| |
|
|
|
|
|