Home > Archive > Compilers > June 2007 > Syntax directed compilation
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Syntax directed compilation
|
|
| Barry Kelly 2007-05-27, 7:11 pm |
| I recall, a long time back on this group, people pointing out that
languages supporting redefinable or user extensible grammars have
never taken off, and that like heavy armour on insects, it's a feature
more notable of the extinct than the extant. The argument against them
seems to be "too much power, users write their own languages and then
can't understand one another's". That always seemed like a weak
argument to me, and in this day and age of DSLs and indirect program
rewriting in dynamic languages, I wonder if it just hasn't been done
correctly yet.
I've recently taken to thinking of source code as being a program to
produce an executable, where compilers are interpreters of the text.
That is, a declaration like 'int x;' in a C-like language is an
imperative statement in the compiler's language to 'declare a variable
called "x", of type "int", and add it to the current scope'.
In this view, it is quite noticeable that most languages don't support
much abstraction during this compile-time interpretation. C++
templates seem like a limited and clumsy form of functions over parse
trees, while C#/.NET-style type generics are single-level type
constructors, and method generics method constructors. C macros are a
very crude manipulation of the source text, with almost no syntax or
type awareness. But this is all small stuff compared to the adventures
of Ruby and its ilk.
As far as I can see, the closest living embodiment of the concept is
Lisp, but Lisp programs seem not to have explored this particular
level of abstraction - viewing the source read by the Lisp interpreter
/ compiler as being instructions on how to create an
executable. Instead, while much macro magic is available, ultimately
it only affects the code subsequently read, rather than defining the
back-end of a compiler.
To old fogies of 27 like me, who like statically verified type safety,
one of the major advantages of the approach I'm describing is that it
enables both (a) Lisp-level abstraction, symbolic manipulation and
program rewriting and (b) compile-time type safety with a classically
efficient target executable. It brings the dynamic touch to the static
world. It also moves much, if not most, of the body of the compiler into
the run-time library of the language.
Has there been research in this area that I've missed on my searches
(using 'syntax directed compilation' as my main phrase)?
Is the idea trivially dismissible by some argument I'm not aware of?
Or is the idea completely bonkers, and am I clearly advertising my
insanity by outlining it?
-- Barry
--
http://barrkel.blogspot.com/
[IMP72 let you extend the syntax either as macros or by writing calls
to the back end code generator. In practice it had all of the same
problems of write-only code as other extensible languages. -John]
| |
| Aaron Gray 2007-05-29, 8:08 am |
| "Barry Kelly" <barry.j.kelly@gmail.com> wrote in message
>I recall, a long time back on this group, people pointing out that
> languages supporting redefinable or user extensible grammars have
> never taken off, and that like heavy armour on insects, it's a feature
> more notable of the extinct than the extant. The argument against them
> seems to be "too much power, users write their own languages and then
> can't understand one another's". That always seemed like a weak
> argument to me, and in this day and age of DSLs and indirect program
> rewriting in dynamic languages, I wonder if it just hasn't been done
> correctly yet.
Yes me too.
> Has there been research in this area that I've missed on my searches
> (using 'syntax directed compilation' as my main phrase)?
Dylan is worth looking at it has a macro system that gives a frontended
approach.
http://www.opendylan.org/books/drm/
Also you can look at 'prop' :-
http://prop-cc.sf.net
I ported it to MSVC and modern GCC some time ago but have not really done
much with it. Its a preprocessor to C++, with compiler-compiler and
functional facilities.
> Is the idea trivially dismissible by some argument I'm not aware of?
>
> Or is the idea completely bonkers, and am I clearly advertising my
> insanity by outlining it?
No, it just a very difficult thing to do and get right in a language. It
really needs a proper archetecture and/or an "open compiler", to coin a
term.
Aaron
| |
| Steven Nichols 2007-05-29, 8:08 am |
| Barry Kelly <barry.j.kelly@gmail.com> wrote:
> I recall, a long time back on this group, people pointing out that
> languages supporting redefinable or user extensible grammars have
> never taken off, and that like heavy armour on insects, it's a feature
> more notable of the extinct than the extant. The argument against them
> seems to be "too much power, users write their own languages and then
> can't understand one another's". That always seemed like a weak
> argument to me, and in this day and age of DSLs and indirect program
> rewriting in dynamic languages, I wonder if it just hasn't been done
> correctly yet.
I agree, there are macro languages that allow detailed control over
type checking rules, code generaton, etc.. I wrote one which runs on
DOS and will also compile for non 80X86 systems called ML1 (which is
not ML/1). It allows you to define the reule for types, the output
code, code optimization (local optimization), etc. It's a macro
compiler, and it's written in its' own language. What it does is load
scripts that define the commands and as the commands are encountered
the scripts (macros) are called to process them. It has support for
object oriented language definitions. It comes with a script that
defines a structured Low Level BASIC like language. You can design
your own control structures, custom types, data coercion, etc.
The default runtime control structure has a Exit command that allows
the equivalent of a forward GOTO to anywhere, without labels.
You can DL it and use it for no-fees (commercial use included)
at: www.ml1compiler.org
The compiler itself is only a 45K executable.
Steve
| |
| thomas.mertes@gmx.at 2007-05-30, 4:18 am |
| On 26 Mai, 13:45, Barry Kelly <barry.j.ke...@gmail.com> wrote:
> I recall, a long time back on this group, people pointing out that
> languages supporting redefinable or user extensible grammars have
> never taken off, and that like heavy armour on insects, it's a feature
> more notable of the extinct than the extant. The argument against them
> seems to be "too much power, users write their own languages and then
> can't understand one another's". That always seemed like a weak
> argument to me, and in this day and age of DSLs and indirect program
> rewriting in dynamic languages, I wonder if it just hasn't been done
> correctly yet.
It is also my opinion that it just hasn't been done correctly by
most programming languages.
> I've recently taken to thinking of source code as being a program to
> produce an executable, where compilers are interpreters of the text.
> That is, a declaration like 'int x;' in a C-like language is an
> imperative statement in the compiler's language to 'declare a variable
> called "x", of type "int", and add it to the current scope'.
This is the way Seed7 works. The declaration statements are executed
at
compile time. They declare new objects and add them to the current
scope.
That means: A Seed7 program is just a sequence of statements which are
executed at compiletime. Since this statements are (most of the time)
declaration statements, this declares the objects of the program.
A 'writeln' statement at top level does just print a message at
compile
time. The 'include' command (except for the first one which boots
Seed7)
is also a statement which is executed at compile time. You are able
to declare new declaration (and other) statements which can be used to
change the language. But this is not as easy as it sounds. All the
different things need to fit to an overall concept. In the future I
plan to create an 'import' statement (for a module / package system)
which is based on the 'include' statement. After the compile time the
interpreter looks for a function named 'main' and starts it. In the
compiler (which compiles to C) the code generation phase is started
instead.
> In this view, it is quite noticeable that most languages don't support
> much abstraction during this compile-time interpretation. C++
> templates seem like a limited and clumsy form of functions over parse
> trees, while C#/.NET-style type generics are single-level type
> constructors, and method generics method constructors. C macros are a
> very crude manipulation of the source text, with almost no syntax or
> type awareness.
The Seed7 templates are also functions which are executed at compile
time. This functions contain other declarations in their body and
declare the necessary things this way. For example:
const proc: FOR_DECLS (in type: aType) is func
begin
const proc: for (inout aType: variable) range (in aType: low) to
(in aType: high) do
(in proc: statements) end for is func
begin
variable := low;
if variable <= high then
statements;
while variable < high do
incr(variable);
statements;
end while;
end if;
end func;
end func;
FOR_DECLS(char);
FOR_DECLS(boolean);
As you can see: The instanciation of the templates must be explicit.
With 'FOR_DECLS(char)' the declaration statement 'const proc: for ...'
is executed which declares a for loop for char. Types are just
used as normal parameters. No special template / generic syntax with
'<' and '>' as in other languages.
> To old fogies of 27 like me, who like statically verified type safety,
I also consider "statically verified type safety" very important.
> one of the major advantages of the approach I'm describing is that it
> enables both (a) Lisp-level abstraction, symbolic manipulation and
> program rewriting and (b) compile-time type safety with a classically
> efficient target executable. It brings the dynamic touch to the static
> world. It also moves much, if not most, of the body of the compiler into
> the run-time library of the language.
This is exacly my approach.
> Has there been research in this area that I've missed on my searches
> (using 'syntax directed compilation' as my main phrase)?
Look for "extensible programming language".
I also suggest you look for Seed7 :-)
Greetings Thomas Mertes
Seed7 Homepage: http://seed7.sourceforge.net
Project page: http://sourceforge.net/projects/seed7
| |
| Chris F Clark 2007-05-30, 4:18 am |
| "Aaron Gray" <ang.usenet@gmail.com> writes:
> "Barry Kelly" <barry.j.kelly@gmail.com> wrote in message
[color=darkred]
> Dylan is worth looking at it has a macro system that gives a frontended
> approach.
>
> http://www.opendylan.org/books/drm/
....
> No, it just a very difficult thing to do and get right in a language. It
> really needs a proper archetecture and/or an "open compiler", to coin a
> term.
Actually, the term open compiler has an established meaning which may
or may not suit the original posters needs, but I suspect OpenDylan is
following this concept, which comes out of the lisp community and the
Meta-Object Protocol and evolved into aspect oriented
programming--that is if I have my terms right. There was even an
OpenC++ implementation if I recall correctly. The name this is
associated with in my mind is Gregory Kikzales. I believe some of
this work became mainstream as "reflection".
This is not my area (although I really like what I've read about
AspectJ, and hope to someday incorporate aspect oriented ideas into
Yacc++), so I may have muddled three or more unrelated things
together.
Hope this helps,
-Chris
****************************************
*************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)
------------------------------------------------------------------------------
| |
| Matthew X. Economou 2007-05-31, 10:10 pm |
| >>>>> "Barry" == Barry Kelly <barry.j.kelly@gmail.com> writes:
Barry> As far as I can see, the closest living embodiment of the
Barry> concept is Lisp, but Lisp programs seem not to have
Barry> explored this particular level of abstraction - viewing the
Barry> source read by the Lisp interpreter / compiler as being
Barry> instructions on how to create an executable. Instead, while
Barry> much macro magic is available, ultimately it only affects
Barry> the code subsequently read, rather than defining the
Barry> back-end of a compiler.
I'm not a Lisp expert, but maybe you can prototype something similar
to what you propose with reader macros (especially the read-time-eval
macro "#.") and compiler macros. It might be worth posting your
comments on comp.lang.lisp.
Best wishes,
Matthew
| |
|
|
| Barry Kelly 2007-06-20, 10:09 pm |
| Barry Kelly wrote:
> I recall, a long time back on this group, people pointing out that
> languages supporting redefinable or user extensible grammars have
> never taken off
> Has there been research in this area that I've missed on my searches
> (using 'syntax directed compilation' as my main phrase)?
Replying to my own post, because I've found some interesting related
work not previously mentioned:
http://www.chrisseaton.com/katahdin/
"Katahdin is a programming language where the syntax and semantics are
mutable at runtime."
[...]
"New constructs such as expressions and statements can be defined, or a
new language can be implemented from scratch. It is built as an
interpreter on the Mono implementation of the .NET framework. "
Public domain source is available from that site.
-- Barry
--
http://barrkel.blogspot.com/
|
|
|
|
|