Home > Archive > Compilers > June 2006 > problem with C grammar
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
problem with C grammar
|
|
| A.T.Hofkamp 2006-05-30, 7:11 pm |
| Hello all,
<the.malfunction@googlemail.com> wrote
> As it seems, the lexical scanner is proposed to analyse whether an
> identifier is a type or not. I rather would like to let the parser do
> this job. Is there any way to change the grammar such that I can use
> IDENTIFIER instead of TYPE_NAME here without having all those conflicts?
As Ivan A. Kosarev already stated, you can neither do this in the scanner nor
in the parser, you need semantical information to decide which parse variant
you should pick.
yacc & friends are not designed to handle this situation, although you can
work around this limiation if you feed semantical information back to the
parser and scanner. That does imply however that you will get an intertwined
scanner/parser/type checker.
If you dislike the idea of having the syntactical phase and semantical phase
intertwined, you may want to use a parser that delivers a parse forest (that
is, all parse trees derivable from the grammar for your input) instead of a
single tree. In the (seperate) semantical phase you can then decide which
trees of the forest you should use.
Two approaches that do this are
1. T.A. Wagner and S.L. Graham, Incremental analysis of real programming
languages. In proceedings of 1997 ACM SIGPLAN conference om Programming
language design and implementation, pages 31--43, ACM Press 1997
2. P. Klint and E. Visser, Using filters for the disambiguation of
context-free grammars, In G. Pighizzini and P. San Pietro, editors Proc.
ASMICS Workshop on Parsing Theory, pages 1--20, Milano Italy, 1994, Tech Rep
126-1994, Dipartimento di Scienze dell'Informazione, Universit\`a di Milano.
Of the latter, there is also an implementation, described in
M.G.J. van den Brand, S. Klusener, L. Moonen, and J.J. Vinju. Generalized
Parsing and Term Rewriting: Semantics Driven Disambiguation, in: Proceedings
of Third Workshop on Language Descriptions, Tools and Applications (LDTA'2003)
(eds. B.R. Bryant and J. Saraiva), volume 82 of Electronic Notes in
Theoretical Computer Science, Elsevier, 2003.
Albert
| |
| Ira Baxter 2006-06-03, 7:05 pm |
| "A.T.Hofkamp" <a.t.hofkamp@tue.nl> wrote in message
>
> <the.malfunction@googlemail.com> wrote
>
> If you dislike the idea of having the syntactical phase and semantical phase
> intertwined, you may want to use a parser that delivers a parse forest
(that
> is, all parse trees derivable from the grammar for your input) instead of
a
> single tree. In the (seperate) semantical phase you can then decide which
> trees of the forest you should use.
>
> Two approaches that do this are
>
> 1. T.A. Wagner and S.L. Graham, Incremental analysis of real programming
> languages. In proceedings of 1997 ACM SIGPLAN conference om Programming
> language design and implementation, pages 31--43, ACM Press 1997
>
> 2. P. Klint and E. Visser, Using filters for the disambiguation of
> context-free grammars, In G. Pighizzini and P. San Pietro, editors Proc.
> ASMICS Workshop on Parsing Theory, pages 1--20, Milano Italy, 1994, Tech
Rep
> 126-1994, Dipartimento di Scienze dell'Informazione, Universit\`a di
Milano.
>
> Of the latter, there is also an implementation, described in
>
> M.G.J. van den Brand, S. Klusener, L. Moonen, and J.J. Vinju. Generalized
> Parsing and Term Rewriting: Semantics Driven Disambiguation, in:
Proceedings
> of Third Workshop on Language Descriptions, Tools and Applications
(LDTA'2003)
> (eds. B.R. Bryant and J. Saraiva), volume 82 of Electronic Notes in
> Theoretical Computer Science, Elsevier, 2003.
The DMS Software Reengineering Toolkit implements a variation of the
1. above, using a tightly integrated attribute evaluator. In
particular, it uses a GLR parsing engine to produce a parse DAG with
embedded ambiguities and shared subterms (this is a property of good
GLR parsers). So-called "Semantical" information is really just an
inference of facts from the syntax tree (sometimes even from the
ambiguous part, many langauges have rules to the effect that "if xxx
is ambiguous, then the meaning is yyy").
It is quite effective to encode such inferences as an attribute-value
computation. The DMS attribute evaluators allow the specification of
such computations in a typical functional-program form, but unusually
allows side effects. One special side effect is the signaling of
"semantic errors", which cause branches under and ambiguity node to be
automatically eliminated. Combining name and type resolution with the
signalling of semantcial errors deletes the "wrong subtrees" of
ambiguities.
We use this in most of our langauge front ends, and it
works beautifully for C++.
--
Ira Baxter, CTO
www.semanticdesigns.com
|
|
|
|
|