Home > Archive > Compilers > June 2007 > parsing ISO C++(1998/2003)
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
parsing ISO C++(1998/2003)
|
|
| parthaspanda2005@yahoo.com 2007-04-14, 10:11 pm |
| Hi,
No matter how many shots at it, I havent been
able to get bison/yacc to parse the C++ grammar
specified in the reference manual.
Factoring and other smart tricks leave the grammar
as much different from the original grammar
and no matter what, shift/reduce and reduce/reduce
conflicts run galore.
Is there any trick that I may be missing?
How do I get the grammar in the manual
not to change much as well as pass under
bison/yacc?
Sincerely,
Partha Sarathi Panda
[C++ is context sensitive. You can't parse it using a yacc parser
unless you do some serious external hackery to look ahead and feed
hints back to the parser. This has been discussed at length in the
past, so look at the archives. -John]
| |
| Aaron Gray 2007-04-19, 4:17 am |
| <parthaspanda2005@yahoo.com> wrote in message
> No matter how many shots at it, I havent been
> able to get bison/yacc to parse the C++ grammar
> specified in the reference manual.
>
> Factoring and other smart tricks leave the grammar
> as much different from the original grammar
> and no matter what, shift/reduce and reduce/reduce
> conflicts run galore.
Its needs/is an ambigous grammar.
> Is there any trick that I may be missing?
GLR (Bison has GLR) was ment to be the answer, although it is slow compared
to other techniques.
Elkhound and Elsa tried GLR approach and failed.
All the major C++ parsers are hand coded, recursive decent and precedance
based binary ops.
> How do I get the grammar in the manual
> not to change much as well as pass under
> bison/yacc?
Thats probably impossible.
Aaron
| |
| parthaspanda2005@yahoo.com 2007-04-19, 4:17 am |
| On Apr 18, 10:46 pm, "Aaron Gray" <ang.use...@gmail.com> wrote:
> <parthaspanda2...@yahoo.com> wrote in message
>
> GLR (Bison has GLR) was ment to be the answer, although it is slow compared
> to other techniques.
>
> Elkhound and Elsa tried GLR approach and failed.
Did Bison(GLR) succeed? I mean has anyone reported it as having
worked? As a side note, I notice Bison(2.3) as not being able to dump
statistics of GLR specifics(merged RHS for e.g.).
Sincerely,
Partha Sarathi Panda
| |
| Aaron Gray 2007-04-23, 8:08 am |
| <parthaspanda2005@yahoo.com> wrote in message
news:07-04-069@comp.compilers...
> On Apr 18, 10:46 pm, "Aaron Gray" <ang.use...@gmail.com> wrote:
>
> Did Bison(GLR) succeed? I mean has anyone reported it as having
> worked?
I am not sure about Bison. I believe that it works though, have not heard
anthing to contradict that belief.
> As a side note, I notice Bison(2.3) as not being able to dump
> statistics of GLR specifics(merged RHS for e.g.).
Dumping stats is not really necessary.
What would be a good tool is something that takes an ambiguous grammar an
generates a GLR grammar. I am not sure whether this problem is decidable or
not though.
Aaron
| |
| Hans Aberg 2007-04-23, 8:08 am |
| parthaspanda2005@yahoo.com wrote:
> Did Bison(GLR) succeed? I mean has anyone reported it as having
> worked?
I haven't used it myself: but sure.
Hans Aberg
| |
| Aaron Gray 2007-04-28, 8:06 am |
| "Ira Baxter" <idbaxter@semdesigns.com> wrote in message
> "Aaron Gray" <ang.usenet@gmail.com> wrote in message
>
> Failed? From what I hear, Elsa can parse virtually all of ANSI C++.
> What's the failure, exactly?
Okay from the Elsa web page it looks a lot better now than it did a
six months ago, I'll say that
I would be interested in how well it would parse Boost "libraries".
>
> Guess our DMS Software Reengineering Toolkit isn't a major C++ parser.
> Nonetheless, it is NOT hand code. It uses
> GLR to parse variety of C++ dialects (ANSI, GNU,
> MS including 2005), with grammar rules taken pretty directly from
> from ANSI standard (and adjusted according to dialects, etc.).
> See http://www.semanticdesigns.com/Prod...pFrontEnds.html
This is indeed encouraging.
>
> Use GLR. There are several proof by examples.
I am aware of Elsa, but no other open C++ GLR parsers.
> So this is very practical.
>
> However, you'll find that there is an enormous amount of work to build
> a working C++ front end above and beyond parsing.
>
> If you think of parsing C++ as climbing foothills, getting the rest of
> the front end right (preprocessor, name and type resolution, managing
> the real dialects used, linking) is like climbing the Himalayas.
> People don't seem to understand this very well. Maybe it because they
> never get past parsing.
Yes I am well aware from looking at GCC source code.
The only other issue brought up was about parsing and disambiguation time of
GLR as compared to combined recursive decent and operator preceedance
parsing.
Aaron
| |
| Aaron Gray 2007-04-28, 8:06 am |
| "Torben "Ęgidius" Mogensen" <torbenm@app-7.diku.dk> wrote in message
> "Aaron Gray" <ang.usenet@gmail.com> writes:
>
>
>
> GLR parsers are able to parse ambiguous grammars, so this is trivially
> true.
The programmer has to program the points at which the GLR actually branches.
> What you need, however, is a way to choose which one of the
> many parse trees that the GLR parser gives you that you want, and I
> can see no way of getting this automatically -- unless you can
> formally specify disambiguation rules (such as operator precedence
> rules). Operator precendence rules are insufficient to make the C++
> grammar unambiguous, though.
It's a case of transforming (or pruning) the AST. This may be done by
transformation rules but probably better done by bespoke code for
efficiency reasons.
Aaron
| |
| Ira Baxter 2007-04-29, 7:07 pm |
| "Torben Ęgidius Mogensen" <torbenm@app-7.diku.dk> wrote in message
> "Aaron Gray" <ang.usenet@gmail.com> writes:
>
> GLR parsers are able to parse ambiguous grammars, so this is trivially
> true. What you need, however, is a way to choose which one of the
> many parse trees that the GLR parser gives you that you want, and I
> can see no way of getting this automatically -- unless you can
> formally specify disambiguation rules (such as operator precedence
> rules). Operator precendence rules are insufficient to make the C++
> grammar unambiguous, though.
In DMS, we solve this problem by building attribute grammars that
evaluate the (ambiguous) parse tree to do name and type resolution,
including operator precedence evaluation; this builds a scoped symbol
table as a side effect. A special operator in our attribute grammar
declares the "current" computation to be in error; this is invoked
when there is some inconsistency in type declarations and actual
usage. Our attribute grammar evaluator engine deletes any subtree in
which such an error occurs. The remaining trees are not inconsistent,
and in fact if you encode all the C++ type resolution rules, the
resulting tree doesn't have any ambiguities.
So the extraction of the "right" parse trees occurs automatically in
one sense. Somebody still has to encode the C++ rules to detect the
wrong cases. That's quite a lot of effort, that we've already done.
--
Ira Baxter, CTO
www.semanticdesigns.com
| |
| Ira Baxter 2007-04-29, 7:07 pm |
| "Aaron Gray" <ang.usenet@gmail.com> wrote
> "Ira Baxter" <idbaxter@semdesigns.com> wrote in message
>
[color=darkred]
> I am aware of Elsa, but no other open C++ GLR parsers.
Dunno what you mean by "open" If you mean "free source",
I agree, there aren't any others I know about. If you
mean "available source code", ours is commercially available.
> The only other issue brought up was about parsing and disambiguation
> time of GLR as compared to combined recursive decent and operator
> preceedance parsing.
It is true that our GLR parsers are not as fast as hand-coded
recursive descent parsers.
That's because we aren't trying to compete with compiler front ends.
On the other hand, typical compiler front ends don't capture comments
and formats of literals, can't transform the resulting trees as C++
trees, and can't reproduce C++ source code (I think EDG actually can
do this) that is still compilable. We are optimizing for different
goals. Speed wasn't at the top of our list.
--
Ira Baxter, CTO
www.semanticdesigns.com
| |
| Ira Baxter 2007-04-29, 7:07 pm |
| "Aaron Gray" <ang.usenet@gmail.com> wrote
> "Torben "Ęgidius" Mogensen" <torbenm@app-7.diku.dk> wrote in message
>
> The programmer has to program the points at which the GLR actually
> branches.
How so? All we write are simple context free grammar rules. These
rules may be collectively ambiguous. But other than that, we don't
write anything else. The GLR engine determines the branches by
looking at conflicts in the parsing tables. They are pretty easy to
detect.
--
Ira Baxter, CTO
www.semanticdesigns.com
| |
|
|
|
|
|
|
|
|
|