Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

ll(1) and variable assignment...
hi,

I want to write a recursive descent parser, and wonder if there is a
LL(1) grammar for "c-style" variable assignement
(i.e. "id=id=id+id" but not "id=id=id+id=2").

The following grammar is LL(2) I think:

A := ID = A
A := E

E := T E'
E':= + T E'
E':=

T := ID
T := ( A )


I tried to convert it to LL(1), then I got something like that:

A := ID A'
A := ( A )
A':= = A
A':= + T E'

E := T E'
E':= + T E'
E':=

T := ID
T := ( A )

But then I lose the context. I don't no if the "id" in rule one is a
left value or not.
Does an LL(1) grammar exist for this problem? Or am I absolutely wrong
with something?

bye
Chris


Report this thread to moderator Post Follow-up to this message
Old Post
Chris
11-06-04 08:57 PM


Re: ll(1) and variable assignment...
weltraum@astrocat.de (Chris) schreibt:

>I want to write a recursive descent parser, and wonder if there is a
>LL(1) grammar for "c-style" variable assignement
>(i.e. "id=id=id+id" but not "id=id=id+id=2").

IMO the problem is the required lookahead in the very last production,
that can be a rvalue in the last place, or must be an lvalue if
followed by another "=".

In a typical solution the difference between rvalue and lvalue will be
handled on semantical level, not on syntactical level in the
grammar. Consider how fruitless a syntactical distinction between
rvalue and lvalue expressions is in practice, when a parser has to try
both ways anyhow. Parsing for an rvalue should always succeed, but
must result in a mismatch when a "=" follows.  Likewise a parse for an
lvalue, in the first guess, will fail for most right hand sides of
such an assignment expression. Parsing for both rvalues and lvalues at
the same time is impossible in LL(1).

An LL(1) grammar for this case IMO is possible only with dramatic
restrictions, as that the lvalue grammar/language must be distinct
from the rvalue language, distinguishable by the very first token of
every expression.

DoDi

Report this thread to moderator Post Follow-up to this message
Old Post
VBDis
11-07-04 08:57 PM


Re: ll(1) and variable assignment...
weltraum@astrocat.de (Chris) wrote:
# hi,
#
# I want to write a recursive descent parser, and wonder if there is a
# LL(1) grammar for "c-style" variable assignement
# (i.e. "id=id=id+id" but not "id=id=id+id=2").
#
# The following grammar is LL(2) I think:
#
# A := ID = A
# A := E
#
# E := T E'
# E':= + T E'
# E':=
#
# T := ID
# T := ( A )

<A> ::= ID = <a-expression>
FIRST:	{ID}
<a-expression> ::= ID <a-suffix>
FIRST:	{ID}
| (<expression> ) <suffix>
FIRST: {"("}
<a-suffix> :: = <a-expression>
FIRST:	{"="}
| <suffix>
FIRST:	{"+"}
<expression> ::= ID <a-suffix>
FIRST:	{ID}
| (<expression> ) <suffix>
FIRST:	{"("}
<suffix> :: = + <expression>
FIRST:	{"+"}
| empty
FIRST: FOLLOWER(A)

The parse tree is going to be a bear to untangle, though.

The only problem is that in C, the thingie before the '=', the l-value, has 
unbounded
size and can be nested deeply in an embedding-(-) production; potentially yo
u won't
know it's not a l-value until deep inside the parse and shifting any number 
of tokens.
That means it won't be LL(k) for any k. But if the l-value can be described 
as something
akin to a regular expression, then it is still LL(k).

--
SM Ryan http://www.rawbw.com/~wyrmwif/
Title does not dictate behaviour.


Report this thread to moderator Post Follow-up to this message
Old Post
SM Ryan
11-07-04 08:57 PM


Re: ll(1) and variable assignment...
Chris asks:
> I want to write a recursive descent parser, and wonder if there is a
> LL(1) grammar for "c-style" variable assignement
> (i.e. "id=id=id+id" but not "id=id=id+id=2").
...
> But then I lose the context. I don't no if the "id" in rule one is a
> left value or not.  Does an LL(1) grammar exist for this problem? Or
> am I absolutely wrong with something?

A little analysis will answer your question.

1) The RHS of an assignment can be a single ID, via A->E->T->ID
2) It can also be an assignment, A->ID = A

Therefore, to distinguish those two cases, you need at least two
tokens ID = (vs ID + or ID EOF).  So, when you factor the ID out of
the common prefix for those possibilities, you lose the distinction of
how ID is used.  You need the two tokens to see ID in context.

Thus, your second grammar captures the fact that ID cannot be used in
the non-desired way.  That is, your second grammar does not allow
ID+ID=ID.  However, it does not build the types of AST trees that
allow you to conveniently process the input.

The problem gets worse if you try to allow an even richer C subset,
e.g. *ID = 2 or ID[I+1] = ID[I].  In general, one needs more unbounded
look-ahead (i.e. to the end of the expression) to distinguish all the
cases.  This means one needs an LR rather than LL grammar.

This is an example of a typical class of problems (typed expressions)
where grammars are not a good solution.  I know of two other problems
that are similar in nature. The first is distinguishing boolean
vs. integer expressions when booleans can "convert" to integers.
The second is dealing with NULL rows as special in SQL expressions.

In both cases, there are grammatic solutions to the problem.  That is,
there are grammars that allow only "legal" programs to be accepted and
enforce the desired "type" rules.  However, in both cases, the grammar
solution is horrendous (in that large portions of the grammar are
duplicated with slight permutations, resulting in a maintenance
nightmare).  It might be possible to do a clean grammatic solution to
"type" problems with 2-level aka VW grammars.  However, your typical
CFG (LL/LR grammar) is not likely to cut it.

A better solution to these kind of problems is a 2nd pass (or phase).
Build the parse tree with a "natural" grammar that doesn't try to
enforce the "type" rules and allows syntactically correct but
semantically wrong inputs.  Then, apply the semantics to the
tree.  Look for left-subtrees of assignments that are disallowed
expressions.  Propragate the boolean/integer null/non-null-ness up the
tree to places where it needs checking.  Attribute grammars do this
relatively well, and one can hand-implement most attribute traversals
by hand quite easily if one doesn't have an attribute evaluator
generator.

Hope this helps,
-Chris

 ****************************************
************************************
*
Chris Clark                    Internet   :  compres@world.std.com
Compiler Resources, Inc.       Web Site   :  http://world.std.com/~compres
23 Bailey Rd                   voice      :  (508) 435-5016
Berlin, MA  01503  USA         fax        :  (978) 838-0263  (24 hours)
----------------------------------------------------------------------------
--


Report this thread to moderator Post Follow-up to this message
Old Post
Chris F Clark
11-15-04 08:57 AM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

Compilers archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 05:34 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.