For Programmers: Free Programming Magazines  


Home > Archive > Compilers > April 2004 > BNF grammar interpretation









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author BNF grammar interpretation
valentin tihomirov

2004-04-03, 10:36 am

[1] I have made an attemt to make an edif (LISP-like file)
parser/translator. I want it to make universal. The file is a tree of
identical elements.
A ::= '(''plus' X Y')'

'plus' determines type of the expression, identifier is a terminal (string
token) and X and Y are imperative (non-optional). I was trying to make a
universal parser traveling through the tree and generating events to
translator. As far as I've seen the imperative sub-expressions in edif
grammar are usually defined like this:
X ::= ( identifier | M | N )
M ::= '(''m' identifier ')'
N ::= '(''n' identifier ')'

As you see, the X selects one expression, whether terminal or M or N. This
conforms to the highr-level A expression that requires only one imperative
sub-expression. I could not find expressions like:
X ::= { identifier | M }

But what is the way to parse it if the above declaration is encountered; the
{} braces mean optional expression in BNF? This looks like a paradox
(conflict) of declarations. X-element becomes optional as opposed to
A-parent element expectations.




[2] The second paradox is similar to the first one.
A ::= ( identifier [X] [Y] )
X ::= (identifier | B)
Y ::= (identifier | C)
Here, expression A has 3 sub-expressions. The first child is simple terminal
identifier, second is optional X element followed by optional Y expression.
If X is optinal, how do I know that identifier returned by lexer at X
position is X element rather than Y?

I beleive that grammar allows more sinthax errors in general. What do you
recommend to read for preventing such stupid questions?

Simon

2004-04-29, 2:22 pm

"valentin tihomirov" <valentin_NOSPAM_NOWORMS@abelectron.com> wrote in message

> A ::= '(''plus' X Y')'
>
> 'plus' determines type of the expression, identifier is a terminal (string
> token) and X and Y are imperative (non-optional). I was trying to make a
> universal parser traveling through the tree and generating events to
> translator. As far as I've seen the imperative sub-expressions in edif
> grammar are usually defined like this:
> X ::= ( identifier | M | N )
> M ::= '(''m' identifier ')'
> N ::= '(''n' identifier ')'
>
> As you see, the X selects one expression, whether terminal or M or N. This
> conforms to the highr-level A expression that requires only one imperative
> sub-expression. I could not find expressions like:
> X ::= { identifier | M }
>
> But what is the way to parse it if the above declaration is encountered; the
> {} braces mean optional expression in BNF? This looks like a paradox
> (conflict) of declarations. X-element becomes optional as opposed to
> A-parent element expectations.
>


The curly braces {S} means "0 or more occurences of S".
Optional would be [S] - more exactly "0 or 1 occurence of S".

Also note that this is EBNF (Extended), though you can rewrite a BNF
grammar to give the same meaning, it's just shortcuts.

If you want to rewrite
X ::= { identifier | M }
so that there must be at least one occurence, try
X ::= identifier | M { identifier | M }

or in the simpler, BNF form, write
X ::= identifier | M |( identifier | M ) X
This left-recursion might give you problems later on, if so you can
rewrite further or augment your grammar with extra rules.

>
>
> [2] The second paradox is similar to the first one.
> A ::= ( identifier [X] [Y] )
> X ::= (identifier | B)
> Y ::= (identifier | C)
> Here, expression A has 3 sub-expressions. The first child is simple terminal
> identifier, second is optional X element followed by optional Y expression.
> If X is optinal, how do I know that identifier returned by lexer at X
> position is X element rather than Y?


Yes that is possible... its equivalent to having separate rules for A
saying:
A ::= (identifier | identifier X | identifier Y | identifier X
Y)

how this is handled depends on the parser... you can tell because the
parser would eventually reach a B or a C. The tokens should have
something to identify them, though, so your rule shouldnt be a problem

hope this helps,

Simon
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com