Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Earley parser
On 2005.05.01.21.43, arthur wrote:
| I have implemented a Earley's recognizer, now I am looking for any
| references to build a parse tree from it.  I looked at the Aho/Ulman
| algorithm and the Grune/Jacob description.  Does anybody have any
| others algorithms or references?

Hi Arthur,

Not so long ago I've introduced new algorithm, which was named as
Earley's Algorithm Adapted to build parse trees. The main problem with
Earley's algorithm is it does not correct build parse trees if the
input grammar is ambigous. That's the problem and answer why Earley
didn't provide parsing algorithm, only recognizing one. The approach
to build parse tree is following: we are extending Eraley's item,
which consists of three components [A-->alpha*beta, i, omega] where
A-->alpha*beta - input grammar's rule, i - number of Earley's state
(list) where this item was burn and omega - lookahead string. As it
was expained long before (in 70th) lookahead string is not needed here
because it only improves Predictor operation and only in few cases.
So, drop this and let Earley's item will consists of thow components
[A-->alpha*beta, i]. Now add two extended components to Earley's item:

1. The reference to item, which was the reason the current item was
added to this Earley's state. This is the item, which was the
parameter of Completer operation (I'll explain this later in details).
It is named as low-reference.

2. The reference to item within the same components but where the
point in the rule A-->alpha*beta is one simbol left (if we have item
[A-->ab*c,2], this is item [A-->a*bc,2]). It is clear the reference is
bull if the point is in the beginning of the rule. It is named as
left-reference.

So, here is the example. The grammar is
E-->E+n
E-->n
E - nonterminal, n, + - terminals.
The input string is "n".

The parsing is:
List 0:
1.[E-->*E+n,0,null,null]
2.[E-->*n,0,null,null]

List 1:
1.[E-->n*,0,null,0.2] (0 is the list's number, 2 is the nu,ber of
item).

So, we have parse tree:
E
|
n

But, such the references keeping may be incorrect if the grammar have
the rules with empty right part. I've intended keep the list of
low-references instead the single one. And, if there will be ambigous
parsing the list of low-references will reflect this in correct way.
So, the item now is [A-->alpha*beta,i,<low-ref>,left-ref]. Every time
Completer operation runs on item I, if it adds a new item I1 the
low-ref to I1 is added to the list of low-refs in item I.

This algorithm was a part of my PhD. It is in russian. If you know
this I can send you the text within the algorithm description and
correctness prove as well.

Vladimir.

Report this thread to moderator Post Follow-up to this message
Old Post
Vladimir
05-03-05 09:01 PM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

Compilers archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 09:40 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.