Code Comments
Programming Forum and web based access to our favorite programming groups.On 2005.05.01.21.43, arthur wrote: | I have implemented a Earley's recognizer, now I am looking for any | references to build a parse tree from it. I looked at the Aho/Ulman | algorithm and the Grune/Jacob description. Does anybody have any | others algorithms or references? Hi Arthur, Not so long ago I've introduced new algorithm, which was named as Earley's Algorithm Adapted to build parse trees. The main problem with Earley's algorithm is it does not correct build parse trees if the input grammar is ambigous. That's the problem and answer why Earley didn't provide parsing algorithm, only recognizing one. The approach to build parse tree is following: we are extending Eraley's item, which consists of three components [A-->alpha*beta, i, omega] where A-->alpha*beta - input grammar's rule, i - number of Earley's state (list) where this item was burn and omega - lookahead string. As it was expained long before (in 70th) lookahead string is not needed here because it only improves Predictor operation and only in few cases. So, drop this and let Earley's item will consists of thow components [A-->alpha*beta, i]. Now add two extended components to Earley's item: 1. The reference to item, which was the reason the current item was added to this Earley's state. This is the item, which was the parameter of Completer operation (I'll explain this later in details). It is named as low-reference. 2. The reference to item within the same components but where the point in the rule A-->alpha*beta is one simbol left (if we have item [A-->ab*c,2], this is item [A-->a*bc,2]). It is clear the reference is bull if the point is in the beginning of the rule. It is named as left-reference. So, here is the example. The grammar is E-->E+n E-->n E - nonterminal, n, + - terminals. The input string is "n". The parsing is: List 0: 1.[E-->*E+n,0,null,null] 2.[E-->*n,0,null,null] List 1: 1.[E-->n*,0,null,0.2] (0 is the list's number, 2 is the nu,ber of item). So, we have parse tree: E | n But, such the references keeping may be incorrect if the grammar have the rules with empty right part. I've intended keep the list of low-references instead the single one. And, if there will be ambigous parsing the list of low-references will reflect this in correct way. So, the item now is [A-->alpha*beta,i,<low-ref>,left-ref]. Every time Completer operation runs on item I, if it adds a new item I1 the low-ref to I1 is added to the list of low-refs in item I. This algorithm was a part of my PhD. It is in russian. If you know this I can send you the text within the algorithm description and correctness prove as well. Vladimir.
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.