Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Earley Parser
Hi all,

I search for a parser with the following features

- a true parser (not just a recognizer)
- processes ALL context free grammars (including epsilon productions,
chain productions, and cyclic productions including those)
- processes ambiguous grammars
- returns ALL parse trees (or a DAG)
- return the parse trees in terms of the original grammar (not a derived
one to cope with the epsilon stuff)
- a really usable implementation (rather an API than an applet
demonstrating the basic technique)

I have been looking around for Earley parsers but did not find an
appropriate one. I read across

Jay Earley, An Efficient Context-Free Parsing Algorithm, 1970
Laszlo Szathmary, Jerly: a Java implementation of Earley's algorithm
Peter Schroeder-Heister, Formale Sprachen und Berechenbarkeit
Grune/Jacobs, Parsing Techniques - A Practical Guide
Aycock/Horspool, Practical Earley Parsing, 2002
James Power, Notes on Formal Language Theory and Parsing

But I (as a parsing noob) could not understand enough to reimplement
the parsing step (unger's method, after recognition). Earley's
proposal is sait to be incorrect in certain situations (Tomita,
1986). Others do not seem to regard (be able to cope with) epsilon
productions or ambique grammars. For Aycock/Horspool's approach i lack
to much background knowledge, I fear. Can anybody help?

Thanks, Tom.

--
Dipl.-Inform. Tom Gelhausen
Institut für Programmstrukturen und Datenorganisation (IPD)
Universität Karlsruhe - Deutschland (Germany)

Report this thread to moderator Post Follow-up to this message
Old Post
Tom Gelhausen
10-02-05 08:57 AM


Re: Earley Parser
Tom Gelhausen wrote:
> Hi all,
>
> I search for a parser with the following features
>
> - a true parser (not just a recognizer)
> - processes ALL context free grammars (including epsilon productions,
> chain productions, and cyclic productions including those)
> - processes ambiguous grammars
> - returns ALL parse trees (or a DAG)
> - return the parse trees in terms of the original grammar (not a derived
> one to cope with the epsilon stuff)
> - a really usable implementation (rather an API than an applet
> demonstrating the basic technique)

The following implementation
http://cocom.sourceforge.net/ammunition-13.html satisfies all your
requirements.  Plus:

o it consumes few resources (e.g. parsing 30K lines of C program per
second on 500Mhz P3 machine and consuming 5MB memory to make translation
of 10K lines of C).

o it makes minimal cost error recovery.

o it produces any simple syntax directed translation.

o it can produce translation with the minimal cost when there are more
one of them for ambiguous grammar and we ask only one translation.

o compact representation (using DAG) of all translations when there are
many of them.

The implementation algorithm is based on Earley's idea but it is
very far away from original Earley's algorithm.

The package (with C and C++ interface) is a part of COCOM toolset
http://cocom.sf.net and distributed under GPL (sorry not LGPL, I spent a
few years to implement the parser and don't want to make it free for
proprietary software).

Report this thread to moderator Post Follow-up to this message
Old Post
Vladimir N. Makarov
10-03-05 08:57 AM


Re: Earley Parser
Tom Gelhausen wrote:
> Hi all,
>
> I search for a parser with the following features
>
> - a true parser (not just a recognizer)
> - processes ALL context free grammars (including epsilon productions,
> chain productions, and cyclic productions including those)
> - processes ambiguous grammars
> - returns ALL parse trees (or a DAG)
> - return the parse trees in terms of the original grammar (not a derived
> one to cope with the epsilon stuff)
> - a really usable implementation (rather an API than an applet
> demonstrating the basic technique)

You can check Kakuy! at

www.ucse.edu.ar/fma/sepa

Kakuy! it's an algorithm animation tool designed to teach Earley's and YCK
parsing techniques. Kakuy! can process all kind of CFG (ambiguous
grammars, epsilon productions, cyclic productions, chain productions) and
generate ALL parse trees (well this is not 100% true in the case of
infinite ambiguous grammars! In this case you can select the max amount of
trees). You can find a full description of  Kakuy! in the post

http://compilers.iecc.com/comparch/article/04-12-074

Probably Kakuy! do not fit your needs (too visual/interactive!). We also
developed a non-intereactive/visual applicantion (ie a command line app)
named Urutaú!. This app implement Earley's algorithm and has the same
capabilities of Kakuy!
The use of Urutaú! is very simple, you must give it the GFG (in Yacc,
BNF,EBNF or plain format) and the input string and it will return (all)
the parse trees coded in a plain file.
To obtain Urutaú! just mail us to sepa@ucse.edu.ar. Also, if you need some
functionality not yet supported by Urutaú! we will be glad to develop it.

Ah! Kakuy! and Urutaú! are free software.

Best regards

Salvador Valerio Cavadini
Project SEPa!
Facultad de Matemática Aplicada
Universidad Católica de Santiago del Estero (Argentina)
web: http://www.ucse.edu.ar/fma/staff/svcavadini

Report this thread to moderator Post Follow-up to this message
Old Post
scavadini@ucse.edu.ar
10-07-05 12:00 AM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

Compilers archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 04:26 AM.

 

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.