Home > Archive > AWK > July 2004 > Writing a parser using AWK
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Writing a parser using AWK
|
|
| Cesar A. K. Grossmann 2004-07-05, 8:55 pm |
| Hi
Is it possible to write a parser using AWK, the same way you write a
parser for C, usting lex/yacc (flex/bison, if you use GNU software)?
I wrote a small parser using C this way, and I was wondering if it is
possible to do the same using an AWK script. I googled the internet, but
the most common references to parser and awk I get where about the
process of compiling gawk (that uses flex/bison to generate part of the
code), so I'm turning to the USENET.
The file I have to parse have only two kind of records:
IDENTIFIER = NUMBER STRING STRING;
IDENTIFIER = STRING { NUMBER IDENTIFIER [, NUMBER IDENTIFIER ] };
IDENTIFIER is a sequence of letters and numbers and '_', starting with a
letter, STRING is a sequence of chars between "", and NUMBER is a
number, with or without floating point. That's it. The records can be
spread in several lines, or in one line by themselves, e.g.:
IDENTIFIER =
NUMBER
STRING
STRING
;
IDENTIFIER =
STRING {
NUMBER IDENTIFIER,
NUMBER IDENTIFIER,
...
}
;
The parser I wrote in C works well. I just have a "programmer's itch"...
If someone knows about a page that deals with such things, please, tell
me about that.
TIA
P.S.: Sorry for the "engrish".
--
..O. Cesar A. K. Grossmann ICQ UIN: 35659423
...O http://www.LinuxByGrossmann.cjb.net/
OOO Quidquid Latine dictum sit, altum viditur
| |
| Ulrich M. Schwarz 2004-07-06, 3:55 am |
| "Cesar A. K. Grossmann" <cakgguard-usenet2004@yahoo.com.br> writes:
> Hi
>
> Is it possible to write a parser using AWK, the same way you write a
> parser for C, usting lex/yacc (flex/bison, if you use GNU software)?
>
> I wrote a small parser using C this way, and I was wondering if it is
> possible to do the same using an AWK script. I googled the internet,
> but the most common references to parser and awk I get where about the
> process of compiling gawk (that uses flex/bison to generate part of
> the code), so I'm turning to the USENET.
>
> The file I have to parse have only two kind of records:
>
> IDENTIFIER =3D NUMBER STRING STRING;
>
> IDENTIFIER =3D STRING { NUMBER IDENTIFIER [, NUMBER IDENTIFIER ] };
You'll be running into problems if your strings may contain ";",
otherwise, I'd set RS to ";" and go from there.
If you need a more complex RS: gawk has an auto-set variable that has
the part that matched RS in it. (In my case, fields were marked by
starting in column 0. Urgh!)
Ulrich
--=20
"P=3DNP. Zumindest f=FCr N=3D1."
-- Florian Diedrich
| |
| Cesar Rabak 2004-07-06, 3:56 pm |
| Cesar A. K. Grossmann escreveu:
> Hi
>
> Is it possible to write a parser using AWK, the same way you write a
> parser for C, usting lex/yacc (flex/bison, if you use GNU software)?
Doing it by hand (w/o flex/bison) should be possible. IIRC, the Henry
Spencer "Amazing awk Assembler"
http://www.math.utah.edu/docs/info/gawk_22.html does it that way.
--
Cesar Rabak
| |
| Stephan Titard 2004-07-13, 3:55 am |
| Hi
something lighter but quite mind-opening
is the preprocessor (written in awk) that Axel-Tobias Schreiner uses in his
Object-oriented programming in C.
google with ooc-94.2.11.tar.gz
enjoy
HTH
"Cesar A. K. Grossmann" wrote:
> Hi
>
> Is it possible to write a parser using AWK, the same way you write a
> parser for C, usting lex/yacc (flex/bison, if you use GNU software)?
>
> I wrote a small parser using C this way, and I was wondering if it is
> possible to do the same using an AWK script. I googled the internet, but
> the most common references to parser and awk I get where about the
> process of compiling gawk (that uses flex/bison to generate part of the
> code), so I'm turning to the USENET.
>
> The file I have to parse have only two kind of records:
>
> IDENTIFIER = NUMBER STRING STRING;
>
> IDENTIFIER = STRING { NUMBER IDENTIFIER [, NUMBER IDENTIFIER ] };
>
> IDENTIFIER is a sequence of letters and numbers and '_', starting with a
> letter, STRING is a sequence of chars between "", and NUMBER is a
> number, with or without floating point. That's it. The records can be
> spread in several lines, or in one line by themselves, e.g.:
>
> IDENTIFIER =
> NUMBER
> STRING
> STRING
> ;
>
> IDENTIFIER =
> STRING {
> NUMBER IDENTIFIER,
> NUMBER IDENTIFIER,
> ...
> }
> ;
>
> The parser I wrote in C works well. I just have a "programmer's itch"...
> If someone knows about a page that deals with such things, please, tell
> me about that.
>
> TIA
> P.S.: Sorry for the "engrish".
> --
> .O. Cesar A. K. Grossmann ICQ UIN: 35659423
> ..O http://www.LinuxByGrossmann.cjb.net/
> OOO Quidquid Latine dictum sit, altum viditur
| |
| Stephan Titard 2004-07-13, 8:55 am |
| I forgot to mention that there is no tool like lex/yacc
*you* have to write the parser by hand - but awk is quite capable for writing
top-down parsers.
another good starting point is m1 written by John Bentley
HTH
stephan
Stephan Titard wrote:
[color=darkred]
> Hi
> something lighter but quite mind-opening
> is the preprocessor (written in awk) that Axel-Tobias Schreiner uses in his
>
> Object-oriented programming in C.
> google with ooc-94.2.11.tar.gz
>
> enjoy
> HTH
>
> "Cesar A. K. Grossmann" wrote:
>
| |
| Stepan Kasal 2004-07-13, 8:55 am |
| Hello,
In article <40F39019.27C79A90@rest.is.fake>, Stephan Titard wrote:
> Object-oriented programming in C.
> google with ooc-94.2.11.tar.gz
google for ooc-02.01.04.tar.gz to get an updated rerelease.
Stepan Kasal
| |
| Stepan Kasal 2004-07-13, 8:55 am |
| Hello,
> another good starting point is m1 written by John Bentley
as a convenience service to others: The m1 macro processor was written by
*Jon* Bentley (slightly different spelling) and was published in the last
chapter of sed&awk by Dale Dougherty and Arnold Robbins (O'Reilly publ.).
When you go to http://oreilly.com/catalog/sed2 and download the examples,
you get not only m1.awk---the code---but also m1.ps---an article about m1.
Happy reading,
Stepan Kasal
|
|
|
|
|