For Programmers: Free Programming Magazines  


Home > Archive > Tcl > June 2005 > Walking backwards with a regex









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Walking backwards with a regex
sigzero@gmail.com

2005-06-10, 8:58 pm

I have a log file that always ends with: (some data)

I want to get at the info between the parens. Since it happens at the
end of the file can I walk a regex backwards to get what I want?

Robert

Donal K. Fellows

2005-06-10, 8:58 pm

sigzero@gmail.com wrote:
> I have a log file that always ends with: (some data)
>
> I want to get at the info between the parens. Since it happens at the
> end of the file can I walk a regex backwards to get what I want?


If you can guess a reasonable value always bigger than your data to
extract, you can optimize using something like this:

set f [open theLogFile]
s $f -1024 end ;# Assume 1kB is enough
set data [read $f]
close $f
regexp {\((match-your-data-here)\)$} $data -> interestingBit

Remember, Tcl's REs usually think \n is a normal character.

Donal.
sigzero@gmail.com

2005-06-11, 3:58 am

I should have mentioned that the data between the parens is not always
the same but the position of it is (always at the end of the data like:

blahblahblah
datadatadata
mopremoremore
printer(HP Deskjet 440) <<< the printer type is what will always be
different

Neil Madden

2005-06-11, 3:58 pm

sigzero@gmail.com wrote:
> I should have mentioned that the data between the parens is not always
> the same but the position of it is (always at the end of the data like:
>
> blahblahblah
> datadatadata
> mopremoremore
> printer(HP Deskjet 440) <<< the printer type is what will always be
> different
>


Yup - Donal's regexp should deal with that. The $-sign at the end of a
regexp means "anchor to the end of the string". So, you want something like:

regexp {\(([^\)]*)\)$} $input -> stuff_between_parens

alternatively, you could use {\((.*?)\)$} but I've been bitten by
unpredictable performance of non-greedy regexps in the past so tend to
avoid them. If your file is particularly large then you might want to
also use Donal's trick of sing to near the end before trying the
regexp (if it fails, you can then back up a bit and try again).

-- Neil
Robert

2005-06-11, 3:58 pm

On 6/11/05 8:30 AM, in article CPAqe.20943$cN2.5729@newsfe4-gui.ntli.net,
"Neil Madden" <nem@cs.nott.ac.uk> wrote:

> sigzero@gmail.com wrote:
>
> Yup - Donal's regexp should deal with that. The $-sign at the end of a
> regexp means "anchor to the end of the string". So, you want something like:
>
> regexp {\(([^\)]*)\)$} $input -> stuff_between_parens
>
> alternatively, you could use {\((.*?)\)$} but I've been bitten by
> unpredictable performance of non-greedy regexps in the past so tend to
> avoid them. If your file is particularly large then you might want to
> also use Donal's trick of sing to near the end before trying the
> regexp (if it fails, you can then back up a bit and try again).
>
> -- Neil


I will try it on Monday. Thank you both for the answer.

Robert

Darren New

2005-06-11, 3:58 pm

sigzero@gmail.com wrote:
> I want to get at the info between the parens. Since it happens at the
> end of the file can I walk a regex backwards to get what I want?


Is there something wrong with
[string last ( $file_data]
to give you the index?

--
Darren New / San Diego, CA, USA (PST)
The samba was clearly inspired
by the margarita.
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com