| Author |
Walking backwards with a regex
|
|
| sigzero@gmail.com 2005-06-10, 8:58 pm |
| I have a log file that always ends with: (some data)
I want to get at the info between the parens. Since it happens at the
end of the file can I walk a regex backwards to get what I want?
Robert
| |
| Donal K. Fellows 2005-06-10, 8:58 pm |
| sigzero@gmail.com wrote:
> I have a log file that always ends with: (some data)
>
> I want to get at the info between the parens. Since it happens at the
> end of the file can I walk a regex backwards to get what I want?
If you can guess a reasonable value always bigger than your data to
extract, you can optimize using something like this:
set f [open theLogFile]
s $f -1024 end ;# Assume 1kB is enough
set data [read $f]
close $f
regexp {\((match-your-data-here)\)$} $data -> interestingBit
Remember, Tcl's REs usually think \n is a normal character.
Donal.
| |
| sigzero@gmail.com 2005-06-11, 3:58 am |
| I should have mentioned that the data between the parens is not always
the same but the position of it is (always at the end of the data like:
blahblahblah
datadatadata
mopremoremore
printer(HP Deskjet 440) <<< the printer type is what will always be
different
| |
| Neil Madden 2005-06-11, 3:58 pm |
| sigzero@gmail.com wrote:
> I should have mentioned that the data between the parens is not always
> the same but the position of it is (always at the end of the data like:
>
> blahblahblah
> datadatadata
> mopremoremore
> printer(HP Deskjet 440) <<< the printer type is what will always be
> different
>
Yup - Donal's regexp should deal with that. The $-sign at the end of a
regexp means "anchor to the end of the string". So, you want something like:
regexp {\(([^\)]*)\)$} $input -> stuff_between_parens
alternatively, you could use {\((.*?)\)$} but I've been bitten by
unpredictable performance of non-greedy regexps in the past so tend to
avoid them. If your file is particularly large then you might want to
also use Donal's trick of s ing to near the end before trying the
regexp (if it fails, you can then back up a bit and try again).
-- Neil
| |
| Robert 2005-06-11, 3:58 pm |
| On 6/11/05 8:30 AM, in article CPAqe.20943$cN2.5729@newsfe4-gui.ntli.net,
"Neil Madden" <nem@cs.nott.ac.uk> wrote:
> sigzero@gmail.com wrote:
>
> Yup - Donal's regexp should deal with that. The $-sign at the end of a
> regexp means "anchor to the end of the string". So, you want something like:
>
> regexp {\(([^\)]*)\)$} $input -> stuff_between_parens
>
> alternatively, you could use {\((.*?)\)$} but I've been bitten by
> unpredictable performance of non-greedy regexps in the past so tend to
> avoid them. If your file is particularly large then you might want to
> also use Donal's trick of s ing to near the end before trying the
> regexp (if it fails, you can then back up a bit and try again).
>
> -- Neil
I will try it on Monday. Thank you both for the answer.
Robert
| |
| Darren New 2005-06-11, 3:58 pm |
| sigzero@gmail.com wrote:
> I want to get at the info between the parens. Since it happens at the
> end of the file can I walk a regex backwards to get what I want?
Is there something wrong with
[string last ( $file_data]
to give you the index?
--
Darren New / San Diego, CA, USA (PST)
The samba was clearly inspired
by the margarita.
|
|
|
|