For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > April 2005 > one-liner multi-line regex problem









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author one-liner multi-line regex problem
Kevin Horton

2005-04-25, 3:56 pm

I'm trying to write a perl one-liner that will edit an iCalendar
format file to remove To Do items. The file contains several
thousand lines, and I need to remove several multi-line blocks. The
blocks to remove start with a line "BEGIN:VTODO" (without the quotes)
and end with a line "END:VTODO" (also without quotes).

I've tried the following one-liner,

perl -p -i.bak -e 's/BEGIN:VTODO.*END:VTODO//sg' file_name_to_edit

The .bak file is created, which tells me the one-liner is finding my
file, but the file is identical to the old one - i.e. the regex
doesn't seem to be matching anything.

I'm also wondering whether my proposed one-liner (if it worked) would
be too greedy. Would it pull out everything between the first
BEGIN:VTODO and the last END:VTODO?

I'd appreciate any hints.

Thanks,

Kevin Horton
John Doe

2005-04-25, 3:56 pm

Hi Kevin

just hints, no solution :-)

Am Montag, 25. April 2005 12.59 schrieb Kevin Horton:
> I'm trying to write a perl one-liner that will edit an iCalendar
> format file to remove To Do items. The file contains several
> thousand lines, and I need to remove several multi-line blocks. The
> blocks to remove start with a line "BEGIN:VTODO" (without the quotes)
> and end with a line "END:VTODO" (also without quotes).
>
> I've tried the following one-liner,
>
> perl -p -i.bak -e 's/BEGIN:VTODO.*END:VTODO//sg' file_name_to_edit


according to perldoc perlrun, -p reads _one_ line after the other, so you
can't search for multiline patterns this way.

> The .bak file is created, which tells me the one-liner is finding my
> file, but the file is identical to the old one - i.e. the regex
> doesn't seem to be matching anything.
>
> I'm also wondering whether my proposed one-liner (if it worked) would
> be too greedy.


yes or no, depends from the working implementation :-)

> Would it pull out everything between the first
> BEGIN:VTODO and the last END:VTODO?


yes, if you try to match a string with the whole file in it with the regex
above.

>
> I'd appreciate any hints.
>
> Thanks,
>
> Kevin Horton

Jay Savage

2005-04-25, 3:56 pm

On 4/25/05, Kevin Horton <khorton01@rogers.com> wrote:
> I'm trying to write a perl one-liner that will edit an iCalendar
> format file to remove To Do items. The file contains several
> thousand lines, and I need to remove several multi-line blocks. The
> blocks to remove start with a line "BEGIN:VTODO" (without the quotes)
> and end with a line "END:VTODO" (also without quotes).
>=20
> I've tried the following one-liner,
>=20
> perl -p -i.bak -e 's/BEGIN:VTODO.*END:VTODO//sg' file_name_to_edit
>=20
> The .bak file is created, which tells me the one-liner is finding my
> file, but the file is identical to the old one - i.e. the regex
> doesn't seem to be matching anything.



-p causes the file to be read one line at a time, which negates the
usefulness of /s. If you have sufficient RAM to read the entire file
into memory, you can use the -0 option to "slurp" the file:

perl -0777 -p -i.bak -e 's/BEGIN:VTODO.*?END:VTODO//sg'

see perldoc perlrun for details
>=20
> I'm also wondering whether my proposed one-liner (if it worked) would
> be too greedy. Would it pull out everything between the first
> BEGIN:VTODO and the last END:VTODO?
>=20


Yes it will.


HTH,

--jay
Dave Gray

2005-04-25, 3:56 pm

> I'm trying to write a perl one-liner that will edit an iCalendar
> format file to remove To Do items. The file contains several
> thousand lines, and I need to remove several multi-line blocks. The
> blocks to remove start with a line "BEGIN:VTODO" (without the quotes)
> and end with a line "END:VTODO" (also without quotes).
>=20
> I've tried the following one-liner,
>=20
> perl -p -i.bak -e 's/BEGIN:VTODO.*END:VTODO//sg' file_name_to_edit


Assuming you have enough disk space:

perl -ane 'print unless /^BEGIN:VTODO/ .. /^END:VTODO/' old > new

perldoc perlrun for more info on perl's command line paramaters
Kevin Horton

2005-04-26, 3:56 am


On 25-Apr-05, at 10:06 AM, Jay Savage wrote:

> On 4/25/05, Kevin Horton <khorton01@rogers.com> wrote:
>
>
> -p causes the file to be read one line at a time, which negates the
> usefulness of /s. If you have sufficient RAM to read the entire file
> into memory, you can use the -0 option to "slurp" the file:
>
> perl -0777 -p -i.bak -e 's/BEGIN:VTODO.*?END:VTODO//sg'


This seems to work perfectly. I've studied the output for five
minutes, and can't find a problem.

Thank you very much.
>
> see perldoc perlrun for details


I've learned a lot in the last few minutes, now that I know which of
the perldoc files to look in.
>
> Yes it will.


I looked at trying to use the "?" to stop the potential greedyness, but
I didn't grok how it worked. Now that I have an example, I think I
understand it (again, as I thought I understood when I was first
puzzling through perl on vacation in Christmas 2003). Hopefully my
understanding this time is more lasting. :)

Thanks so much to the several people who responded.

Kevin Horton
Ottawa, Canada
RV-8 - Finishing Kit
http://www.kilohotel.com/rv8

Kevin Horton

2005-04-26, 3:56 am


On 25-Apr-05, at 10:06 AM, Jay Savage wrote:

> On 4/25/05, Kevin Horton <khorton01@rogers.com> wrote:
>
>
> -p causes the file to be read one line at a time, which negates the
> usefulness of /s. If you have sufficient RAM to read the entire file
> into memory, you can use the -0 option to "slurp" the file:
>
> perl -0777 -p -i.bak -e 's/BEGIN:VTODO.*?END:VTODO//sg'


This seems to work perfectly. I've studied the output for five
minutes, and can't find a problem.

Thank you very much.
>
> see perldoc perlrun for details


I've learned a lot in the last few minutes, now that I know which of
the perldoc files to look in.
>
> Yes it will.


I looked at trying to use the "?" to stop the potential greedyness, but
I didn't grok how it worked. Now that I have an example, I think I
understand it (again, as I thought I understood when I was first
puzzling through perl on vacation in Christmas 2003). Hopefully my
understanding this time is more lasting. :)

Thanks so much to the several people who responded.

Kevin Horton
Ottawa, Canada
RV-8 - Finishing Kit
http://www.kilohotel.com/rv8

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com