For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > June 2005 > Search Pattern for Roman Numerals?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Search Pattern for Roman Numerals?
Siegfried Heintze

2005-06-03, 3:56 am

How do I write a pattern for removing roman numerals? The first 10 is
enough.

Thanks,
Siegfried

Jeff 'japhy' Pinyan

2005-06-03, 8:58 am

On Jun 2, Siegfried Heintze said:

> How do I write a pattern for removing roman numerals? The first 10 is
> enough.


Well, the first ten roman numerals are:

I, II, III, IV, V, VI, VII, VIII, IX, X

Just put those in a regex.

s/\b(I|II|...)\b//g;

would remove roman numerals, provided they aren't touching any word
characters.

--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
http://japhy.perlmonk.org/ % have long ago been overpaid?
http://www.perlmonks.org/ % -- Meister Eckhart
Jay Savage

2005-06-03, 3:56 pm

On 6/3/05, Jeff 'japhy' Pinyan <japhy@perlmonk.org> wrote:
> On Jun 2, Siegfried Heintze said:
>=20
>=20
> Well, the first ten roman numerals are:
>=20
> I, II, III, IV, V, VI, VII, VIII, IX, X
>=20
> Just put those in a regex.
>=20
> s/\b(I|II|...)\b//g;
>=20
> would remove roman numerals, provided they aren't touching any word
> characters.
>=20
> --
> Jeff "japhy" Pinyan % How can we ever be the sold short or



This isn't going to get them all; it says to match (between word
boundaries) "I" or "II" or any three non-newlines. So it will catch
"I", "II", "III", and "VII". It will also catch "I" where it's a
pronoun (assuming this is an english text file), and any three-letter
words/constructs.

I would trysomething like this:

s/\bI(?:I+|V|X)?|VI*|XI*\b//

Note that this will "I". You may want to go through and get those by
hand instead if there is any chance of "I" having another function.=20
If you can identify the context where the numerals appear, you can
make it easier on yourself.


HTH,

-- jay=20
--------------------
daggerquill [at] gmail [dot] com
http://www.engatiki.org
Jeff 'japhy' Pinyan

2005-06-03, 3:56 pm

On Jun 3, Jay Savage said:

> On 6/3/05, Jeff 'japhy' Pinyan <japhy@perlmonk.org> wrote:
>
>
> This isn't going to get them all; it says to match (between word
> boundaries) "I" or "II" or any three non-newlines. So it will catch
> "I", "II", "III", and "VII". It will also catch "I" where it's a
> pronoun (assuming this is an english text file), and any three-letter
> words/constructs.


I'm sorry, that regex wasn't meant to be taken literally. I just didn't
feel the need to reproduce the alternations *again*.

> I would trysomething like this:
>
> s/\bI(?:I+|V|X)?|VI*|XI*\b//


This will get rid of the "I" in "Ishmael". Your \b anchors aren't
effective on the *entire* pattern. You're matching

\bI(?:I+|V|X)?
or
VI*
or
XI*\b

The regex I would use would probably be

/\b(?:I{1,3}|IV|VI{0,3}|I?X)\b/

--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
http://japhy.perlmonk.org/ % have long ago been overpaid?
http://www.perlmonks.org/ % -- Meister Eckhart
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com