Home > Archive > PERL Beginners > June 2005 > Search Pattern for Roman Numerals?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Search Pattern for Roman Numerals?
|
|
| Siegfried Heintze 2005-06-03, 3:56 am |
| How do I write a pattern for removing roman numerals? The first 10 is
enough.
Thanks,
Siegfried
| |
| Jeff 'japhy' Pinyan 2005-06-03, 8:58 am |
| On Jun 2, Siegfried Heintze said:
> How do I write a pattern for removing roman numerals? The first 10 is
> enough.
Well, the first ten roman numerals are:
I, II, III, IV, V, VI, VII, VIII, IX, X
Just put those in a regex.
s/\b(I|II|...)\b//g;
would remove roman numerals, provided they aren't touching any word
characters.
--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
http://japhy.perlmonk.org/ % have long ago been overpaid?
http://www.perlmonks.org/ % -- Meister Eckhart
| |
| Jay Savage 2005-06-03, 3:56 pm |
| On 6/3/05, Jeff 'japhy' Pinyan <japhy@perlmonk.org> wrote:
> On Jun 2, Siegfried Heintze said:
>=20
>=20
> Well, the first ten roman numerals are:
>=20
> I, II, III, IV, V, VI, VII, VIII, IX, X
>=20
> Just put those in a regex.
>=20
> s/\b(I|II|...)\b//g;
>=20
> would remove roman numerals, provided they aren't touching any word
> characters.
>=20
> --
> Jeff "japhy" Pinyan % How can we ever be the sold short or
This isn't going to get them all; it says to match (between word
boundaries) "I" or "II" or any three non-newlines. So it will catch
"I", "II", "III", and "VII". It will also catch "I" where it's a
pronoun (assuming this is an english text file), and any three-letter
words/constructs.
I would trysomething like this:
s/\bI(?:I+|V|X)?|VI*|XI*\b//
Note that this will "I". You may want to go through and get those by
hand instead if there is any chance of "I" having another function.=20
If you can identify the context where the numerals appear, you can
make it easier on yourself.
HTH,
-- jay=20
--------------------
daggerquill [at] gmail [dot] com
http://www.engatiki.org
| |
| Jeff 'japhy' Pinyan 2005-06-03, 3:56 pm |
| On Jun 3, Jay Savage said:
> On 6/3/05, Jeff 'japhy' Pinyan <japhy@perlmonk.org> wrote:
>
>
> This isn't going to get them all; it says to match (between word
> boundaries) "I" or "II" or any three non-newlines. So it will catch
> "I", "II", "III", and "VII". It will also catch "I" where it's a
> pronoun (assuming this is an english text file), and any three-letter
> words/constructs.
I'm sorry, that regex wasn't meant to be taken literally. I just didn't
feel the need to reproduce the alternations *again*.
> I would trysomething like this:
>
> s/\bI(?:I+|V|X)?|VI*|XI*\b//
This will get rid of the "I" in "Ishmael". Your \b anchors aren't
effective on the *entire* pattern. You're matching
\bI(?:I+|V|X)?
or
VI*
or
XI*\b
The regex I would use would probably be
/\b(?:I{1,3}|IV|VI{0,3}|I?X)\b/
--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
http://japhy.perlmonk.org/ % have long ago been overpaid?
http://www.perlmonks.org/ % -- Meister Eckhart
|
|
|
|
|