Home > Archive > PERL Programming > January 2005 > Learning Reg expressions.
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Learning Reg expressions.
|
|
|
| I'm just learnign reg expressions and can any one tell me how to do
something like:
Search for subjectpage1.htm
&
change it to titlepageX.htm
Where X is an integer 1-100
Its the changing part I dont know how to do - is it possible to specify it
as a reg expr. ?
TIA
| |
| James Keasley 2005-01-06, 8:57 pm |
| -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 2005-01-06, <127.0.0.1@127.0.0.1> <127.0.0.1@127.0.0.1> wrote:
> I'm just learnign reg expressions and can any one tell me how to do
> something like:
>
> Search for subjectpage1.htm
>
> &
>
> change it to titlepageX.htm
>
> Where X is an integer 1-100
>
> Its the changing part I dont know how to do - is it possible to specify it
> as a reg expr. ?
easy, where the subjectpage and .htm parts don't change.
$var = "this has subjectpage1.htm in it, and subjectpage99.htm";
$var =~ s/subjectpage(\d+).htm/titlepage$1.htm/g;
print $var . "\n";
the brackets mean save that part to a special variable, the first of
which is $1 for the first section of the regex in brackets.
the g part just means apply this globally through out the input, rather
than just apply it to the first match.
- --
James jamesk[at]homeric[dot]co[dot]u
k
I can read your mind, and you should be ashamed of yourself.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
iD8DBQFB3b40qfSmHkD6LvoRAp2nAJ9ZHYQqXSXw
WN6BeLCc1ElpzM1dygCeJOey
O3+bF3cp55LY4fONP0iJ2iE=
=w9Dz
-----END PGP SIGNATURE-----
| |
| Matt Garrish 2005-01-07, 8:55 am |
|
"James Keasley" <james.keasley@gmail.invalid> wrote in message
news:slrnctrfhk.fjp.james.keasley@athena.homeric.co.uk...
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 2005-01-06, <127.0.0.1@127.0.0.1> <127.0.0.1@127.0.0.1> wrote:
>
> easy, where the subjectpage and .htm parts don't change.
>
> $var = "this has subjectpage1.htm in it, and subjectpage99.htm";
> $var =~ s/subjectpage(\d+).htm/titlepage$1.htm/g;
> print $var . "\n";
>
> the brackets mean save that part to a special variable, the first of
> which is $1 for the first section of the regex in brackets.
>
> the g part just means apply this globally through out the input, rather
> than just apply it to the first match.
>
While that will work, it may not be the most effective regular expression.
Assuming that "subjectpage" has no other meaning in the files, there's no
reason to capture the number following it just to reinsert it:
$var =~ s/subjectpage/titlepage/g;
No need to make your regular expressions more complicated than they need to
be (even in a relatively simple case like this). As a case in point, periods
have a special meaning in regular expressions, so the .htm on the left-hand
side of the substitution may not always do what you expect.
$var =~ s/subjectpage(\d+)\.htm/titlepage$1.htm/g;
The lack of clarity in the original post makes it hard to know exactly what
is needed in this case, though (does X = 1 or some other number, for
example), so neither may be what the OP is after.
Matt
| |
| James Keasley 2005-01-07, 8:55 am |
| -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 2005-01-07, Matt Garrish <matthew.garrish@sympatico.ca> wrote:
> While that will work, it may not be the most effective regular expression.
> Assuming that "subjectpage" has no other meaning in the files, there's no
> reason to capture the number following it just to reinsert it:
>
> $var =~ s/subjectpage/titlepage/g;
Good point, always worth keeping it simple, less to go wrong
> No need to make your regular expressions more complicated than they need to
> be (even in a relatively simple case like this).
Surely you jest, whats the point of using Perl if you don't use
hopelessly opaque regexes whenever possible? Its good as "job-security"
coding. ;)
> As a case in point, periods
> have a special meaning in regular expressions, so the .htm on the left-hand
> side of the substitution may not always do what you expect.
>
> $var =~ s/subjectpage(\d+)\.htm/titlepage$1.htm/g;
Bah, I _knew_ I had forgotten something, but that type of brainfart always
seems to be the hardest type to nail down, it is doing what is expected, it
is just adding some extra stuff as well, BTDTGTT
- --
James jamesk[at]homeric[dot]co[dot]u
k
"Luge strategy? Lie flat and try not to die." -Carmen Boyle
(Olympic Luge Gold Medal winner 1996)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
iD8DBQFB3d34qfSmHkD6LvoRAlAhAJ4vIr9azRAO
faLuThLrgvUBx6gquwCfc05u
ywCgDPlRGE2e+gG9Z1WIqhQ=
=TVvr
-----END PGP SIGNATURE-----
| |
| Joe Smith 2005-01-07, 8:55 am |
| James Keasley wrote:
> $var = "this has subjectpage1.htm in it, and subjectpage99.htm";
> $var =~ s/subjectpage(\d+).htm/titlepage$1.htm/g;
> print $var . "\n";
That's the way to do it if the number for titlepageX.htm is the
same as the number for subjectpageX.htm. But if the titlepageX
numbers need to be consecutive even when the subjectpageX numbers
are out of order, use ++ and s///e. The /e modifier causes the
replacement part of s/// to be executed as a perl expression.
---------------------------
@lines = ('First line has subjectpage1.htm',
'subjectpage9.htm has been replaced the second',
'Former fourth line move up one subjectpage4.htm');
$currentpage = 81;
foreach (@lines) {
s/subjectpage\d+\.htm/'titlepage' . $currentpage++ . '.htm'/eg;
}
print join("\n",@lines),"\n";
---------------------------
First line has titlepage81.htm
titlepage82.htm has been replaced the second
Former fourth line move up one titlepage83.htm
---------------------------
-Joe
|
|
|
|
|