For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > November 2006 > failed substitution









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author failed substitution
Beginner

2006-11-24, 6:56 pm

Hi,

I have a number of jpegs I wanted to rename. I wrote a short script
to do it but the new file name is not always generated correctly. The
script should find the last letter in the filename (before the
extension) and substitute it for '_a'.

If you look at the results below you'll see that 'a' and 'b' fail but
'c' worked. I don't understand why.

DSC00092a.jpg -> DSC00092a.jpg a
DSC00093b.jpg -> DSC00093b.jpg b
DSC00094c.jpg -> DSC00094_a.jpg c
DSC00095d.jpg -> DSC00095d.jpg d
DSC00096e.jpg -> DSC00096e.jpg e
DSC00097f.jpg -> DSC00097f.jpg f
DSC00098g.jpg -> DSC00098g.jpg g
DSC00099h.jpg -> DSC00099h.jpg h
DSC00100i.jpg -> DSC00100i.jpg i
DSC00101j.jpg -> DSC00101_a.jpg j
DSC00102k.jpg -> DSC00102_a.jpg k
DSC00103l.jpg -> DSC00103l.jpg l
....snip

Here the script, there isn't much to it. Can anyone explain why the
substitute fails?
TIA
Dp.


#!/bin/perl
# Active State 5.8.6.811

use strict;
use warnings;
use File::Basename;

my $dir = 'D:/Temp/jpegs/thumbs/';
my @files = glob("${dir}*.jpg");

foreach my $f (@files) {
(my $l) = ($f =~ /([a-z]|[a-z][a-z])\.jpg/);
(my $new = $f) =~ s/$l/_a/;
my $basef = basename($f);
my $basenew = basename($new);
print "$basef -> $basenew $l\n";
}


boyd

2006-11-24, 6:56 pm

In article <4566F811.7444.309FC868@dermot.sciencephoto.com>,
dermot@sciencephoto.com (Beginner) wrote:

> Hi,
>


> Here the script, there isn't much to it. Can anyone explain why the
> substitute fails?
> TIA
> Dp.
>
>
> #!/bin/perl
> # Active State 5.8.6.811
>
> use strict;
> use warnings;
> use File::Basename;
>
> my $dir = 'D:/Temp/jpegs/thumbs/';
> my @files = glob("${dir}*.jpg");
>
> foreach my $f (@files) {
> (my $l) = ($f =~ /([a-z]|[a-z][a-z])\.jpg/);
> (my $new = $f) =~ s/$l/_a/;
> my $basef = basename($f);
> my $basenew = basename($new);
> print "$basef -> $basenew $l\n";
> }


Strange. It worked for me - I used your data as input and got:

DSC00092a.jpg -> DSC00092_a.jpg a
DSC00093b.jpg -> DSC00093_a.jpg b
DSC00094c.jpg -> DSC00094_a.jpg c
DSC00095d.jpg -> DSC00095_a.jpg d
DSC00096e.jpg -> DSC00096_a.jpg e
DSC00097f.jpg -> DSC00097_a.jpg f
DSC00098g.jpg -> DSC00098_a.jpg g
DSC00099h.jpg -> DSC00099_a.jpg h
DSC00100i.jpg -> DSC00100_a.jpg i
DSC00101j.jpg -> DSC00101_a.jpg j
DSC00102k.jpg -> DSC00102_a.jpg k
DSC00103l.jpg -> DSC00103_a.jpg l

as expected. Maybe you have some hidden characters or something.

By the way, the regexp parser would have less work to do with:
(my $l) = ($f =~ /([a-z]|[a-z][a-z])\.jpg$/);
(in other words, tell it that the string ends in jpg).

Boyd
D. Bolliger

2006-11-24, 6:56 pm

Beginner am Freitag, 24. November 2006 14:48:
> Hi,
>
> I have a number of jpegs I wanted to rename. I wrote a short script
> to do it but the new file name is not always generated correctly. The
> script should find the last letter in the filename (before the
> extension) and substitute it for '_a'.


Hi Beginner

I assume that you mean "substitue with _a".

> If you look at the results below you'll see that 'a' and 'b' fail but
> 'c' worked. I don't understand why.
>
> DSC00092a.jpg -> DSC00092a.jpg a
> DSC00093b.jpg -> DSC00093b.jpg b
> DSC00094c.jpg -> DSC00094_a.jpg c
> DSC00095d.jpg -> DSC00095d.jpg d
> DSC00096e.jpg -> DSC00096e.jpg e
> DSC00097f.jpg -> DSC00097f.jpg f
> DSC00098g.jpg -> DSC00098g.jpg g
> DSC00099h.jpg -> DSC00099h.jpg h
> DSC00100i.jpg -> DSC00100i.jpg i
> DSC00101j.jpg -> DSC00101_a.jpg j
> DSC00102k.jpg -> DSC00102_a.jpg k
> DSC00103l.jpg -> DSC00103l.jpg l
> ...snip
>
> Here the script, there isn't much to it. Can anyone explain why the
> substitute fails?


I should not mention that in the public ;-) but, just to demonstrate one way
to search for a reason for a malfunction:

Because I did not see an error at first glance, I...

> #!/bin/perl
> # Active State 5.8.6.811
>
> use strict;
> use warnings;
> use File::Basename;
>
> my $dir = 'D:/Temp/jpegs/thumbs/';
> my @files = glob("${dir}*.jpg");


....replaced these two lines with simply

my @files=qw(DSC00092a.jpg DSC00094c.jpg); # etc

and everything worked fine.
Then, I created these files in the current directory, and again everything
worked fine.

Then, I made a subdirectory, moved the file over, ...


> foreach my $f (@files) {
> (my $l) = ($f =~ /([a-z]|[a-z][a-z])\.jpg/);
> (my $new = $f) =~ s/$l/_a/;


....placed here a

warn "new=$new";

and got (excerpt):

new=/home/d_ani/ramsch/thumbs/DSC00092a.jpg at ./script.pl line 17. # !!
new=/home/dani/ramsch/thum_as/DSC00093b.jpg at ./script.pl line 17. # !!
new=/home/dani/rams_ah/thumbs/DSC00094c.jpg at ./script.pl line 17. # !!
new=/home/dan_a/ramsch/thumbs/DSC00100i.jpg at ./script.pl line 17. # !!
new=/home/dani/ramsch/thumbs/DSC00101_a.jpg at ./script.pl line 17.

> my $basef = basename($f);
> my $basenew = basename($new);
> print "$basef -> $basenew $l\n";
> }


And now it's extraordinary obvious that the error is

(my $new = $f) =~ s/$l/_a/;

which simply searches for the first char contained in $l and replaces it
with '_a'. This makes the malfunction dependent from the contents in $dir.

Instead, this line should be more specific, f.ex:

(my $new = $f) =~ s/$l\.jpg$/_a\.jpg/;

(Note that I anchor with $ since "DSC00092a.jpg" is a valid path name :-) )

Of course it would have been sufficient to only present this last substitution
to lead you to a "aha!", but I think it's important to have a personal
strategy to search for errors in the dark :-)


btw, the foreach code can at least be shortened to:

foreach my $basef (map basename ($_), @files) {
(my $l) = ($basef =~ /([a-z]{1,2})\.jpg$/);
# above line is still problematic: What if the match failes?

(my $basenew = $basef) =~ s/$l\.jpg$/_a\.jpg/;
print "$basef -> $basenew $l\n";
}

and certainly optimized further in several ways (f.ex if you don't need the
last print statement, $l could possibly be eliminated), but I'm so tired and
brain dead at the time :-)

Dani
John W. Krahn

2006-11-24, 6:56 pm

Beginner wrote:
> Hi,


Hello,

> I have a number of jpegs I wanted to rename. I wrote a short script
> to do it but the new file name is not always generated correctly. The
> script should find the last letter in the filename (before the
> extension) and substitute it for '_a'.
>
> If you look at the results below you'll see that 'a' and 'b' fail but
> 'c' worked. I don't understand why.
>
> DSC00092a.jpg -> DSC00092a.jpg a
> DSC00093b.jpg -> DSC00093b.jpg b
> DSC00094c.jpg -> DSC00094_a.jpg c
> DSC00095d.jpg -> DSC00095d.jpg d
> DSC00096e.jpg -> DSC00096e.jpg e
> DSC00097f.jpg -> DSC00097f.jpg f
> DSC00098g.jpg -> DSC00098g.jpg g
> DSC00099h.jpg -> DSC00099h.jpg h
> DSC00100i.jpg -> DSC00100i.jpg i
> DSC00101j.jpg -> DSC00101_a.jpg j
> DSC00102k.jpg -> DSC00102_a.jpg k
> DSC00103l.jpg -> DSC00103l.jpg l
> ...snip
>
> Here the script, there isn't much to it. Can anyone explain why the
> substitute fails?
>
>
> #!/bin/perl
> # Active State 5.8.6.811
>
> use strict;
> use warnings;
> use File::Basename;
>
> my $dir = 'D:/Temp/jpegs/thumbs/';
> my @files = glob("${dir}*.jpg");
>
> foreach my $f (@files) {
> (my $l) = ($f =~ /([a-z]|[a-z][a-z])\.jpg/);


Perl's alternation always quits when the first alternative matches so in your
example ([a-z]) and ([a-z]|[a-z][a-z]) are equivalent. Perhaps you meant
([a-z]{1,2}) instead?


> (my $new = $f) =~ s/$l/_a/;


$f contains the complete path so you are probably modifying something other
than the file name.

You probably want something like:

( my $new = $f ) =~ s/([a-z]{1,2})(?=\.jpg\z)/_a/;


> my $basef = basename($f);
> my $basenew = basename($new);
> print "$basef -> $basenew $l\n";
> }



John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order. -- Larry Wall
Beginner

2006-11-28, 7:57 am

On 24 Nov 2006 at 23:55, D. Bolliger wrote:

> Beginner am Freitag, 24. November 2006 14:48:
>
> Hi Beginner
>
> I assume that you mean "substitue with _a".
>
>
> I should not mention that in the public ;-) but, just to demonstrate one way
> to search for a reason for a malfunction:
>
> Because I did not see an error at first glance, I...
>
>
> ...replaced these two lines with simply
>
> my @files=qw(DSC00092a.jpg DSC00094c.jpg); # etc
>
> and everything worked fine.
> Then, I created these files in the current directory, and again everything
> worked fine.
>
> Then, I made a subdirectory, moved the file over, ...
>
>
>
> ...placed here a
>
> warn "new=$new";
>
> and got (excerpt):
>
> new=/home/d_ani/ramsch/thumbs/DSC00092a.jpg at ./script.pl line 17. # !!
> new=/home/dani/ramsch/thum_as/DSC00093b.jpg at ./script.pl line 17. # !!
> new=/home/dani/rams_ah/thumbs/DSC00094c.jpg at ./script.pl line 17. # !!
> new=/home/dan_a/ramsch/thumbs/DSC00100i.jpg at ./script.pl line 17. # !!
> new=/home/dani/ramsch/thumbs/DSC00101_a.jpg at ./script.pl line 17.
>
>
> And now it's extraordinary obvious that the error is
>
> (my $new = $f) =~ s/$l/_a/;
>
> which simply searches for the first char contained in $l and replaces it
> with '_a'. This makes the malfunction dependent from the contents in $dir.
>
> Instead, this line should be more specific, f.ex:
>
> (my $new = $f) =~ s/$l\.jpg$/_a\.jpg/;
>
> (Note that I anchor with $ since "DSC00092a.jpg" is a valid path name :-) )
>
> Of course it would have been sufficient to only present this last substitution
> to lead you to a "aha!", but I think it's important to have a personal
> strategy to search for errors in the dark :-)
>
>
> btw, the foreach code can at least be shortened to:
>
> foreach my $basef (map basename ($_), @files) {
> (my $l) = ($basef =~ /([a-z]{1,2})\.jpg$/);
> # above line is still problematic: What if the match failes?
>
> (my $basenew = $basef) =~ s/$l\.jpg$/_a\.jpg/;
> print "$basef -> $basenew $l\n";
> }
>
> and certainly optimized further in several ways (f.ex if you don't need the
> last print statement, $l could possibly be eliminated), but I'm so tired and
> brain dead at the time :-)


Thanx Dani and John,

I should have realised that the that I was making the substitiution
on the full path and not the basename.

I appreciate you showing me how to shorten the code. Can I ask if I
am reading it right.

foreach my $basef (map basename ($_), @files) {
(my $l) = ($basef =~ /([a-z]{1,2})\.jpg$/);

Does this basename everything in @files and make it $basef?


In John's example I am not sure what is happening with this RegEx:
( my $new = $f ) =~ s/([a-z]{1,2})(?=\.jpg\z)/_a/;

There are 2 sets of parentheses but one lvalue, $new. So is that any
character a-z, 1 or 2 times and the ? mean 1 or more times? What is
the \z switch here? I can find it is perlre.

Thanx again.
Dp.



D. Bolliger

2006-11-28, 7:58 am

Beginner am Dienstag, 28. November 2006 11:55:

[snipped for brevity, sorry]

> Thanx Dani and John,
>
> I should have realised that the that I was making the substitiution
> on the full path and not the basename.
>
> I appreciate you showing me how to shorten the code. Can I ask if I
> am reading it right.
>
> foreach my $basef (map basename ($_), @files) {
> (my $l) = ($basef =~ /([a-z]{1,2})\.jpg$/);
>
> Does this basename everything in @files and make it $basef?


Yes, every file in @files is "piped" through map witch applies the basename
function, and the result is stored in $basef, used within the foreach loop.

For the powerful map function see:

perldoc -f map

> In John's example I am not sure what is happening with this RegEx:
> ( my $new = $f ) =~ s/([a-z]{1,2})(?=\.jpg\z)/_a/;


First, $f is copied into $new and the regex is applied to $new.

(?=something_here) is a positive lookahead not actually matching
something_here. The regex sais:
"match one or two a-z chars that are followed by the string '.jpg', and
replace this/these char(s) with '_a'". See

perldoc perlre

> There are 2 sets of parentheses but one lvalue, $new. So is that any
> character a-z, 1 or 2 times and the ? mean 1 or more times?


No, the question mark is part of the '(?=)' construct, all described in
perlre.

> What is the \z switch here? I can find it is perlre.


It's under the paragraph "Perl defines the following zero-width assertions"

(btw, at least on linux, you can call the search funktion by typing a '/'
while viewing)

Dani
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com