Home > Archive > PERL Beginners > August 2005 > match basename file and s / / /;
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
match basename file and s / / /;
|
|
| Brian Volk 2005-08-03, 10:00 pm |
| Hi All~
my program below is not returning any errors but nothing is happening to the
..txt files like I hoped. Can someone pls take a look and let me know what
I'm doing wrong.
----- Thank you! ----
# If there is a .pdf file and a matching .txt file, open the .txt file and s
% http://.* % "$link" %
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
my $pdf_dir = "j:/flash_host/ecomm/descriptions/product/MSDS";
opendir(PDFDIR, $pdf_dir) or die "Can't open the $pdf_dir: $!\n";
# read file/directory names in that directory into @htmls
my @pdfs = readdir(PDFDIR) or die "Unable to read current dir:$!\n";
closedir(PDFDIR);
my $text_dir = "c:/brian/descriptions/product/small";
opendir (TEXTDIR, $text_dir) or die "Can't open $text_dir: $!";
# read all the .txt files and load @ARGV for <> operator
@ARGV = map { "$text_dir/$_" } grep { !/^\./ } readdir TEXTDIR;
my %PDFDIR_LIST;
$PDFDIR_LIST{$_}=1 for @pdfs;
foreach my $text_file (<> ) {
my ($basename, $suffix) = fileparse($text_file,'.txt');
my $link = "descriptions/product/small/$basename.pdf";
if( $PDFDIR_LIST{"$basename.pdf"} ){
open FH, $text_file or die "can't open $text_file: $!";
s% http://.* % $link %;
next;}
}
close (FH);
closedir (TEXTDIR);
__END__
Brian Volk
HP Products
317.298.9950 x1245
<mailto:bvolk@hpproducts.com> bvolk@hpproducts.com
| |
| John Doe 2005-08-04, 9:01 am |
| Brian Volk am Mittwoch, 3. August 2005 17.50:
> Hi All~
>
> my program below is not returning any errors but nothing is happening to
> the .txt files like I hoped. Can someone pls take a look and let me know
> what I'm doing wrong.
>
> ----- Thank you! ----
>
> # If there is a .pdf file and a matching .txt file, open the .txt file and
> s % http://.* % "$link" %
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
> use File::Basename;
>
> my $pdf_dir = "j:/flash_host/ecomm/descriptions/product/MSDS";
> opendir(PDFDIR, $pdf_dir) or die "Can't open the $pdf_dir: $!\n";
>
> # read file/directory names in that directory into @htmls
>
> my @pdfs = readdir(PDFDIR) or die "Unable to read current dir:$!\n";
>
> closedir(PDFDIR);
>
> my $text_dir = "c:/brian/descriptions/product/small";
> opendir (TEXTDIR, $text_dir) or die "Can't open $text_dir: $!";
>
> # read all the .txt files and load @ARGV for <> operator
>
> @ARGV = map { "$text_dir/$_" } grep { !/^\./ } readdir TEXTDIR;
>
> my %PDFDIR_LIST;
>
> $PDFDIR_LIST{$_}=1 for @pdfs;
>
> foreach my $text_file (<> ) {
This tries to open and read the files whose names are specified in @ARGV:
$ perl
local @ARGV=(qw /a b c/);
while (<> ) {print};
Can't open a: No such file or directory at - line 2.
Can't open b: No such file or directory at - line 2.
Can't open c: No such file or directory at - line 2.
>
> my ($basename, $suffix) = fileparse($text_file,'.txt');
> my $link = "descriptions/product/small/$basename.pdf";
>
> if( $PDFDIR_LIST{"$basename.pdf"} ){
> open FH, $text_file or die "can't open $text_file: $!";
> s% http://.* % $link %;
> next;}
>
> }
> close (FH);
> closedir (TEXTDIR);
>
> __END__
hth, joe
| |
| Brian Volk 2005-08-04, 5:01 pm |
|
> -----Original Message-----
> From: John Doe [mailto:security.department@tele2.ch]
> Sent: Thursday, August 04, 2005 8:03 AM
> To: beginners@perl.org
> Subject: Re: match basename file and s / / /;
>
>
> Brian Volk am Mittwoch, 3. August 2005 17.50:
> happening to
> and let me know
> the .txt file and
> $pdf_dir: $!\n";
$^I = '.bak'; # I added this....[color=darkred]
>
> This tries to open and read the files whose names are
> specified in @ARGV:
>
> $ perl
> local @ARGV=(qw /a b c/);
> while (<> ) {print};
>
> Can't open a: No such file or directory at - line 2.
> Can't open b: No such file or directory at - line 2.
> Can't open c: No such file or directory at - line 2.
>
Thank you for the reply but I don't think I understand.. The .txt files
that I am loading into @ARGV are the files that I want to open and
substitute the URL for a local path.... I thought that I had to include the
foreach $text_file so I could match the basename of the .txt file and .pdf
file and then include the "$basename.pdf" in the $link... I'm pretty sure I
am matching the file up correctly but the .txt file are being erased instead
of sub'ing the url w/ the local path... I think I'm ... :~)
print; # I added this too..[color=darkred]
>
> hth, joe
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
>
>
>
| |
| Jay Savage 2005-08-04, 5:01 pm |
| On 8/4/05, Brian Volk <BVolk@hpproducts.com> wrote:
>=20
>=20
[snip]
> Thank you for the reply but I don't think I understand.. The .txt files
> that I am loading into @ARGV are the files that I want to open and
> substitute the URL for a local path.... I thought that I had to include t=
he
> foreach $text_file so I could match the basename of the .txt file and .pd=
f
> file and then include the "$basename.pdf" in the $link... I'm pretty sur=
e I
> am matching the file up correctly but the .txt file are being erased inst=
ead
> of sub'ing the url w/ the local path... I think I'm ... :~)
>=20
$ARGV, not $_, is the value of the current open file when using <>. In
scalar context <> returns individual lines. It also concatentates
@ARGV, treating the listed files as essentially one file (at least
most of the time; it possible to find out what file you're in; read
perlop). In list context, it returns a list of lines from the
concatentated @ARGV. So, let's say I have three files in @ARGV, a.a,
b.b, and c.c, each ten lines long.
while (<> ) {
#cycles through 30 times, not stopping between files
}
@lines =3D <>;
#returns a single list of all the lines in the files
# with $lines[0] being the first line of a.a
# and $lines[29] being the last line of c.c
What you probably want here is:
foreach my $text_file (@ARGV) {
# ...and then the rest of your code looks like it should work.
In you current code, the block executes once for each line of each
file, and the variable $text_file =3D 'a line from whichever file you
happen to be in at the moment'. Since 'a line from whichever file you
happen to be in at the moment'.pdf almost certainly doesn't exist,
nothing happens. But you've set $^|, so when <> opens each file it is
renamed $ARGV.bak and an empty file $ARGV is created.
Key point: <> !=3D @ARGV (!=3D $ARGV)
HTH
-- jay
--------------------------------------------------
This email and attachment(s): [ ] blogable; [ x ] ask first; [ ]
private and confidential
daggerquill [at] gmail [dot] com
http://www.tuaw.com http://www.dpguru.com http://www.engatiki.org
| |
| Brian Volk 2005-08-04, 5:01 pm |
|
> -----Original Message-----
> From: Jay Savage [mailto:daggerquill@gmail.com]
> Sent: Thursday, August 04, 2005 12:46 PM
> To: Brian Volk; beginners perl
> Subject: Re: match basename file and s / / /;
>
>
> On 8/4/05, Brian Volk <BVolk@hpproducts.com> wrote:
> [snip]
> The .txt files
> had to include the
> .txt file and .pdf
> I'm pretty sure I
> being erased instead
> ... :~)
>
> $ARGV, not $_, is the value of the current open file when using <>. In
> scalar context <> returns individual lines. It also concatentates
> @ARGV, treating the listed files as essentially one file (at least
> most of the time; it possible to find out what file you're in; read
> perlop). In list context, it returns a list of lines from the
> concatentated @ARGV. So, let's say I have three files in @ARGV, a.a,
> b.b, and c.c, each ten lines long.
>
> while (<> ) {
> #cycles through 30 times, not stopping between files
> }
>
> @lines = <>;
> #returns a single list of all the lines in the files
> # with $lines[0] being the first line of a.a
> # and $lines[29] being the last line of c.c
>
> What you probably want here is:
>
> foreach my $text_file (@ARGV) {
>
> # ...and then the rest of your code looks like it should work.
>
> In you current code, the block executes once for each line of each
> file, and the variable $text_file = 'a line from whichever file you
> happen to be in at the moment'. Since 'a line from whichever file you
> happen to be in at the moment'.pdf almost certainly doesn't exist,
> nothing happens. But you've set $^|, so when <> opens each file it is
> renamed $ARGV.bak and an empty file $ARGV is created.
>
> Key point: <> != @ARGV (!= $ARGV)
>
> HTH
>
> -- jay
Jay, Thank you very much for the reply and the detailed explanation. I
learned a lot. Unfortunately, I think there is something else wrong. When
I execute the program nothing happens. I'm just guessing but I think my
problem lies somewhere w/ this line if( $PDFDIR_LIST{"$basename.pdf"} ) {
If I take out the .pdf and quotes I don't get the Use of uninitialized value
error but still nothing happens.. Anyway, thanks again for explanation but
for now it's back to the drawing board and waiting for that light in my head
to go off! :~)
Brian
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
my $pdf_dir = "j:/flash_host/ecomm/descriptions/product/MSDS";
opendir(PDFDIR, $pdf_dir) or die "Can't open the $pdf_dir: $!\n";
# read file/directory names in that directory into @pdfs
my @pdfs = readdir(PDFDIR) or die "Unable to read current dir:$!\n";
closedir(PDFDIR);
my $text_dir = "c:/brian/descriptions/product/small";
opendir (TEXTDIR, $text_dir) or die "Can't open $text_dir: $!";
# read all the .txt files and load @ARGV for <> operator
@ARGV = map { "$text_dir/$_" } grep { !/^\./ } readdir TEXTDIR;
my %PDFDIR_LIST;
$PDFDIR_LIST{$_}=1 for @pdfs;
$^I = '.bak';
foreach my $text_file (@ARGV) {
my ($basename, $suffix) = fileparse($text_file,'.txt');
my $link = "descriptions/product/small/$basename.pdf";
if( $PDFDIR_LIST{"$basename.pdf"} ) {
s% http://.* % $link %;
print;
next;
}
}
closedir (TEXTDIR);
__END__
| |
| Adam Wuellner 2005-08-04, 5:01 pm |
| On 8/4/05, Brian Volk <BVolk@hpproducts.com> wrote:
> for now it's back to the drawing board and waiting for that light in my h=
ead
> to go off! :~)
Brian,
I took the last script you posted and re wrote it the way I would have
gone about it. It doesn't use in-place editing, or @ARGV, because I
don't quite grasp how to make that work in this situation. Anyway,
see if it helps... I apologize if there are errors, I couldn't
quickly think of a way to test it.
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
use File::Copy;
my $pdf_dir =3D "j:/flash_host/ecomm/descriptions/product/MSDS";
# read file/directory names in that directory into @pdfs
opendir PDFDIR, $pdf_dir=20
=09=09=09or die "Can't opendir $pdf_dir: $!\n";
my @pdfs =3D readdir(PDFDIR);
closedir PDFDIR;
my $text_dir =3D "c:/brian/descriptions/product/small";
opendir TEXTDIR, $text_dir
=09=09=09or die "Can't opendir $text_dir: $!";
my @txts =3D map { "$text_dir/$_" } grep { !/^\./ } readdir TEXTDIR;
closedir TEXTDIR;
my %PDFDIR_LIST;
$PDFDIR_LIST{$_}=3D1 for @pdfs;
foreach my $text_file (@txts) {
=09my ($basename, $suffix) =3D fileparse($text_file,'.txt');
=09my $link =3D "descriptions/product/small/$basename.pdf";
=09
=09if( $PDFDIR_LIST{"$basename.pdf"} ) {
=09=09my $out_file =3D "$textfile.new";
=09=09open INPUT, "< $text_file"=20
=09=09=09=09=09or die "can't open $text_file: $!";
=09=09open OUTPUT, "> $out_file"=20
=09=09=09=09=09or die "can't open $out_file: $!";
=09=09
=09=09while (<INPUT> ) {
=09=09=09s% http://.* % $link %; # are you sure you don't need the g optio=
n?
=09=09=09print OUTPUT;
=09=09}
=09=09close INPUT; close OUTPUT;
=09=09move( $text_file, "$text_file.bak" );
=09=09move( $out_file, $text_file );
=09}
}
__END__
--=20
Adam Wuellner
| |
| Brian Volk 2005-08-04, 10:00 pm |
|
---- Adam Wuellner <adam.wuellner@gmail.com> wrote:
>
> On 8/4/05, Brian Volk <BVolk@hpproducts.com> wrote:
head[color=darkred]
>=20
> Brian,
>=20
> I took the last script you posted and re wrote it the way I would have
> gone about it. It doesn't use in-place editing, or @ARGV, because I
> don't quite grasp how to make that work in this situation. Anyway,
> see if it helps... I apologize if there are errors, I couldn't
> quickly think of a way to test it.
>=20
> #!/usr/bin/perl
>=20
> use strict;
> use warnings;
> use File::Basename;
> use File::Copy;
>=20
> my $pdf_dir =3D "j:/flash_host/ecomm/descriptions/product/MSDS";
> # read file/directory names in that directory into @pdfs
> opendir PDFDIR, $pdf_dir=20
> =09=09=09or die "Can't opendir $pdf_dir: $!\n";
> my @pdfs =3D readdir(PDFDIR);
> closedir PDFDIR;
>=20
> my $text_dir =3D "c:/brian/descriptions/product/small";
> opendir TEXTDIR, $text_dir
> =09=09=09or die "Can't opendir $text_dir: $!";
> my @txts =3D map { "$text_dir/$_" } grep { !/^\./ } readdir TEXTDIR;
> closedir TEXTDIR;
>=20
> my %PDFDIR_LIST;
> $PDFDIR_LIST{$_}=3D1 for @pdfs;
>=20
> foreach my $text_file (@txts) {
> =09my ($basename, $suffix) =3D fileparse($text_file,'.txt');
> =09my $link =3D "descriptions/product/small/$basename.pdf";
> =09
> =09if( $PDFDIR_LIST{"$basename.pdf"} ) {
=09=09my $out_file =3D "$textfile.new"; # just missed the underscore
> =09=09open INPUT, "< $text_file"=20
> =09=09=09=09=09or die "can't open $text_file: $!";
> =09=09open OUTPUT, "> $out_file"=20
> =09=09=09=09=09or die "can't open $out_file: $!";
> =09=09
> =09=09while (<INPUT> ) {
> =09=09=09s% http://.* % $link %; # are you sure you don't need the g opt=
ion?
> =09=09=09print OUTPUT;
> =09=09}
> =09=09close INPUT; close OUTPUT;
> =09=09move( $text_file, "$text_file.bak" );
> =09=09move( $out_file, $text_file );
> =09}
> }
>=20
> __END__
>=20
> --=20
> Adam Wuellner
>=20
Adam.. you're a saint! +<:-) That work perfectly! I knew I had to use "fo=
reach" for the $basename to work correctly and I was able to get the subsit=
ution to work when I used while <> ...I just couldn't figure out how to i=
ncorporate both of them... I DO NOW! Thank you very much!=20
John and Jay thank you guys as well for teaching me.. I really am learning =
a lot!=20
Brian Volk
| |
| John W. Krahn 2005-08-05, 4:00 am |
| Brian Volk wrote:
> Hi All~
Hello,
> my program below is not returning any errors but nothing is happening to the
> .txt files like I hoped. Can someone pls take a look and let me know what
> I'm doing wrong.
>
> ----- Thank you! ----
>
> # If there is a .pdf file and a matching .txt file, open the .txt file and s
> % http://.* % "$link" %
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
> use File::Basename;
>
> my $pdf_dir = "j:/flash_host/ecomm/descriptions/product/MSDS";
> opendir(PDFDIR, $pdf_dir) or die "Can't open the $pdf_dir: $!\n";
>
> # read file/directory names in that directory into @htmls
>
> my @pdfs = readdir(PDFDIR) or die "Unable to read current dir:$!\n";
You are reading ALL entries from the directory into @pdfs (including the
.. and .. entries.)
> closedir(PDFDIR);
>
> my $text_dir = "c:/brian/descriptions/product/small";
> opendir (TEXTDIR, $text_dir) or die "Can't open $text_dir: $!";
>
> # read all the .txt files and load @ARGV for <> operator
>
> @ARGV = map { "$text_dir/$_" } grep { !/^\./ } readdir TEXTDIR;
>
> my %PDFDIR_LIST;
>
> $PDFDIR_LIST{$_}=1 for @pdfs;
>
> foreach my $text_file (<> ) {
You are using a foreach loop with <>. The empty readline operator <> is
"magical" in that it will open all the files listed in @ARGV and read
all the lines from all the files. Because you are using a foreach loop
that means that all the lines from all the files have to be read and
stored in memory first before the foreach loop will iterate over them.
It also means that inplace edit using $^I won't work and that $. won't
give you the current line number and that $ARGV won't give you the
current file name.
> my ($basename, $suffix) = fileparse($text_file,'.txt');
You are using File::Basename::fileparse() incorrectly, it returns three
values ($name,$path,$suffix) but since you are only using the first
value you haven't encountered a problem. Also the second argument is a
regular expression so the period has to be escaped to match a period and
only the suffixes in the list you supply will be parsed.
> my $link = "descriptions/product/small/$basename.pdf";
>
> if( $PDFDIR_LIST{"$basename.pdf"} ){
> open FH, $text_file or die "can't open $text_file: $!";
> s% http://.* % $link %;
> next;}
>
> }
> close (FH);
> closedir (TEXTDIR);
>
> __END__
If I understand correctly then something like this should work (untested):
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
my $pdf_dir = 'j:/flash_host/ecomm/descriptions/product/MSDS';
opendir PDFDIR, $pdf_dir or die "Can't open the $pdf_dir: $!\n";
# read file/directory names in that directory into %PDFDIR_LIST
my %PDFDIR_LIST = map { /^(.+)\.pdf$/i ? ( $1, 1 ) : () } readdir PDFDIR
closedir PDFDIR;
my $text_dir = 'c:/brian/descriptions/product/small';
opendir TEXTDIR, $text_dir or die "Can't open $text_dir: $!";
# read all the .txt files and load @ARGV for <> operator
@ARGV = map "$text_dir/$_", grep /\.txt$/i, readdir TEXTDIR;
closedir TEXTDIR;
$^I = '.bak'; # use inplace edit and backup the original
my $link;
while ( <> ) {
if ( $. == 1 ) { # first line of file
my ( $basename ) = fileparse( $ARGV, /\.txt/i );
if ( not exists $PDFDIR_LIST{ $basename } ) {
close ARGV; # close the current filehandle
next; # and go on to the next file
}
$link = "descriptions/product/small/$basename.pdf";
}
s% http://.* % $link %g;
print;
close ARGV if eof; # reset the $. variable
}
__END__
John
--
use Perl;
program
fulfillment
|
|
|
|
|