Home > Archive > PERL Miscellaneous > May 2005 > perl grep problem
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| demolitionz@gmail.com 2005-05-25, 3:56 pm |
| hey, wonder if anyone can help 'cause i'm fresh out of ideas why my
perl script isn't working!
basically the script reads all the data from files in a directory into
an array. i then want the user to be able to search that array for
keywords (in each line) and output the keywords to a file. i've got
the script to work using the following line:
@found = grep(/$ARGV[2]/i, @rf);
where @rf is the array that's being searched, @found is the array the
found words are stored to (i output it to a file later which also works
fine) and $ARGV[2] is the user input word to search for. the problem
with this script is that because the user inputs the search word as
$ARGV[2] the program can only search for one word per run, which means
when they want to search for another word they have to run the whole
program again and this slows things down as the @rf array has to be
created from scratch once more.
what i want to do (and what i've tried endlessly to do in the 2nd
remake of the script!) is to have a 2 step proccess, where the files
are read into the @rf array as step one, and then in step 2 the user
inputs the keyword to search for and we loop step 2 as many times as
the user wants. what i'm currently doing with that then, is this:
$keyword = <STDIN>;
chop $keyword;
@found = grep(/$keyword/i, @rf);
now i've printed to screen everything so as to debug it, and if the
user inputs "chickens" for example, then print $keyword; will return
"chickens" correctly. the problem is, no matter what i try, i cannot
get the grep(/$keyword/) bit to work and @found is *always* empty! i
don't really understand why grep would work fine with $ARGV[2] but not
with $keyword and it's drivin me crazy! i've tried @found =
grep(/"$keyword"/i, @rf); and i've tried chomp $keyword; and i've even
resorted to pushing $keyword into an array and calling the same value
from the array as a scalar (i got very very desperate by this point and
would try anything ;)) but nothing i do works.
can anyone help?! :)
cheers,
d
| |
| Mark Clements 2005-05-25, 3:56 pm |
| demolitionz@gmail.com wrote:
> hey, wonder if anyone can help 'cause i'm fresh out of ideas why my
> perl script isn't working!
>
> basically the script reads all the data from files in a directory into
> an array. i then want the user to be able to search that array for
> keywords (in each line) and output the keywords to a file. i've got
> the script to work using the following line:
>
> @found = grep(/$ARGV[2]/i, @rf);
>
> where @rf is the array that's being searched, @found is the array the
> found words are stored to (i output it to a file later which also
> works fine) and $ARGV[2] is the user input word to search for. the
> problem with this script is that because the user inputs the search
> word as $ARGV[2] the program can only search for one word per run,
> which means when they want to search for another word they have to
> run the whole program again and this slows things down as the @rf
> array has to be created from scratch once more.
>
> what i want to do (and what i've tried endlessly to do in the 2nd
> remake of the script!) is to have a 2 step proccess, where the files
> are read into the @rf array as step one, and then in step 2 the user
> inputs the keyword to search for and we loop step 2 as many times as
> the user wants. what i'm currently doing with that then, is this:
>
> $keyword = <STDIN>;
> chop $keyword;
> @found = grep(/$keyword/i, @rf);
>
> now i've printed to screen everything so as to debug it, and if the
> user inputs "chickens" for example, then print $keyword; will return
> "chickens" correctly. the problem is, no matter what i try, i cannot
> get the grep(/$keyword/) bit to work and @found is always empty! i
> don't really understand why grep would work fine with $ARGV[2] but not
> with $keyword and it's drivin me crazy! i've tried @found =
> grep(/"$keyword"/i, @rf); and i've tried chomp $keyword; and i've even
> resorted to pushing $keyword into an array and calling the same value
> from the array as a scalar (i got very very desperate by this point
> and would try anything ;)) but nothing i do works.
>
> can anyone help?! :)
>
You need to post a small, complete program that displays this
behaviour, as well as sample data and output, copying and pasting
rather than retyping. Check out the posting guidelines. I would have
suggested "chomp" rather than "chop", but you've tried that. Is also
possible that the data you are feeding to STDIN has something
unexpected in it. Bear in mind that you'll need to escape special
characters if you want to use them in a regex to match the, er, special
characters.
Mark
| |
| demolitionz@gmail.com 2005-05-25, 3:56 pm |
| Okay, have read the posting guidelines and hopefully understood them,
sorry about that :)
Here's a scaled down version of the program that isn't working...
#!usr/bin/perl
$filenumber = 0;
do {
print "Processing file $filenumber of $#rd\n"; # nb: this is just to
debug
opendir(DH,$ARGV[0]);
@rd = readdir(DH);
open(FH,"$ARGV[0]/$rd[$filenumber]");
@rf = <FH>;
$filenumber++;
}
while ($filenumber <= $#rd);
do {
print "File to save to: ";
$filename = <STDIN>;
chomp $filename;
print "Keyword to search for:";
$searchterm = <STDIN>;
chomp $searchterm;
@found = grep(/$searchterm/i, @rf);
open(SAVETOFILE,">>./new/$filename");
print SAVETOFILE @found;
print "rf array: @rf\n";
print "keyword: $searchterm\n";
print "found array: @found\n";
print "Search again? y/n\n";
$stop = <STDIN>;
chop $stop;
if ($stop eq "n") { exit; }
}
while ($filename ne "!exit");
exit;
Have also tried adding in $searchterm =~ s/[^A-Za-z0-9 .\\:-]*//g; but
doesn't seem to make a difference. (oh and the directory 'New' does
exist just in case you were wondering :)).
And here's some sample data (directory contains 4 txt files. 1.txt
contains word eggs, 2.txt contains word bacon, 3.txt contains word
chickens, 4.txt contains word flower)...
c:\scriptdir> new1.pl c:\prltest\
Processing file 0 of -1 #this is just cos i put the debug print in
weird place :)
Processing file 1 of 5
Processing file 2 of 5
Processing file 3 of 5
Processing file 4 of 5
Processing file 5 of 5
File to save to: new.txt
Keyword to search for: chicken
rf array: flower
keyword: chicken
found array:
Search again? n
And that's basically it. As i say, it works absolutely fine with
$ARGV[2] as input so i'm stumped!
cheers,
d
ps this is only the 3rd script i've ever written in perl, so pls go
easy on me if i've done something obviously stupid ;)
| |
| demolitionz@gmail.com 2005-05-25, 3:56 pm |
| oh and just to pre-empt anyone lol, i did actually copy and paste that
script so i assume some of the misformats are due to google's
newsreader - e.g. open(FH,"$ARGV[0]/$rd[$filenum ber]"); is not a
mistake in the script (it's actually
open(FH,"$ARGV[0]/$rd[$filenumber]"); in the script) :)
| |
| John Bokma 2005-05-25, 3:56 pm |
| wrote:
> Okay, have read the posting guidelines and hopefully understood them,
Probably not entirely, so I added some guidelines ;-)
> #!usr/bin/perl
use strict;
use warnings;
> opendir(DH,$ARGV[0]);
check return value
> open(FH,"$ARGV[0]/$rd[$filenumber]");
check return value
> open(SAVETOFILE,">>./new/$filename");
check
> chop $stop;
use chomp if you want to chomp, see perldoc -f chomp
> if ($stop eq "n") { exit; }
> }
> while ($filename ne "!exit");
nicer:
while ( 1 ) {
:
last if $stop eq 'n';
}
--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html
| |
| demolitionz@gmail.com 2005-05-25, 3:56 pm |
| Ok, I've added the following debug in which will check to see if
directory can be opened and files are being read properly...
open(FH,"$ARGV[0]/$rd[$filenumber]");
print "\nFH is: ";
print <FH>;
this returns...
Processing file 1 of 5
FH is:
Processing file 2 of 5
FH is: eggs
Processing file 3 of 5
FH is: bacon
Processing file 4 of 5
FH is: chickens
Processing file 5 of 5
FH is: flower
So return value for open(FH,"$ARGV[0]/$rd[$filenumber]"); (and
therefore opendir(DH,$ARGV[0]);) seem fine.
Also, the new file is created ok (and it writes to the new file ok when
using $ARGV[2]) so open(SAVETOFILE,">>./new/$file name"); so this seems
fine too, its just annoying that it's always bleeding empty lol!
Have changed the ending of the script per your suggestion, and ty for
that :)
So I'm once again totally baffled, as all my debug checks seem to show
everything is working ok. The files in the directory are read to the
@rf array ok, the new file is created fine, the $keyword stdin works,
but the script just refuses to grep using the $keyword. And to top it
off google is doing it's best to misformat these posts lol :) ty for
ongoing help btw, appreciate it :)
| |
| John Bokma 2005-05-25, 3:56 pm |
| wrote:
Learn how to quote, otherwise you will notice that no one is going to reply
to your postings.
> Ok, I've added the following debug
wrong, try again.
(hint open( ... ) or die "Can't open '$filename': $!";
BTW: I am not saying that it's going to fix your problem, but it might trap
errors now, or in your future work.
--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html
| |
| demolitionz@gmail.com 2005-05-25, 3:56 pm |
| mmkay, well these will be manual quotes then as google doesn't have a
quote feature that i can find, so hopefully they come out ok.
> (hint open( ... ) or die "Can't open '$filename': $!";
done on opendir(DH,$ARGV[0]); and open(FH,"$ARGV[0]/$rd[$filenumber]");
and open(SAVETOFILE,">>./new/$file name") and they all work fine, no
errors...
| |
| Mark Clements 2005-05-25, 3:56 pm |
| demolitionz@gmail.com wrote:
> Okay, have read the posting guidelines and hopefully understood them,
> sorry about that :)
>
> Here's a scaled down version of the program that isn't working...
>
I've piped it through perltidy to make it semi-legible.
> $filenumber = 0;
> do {
> print "Processing file $filenumber of $#rd\n"; # nb: this >
> opendir( DH, $ARGV[0] );
> @rd = readdir(DH);
> open( FH, "$ARGV[0]/$rd[$filenumber]" );
> @rf = <FH>;
> $filenumber++;
> } while ( $filenumber <= $#rd );
You are doing opendir and reading the directory each time through the
loop. You don't need to do this. You aren't checking the return value
of your system calls. You aren't running with strict and warnings
(already pointed out). You are probably trying to open "." and ".." as
files. You are overwriting the value of @rf each time through the loop,
so @rf will only contain the contents of the last file found in the
directory, whatever that is.
use strict;
use warnings;
use Data::Dumper;
my $dirName = shift;
opendir DIRTOREAD, $dirName or die "could not open dir $dirName: $!";
my @filesToSearch = grep { -f "$dirName/$_" } readdir DIRTOREAD;
closedir DIRTOREAD or die "error closing dir $dirName: $!";
my %fileData = ();
foreach my $fileName(@filesToSearch){
my $fileToSearch = "$dirName$fileName";
open IN, "<$fileToSearch"
or die "could not open $fileToSearch: $!";
my @lines = map { chomp , $_} <IN>;
$fileData{$fileName} = \@lines;
}
warn Dumper %fileData;
while(my($fileName,$lines)=each %fileData){
print "enter search term for $fileName: ";
my $searchTerm = <STDIN>;
chomp $searchTerm;
last unless $searchTerm;
print "\n";
my @foundLines = grep { /$searchTerm/ } @$lines;
print "filename = $fileName searchTerm = $searchTerm\n";
print "found ".Dumper(@foundLines)."\n";
}
use Data::Dumper to make sure that your arrays contain what you think
they contain....
Note that doing this loads *all* of the files in the directory into
memory; you may not want to do this.
Mark
| |
| John Bokma 2005-05-25, 3:56 pm |
| Mark Clements wrote:
> my $dirName = shift;
> opendir DIRTOREAD, $dirName or die "could not open dir $dirName: $!";
Isn't it more common to use:
opendir my $dh, etc
nowadays? (Also CamelCase is something I prefer not to use ;-) )
> my @lines = map { chomp , $_} <IN>;
chomp( my @lines = <IN> ); ?
(Just curious, not nitpicking, ok a little).
--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html
| |
| demolitionz@gmail.com 2005-05-25, 3:56 pm |
| Thanks for your reply. Haven't used your code primarily because the
point of the excercise for me was just to try and learn some perl and
see if i could make the thing work (i'll move on to elegance later ;)),
but you did hit the nail on the head with this...
> You are overwriting the value of @rf each time through the loop,
> so @rf will only contain the contents of the last file found in the
> directory, whatever that is.
Have now changed the code from
@rf = <FH>;
to:
push(@rf, <FH> );
and it works fine :)
have also moved opendir(DH,$ARGV[0]); @rd = readdir(DH); out of the
first loop as you suggested.
many thanks to you both for your help :)
d
| |
| Mark Clements 2005-05-25, 3:56 pm |
| John Bokma wrote:
> Mark Clements wrote:
>
>
> Isn't it more common to use:
>
> opendir my $dh, etc
Sure - I was just following on from the OP's style, or, er, perhaps it
just didn't occur to me. On another point, I tend not to put the "my"
there unless it is eg at the start of a foreach. I think it makes
things clearer if the my is the first non-whitespace on the line.
> nowadays? (Also CamelCase is something I prefer not to use ;-) )
Yeah - I've been whistled on this one before :)
>
>
> chomp( my @lines = <IN> ); ?
>
Good point. I hadn't realised that chomp could be fed a list argument.
You learn something new every day.
regards,
Mark
| |
|
|
|
|
|