Home > Archive > PERL Beginners > February 2007 > Complex splitting or alternative matching
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Complex splitting or alternative matching
|
|
| Karyn Williams 2007-02-15, 7:00 pm |
| I am comparing the passwd file to a file of numbers. The numbers match
GECOS info in the passwd file. I want the lines in the passwd file not
matching lines in the $students file printed to another file. I used
examples of others code to come up with the following. It works OK. My
problem is that the data in /etc/passwd is quite variable and so sometimes
I have nothing in $two, and sometimes it is not what I want (i.e. the line
with "Jr."). Can someone educate me on a way to split conditionally, as in
split on the character before a number ?
#!/usr/bin/perl -w
use strict;
my ($one, $two, $three, $four, $five, $six, $line, $pair);
my $dir = "/usr/local/tools";
my $pwlist = "/etc/passwd";
my $students = "$dir/spr2007.txt";
my %sids;
open my($numbers), $students or die "can't read $students: $!";
open my($pfile), $pwlist or die "can't read $pwlist: $!";
while (<$numbers> ) {
chomp;
$sids{$_} = 1;
}
close $numbers;
open (RESULTS, "+>$dir/students_to_modify.txt");
my @line = <$pfile>;
foreach $pair (@line) {
($one, $two, $three, $four, $five, $six) = split(/:/, $pair);
($one, $two, $three) = split(/,/, $five);
$two =~ s/^\s+//;
printf RESULTS ("$two, $one, $three\n") unless $sids{$two};
}
close $pfile;
close RESULTS;
Here is an example of $students :
123
1234567
12334
300901
Here is an example of $pwlist (other than the usual entires):
nhl:*:15739:15739: Norm E Hill, Jr. , 123404 , FG-04 :/usr/nhl:/bin/nologin
rmaya:*:15742:15742: Ruika Manya , 24540 , QW-27 :/usr/rmaya:/bin/nologin
lpea:*:15755:15755:Luci Pea, NOID, Art, 1985:/usr/lpea:/bin/nologin
jpos:*:15758:15758:Jose Pos,12334,AR-34:/usr/jpos:/bin/nologin
bmier:*:15760:15760:Bennet Mier, Jr.,300901,AA-18:/usr/bmier:/bin/nologin
cotght:*:15762:15762:Tami Cotght,123333,ZZ-08:/usr/cotght:/bin/nologin
gcale:*:15763:15763:Gin Calebary,2448,BB-157:/usr/gcale:/bin/nologin
studc:*:15764:15764:Student Council,Student
Affairs,Special:/usr/studc:/bin/nologin
blackc:*:15768:15768:Black Clock,Bob Foser,Special:/usr/blackc:/bin/nologin
jiee:*:15791:15791:JiYeo Lee:/usr/jiee:/bin/nologin
mfa04:*:15794:15794:Karen Ason:/usr/mfa04:/bin/nologin
--
Karyn Williams
Network Services Manager
California Institute of the Arts
karyn@calarts.edu
http://www.calarts.edu/network
| |
| Uri Guttman 2007-02-15, 9:59 pm |
| >>>>> "KW" == Karyn Williams <karyn@calarts.edu> writes:
KW> I am comparing the passwd file to a file of numbers. The numbers match
KW> GECOS info in the passwd file. I want the lines in the passwd file not
KW> matching lines in the $students file printed to another file. I used
KW> examples of others code to come up with the following. It works OK. My
KW> problem is that the data in /etc/passwd is quite variable and so sometimes
KW> I have nothing in $two, and sometimes it is not what I want (i.e. the line
KW> with "Jr."). Can someone educate me on a way to split conditionally, as in
KW> split on the character before a number ?
/etc/passwd is well defined and should not be 'variable' as you say.
KW> #!/usr/bin/perl -w
KW> use strict;
KW> my ($one, $two, $three, $four, $five, $six, $line, $pair);
eww. why use poor names like that? and don't declare vars before you use
them. those (with better names) should be declared in the loop.
KW> my $dir = "/usr/local/tools";
KW> my $pwlist = "/etc/passwd";
KW> my $students = "$dir/spr2007.txt";
KW> my %sids;
KW> open my($numbers), $students or die "can't read $students: $!";
KW> open my($pfile), $pwlist or die "can't read $pwlist: $!";
KW> while (<$numbers> ) {
KW> chomp;
KW> $sids{$_} = 1;
KW> }
KW> close $numbers;
my %sids = map { chomp ; $_ => 1 } <$numbers> ;
or without the open/close
use File::Slurp ;
my %sids = map { chomp ; $_ => 1 } read_file( $students ) ;
KW> open (RESULTS, "+>$dir/students_to_modify.txt");
KW> my @line = <$pfile>;
no need to slurp in the whole file (even if i like slurping). use a
while loop.
KW> foreach $pair (@line) {
KW> ($one, $two, $three, $four, $five, $six) = split(/:/,
KW> $pair);
why split to all if you toss out several?
KW> ($one, $two, $three) = split(/,/, $five);
that is some very bad code. the names are useless and you reuse vars to
hold different things. use names that reflect actual usage of the
vars. i have no clue what those records contain and neither will you in
3 months (or anyone else who will read this code).
KW> $two =~ s/^\s+//;
KW> printf RESULTS ("$two, $one, $three\n") unless $sids{$two};
KW> }
KW> close $pfile;
KW> close RESULTS;
why are you mixing lexical and bareword handles? use lexicals as you
seem to know how.
KW> Here is an example of $students :
KW> 123
KW> 1234567
KW> 12334
KW> 300901
it is a bunch of numbers! i can deduce that from the code (sorta)
KW> nhl:*:15739:15739: Norm E Hill, Jr. , 123404 , FG-04 :/usr/nhl:/bin/nologin
look at perlfunc -f getpwnent. no need to do your nameless parsing as
reading /etc/passwd is already builtin to perl. it will reduce your
script to almost nothing.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
|
|
|
|
|