For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > February 2007 > Complex splitting or alternative matching









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Complex splitting or alternative matching
Karyn Williams

2007-02-15, 7:00 pm

I am comparing the passwd file to a file of numbers. The numbers match
GECOS info in the passwd file. I want the lines in the passwd file not
matching lines in the $students file printed to another file. I used
examples of others code to come up with the following. It works OK. My
problem is that the data in /etc/passwd is quite variable and so sometimes
I have nothing in $two, and sometimes it is not what I want (i.e. the line
with "Jr."). Can someone educate me on a way to split conditionally, as in
split on the character before a number ?

#!/usr/bin/perl -w
use strict;

my ($one, $two, $three, $four, $five, $six, $line, $pair);
my $dir = "/usr/local/tools";
my $pwlist = "/etc/passwd";
my $students = "$dir/spr2007.txt";

my %sids;

open my($numbers), $students or die "can't read $students: $!";
open my($pfile), $pwlist or die "can't read $pwlist: $!";

while (<$numbers> ) {
chomp;
$sids{$_} = 1;
}
close $numbers;

open (RESULTS, "+>$dir/students_to_modify.txt");
my @line = <$pfile>;

foreach $pair (@line) {
($one, $two, $three, $four, $five, $six) = split(/:/, $pair);
($one, $two, $three) = split(/,/, $five);
$two =~ s/^\s+//;
printf RESULTS ("$two, $one, $three\n") unless $sids{$two};

}

close $pfile;
close RESULTS;

Here is an example of $students :

123
1234567
12334
300901

Here is an example of $pwlist (other than the usual entires):

nhl:*:15739:15739: Norm E Hill, Jr. , 123404 , FG-04 :/usr/nhl:/bin/nologin
rmaya:*:15742:15742: Ruika Manya , 24540 , QW-27 :/usr/rmaya:/bin/nologin
lpea:*:15755:15755:Luci Pea, NOID, Art, 1985:/usr/lpea:/bin/nologin
jpos:*:15758:15758:Jose Pos,12334,AR-34:/usr/jpos:/bin/nologin
bmier:*:15760:15760:Bennet Mier, Jr.,300901,AA-18:/usr/bmier:/bin/nologin
cotght:*:15762:15762:Tami Cotght,123333,ZZ-08:/usr/cotght:/bin/nologin
gcale:*:15763:15763:Gin Calebary,2448,BB-157:/usr/gcale:/bin/nologin
studc:*:15764:15764:Student Council,Student
Affairs,Special:/usr/studc:/bin/nologin
blackc:*:15768:15768:Black Clock,Bob Foser,Special:/usr/blackc:/bin/nologin
jiee:*:15791:15791:JiYeo Lee:/usr/jiee:/bin/nologin
mfa04:*:15794:15794:Karen Ason:/usr/mfa04:/bin/nologin


--

Karyn Williams
Network Services Manager
California Institute of the Arts
karyn@calarts.edu
http://www.calarts.edu/network
Uri Guttman

2007-02-15, 9:59 pm

>>>>> "KW" == Karyn Williams <karyn@calarts.edu> writes:

KW> I am comparing the passwd file to a file of numbers. The numbers match
KW> GECOS info in the passwd file. I want the lines in the passwd file not
KW> matching lines in the $students file printed to another file. I used
KW> examples of others code to come up with the following. It works OK. My
KW> problem is that the data in /etc/passwd is quite variable and so sometimes
KW> I have nothing in $two, and sometimes it is not what I want (i.e. the line
KW> with "Jr."). Can someone educate me on a way to split conditionally, as in
KW> split on the character before a number ?

/etc/passwd is well defined and should not be 'variable' as you say.

KW> #!/usr/bin/perl -w
KW> use strict;

KW> my ($one, $two, $three, $four, $five, $six, $line, $pair);

eww. why use poor names like that? and don't declare vars before you use
them. those (with better names) should be declared in the loop.

KW> my $dir = "/usr/local/tools";
KW> my $pwlist = "/etc/passwd";
KW> my $students = "$dir/spr2007.txt";

KW> my %sids;

KW> open my($numbers), $students or die "can't read $students: $!";
KW> open my($pfile), $pwlist or die "can't read $pwlist: $!";

KW> while (<$numbers> ) {
KW> chomp;
KW> $sids{$_} = 1;
KW> }
KW> close $numbers;

my %sids = map { chomp ; $_ => 1 } <$numbers> ;

or without the open/close

use File::Slurp ;
my %sids = map { chomp ; $_ => 1 } read_file( $students ) ;

KW> open (RESULTS, "+>$dir/students_to_modify.txt");
KW> my @line = <$pfile>;

no need to slurp in the whole file (even if i like slurping). use a
while loop.

KW> foreach $pair (@line) {
KW> ($one, $two, $three, $four, $five, $six) = split(/:/,
KW> $pair);

why split to all if you toss out several?

KW> ($one, $two, $three) = split(/,/, $five);

that is some very bad code. the names are useless and you reuse vars to
hold different things. use names that reflect actual usage of the
vars. i have no clue what those records contain and neither will you in
3 months (or anyone else who will read this code).

KW> $two =~ s/^\s+//;
KW> printf RESULTS ("$two, $one, $three\n") unless $sids{$two};

KW> }

KW> close $pfile;
KW> close RESULTS;

why are you mixing lexical and bareword handles? use lexicals as you
seem to know how.

KW> Here is an example of $students :

KW> 123
KW> 1234567
KW> 12334
KW> 300901

it is a bunch of numbers! i can deduce that from the code (sorta)

KW> nhl:*:15739:15739: Norm E Hill, Jr. , 123404 , FG-04 :/usr/nhl:/bin/nologin

look at perlfunc -f getpwnent. no need to do your nameless parsing as
reading /etc/passwd is already builtin to perl. it will reduce your
script to almost nothing.

uri

--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com