For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > July 2005 > skip/delete lines with dup keys









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author skip/delete lines with dup keys
Jim

2005-07-26, 10:01 pm


Hi

I have csv file that has data something like this (Header in caps)

LOAN_NO,SCORE,BAL
77585,740,452125
77585,741,450256
85669,658,125869
85669,658,122586

Looking for ideas on the best way to skip/delete a line if the LOAN_NO
repeat even if other fields ion in the record are not the same (need to
remove lines with dup key fields).
The code below seems to work, but there must be a better way.
Thanks for any help

Jim
-------------------------------------

# file already sorted by $id
my $cur_id = '';
while (<F> ) {
chomp;
($id,$score,$bal) = split (/,/, $_);
next if $cur_id == $id;
print OUT "$id,$score,$bal\n";
$cur_id = $id;
}


John W. Krahn

2005-07-26, 10:01 pm

Jim wrote:
>
> Hi


Hello,

> I have csv file that has data something like this (Header in caps)
>
> LOAN_NO,SCORE,BAL
> 77585,740,452125
> 77585,741,450256
> 85669,658,125869
> 85669,658,122586
>
> Looking for ideas on the best way to skip/delete a line if the LOAN_NO
> repeat even if other fields ion in the record are not the same (need to
> remove lines with dup key fields).
> The code below seems to work, but there must be a better way.
> Thanks for any help
>
> Jim
> -------------------------------------
>
> # file already sorted by $id
> my $cur_id = '';
> while (<F> ) {
> chomp;
> ($id,$score,$bal) = split (/,/, $_);
> next if $cur_id == $id;
> print OUT "$id,$score,$bal\n";
> $cur_id = $id;
> }


Don't fix it if it's not broken. :-)


John
--
use Perl;
program
fulfillment
Jeff 'japhy' Pinyan

2005-07-26, 10:01 pm

On Jul 26, Jim said:

> Looking for ideas on the best way to skip/delete a line if the LOAN_NO
> repeat even if other fields ion in the record are not the same (need to
> remove lines with dup key fields).
> The code below seems to work, but there must be a better way.


Your code works fine, assuming the duplicates are right after one another.

--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
http://japhy.perlmonk.org/ % have long ago been overpaid?
http://www.perlmonks.org/ % -- Meister Eckhart
Chris

2005-07-27, 4:00 am

Jeff 'japhy' Pinyan wrote:
> On Jul 26, Jim said:
>
>
>
> Your code works fine, assuming the duplicates are right after one another.
>

If not or if you want to avoid sorting then you need to use a hash:

use warnings;
use strict;

# other file opening stuff

my %hash;
while (<F> ) {
chomp;
my ($id,$score,$bal) = split (/,/, $_);
next if exists($hash{$id});
print OUT;
$hash{$id}++;
}
Jim

2005-07-28, 4:00 am

------
>
> Don't fix it if it's not broken. :-)
>
>



Thanks John and Jeff. Just thought there may be a better way.

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com