Home > Archive > PERL Beginners > December 2007 > How to search a CDR file for duplicates and delete them
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
How to search a CDR file for duplicates and delete them
|
|
| Henrik Nielsen 2007-12-10, 7:59 am |
| Hi
I'm almost totally new to Perl! :-)
I need to write a program that search a CDR file for duplicate lines and
then delete them.
This is what I have found out by reading in the Perl documentation and
this newsgroup, but I need a little more help.
I found this program in this newsgroup:
---------------
#!/usr/bin/perl
my $in_file = '/path/my_in_file';
open my $in, '<', $in_file or die "Cannot open '$in_file' $!";
open my $out, '>', 'scripted_out_file' or die "Failed to script file
for duplicates $!";
my %hash;
while ( <$in> ) {
my $key = join ',', ( split /,/ )[ 2, 3, 6, 7 ];
print $out $_ unless $hash{ $key }++;
}
close $out;
close $in;
---------------
This program should take out the cells 2, 3, 6 and 7 in a file split by
comma. My CDR file is split by space.
How can I use 'readline EXPR'?
Any help?... it's probably pretty simpel :-)
...//Henrik
| |
| Jeff Pang 2007-12-10, 7:01 pm |
| On Dec 10, 2007 6:58 PM, Henrik Nielsen <quercus1974@gmail.com> wrote:
> Hi
>
> I'm almost totally new to Perl! :-)
> I need to write a program that search a CDR file for duplicate lines and
> then delete them.
>
> This is what I have found out by reading in the Perl documentation and
> this newsgroup, but I need a little more help.
>
Follow the script you provided, you just need to change two lines for
your new requirement.
change:
> my $key = join ',', ( split /,/ )[ 2, 3, 6, 7 ];
> print $out $_ unless $hash{ $key }++;
to:
print $out $_ unless $hash{$_}++;
Good luck.
| |
| Dr.Ruud 2007-12-10, 7:01 pm |
| Henrik Nielsen schreef:
> I need to write a program that search a CDR file for duplicate lines
> and then delete them.
> [...]
> while ( <$in> ) {
> my $key = join ',', ( split /,/ )[ 2, 3, 6, 7 ];
> print $out $_ unless $hash{ $key }++;
> }
> [...]
> ---------------
> This program should take out the cells 2, 3, 6 and 7 in a file split
> by comma. My CDR file is split by space.
For SP separators specifically, use:
my $key = join ' ', ( split / +/)[ 2, 3, 6, 7 ];
Often this is better:
my $key = join ' ', ( split ' ')[ 2, 3, 6, 7 ];
(a ' ' with split works almost like /\s+/, see `perldoc -f split` about
the details)
> How can I use 'readline EXPR'?
The "<$in>" is a readline().
--
Affijn, Ruud
"Gewoon is een tijger."
| |
| Henrik Nielsen 2007-12-10, 10:03 pm |
| Jeff Pang wrote:
> change:
>
>
> to:
>
> print $out $_ unless $hash{$_}++;
>
>
> Good luck.
How simple! It works :-)
Thx!
Next step to work with file names and dates.
.../Henrik
|
|
|
|
|