For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > February 2007 > Count Recurrence of Paired Values in Text File









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Count Recurrence of Paired Values in Text File
jim

2007-02-26, 7:00 pm

Hi,

I'm trying to write what should be a simple script, but which I keep
getting hung up on. I've scoured past posts, but still haven't found
the answer I'm looking for.

I have a text file that will re-populate on a wly basis. I want to
read this file into a hash and return a count of unique instances
based on an index of columns.

More specifically, of the 15 tab-delimited columns in the spreadsheet,
I would like to get a count for each recurrence of the column 1, 4, 13
combination (area + user + manager). The remaining fields contain
unique data (e.g. order numbers, etc.) that I'm not interested in for
this view.

I'm able to roll up based on a single factor, but because column can
be present in others, it's important that I be able to include
multiple conditions. Any recommendations or advice for places to look
would be appreciated. Here's my first stab (based on a single
condition):

#!C:\Perl\bin\perl.exe -w

$file = "data.txt";

open (FILE, $file) or die ("Can't open $file: $!");
while (<FILE> ) {
#[1] is the column (second) that contains area name
$count{(split(/\t/))[1]}++;
}

foreach (sort keys %count) {
print "$_: $count{$_}\n";
}

gf

2007-02-27, 4:01 am

On Feb 26, 5:48 pm, "jim" <jimmorga...@gmail.com> wrote:
> Hi,
>
> I'm trying to write what should be a simple script, but which I keep
> getting hung up on. I've scoured past posts, but still haven't found
> the answer I'm looking for.
>
> I have a text file that will re-populate on a wly basis. I want to
> read this file into a hash and return a count of unique instances
> based on an index of columns.
>
> More specifically, of the 15 tab-delimited columns in the spreadsheet,
> I would like to get a count for each recurrence of the column 1, 4, 13
> combination (area + user + manager). The remaining fields contain
> unique data (e.g. order numbers, etc.) that I'm not interested in for
> this view.


I'd go after it this way. (Code is untested)...

#!/usr/bin/perl

# always use warnings and strict
use warnings;
use strict;

# I always pre-define my array and hashes as a visual reminder they're
empty.
# It is always good to make your code be as self-documenting as
possible.
my %counts = ();

# Always use the three parameter version of open().
# This tests open for a successful open...
if ( open( my $FILE, '<', 'data.txt' ) ) {
while (<$FILE> ) {

# Get the columns we want. Remember Perl is usually 0-indexed.
my @cols = ( split( "\t", $_ ) )[ 0, 3, 12 ];

# "@cols" uses space separated column data for readability
later.
$counts{"@cols"}++;
}

close $FILE;

# loop through the keys...
foreach (

# descending sort on count...
sort {
( $counts{$b} <=> $counts{$a} )

# ...or ascending on keys.
or ( $a cmp $b )
}
keys %counts
)
{

# print the count and the key
print $counts{$_}, "\t", $_, "\n";
}
}


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com