For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > February 2006 > How to concatenate 'like' files in a dir?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author How to concatenate 'like' files in a dir?
wilson_work@yahoo.com

2006-02-18, 6:56 pm

Hi All,
I have a directory of .txt files and need to concatenate all files
belonging to each user (oldest first, no set number per user). The
username (M08x) is embedded in the filename, along with other info. I
would like to delete the smaller individual logs/files once they have
been concatenated. Any advice is greatly appreciated!

Here is a sample of the filenames...
-rw-r--r-- 1 christine christine 28046 Oct 11 21:40
KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt
KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt
.......


Thank you,
Christine

A. Sinan Unur

2006-02-18, 6:56 pm

wilson_work@yahoo.com wrote in news:1140288000.457752.320040
@z14g2000cwz.googlegroups.com:

> Hi All,
> I have a directory of .txt files and need to concatenate all files
> belonging to each user (oldest first, no set number per user). The
> username (M08x) is embedded in the filename, along with other info. I
> would like to delete the smaller individual logs/files once they have
> been concatenated. Any advice is greatly appreciated!


Well, please first read the posting guidelines for this group. You have
a much better chance of getting useful help if you post some code.

> Here is a sample of the filenames...
> -rw-r--r-- 1 christine christine 28046 Oct 11 21:40
> KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt
> KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
> KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
> KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
> KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
> KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
> KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt
> ......


Simple ... use a hash ;-)

1. opendir and readdir to read the filesnames
2. Use a capturing regex match to grab the user name
3. Add the filename to the list of filenames belonging to the user
4. Custom sort routine to sort filenames by date component.
5. Open a file for user to write to.
6. Read and write each file in required order.

Here is something quick and dirty to get you started:

#!/usr/bin/perl

use strict;
use warnings;

my %months = ( Jan => '01', Feb => '02', Mar => '03',
Apr => '04', May => '05', Jun => '06',
Jul => '07', Aug => '08', Sep => '09',
Oct => '10', Nov => '11', Dec => '12',
);

my %users;

while (my $filename = <DATA> ) {
chomp $filename;
if ( $filename =~ m{
\A
KCD-
(M\d{6})-
NA-server.name-
(\d{4})(\w{3})(\d{2})-
(\d{2}:\d{2}:\d{2})
\.txt
\z
}x ) {
my ($user, $date) = ($1, "$2$months{$3}$4$5");
push @{ $users{$user} }, { filename => $filename, date => $date
};
}
}

for my $user (keys %users) {
print "Files for $user:\n";
my @files = sort {
$b->{date} cmp $a->{date}
} @{ $users{$user} };

print $_->{filename}, "\n" for @files;
print "\n";
}


__DATA__
KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt
KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt

D:\Home\asu1\UseNet\clpmisc\dir> files
Files for M087350:
KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt

Files for M087326:
KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt



--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/c...guidelines.html

it_says_BALLS_on_your_forehead

2006-02-18, 6:56 pm


A. Sinan Unur wrote:
> wilson_work@yahoo.com wrote in news:1140288000.457752.320040
> @z14g2000cwz.googlegroups.com:
>
>
> Well, please first read the posting guidelines for this group. You have
> a much better chance of getting useful help if you post some code.
>
>
> Simple ... use a hash ;-)
>
> 1. opendir and readdir to read the filesnames
> 2. Use a capturing regex match to grab the user name
> 3. Add the filename to the list of filenames belonging to the user
> 4. Custom sort routine to sort filenames by date component.
> 5. Open a file for user to write to.
> 6. Read and write each file in required order.
>
> Here is something quick and dirty to get you started:
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> my %months = ( Jan => '01', Feb => '02', Mar => '03',
> Apr => '04', May => '05', Jun => '06',
> Jul => '07', Aug => '08', Sep => '09',
> Oct => '10', Nov => '11', Dec => '12',
> );
>
> my %users;
>
> while (my $filename = <DATA> ) {
> chomp $filename;
> if ( $filename =~ m{
> \A
> KCD-
> (M\d{6})-
> NA-server.name-
> (\d{4})(\w{3})(\d{2})-
> (\d{2}:\d{2}:\d{2})
> \.txt
> \z
> }x ) {
> my ($user, $date) = ($1, "$2$months{$3}$4$5");
> push @{ $users{$user} }, { filename => $filename, date => $date
> };
> }
> }
>
> for my $user (keys %users) {
> print "Files for $user:\n";
> my @files = sort {
> $b->{date} cmp $a->{date}
> } @{ $users{$user} };
>
> print $_->{filename}, "\n" for @files;
> print "\n";
> }
>
>
> __DATA__
> KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt
> KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
> KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
> KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
> KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
> KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
> KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt
>
> D:\Home\asu1\UseNet\clpmisc\dir> files
> Files for M087350:
> KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
> KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
> KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt
>
> Files for M087326:
> KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
> KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
> KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
> KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt



i realize this is just an example, but why did you choose to sort such
that the oldest file is last?

A. Sinan Unur

2006-02-18, 6:56 pm

"it_says_BALLS_on_your_forehead" <simon.chao@gmail.com> wrote in
news:1140293637.091464.96260@g44g2000cwa.googlegroups.com:

> A. Sinan Unur wrote:

....
[color=darkred]
[color=darkred]
> i realize this is just an example, but why did you choose to sort such
> that the oldest file is last?


I misread the OP's statement and thought that she wanted oldest last,
not first.

In any case, please quote only the relevant parts of the message to
which you are replying.

Sinan

--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/c...guidelines.html
it_says_BALLS_on_your_forehead

2006-02-18, 6:56 pm


A. Sinan Unur wrote:
> "it_says_BALLS_on_your_forehead" <simon.chao@gmail.com> wrote in
> news:1140293637.091464.96260@g44g2000cwa.googlegroups.com:
>
>
> ...
>
>
>
> I misread the OP's statement and thought that she wanted oldest last,
> not first.


gotcha. i didn't know if there was some esoteric file concat method
that took a reverse sorted list as an argument. sorry about the
over-quoting.

A. Sinan Unur

2006-02-18, 6:56 pm

"it_says_BALLS_on_your_forehead" <simon.chao@gmail.com> wrote in
news:1140294427.032181.72650@z14g2000cwz.googlegroups.com:

>
> A. Sinan Unur wrote:
>
> gotcha. i didn't know if there was some esoteric file concat method
> that took a reverse sorted list as an argument.


No there isn't (not that I know of ;-). But my code was missing a map
that would have made life much easier:

for my $user (keys %users) {
print "Files for $user:\n";
my @files = map { $_->{filename} }
sort { $a->{date} cmp $b->{date} }
@{ $users{$user} };
system "cat @files > $user.txt";
}

> sorry about the over-quoting.


No problem.

Sinan

--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines onthe WWW:
http://mail.augustmail.com/~tadmc/c...guidelines.html
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com