For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > October 2004 > Perl equivalent to the unix 'cut' command









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Perl equivalent to the unix 'cut' command
Dave Kettmann

2004-10-14, 8:55 pm

Howdy list,

Subject speaks for itself.. Need to find a function or module to work =
like the 'cut' command. I have a file with multiple fields and i need to =
'cut' out 2 of the fields. Looked thru CPAN for cut, but didnt find =
anything, started looking thru CPAN searching on 'File' but I think I'm =
just not thinking of the right keyword that perl uses. I'm still new to =
perl and I'm sorry for my ignorance if it should be right under my nose

Thanks in advance,

Dave Kettmann
NetLogic
Chris Devers

2004-10-14, 8:55 pm

On Thu, 14 Oct 2004, Dave Kettmann wrote:

> Subject speaks for itself..


Okay then.

perldoc -f split

Also speaks for itself :-)




--
Chris Devers
Chris Devers

2004-10-14, 8:55 pm

On Thu, 14 Oct 2004, Chris Devers wrote:

> On Thu, 14 Oct 2004, Dave Kettmann wrote:
>
>
> Okay then.
>
> perldoc -f split
>
> Also speaks for itself :-)


To be less snarky, you probably need to open up your file, iterate over
it line by line, using split to break each line up into chunks, then
write out a new array with the fields you want and the order you want
them. This second array can then be written out to disc; if you want you
could even read & write within the same loop.

But the key point is that split is often the easiest way to break apart
the fields in a file that is, for example, CSV formatted.

Give that a try, write some code to attempt it, and let the list know if
you have any problems in getting it to work.


--
Chris Devers
Dave Kettmann

2004-10-14, 8:55 pm



> -----Original Message-----
> From: Chris Devers [mailto:cdevers@pobox.com]
> Sent: Thursday, October 14, 2004 4:16 PM
> To: Perl List (E-mail)
> Cc: Dave Kettmann
> Subject: Re: Perl equivalent to the unix 'cut' command
>=20
>=20
> On Thu, 14 Oct 2004, Chris Devers wrote:
>=20
> =20
> To be less snarky, you probably need to open up your file,=20
> iterate over=20
> it line by line, using split to break each line up into chunks, then=20
> write out a new array with the fields you want and the order you want=20
> them. This second array can then be written out to disc; if=20
> you want you=20
> could even read & write within the same loop.
>=20
> But the key point is that split is often the easiest way to=20
> break apart=20
> the fields in a file that is, for example, CSV formatted.=20
>=20
> Give that a try, write some code to attempt it, and let the=20
> list know if=20
> you have any problems in getting it to work.
>=20
>=20
> --=20
> Chris Devers
>=20


Chris,

The reply was deserved :) Just another question before I go too far with =
this... The files I am parsing (just needing 2 tabbed fields out of =
them) are approximately 20,000 - 25,000 lines long a piece. Each of =
these files will be globbed into one file, but that is something =
completely different. I guess my question is, would I be better off =
calling exec(cut) with files of this size for ease of use? Guess I =
should have mentioned this in my previous email.

Thanks again,

Dave
Chris Devers

2004-10-14, 8:55 pm

On Thu, 14 Oct 2004, Dave Kettmann wrote:

> The reply was deserved :) Just another question before I go too far
> with this... The files I am parsing (just needing 2 tabbed fields out
> of them) are approximately 20,000 - 25,000 lines long a piece. Each of
> these files will be globbed into one file, but that is something
> completely different. I guess my question is, would I be better off
> calling exec(cut) with files of this size for ease of use? Guess I
> should have mentioned this in my previous email.


Not necessarily.

I seem to remember that as long as you're iterating over a small window
of the file at any given time, you don't necessarily end up slurping the
whole thing into memory at once.

How long is each line? How large are the files, bytewise? And how much
memory (etc) do you have to work with?

This is going to be a situations where benchmarks are invaluable.




--
Chris Devers
Dave Kettmann

2004-10-14, 8:55 pm


> On Thu, 14 Oct 2004, Dave Kettmann wrote:
>=20
> fields out=20
> piece. Each of=20
>=20
> Not necessarily.=20
>=20
> I seem to remember that as long as you're iterating over a=20
> small window=20
> of the file at any given time, you don't necessarily end up=20
> slurping the=20
> whole thing into memory at once.=20
>=20
> How long is each line? How large are the files, bytewise? And=20
> how much=20
> memory (etc) do you have to work with?
>=20
> This is going to be a situations where benchmarks are invaluable.
>=20
>=20
> =20
>=20
> --=20
> Chris Devers
>=20


Each line is probably 80-100 characters in legnth, the files are about =
300Kb each (6 files total) working with 1GB of memory. Looking at these =
numbers, dont know that these are really that big of a file, but seem =
like it when you look at them in vi ;)... I guess I will give slice a =
shot and see what I can do with it, I will keep you and the list updated =
:)

Dave Kettmann
Chris Devers

2004-10-14, 8:55 pm

On Thu, 14 Oct 2004, Dave Kettmann wrote:

> Each line is probably 80-100 characters in legnth, the files are about
> 300Kb each (6 files total) working with 1GB of memory.


Oh, that's it? I'm sure you'll be find then


> Looking at these numbers, dont know that these are really that big of
> a file, but seem like it when you look at them in vi ;)...


Well, among other things, Vim handles big files more smoothly :-)




--
Chris Devers
John W. Krahn

2004-10-14, 8:55 pm

Dave Kettmann wrote:
> Howdy list,


Hello,

> Subject speaks for itself.. Need to find a function or module to work
> like the 'cut' command. I have a file with multiple fields and i need
> to 'cut' out 2 of the fields. Looked thru CPAN for cut, but didnt find
> anything,


Really? I found it right away.

http://search.cpan.org/~cwest/ppt-0...cut/cut.hewgill

> started looking thru CPAN searching on 'File' but I think
> I'm just not thinking of the right keyword that perl uses.


Use 'cut'.

> I'm still
> new to perl and I'm sorry for my ignorance if it should be right under
> my nose



John
--
use Perl;
program
fulfillment
Chris Cole

2004-10-15, 3:55 pm

On Thu, 14 Oct 2004 17:40:28 -0400, Chris Devers wrote:

> On Thu, 14 Oct 2004, Dave Kettmann wrote:
>
>
> Oh, that's it? I'm sure you'll be find then


Most definitely. I sometimes iterate over 1Gb files in a system with 512Mb
RAM with no problems. It usually takes ~20 seconds to complete so speed
isn't too much of an issue either.

>
>
> Well, among other things, Vim handles big files more smoothly :-)


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com