For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > April 2005 > Recommendations of module to use on tab-delimited file?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Recommendations of module to use on tab-delimited file?
Kevin Zembower

2005-04-20, 3:56 pm

I have to process a tab-delimited file, similar to a comma-delimited file =
(.csv). I found these modules which would seem to work:
Text::Delimited at http://search.cpan.org/~bennie/Text...ted-1.93/lib/T=
ext/Delimited.pm (Jul 2004)
Text::TabFile at http://search.cpan.org/~bennie/Text...1.00/lib/Text/=
TabFile.pm (Apr 2004)
Text::xSV at http://search.cpan.org/~tilly/Text-...lib/Text/xSV.pm =
(Apr 2005)
Text::CSV_XS at http://search.cpan.org/~jwied/Text-...-0.23/CSV_XS.pm =
(Oct 2001)

I'm inclined to use Text::xSV because of it's recent update. I've used =
Text::CSV_XS successfully before, but it hasn't been revised lately (maybe =
it doesn't need to be revised?) and it seems more complex than the others, =
requiring the use of IO::File:flock.

I've got to process about 320,000 records, so speed of execution is an =
issue, but it's not the overriding concern.

Any recommendations on which module to pick?

Thank you for your advice and suggestions.

-Kevin

-----
E. Kevin Zembower
Internet Systems Group manager
Johns Hopkins University
Bloomberg School of Public Health
Center for Communications Programs
111 Market Place, Suite 310
Baltimore, MD 21202
410-659-6139
Chris Devers

2005-04-20, 3:56 pm

On Wed, 20 Apr 2005, KEVIN ZEMBOWER wrote:

> I'm inclined to use Text::xSV because of it's recent update. I've used
> Text::CSV_XS successfully before, but it hasn't been revised lately
> (maybe it doesn't need to be revised?) and it seems more complex than
> the others, requiring the use of IO::File:flock.
>
> I've got to process about 320,000 records, so speed of execution is an
> issue, but it's not the overriding concern.


So use Text::CSV_XS then.

Text::CSV and Text::CSV_XS are the standard modules for this. If you
need to work with the files as they are now, and you need your code to
run fast, then the module to use is Text::CSV_XS. (Text::CSV should be
identical, but is implemented in pure Perl; this makes it more portable
than the C/XS version, but much slower with big data files like yours.)

As an alternative, the next popular option is DBD::CSV, which lets you
treat the CSV (or TSV or whatever) files as if they were tables in a
relational database, allowing you to issue SQL statements against the
file contents. This can be useful -- especially if you're assuming that
the data will in fact be migrated to a proper database in the future --
but I'm not sure how it compares speedwise to Text::CSV_XS. If you just
need raw speed, it may not help you.



--
Chris Devers
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com