Home > Archive > PERL Beginners > September 2004 > CSV Files
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| reclusive monkey 2004-09-29, 10:45 am |
| Hello Everyone,
I am new to both programming and perl, so please excuse my ignorance.
I am looking to use to Perl to 'correct' certain CSV files I use at
work for some reports. I export a report into CSV format from an
Oracle Database. The format of the records is
="70681601",S,"Training Expenses - Course Fees Expenses Employees And
Related Expenses Unblocking Barriers to Training Apr-Sept 02 Childrens
Services Unit",0,0,0,0
with a few thousand records. Most of the records do not present any
problems, however there are some codes which cause havoc. Some of our
expenditure codes are in the form of "6APR****" which once you save
the file from a CSV to an Excel file, Excel 'helpfully' converts these
to dates (6-Apr-04) etc. It also converts some of these codes to
exponential values. I ultimately put the CSV file into Access, but if
I try to import/link the CSV files as they are, this also has its
problems as in the description field there some containing commas,
which Access then takes as a delimiter, even though they are enclosed
in quotes. So, getting to the point, what I would like to do is to use
Perl to search out for commas within the description field and remove
them.
I have tried to knock up some code to use pattern matching, and have
not have had much success. I have googled around, and seen both
DBD::CSV and Text::CSV_XS mentioned when using CSV files, but this
seems to be more difficult to me to understand than writing the code
itself. For the moment I am concentrating on just getting the relevant
lines counted and printed out. Am I anywhere close here? It seems to
be the pattern matching I am having trouble with, the code I am
enclosing seems close; it finds 5 target records, all of which indeed
have a comma in the description. However it also misses some
descriptions with commas in. I have looked at these results and don't
seem to be able to spot my mistake; word,word and word, word are
represented in both the target results and null results. Not
surprising as the pattern matching is all still a mystery to me so
far. Thanks in advance for any advice anyone might have.
| |
| reclusive.monkey@gmail.com 2004-09-29, 10:45 am |
| reclusive monkey wrote:
> I have tried to knock up some code to use pattern matching, and have
> not have had much success.
Not a good introduction; I forgot to paste the code!
#!/usr/bin/perl
use warnings;
# Ask for the csv file to be checked
print "Please enter the name of the csv file you wish to check: ";
my $csvfile = <STDIN>;
chomp $csvfile;
$target_records = 0;
$null_records = 0;
# Open the file
open FILE, $csvfile or die "Cannot read '$csvfile': $!";
# We are looking for commas within two quotes (" "). I'm not having
much luck
while (<FILE> )
{
if ($_ =~ /"[\s\w]*,[\s\w]*"/)
{
print $_;
$target_records = $target_records + 1; # Count record as target
}
else
{
$null_records = $null_records + 1; # Count record as null
}
}
# Close the file
close FILE;
# Tell user how many records were found
print "There are $target_records target records and $null_records null
records.\n";
|
|
|
|
|