For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > November 2005 > Re: Modifying column values of file records and appending to end









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Re: Modifying column values of file records and appending to end
John W. Krahn

2005-11-21, 9:56 pm

Danny Fang wrote:
> Hi,


Hello,

> I'm new to PERL and would like to s help for the task mentioned below:
>
> I'm attempting to read the contents of a file containing rows with the format shown below:
> version|exchange|area|date|time|callmod|
callid|callno1|callno2|part2|start_date|
start_time|spare|dur|flag_ini|indicator|
length|ni|calling_nai|screening|address_
i|num_plan_ind|publicservice_user|spare1
|spare2|calling_no|line_no|spare3|spare4
|called_nai|c

alled_num_plan_ind|spare5|called_num|spa
re6|di_nai|di_num_plan_ind|spare7|dest_n
um|dest_num_type|spare8|doc|dccc_type|sp
are9|dt_update|spare10|subs_type|cug_cod
e|spare11|teleservices|spare12|channel_i
sdn|calling_terminal|pulses|in_port|out_
port|in_pop|out
_pop|spare13|out_mod|spare14|link2bb_typ
e|out_cic|in_cic|anomaly_code|anomaly_in
d|spare15|spare16|location|cause_value|s
pare17|final_status|ingress_ip|egress_ip
|inter_dur|bid_time|term_call_setup_dela
y|del_call_setup_delay|term_ccd|int_call
_clear_delay|tr
ans_delay|inter_jitter|send_packets|send
_octets|rec_packets|rec_octets|lost_pack
ets|lost_packets_out|packet_period|code_
alg|spare20|silence_suppr|spare21|coi|sp
are22|
> E1440000100TT|006030766|0521|051018|1915
54|2|1529238851|63271|0|0|051018|184414|
0|965|0|0|171|1|3|1|0|1|2|0|0|882030|255
|0|0|3|1|0|052539028|0|3|1|0|052539028|1
|0|1|0|0|0|0|SNO|0|0|0|0|0|000521882030|
1|24086|27368|0|0|0|0|0|0|2|2|0|0|0|0|0|
16|0|1|85.38.

245.146|83.175.46. 150|1|1129653851|142|0|0|0|17093|140|482
27|1542980|47896|1532672|258|0||13|0|0|0
|||
>
> I'm interested in modifying the values at the 3rd and 26th column of 1 particular
> row in this file and duplicating that row values to populate it to 4000 rows. There
> are 2123 rows in the this file currently.
>
> Below is the script which I've written in order to modify the values at the column
> mentioned.
>
> However, I'm not sure how I could rewrite the newly modified column values of that
> particular row back into the file - I want to use the particular row which had its
> columns modified to be duplicated and appended to the end of the current file for
> a specific number of time (adding more rows with the duplicated rows).
>
> version|exchange|area|date|time|callmod|
callid|callno1|callno2|part2|start_date|
start_time|spare|dur|flag_ini|indicator|
length|ni|calling_nai|screening|address_
i|num_plan_ind|publicservice_user|spare1
|spare2|calling_no|line_no|spare3|spare4
|called_nai|c

alled_num_plan_ind|spare5|called_num|spa
re6|di_nai|di_num_plan_ind|spare7|dest_n
um|dest_num_type|spare8|doc|dccc_type|sp
are9|dt_update|spare10|subs_type|cug_cod
e|spare11|teleservices|spare12|channel_i
sdn|calling_terminal|pulses|in_port|out_
port|in_pop|out
_pop|spare13|out_mod|spare14|link2bb_typ
e|out_cic|in_cic|anomaly_code|anomaly_in
d|spare15|spare16|location|cause_value|s
pare17|final_status|ingress_ip|egress_ip
|inter_dur|bid_time|term_call_setup_dela
y|del_call_setup_delay|term_ccd|int_call
_clear_delay|tr
ans_delay|inter_jitter|send_packets|send
_octets|rec_packets|rec_octets|lost_pack
ets|lost_packets_out|packet_period|code_
alg|spare20|silence_suppr|spare21|coi|sp
are22|
> E1440000100TT|006030766|AAA|BBBB|191554|
2|1529238851|63271|0|0|051018|184414|0|9
65|0|0|171|1|3|1|0|1|2|0|0|882030|255|0|
0|3|1|0|052539028|0|3|1|0|052539028|1|0|
1|0|0|0|0|SNO|0|0|0|0|0|000521882030|1|2
4086|27368|0|0|0|0|0|0|2|2|0|0|0|0|0|16|
0|1|85.38.245

.146|83.175.46. 150|1|1129653851|142|0|0|0|17093|140|482
27|1542980|47896|1532672|258|0||13|0|0|0
|||
> E1440000100TT|006030766|AAA|BBBB|191554|
2|1529238851|63271|0|0|051018|184414|0|9
65|0|0|171|1|3|1|0|1|2|0|0|882030|255|0|
0|3|1|0|052539028|0|3|1|0|052539028|1|0|
1|0|0|0|0|SNO|0|0|0|0|0|000521882030|1|2
4086|27368|0|0|0|0|0|0|2|2|0|0|0|0|0|16|
0|1|85.38.245

.146|83.175.46. 150|1|1129653851|142|0|0|0|17093|140|482
27|1542980|47896|1532672|258|0||13|0|0|0
|||
>
> Could anyone help me out?
>
>
> ##open file for reading
> open(INPUTFILE, $inputFile) || die "Cannot open $inputFile \n";


You should include the $! variable in the error message so you know why the
file failed to open.


> @fileRecs = <INPUTFILE>;
> $totalRecs = scalar(@fileRecs)-1;


$totalRecs is a scalar so the expression is in scalar context and the scalar()
function is redundant. You are substracting one from the number of elements
in @fileRecs so if @fileRecs contains 2,123 rows $totalRecs will be 2,122.


> print "Total records in $inputFile is $totalRecs \n";
> ##open file for writting output
>
> open(OUTFILE, ">$inputFile.tmp") || die "Cannot open $inputFile.tmp \n";


You should include the $! variable in the error message so you know why the
file failed to open.


> $secondRow = $fileRecs[2];


Array indexes start at zero so that is actually the third row.


> print "BEGINNING -- secondRow = $secondRow \n";
>
> @secondRowRec = split("|", $secondRow);


The first argument to split is a regular expression and the '|' character has
a special meaning in a regular expression so if you want to match a literal
'|' character you have to put a backslash in front of it. The two argument
form of split() will not work correctly if there are empty fields at the end
of the row.


> $secondRowRec[2]="AAAA";
> $secondRowRec[3]="BBBBB";


"I'm interested in modifying the values at the 3rd and 26th column". The 26th
column would be $secondRowRec[25].


> print "\$secondRowRec[2] = $secondRowRec[2] and \$secondRowRec[3]=$secondRowRec[3] \n";
>
> print "\$secondRow is now $secondRow \n";
>
> $diffOfNewRec = 4000 - $totalRecs;
>
> print "Need to produce additional $diffOfNewRec \n";
>
> #making a backup copy
> `cp $inputFile $inputFile.tmp`;


perldoc -q "using backticks in a void context"


> ## I'm need help here !! Not sure how I could re-join the elements modified back
> ## into the array containing that particular row and append it to the end of the file
> print OUTFILE for ($i=0;$i<$diffOfNewRec; $i++){

^^^
You are missing a semicolon at the end of the print statement.


> print OUTFILE "$secondRow\n";
> }
>
> close OUTFILE;


You can do what you want like this:

open INPUTFILE, '<', $inputFile or die "Cannot open $inputFile: $!";

my @fileRecs = <INPUTFILE>;

my @secondRowRec = split /\|/, $fileRecs[ 1 ], -1;

my $secondRowRec[ 2 ] = 'AAAA';
my $secondRowRec[ 3 ] = 'BBBBB';

open OUTFILE, '>', "$inputFile.tmp" or die "Cannot open $inputFile.tmp: $!";

print OUTFILE @fileRecs, ( join '|', @secondRowRec ) x ( 4000 - @fileRecs );





John
--
use Perl;
program
fulfillment
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com