Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

While loop slowness
I have a while inside a while inside a while that is very slow for
large reads. Here is the code (it is really long):

{ while read myline; do
if [[ $myline = "<tlm-meas>"* ]];then
read dayline
firstpass="${dayline##<day-time>}"
daytime="${firstpass%%</day-time>}"
linevals=$daytime$comma
read vehicletime
read meascolumn
read measvalue
read limitsflag
i=0
while [[ $i -le $meascount ]];do
firstpass="${meascolumn##<meas-column>}"
meascol="${firstpass%%</meas-column>}"
if [[ $meascol = $i ]];then
firstpass="${measvalue##<meas-value>}"
measval="${firstpass%%</meas-value>}"
linevals=$linevals$measval$comma
read meascolumn
if [[ $meascolumn = "</tlm-meas>"* ]];then
while [[ $i < $(($meascount-1)) ]];do
linevals=$linevals$comma
i=$(($i+1))
done
break
fi
read measvalue
read limitsflag
else
linevals=$linevals$comma
fi
i=$(($i+1))
done
fi
if [[ $myline = "</pds-datasort>"* ]];then
break
fi
if [[ $myline = "</tlm-meas>"* || $meascolumn = "</tlm-
meas>"* ]];then
while [[ $i -le $(($meascount-1)) ]];do
linevals=$linevals$comma
i=$(($i+1))
done linevals1="${linevals%,,}"
print $linevals1 >> $3
continue
fi
done } < $dspbfile

So, what this does is takes data from one file that looks like this
(and this is just a a partial file):

<tlm-meas>
<day-time>2008/035:23:08:09.803</day-time>
<vehicle-time>   83289.803</vehicle-time>
<meas-column>8</meas-column>
<meas-value>-25.0335</meas-value>
<limits-flag><<tlm-meas>
<day-time>2008/035:23:08:25.333</day-time>
<vehicle-time>   83305.333</vehicle-time>
<meas-column>9</meas-column>
<meas-value>0</meas-value>
<limits-flag></limits-flag>
<meas-column>11</meas-column>
<meas-value>3.22123e+09</meas-value>
<limits-flag></limits-flag>
</tlm-meas>
/limits-flag>
</tlm-meas>
</pds-datasort>

And prints it into a file that looks like this:
2008/035:23:08:09.803,,,,,,,,,-25.0335,,,
2008/035:23:08:25.333,,,,,,,,,,0,,3.22123e+09

Where the meas-column field is where the value gets put and if there
is no value for the column (they are in order), then it will just get
a comma. And there needs to be commas for each mnemonic (which I do
know how many there are) even if it has no value.

When I have only 60 samples in the first file, it runs very quickly.
When I have 274,100 samples in the first file, it takes 2-3 hours to
run.

Is there a quicker way to do this? If not, that is ok.  I just can't
seem to find one. Thanks for any help.

Allyson






Report this thread to moderator Post Follow-up to this message
Old Post
eskgwin@gmail.com
04-02-08 12:27 AM


Re: While loop slowness
eskgwin@gmail.com wrote:
> I have a while inside a while inside a while that is very slow for
> large reads. Here is the code (it is really long):
>
> { while read myline; do
>         if [[ $myline = "<tlm-meas>"* ]];then
>           read dayline
>           firstpass="${dayline##<day-time>}"
>           daytime="${firstpass%%</day-time>}"
>           linevals=$daytime$comma
>           read vehicletime
>           read meascolumn
>           read measvalue
>           read limitsflag
>           i=0
>           while [[ $i -le $meascount ]];do
>             firstpass="${meascolumn##<meas-column>}"
>             meascol="${firstpass%%</meas-column>}"
>             if [[ $meascol = $i ]];then
>               firstpass="${measvalue##<meas-value>}"
>               measval="${firstpass%%</meas-value>}"
>               linevals=$linevals$measval$comma
>               read meascolumn
>               if [[ $meascolumn = "</tlm-meas>"* ]];then
>                 while [[ $i < $(($meascount-1)) ]];do
>                   linevals=$linevals$comma
>                   i=$(($i+1))
>                 done
>                 break
>               fi
>               read measvalue
>               read limitsflag
>             else
>               linevals=$linevals$comma
>             fi
>             i=$(($i+1))
>           done
>         fi
>         if [[ $myline = "</pds-datasort>"* ]];then
>           break
>         fi
>         if [[ $myline = "</tlm-meas>"* || $meascolumn = "</tlm-
> meas>"* ]];then
>           while [[ $i -le $(($meascount-1)) ]];do
>             linevals=$linevals$comma
>             i=$(($i+1))
>           done linevals1="${linevals%,,}"
>           print $linevals1 >> $3
>           continue
>         fi
>     done } < $dspbfile
>
> So, what this does is takes data from one file that looks like this
> (and this is just a a partial file):
>
> <tlm-meas>
> <day-time>2008/035:23:08:09.803</day-time>
> <vehicle-time>   83289.803</vehicle-time>
> <meas-column>8</meas-column>
> <meas-value>-25.0335</meas-value>
> <limits-flag><<tlm-meas>
> <day-time>2008/035:23:08:25.333</day-time>
> <vehicle-time>   83305.333</vehicle-time>
> <meas-column>9</meas-column>
> <meas-value>0</meas-value>
> <limits-flag></limits-flag>
> <meas-column>11</meas-column>
> <meas-value>3.22123e+09</meas-value>
> <limits-flag></limits-flag>
> </tlm-meas>
> /limits-flag>
> </tlm-meas>
> </pds-datasort>
>
> And prints it into a file that looks like this:
> 2008/035:23:08:09.803,,,,,,,,,-25.0335,,,
> 2008/035:23:08:25.333,,,,,,,,,,0,,3.22123e+09
>
> Where the meas-column field is where the value gets put and if there
> is no value for the column (they are in order), then it will just get
> a comma. And there needs to be commas for each mnemonic (which I do
> know how many there are) even if it has no value.
>
> When I have only 60 samples in the first file, it runs very quickly.
> When I have 274,100 samples in the first file, it takes 2-3 hours to
> run.
>
> Is there a quicker way to do this? If not, that is ok.  I just can't
> seem to find one. Thanks for any help.

Have a look at xgawk (XML extended GNU awk) to process such data.

Janis

>
> Allyson
>
>
>
>
>

Report this thread to moderator Post Follow-up to this message
Old Post
Janis Papanagnou
04-02-08 12:27 AM


Re: While loop slowness

On 4/1/2008 12:41 PM, eskgwin@gmail.com wrote:
> I have a while inside a while inside a while that is very slow for
> large reads. Here is the code (it is really long):
>
> { while read myline; do
>         if [[ $myline = "<tlm-meas>"* ]];then
>           read dayline
>           firstpass="${dayline##<day-time>}"
>           daytime="${firstpass%%</day-time>}"
>           linevals=$daytime$comma
>           read vehicletime
>           read meascolumn
>           read measvalue
>           read limitsflag
>           i=0
>           while [[ $i -le $meascount ]];do
>             firstpass="${meascolumn##<meas-column>}"
>             meascol="${firstpass%%</meas-column>}"
>             if [[ $meascol = $i ]];then
>               firstpass="${measvalue##<meas-value>}"
>               measval="${firstpass%%</meas-value>}"
>               linevals=$linevals$measval$comma
>               read meascolumn
>               if [[ $meascolumn = "</tlm-meas>"* ]];then
>                 while [[ $i < $(($meascount-1)) ]];do
>                   linevals=$linevals$comma
>                   i=$(($i+1))
>                 done
>                 break
>               fi
>               read measvalue
>               read limitsflag
>             else
>               linevals=$linevals$comma
>             fi
>             i=$(($i+1))
>           done
>         fi
>         if [[ $myline = "</pds-datasort>"* ]];then
>           break
>         fi
>         if [[ $myline = "</tlm-meas>"* || $meascolumn = "</tlm-
> meas>"* ]];then
>           while [[ $i -le $(($meascount-1)) ]];do
>             linevals=$linevals$comma
>             i=$(($i+1))
>           done linevals1="${linevals%,,}"
>           print $linevals1 >> $3
>           continue
>         fi
>     done } < $dspbfile
>
> So, what this does is takes data from one file that looks like this
> (and this is just a a partial file):
>
> <tlm-meas>
> <day-time>2008/035:23:08:09.803</day-time>
> <vehicle-time>   83289.803</vehicle-time>
> <meas-column>8</meas-column>
> <meas-value>-25.0335</meas-value>
> <limits-flag><<tlm-meas>
> <day-time>2008/035:23:08:25.333</day-time>
> <vehicle-time>   83305.333</vehicle-time>
> <meas-column>9</meas-column>
> <meas-value>0</meas-value>
> <limits-flag></limits-flag>
> <meas-column>11</meas-column>
> <meas-value>3.22123e+09</meas-value>
> <limits-flag></limits-flag>
> </tlm-meas>
> /limits-flag>
> </tlm-meas>
> </pds-datasort>
>
> And prints it into a file that looks like this:
> 2008/035:23:08:09.803,,,,,,,,,-25.0335,,,
> 2008/035:23:08:25.333,,,,,,,,,,0,,3.22123e+09
>
> Where the meas-column field is where the value gets put and if there
> is no value for the column (they are in order), then it will just get
> a comma. And there needs to be commas for each mnemonic (which I do
> know how many there are) even if it has no value.
>
> When I have only 60 samples in the first file, it runs very quickly.
> When I have 274,100 samples in the first file, it takes 2-3 hours to
> run.
>
> Is there a quicker way to do this? If not, that is ok.  I just can't
> seem to find one. Thanks for any help.

shell loops are usually the wrong approach. I don't think your sample input 
is
quite right as it has things in it like "<<tlm-meas>" and "/limits-flag>". I
t
appears that you're trying to get all the between "<tlm-meas>" and "</tlm-me
as>"
into a single line. If so, take a look at this using GNU awk on a modified
verion of your input file:

$ cat file
<tlm-meas>
<day-time>2008/035:23:08:09.803</day-time>
<vehicle-time>   83289.803</vehicle-time>
<meas-column>8</meas-column>
<meas-value>-25.0335</meas-value>
</tlm-meas>
<tlm-meas>
<day-time>2008/035:23:08:25.333</day-time>
<vehicle-time>   83305.333</vehicle-time>
<meas-column>9</meas-column>
<meas-value>0</meas-value>
<limits-flag></limits-flag>
</tlm-meas>
<tlm-meas>
<meas-column>11</meas-column>
<meas-value>3.22123e+09</meas-value>
<limits-flag></limits-flag>
</tlm-meas>
$ gawk -v RS="</tlm-meas>[[:space:]]*" -F'\n' '{
for (i=2;i<NF;i++) {
split($i,arr,"[<> ]+")
printf "%s=\"%s\"\n",arr[2],arr[3]
}
print "----"
}' file
day-time="2008/035:23:08:09.803"
vehicle-time="83289.803"
meas-column="8"
meas-value="-25.0335"
----
day-time="2008/035:23:08:25.333"
vehicle-time="83305.333"
meas-column="9"
meas-value="0"
limits-flag="/limits-flag"
----
meas-column="11"
meas-value="3.22123e+09"
limits-flag="/limits-flag"
----

and if it seems to be roughly pulling out and grouping the right information
, we
could tidy it up and figure out how to deal with the missing fields for each
 record.

Ed.


Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
04-02-08 12:27 AM


Re: While loop slowness
On Apr 1, 11:05=A0am, Ed Morton <mor...@lsupcaemnt.com> wrote:
> On 4/1/2008 12:41 PM, eskg...@gmail.com wrote:
>
>
>
>
> 
> 
 
lm- 
> 
> 
> 
> 
> 
> 
>
> shell loops are usually the wrong approach. I don't think your sample inpu=[/color
]
t is
> quite right as it has things in it like "<<tlm-meas>" and "/limits-flag>".=[/color
]
It
> appears that you're trying to get all the between "<tlm-meas>" and "</tlm-=[/color
]
meas>"
> into a single line. If so, take a look at this using GNU awk on a modified=[/color
]

> verion of your input file:
>
> $ cat file
> <tlm-meas>
> <day-time>2008/035:23:08:09.803</day-time>
> <vehicle-time> =A0 83289.803</vehicle-time>
> <meas-column>8</meas-column>
> <meas-value>-25.0335</meas-value>
> </tlm-meas>
> <tlm-meas>
> <day-time>2008/035:23:08:25.333</day-time>
> <vehicle-time> =A0 83305.333</vehicle-time>
> <meas-column>9</meas-column>
> <meas-value>0</meas-value>
> <limits-flag></limits-flag>
> </tlm-meas>
> <tlm-meas>
> <meas-column>11</meas-column>
> <meas-value>3.22123e+09</meas-value>
> <limits-flag></limits-flag>
> </tlm-meas>
> $ gawk -v RS=3D"</tlm-meas>[[:space:]]*" -F'\n' '{
> =A0 =A0 =A0 =A0 for (i=3D2;i<NF;i++) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 split($i,arr,"[<> ]+")
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 printf "%s=3D\"%s\"\n",arr[2],arr[3]
> =A0 =A0 =A0 =A0 }
> =A0 =A0 =A0 =A0 print "----"}' file
>
> day-time=3D"2008/035:23:08:09.803"
> vehicle-time=3D"83289.803"
> meas-column=3D"8"
> meas-value=3D"-25.0335"
> ----
> day-time=3D"2008/035:23:08:25.333"
> vehicle-time=3D"83305.333"
> meas-column=3D"9"
> meas-value=3D"0"
> limits-flag=3D"/limits-flag"
> ----
> meas-column=3D"11"
> meas-value=3D"3.22123e+09"
> limits-flag=3D"/limits-flag"
> ----
>
> and if it seems to be roughly pulling out and grouping the right informati=[/color
]
on, we
> could tidy it up and figure out how to deal with the missing fields for ea=[/color
]
ch record.
>
> =A0 =A0 =A0 =A0 Ed.- Hide quoted text -
>
> - Show quoted text -

The only data I need actually is the day-time and meas-value. I need
the meas-column to figure out where to put each value in the line. The
part that seems hard is to figure out how to deal with the missing
fields and getting the values in the right places. Thanks.

Allyson

Report this thread to moderator Post Follow-up to this message
Old Post
eskgwin@gmail.com
04-02-08 12:27 AM


Re: While loop slowness

On 4/1/2008 1:18 PM, eskgwin@gmail.com wrote:
> On Apr 1, 11:05 am, Ed Morton <mor...@lsupcaemnt.com> wrote:
> 
>
>
> The only data I need actually is the day-time and meas-value. I need
> the meas-column to figure out where to put each value in the line. The
> part that seems hard is to figure out how to deal with the missing
> fields and getting the values in the right places. Thanks.
>

OK, so given the input file I show above, we can do this:

gawk -v OFS="," -v RS="</tlm-meas>[[:space:]]*" -F'\n' '{
dayTime=measColumn=measValue=""
for (i=2;i<NF;i++) {
split($i,arr,"[<> ]+")
if (arr[2] == "day-time") {
dayTime=arr[3]
}
if (arr[2] == "meas-column") {
measColumn=arr[3]
}
if (arr[2] == "meas-value") {
measValue=arr[3]
}
}
print dayTime,measColumn,measValue
}' file
2008/035:23:08:09.803,8,-25.0335
2008/035:23:08:25.333,9,0
,11,3.22123e+09

What needs to be done now?

Ed.



Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
04-02-08 12:27 AM


Re: While loop slowness
On Apr 1, 11:27=A0am, Ed Morton <mor...@lsupcaemnt.com> wrote:
> On 4/1/2008 1:18 PM, eskg...@gmail.com wrote:
>
>
>
>
> 
> 
> 
> 
n 
tlm- 
> 
> 
> 
> 
> 
> 
> 
put is 
". It 
m-meas>" 
ed 
> 
> 
> 
tion, we 
each record.
> 
> 
> 
>
> OK, so given the input file I show above, we can do this:
>
> gawk -v OFS=3D"," -v RS=3D"</tlm-meas>[[:space:]]*" -F'\n' '{
> =A0 =A0 =A0 =A0 dayTime=3DmeasColumn=3DmeasValue=3D""
> =A0 =A0 =A0 =A0 for (i=3D2;i<NF;i++) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 split($i,arr,"[<> ]+")
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (arr[2] =3D=3D "day-time") {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 dayTime=3Darr[3]
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (arr[2] =3D=3D "meas-column") {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 measColumn=3Darr[3]
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (arr[2] =3D=3D "meas-value") {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 measValue=3Darr[3]
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
> =A0 =A0 =A0 =A0 }
> =A0 =A0 =A0 =A0 print dayTime,measColumn,measValue}' file
>
> 2008/035:23:08:09.803,8,-25.0335
> 2008/035:23:08:25.333,9,0
> ,11,3.22123e+09
>
> What needs to be done now?
>
> =A0 =A0 =A0 =A0 Ed.- Hide quoted text -
>
> - Show quoted text -
It needs to look like this:

2008/035:23:08:09.803,,,,,,,,,-25.0335,,,
2008/035:23:08:25.333,,,,,,,,,,0,,3.22123e+09

with the 8 of the meas-column being the place to put the meas-value of
-25.0335. In the second line, the 9 is the column where the 0 meas-
value goes and the 11 is the column where the 3.22123e+09 goes.

Also, when I try to use gawk on my unix box:

Machine hardware:   sun4u
OS version:         5.8
Processor type:     sparc
Hardware:           SUNW,Sun-Blade-100

I get this:
a.ksh[3]: gawk:  not found

I can't even do a man on it:
No manual entry for gawk.

Is there something equivalent that I can use? Thanks.

Allyson







Report this thread to moderator Post Follow-up to this message
Old Post
eskgwin@gmail.com
04-02-08 12:27 AM


Re: While loop slowness

On 4/1/2008 1:36 PM, eskgwin@gmail.com wrote:
> On Apr 1, 11:27 am, Ed Morton <mor...@lsupcaemnt.com> wrote:
> 
>
> It needs to look like this:
>
> 2008/035:23:08:09.803,,,,,,,,,-25.0335,,,
> 2008/035:23:08:25.333,,,,,,,,,,0,,3.22123e+09
>
> with the 8 of the meas-column being the place to put the meas-value of
> -25.0335. In the second line, the 9 is the column where the 0 meas-
> value goes and the 11 is the column where the 3.22123e+09 goes.

That's not a problem but before I do any more: is my guess at your input fil
e
format correct or should it instead be this (deleted the 2 lines immediately
before <meas-column>11</meas-column> ):

<tlm-meas>
<day-time>2008/035:23:08:09.803</day-time>
<vehicle-time>   83289.803</vehicle-time>
<meas-column>8</meas-column>
<meas-value>-25.0335</meas-value>
</tlm-meas>
<tlm-meas>
<day-time>2008/035:23:08:25.333</day-time>
<vehicle-time>   83305.333</vehicle-time>
<meas-column>9</meas-column>
<meas-value>0</meas-value>
<limits-flag></limits-flag>
<meas-column>11</meas-column>
<meas-value>3.22123e+09</meas-value>
<limits-flag></limits-flag>
</tlm-meas>

or should it really be something else? There's different solutions depending
 on
the correct input format.

> Also, when I try to use gawk on my unix box:
>
> Machine hardware:   sun4u
> OS version:         5.8
> Processor type:     sparc
> Hardware:           SUNW,Sun-Blade-100
>
> I get this:
> a.ksh[3]: gawk:  not found

Then gawk isn't in your PATH or it may not already be installed on your mach
ine.

> I can't even do a man on it:
> No manual entry for gawk.
>
> Is there something equivalent that I can use? Thanks.

You can use any awk that allows you to use a regular-expression as it's
record-separator (RS) but the only awk I personally know of that supports th
at
is gawk. We could come up with workarounds but gawk has many, many features 
that
make it a good choice of awk to use so if I were you I'd download and instal
l it
from http://www.gnu.org/software/gawk/ if you don't already have it.

Ed.


Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
04-02-08 12:27 AM


Re: While loop slowness
On 2008-04-01, Janis Papanagnou <Janis_Papanagnou@hotmail.com> wrote:
> eskgwin@gmail.com wrote:
<immense piece of home-grown XML code snipped>
>
> Have a look at xgawk (XML extended GNU awk) to process such data.
>
Perl has a number of XML handling packages that have a lot of
use and polishing behind them; I'd recommend going that route.

--
Christopher Mattern

NOTICE
Thank you for noticing this new notice
Your noticing it has been noted
And will be reported to the authorities

Report this thread to moderator Post Follow-up to this message
Old Post
Chris Mattern
04-02-08 12:27 AM


Re: While loop slowness
Ed Morton wrote:
>
> On 4/1/2008 1:36 PM, eskgwin@gmail.com wrote: 
>
> That's not a problem but before I do any more: is my guess at your input f
ile
> format correct or should it instead be this (deleted the 2 lines immediate
ly
> before <meas-column>11</meas-column> ):
>
> <tlm-meas>
> <day-time>2008/035:23:08:09.803</day-time>
> <vehicle-time>   83289.803</vehicle-time>
> <meas-column>8</meas-column>
> <meas-value>-25.0335</meas-value>
> </tlm-meas>
> <tlm-meas>
> <day-time>2008/035:23:08:25.333</day-time>
> <vehicle-time>   83305.333</vehicle-time>
> <meas-column>9</meas-column>
> <meas-value>0</meas-value>
> <limits-flag></limits-flag>
> <meas-column>11</meas-column>
> <meas-value>3.22123e+09</meas-value>
> <limits-flag></limits-flag>
> </tlm-meas>
>
> or should it really be something else? There's different solutions dependi
ng on
> the correct input format.
> 
>
> Then gawk isn't in your PATH or it may not already be installed on your ma
chine.
> 
>
> You can use any awk that allows you to use a regular-expression as it's
> record-separator (RS) but the only awk I personally know of that supports 
that
> is gawk. We could come up with workarounds but gawk has many, many feature
s that
> make it a good choice of awk to use so if I were you I'd download and inst
all it
> from http://www.gnu.org/software/gawk/ if you don't already have it.
>
> 	Ed.
>

IMHO nawk and /usr/xpg4/bin/awk would work in this case.

But it's a good idea to download GNU awk,
I would choose http://www.sunfreeware.com/
which will guide you to the needed GNU libraries.

--
Michael Tosch @ hp : com

Report this thread to moderator Post Follow-up to this message
Old Post
Michael Tosch
04-03-08 12:41 AM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

Unix Shell Programming archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 01:51 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.