Home > Archive > Tcl > August 2006 > apply a math function to alist of numbers
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
apply a math function to alist of numbers
|
|
|
| if I have a tcl list of numbers, what would be the fastest way to apply
a math function to each element ?
| |
| Alan Anderson 2006-08-20, 7:01 pm |
| mitch <mitchu@houston.rr.com> wrote:
> if I have a tcl list of numbers, what would be the fastest way to apply
> a math function to each element ?
Be more specific. What do you mean by "apply a math function to each
element"? Do you want to add them all up and return the sum, do you
want to replace each element by the result of the function, or something
else entirely?
And why do you want to know the "fastest" way? It's possible that
there's some unobvious trick that can speed up whatever it is you want
to do, but it's also likely that doing it the straightforward way will
be fast enough.
| |
| Gerald W. Lester 2006-08-20, 7:01 pm |
| mitch wrote:
> if I have a tcl list of numbers, what would be the fastest way to apply
> a math function to each element ?
>
foreach
--
+--------------------------------+---------------------------------------+
| Gerald W. Lester |
|"The man who fights for his ideals is the man who is alive." - Cervantes|
+------------------------------------------------------------------------+
| |
| Donald Arseneau 2006-08-20, 7:01 pm |
| mitch <mitchu@houston.rr.com> writes:
> if I have a tcl list of numbers, what would be the fastest way to apply a
> math function to each element ?
>
There are packages for vector operations. math in Tcllib
as well as La are Tcl implementations, so may not be
particularly fast. BLT's vectors are binary, but I'm not
sure of the available operations.
--
Donald Arseneau asnd@triumf.ca
| |
| Uwe Klein 2006-08-20, 7:01 pm |
| Donald Arseneau wrote:
> mitch <mitchu@houston.rr.com> writes:
>
>
>
> There are packages for vector operations. math in Tcllib
> as well as La are Tcl implementations, so may not be
> particularly fast. BLT's vectors are binary, but I'm not
> sure of the available operations.
>
the ternary operator is missing
( (condition) ? resa : resb )
there are some examples in the blt distribution.
another example is on the wiki:
http://wiki.tcl.tk/15000 ( function plotter )
uwe
| |
|
| to be more specific ...
I have between 15 and 100 tcl lists, with each list containing up to
900,000 integers (both pos and neg ). From each list , I would like to
create a new list, and accomplish something like this.:
( this makes me wonder if lreplace might be faster than lappend )
----------------------------------
set ofs 12.3
set fct 0.001807
puts " [ llength $mybiglist ]"
set newlist [list]
foreach nextval {
lappend newlist [ expr [ expr $nexval * $fct ] + $ofs]
}
----------------------
what I can do with awk in miliseconds, takes dozens of seconds in tcl.
I am wondering if I am just stupid about the right technique.
It's actually faster to save the data , run awk on it externally and
reload it. ?!
....M'
Alan Anderson wrote:
[color=darkred]
> mitch <mitchu@houston.rr.com> wrote:
>
>
>
>
> Be more specific. What do you mean by "apply a math function to each
> element"? Do you want to add them all up and return the sum, do you
> want to replace each element by the result of the function, or something
> else entirely?
>
> And why do you want to know the "fastest" way? It's possible that
> there's some unobvious trick that can speed up whatever it is you want
> to do, but it's also likely that doing it the straightforward way will
> be fast enough.
| |
| Uwe Klein 2006-08-21, 7:02 pm |
| mitch wrote:
> ----------------------------------
> set ofs 12.3
> set fct 0.001807
>
> puts " [ llength $mybiglist ]"
> set newlist [list]
catch { unset newlist }
> foreach nextval $mybiglist {
> lappend newlist [ expr [ expr $nexval * $fct ] + $ofs]
lappend newlist [ expr { ( $fct * $nextval ) + $ofs} ]
# notice single call to expr and the curly braces
# see: http://wiki.tcl.tk/10225
> }
put your stuff into a proc:
proc inlist {
...
return $outlist
}[color=darkred]
> ----------------------
> what I can do with awk in miliseconds, takes dozens of seconds in tcl.
> I am wondering if I am just stupid about the right technique.
> It's actually faster to save the data , run awk on it externally and
> reload it. ?!
> ...M'
>
>
> Alan Anderson wrote:
>
if you are not happy with the speed of plain tcl:
#/us/bin/tclsh
package require BLT
catch { namespace import ::blt::* } cerr
set ds1 [ vector #auto ]
set res [ vector #auto ]
$ds1 set $my_datalist_1
set offs 12.5
set fct 123.45
$res expr { ( $fct * $ds1 ) + $offs }
puts [ join [ $res range 0 end ] \n]
uwe
| |
| Uwe Klein 2006-08-21, 7:02 pm |
| mitch wrote:
> ----------------------------------
> set ofs 12.3
> set fct 0.001807
>
> puts " [ llength $mybiglist ]"
> set newlist [list]
catch { unset newlist }
> foreach nextval $mybiglist {
> lappend newlist [ expr [ expr $nexval * $fct ] + $ofs]
lappend newlist [ expr { ( $fct * $nextval ) + $ofs} ]
# notice single call to expr and the curly braces
# see: http://wiki.tcl.tk/10225
> }
put your stuff into a proc:
proc run_expr_on inlist {
...
return $outlist
}[color=darkred]
> ----------------------
> what I can do with awk in miliseconds, takes dozens of seconds in tcl.
> I am wondering if I am just stupid about the right technique.
> It's actually faster to save the data , run awk on it externally and
> reload it. ?!
> ...M'
>
>
> Alan Anderson wrote:
>
if you are not happy with the speed of plain tcl:
#/us/bin/tclsh
package require BLT
catch { namespace import ::blt::* } cerr
set ds1 [ vector #auto ]
set res [ vector #auto ]
$ds1 set $my_datalist_1
set offs 12.5
set fct 123.45
$res expr { ( $fct * $ds1 ) + $offs }
puts [ join [ $res range 0 end ] \n]
uwe
| |
| Adrian Ho 2006-08-21, 7:03 pm |
| On 2006-08-21, mitch <mitchu@houston.rr.com> wrote:
> to be more specific ...
> I have between 15 and 100 tcl lists, with each list containing up to
> 900,000 integers (both pos and neg ).
Yikes! With datasets that size, I'd use stream processing techniques
where possible, instead of loading everything into memory and probably
swapping my machine to death even before I start manipulating it.
> From each list , I would like to
> create a new list, and accomplish something like this.:
> ( this makes me wonder if lreplace might be faster than lappend )
> ----------------------------------
> set ofs 12.3
> set fct 0.001807
>
> puts " [ llength $mybiglist ]"
> set newlist [list]
> foreach nextval {
> lappend newlist [ expr [ expr $nexval * $fct ] + $ofs]
> }
> ----------------------
> what I can do with awk in miliseconds, takes dozens of seconds in tcl.
I assume The Awk Way looks something like this?
awk '{print $1 * 0.001807 + 12.3}' < input.txt > output.txt
If so, it's no wonder it's faster -- it uses *much* less memory (in
theory, O(1) vs. your Tcl Way's O(n)).
> I am wondering if I am just stupid about the right technique.
> It's actually faster to save the data , run awk on it externally and
> reload it. ?!
It'll probably be faster still as a Unix pipeline:
process1 < input.txt | process2 | ... | processN > output.txt
where any, all or *none* of the processX's are written in Tcl. I really
like Tcl, but I wouldn't thrash my machine for it. 8-)
- Adrian
| |
| Stephan Kuhagen 2006-08-22, 8:01 am |
| Adrian Ho wrote:
> awk '{print $1 * 0.001807 + 12.3}' < input.txt > output.txt
You can do similar with Tcl, nevertheless awk is faster:
|---list_add.tcl---
| #!/bin/sh
| #\
| exec tclsh "$0" "$@"
|
| set list [read stdin]
| foreach nextval $list {
| append result "[expr {(0.001807*$nextval)+12.3}]\n"
| }
| puts $result
|---end---
time ./list_add.tcl < data > result
real 0m3.744s
time awk '{print $1 * 0.001807 + 12.3}' < data > result2
real 0m2.694s
So, awk IS faster for that special task, but not milliseconds against dozens
of seconds... A faster foreach and string-concat would be great.
> If so, it's no wonder it's faster -- it uses much less memory (in
> theory, O(1) vs. your Tcl Way's O(n)).
I think, this is not the problem. The "data" file in my example contains
750.000 lines with ints as described and is just 3.8MB in size. Even if all
int had 10 digits it would be just 3 times as big. This is, why my example
first reads the whole file, then processes it, which is much faster than
reading line by line.
Regards
Stephan
| |
| Adrian Ho 2006-08-22, 8:01 am |
| On 2006-08-22, Stephan Kuhagen <stk@mevis.de> wrote:
> Adrian Ho wrote:
>
> I think, this is not the problem.
I think it's certainly a major contributing factor. Read on...
> The "data" file in my example contains 750.000 lines with ints as
> described and is just 3.8MB in size. Even if all int had 10 digits it
> would be just 3 times as big. This is, why my example first reads the
> whole file, then processes it, which is much faster than reading line
> by line.
As a counterexample, I wrote three scripts, and populated randnums.txt
with 900000 random integers between -16384 and 16383 to simulate the
OP's working conditions:
# Script 1 - just reads in the numbers (baseline)
set fp [open randnums.txt]
set list [read $fp]
close $fp
exec ps -eo cmd,size | grep tclsh > results.txt
# Script 2 - does calc, appends result (your example)
set fp [open randnums.txt]
set list [read $fp]
close $fp
foreach nextval $list {
append result "[expr {(0.001807*$nextval)+12.3}]\n"
}
exec ps -eo cmd,size | grep tclsh > results.txt
# Script 3 - does calc, lappends result (OP's example)
set fp [open randnums.txt]
set list [read $fp]
close $fp
foreach nextval $list {
lappend result [expr {(0.001807*$nextval)+12.3}]
}
exec ps -eo cmd,size | grep tclsh > results.txt
I then [source]d each script in a different tclsh and used "ps" to check
the memory usage of each tclsh. I also measured awk's memory consumption:
awk '
{X = $1 * 0.001807 + 12.3}
END {print system("ps -eo cmd,size | grep awk")}
' < randnums.txt > results.txt
The results are instructive:
awk: 1.234s @ 1096K
Script 1: 0.096s @ 8456K
Script 2: 12.576s @ 58292K
Script 3: 4.466s @ 74492K
Note in particular the large increases in memory consumption due to
in-memory processing of the entire list. It's certainly conceivable
that this magnitude of increase can trigger a serious amount of swapping,
especially if the OP's 15-100 datasets are all simultaneously processed
in the same tclsh.
- Adrian
| |
| Stephan Kuhagen 2006-08-22, 8:01 am |
| Adrian Ho wrote:
> awk: 1.234s @ 1096K
> Script 1: 0.096s @ 8456K
> Script 2: 12.576s @ 58292K
> Script 3: 4.466s @ 74492K
>
> Note in particular the large increases in memory consumption due to
> in-memory processing of the entire list. It's certainly conceivable
> that this magnitude of increase can trigger a serious amount of swapping,
> especially if the OP's 15-100 datasets are all simultaneously processed
> in the same tclsh.
That's true. My example was just the Tcl-version of that awk-oneliner. As I
said, Tcl is slower on that, and awk uses less mem. But with the
example-scenario of your awk-oneliner and it's version in Tcl, memory is no
problem (assuming, that 75MB doesn't cause swapping). Of course everything
looks completely different, if you have 10 such lists in memory. But then I
would ask, where there is potential for optimizing, maybe doing it in
sequence instead of parallel (is it required to do parallel?) or not
reading the whole files in one step, but in chunks optimized for speed vs.
memory-usage. The other question is, of course, if it is worth the effort.
If he just needs to process his 10 lists once in his life, then there is no
need for optimization and any effort. Or if it is a common task, that for
some reason should be done in Tcl instead of awk, then it might be worth
some thinking. I think, we can agree, that optimizing this task involves
avoiding too much memory-consumption for swapping and slow IO. The foreach
and the calculation are not the slow parts.
Regards
Stephan
| |
| Uwe Klein 2006-08-22, 8:01 am |
| the first (1,2) are your scripts run on my box
the second ( 4,5) are your ops done through blt.
Linux 2.6.5-7.111.19-default #1 Fri Dec 10 15:10:58 UTC 2004 i686 athlon i386 GNU/Linux
AMD Athlon(tm) XP 3000+
tcl_patchLevel -> 8.4.6
cat script1.tcl ; time tclsh script1.tcl ; cat results.txt
set fp [open randomdat.txt]
set list [read $fp]
close $fp
exec ps -eo cmd,size | grep tclsh > results.txt
real 0m0.079s
user 0m0.051s
sys 0m0.028s
tclsh script1.tc 8468
cat script2.tcl ; time tclsh script2.tcl ; cat results.txt
set fp [open randomdat.txt]
set list [read $fp]
close $fp
foreach nextval $list {
append result "[expr {(0.001807*$nextval)+12.3}]\n"
}
exec ps -eo cmd,size | grep tclsh > results.txt
real 0m4.662s
user 0m4.529s
sys 0m0.107s
tclsh script2.tc 71856
cat script4.tcl ; time tclsh script4.tcl ; cat results.txt
#/us/bin/tclsh
package require BLT
catch { namespace import ::blt::* } cerr
set ds1 [ vector #auto ]
set res [ vector #auto ]
set start [ clock clicks ]
$ds1 set [ read [ open randomdat.txt ] ]
set offs 12.5
set fct 123.45
exec ps -eo cmd,size | grep tclsh > results.txt
real 0m0.866s
user 0m0.751s
sys 0m0.112s
tclsh script4.tc 43728
cat script5.tcl ; time tclsh script5.tcl ; cat results.txt
#/us/bin/tclsh
package require BLT
catch { namespace import ::blt::* } cerr
set ds1 [ vector #auto ]
set res [ vector #auto ]
set start [ clock clicks ]
$ds1 set [ read [ open randomdat.txt ] ]
set offs 12.5
set fct 123.45
$res expr { ( $fct * $ds1 ) + $offs }
exec ps -eo cmd,size | grep tclsh > results.txt
exit
real 0m0.976s
user 0m0.838s
sys 0m0.127s
tclsh script5.tc 51924
G!
uwe
| |
| stephanearnold@yahoo.fr 2006-08-24, 4:04 am |
| Hello,
When I read the title I felt it was not numerical analysis,
but fonctionnal programming.
See the following link :
http://wiki.tcl.tk/lmap
then perform :
lmap x $mynumbers {expr cos($x)}
To apply cosine on all numbers in the list.
Hope it helps.
Regards,
St=E9phane
| |
| Andreas Leitgeb 2006-08-24, 8:01 am |
| stephanearnold@yahoo.fr <stephanearnold@yahoo.fr> wrote:
> When I read the title I felt it was not numerical analysis,
> but fonctionnal programming.
> See the following link :
> http://wiki.tcl.tk/lmap
> then perform :
> lmap x $mynumbers {expr cos($x)}
> To apply cosine on all numbers in the list.
First, always use braces with expr !
lmap x $mynumbers {expr {cos($x)}}
Second, the wiki-page implements this lmap procedure
using foreach and lappend - a combination that has
already been suggested. if there were some builtin
lmap it might be better performing, since it would
know in advance how much space to allocate for result
list.
|
|
|
|
|