Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Large binary data manipulation
hi All


I have a binary data file that has interleaved data. I need to extract
the left and right data into seperate variables, keeping binary.

Input file  "L|R|L|R|L|R|L|R|L|R...."
Left output  L|L|L|L|L|L|L|....
Right output R|R|R|R|R|R|R|....

I tried allocating it as follows

for {set i 0} {$i < $num_samples} {incr i} {
set data_left $data_left[read $file_handle 2]
set data_right $data_right[read $file_handle 2]
}




How ever this took a long time, I presume as it had to re allocate
memory while assigning the next data point.

Is there a better way? just using the core Tcl, i.e. no extensions

P.S. I found it quicker to write to files then read them back in!

Thanks in advance

Derek

Report this thread to moderator Post Follow-up to this message
Old Post
Derek
08-26-04 01:57 AM


Re: Large binary data manipulation
Hi Derek

you have two problems in your example:

your first problem is the reading 2 bytes every step. this causes alot
of io operations. The second problem, that you don't use the append
command, your method reallocates the buffer every time you set a new
variable value.
Just read larger buffers and the routines get alot faster.

set data [read $file_handle 10240]
set len [string length $data]
set left_data ""
set right_data ""
for {set i 0} {$i < $len} {incr i 4} {
append left_data [string range $data $i [expr $i+1]]
append right_data [string range $data [expr $i+2] [expr $i+3]]
}

regards
--
Try Code-Navigator on http://www.codenav.com
a source code navigating, analysis and developing tool. It supports
almost all languages on the scope.


Derek wrote:
> hi All
>
>
> I have a binary data file that has interleaved data. I need to extract
> the left and right data into seperate variables, keeping binary.
>
>    Input file  "L|R|L|R|L|R|L|R|L|R...."
>    Left output  L|L|L|L|L|L|L|....
>    Right output R|R|R|R|R|R|R|....
>
>    I tried allocating it as follows
>
>     for {set i 0} {$i < $num_samples} {incr i} {
>         set data_left $data_left[read $file_handle 2]
>         set data_right $data_right[read $file_handle 2]
>     }
>
>
>
>
>    How ever this took a long time, I presume as it had to re allocate
> memory while assigning the next data point.
>
>    Is there a better way? just using the core Tcl, i.e. no extensions
>
> P.S. I found it quicker to write to files then read them back in!
>
> Thanks in advance
>
> Derek


--
Dipl.-Informatiker
Khamis Abuelkomboz
Rosenweg 124
58239 Schwerte
+49 2304 898560 (Telefon)
+49 2304 898561 (Fax)
http://www.wellcode.com

Report this thread to moderator Post Follow-up to this message
Old Post
Wellcode
08-26-04 01:57 AM


Re: Large binary data manipulation
Wellcode wrote:

> Hi Derek
>
> you have two problems in your example:
>
> your first problem is the reading 2 bytes every step. this causes alot
> of io operations. The second problem, that you don't use the append
> command, your method reallocates the buffer every time you set a new
> variable value.
> Just read larger buffers and the routines get alot faster.
>
> set data [read $file_handle 10240]
> set len [string length $data]
> set left_data ""
> set right_data ""
> for {set i 0} {$i < $len} {incr i 4} {
>     append left_data [string range $data $i [expr $i+1]]
>     append right_data [string range $data [expr $i+2] [expr $i+3]]
> }
>
> regards

... and it will go even faster if you put {} around the expressions:

..
append left_data [string range $data $i [expr {$i+1}]]
..

Whether it's noticible in your situation or not depends on a lot of
factors but it's a good habit to get into in any case.


Report this thread to moderator Post Follow-up to this message
Old Post
Bryan Oakley
08-26-04 01:57 AM


Re: Large binary data manipulation
"Bryan Oakley" <oakley@bardo.clearlight.com> wrote
>
> ... and it will go even faster if you put {} around the expressions:
>
>     ...
>     append left_data [string range $data $i [expr {$i+1}]]
>     ...
>
> Whether it's noticible in your situation or not depends on a lot of
> factors but it's a good habit to get into in any case.
>

That's interesting ... for us newbies, why is using {} in expr faster Bryan?
I should get into that habit as well!
Thanks!



Report this thread to moderator Post Follow-up to this message
Old Post
USCode
08-26-04 01:57 AM


Re: Large binary data manipulation
Wellcode wrote:

> Hi Derek
>
> you have two problems in your example:
>
> your first problem is the reading 2 bytes every step. this causes alot
> of io operations. The second problem, that you don't use the append
> command, your method reallocates the buffer every time you set a new
> variable value.
> Just read larger buffers and the routines get alot faster.
>
> set data [read $file_handle 10240]
> set len [string length $data]
> set left_data ""
> set right_data ""
> for {set i 0} {$i < $len} {incr i 4} {
>     append left_data [string range $data $i [expr $i+1]]
>     append right_data [string range $data [expr $i+2] [expr $i+3]]
> }
>
> regards


This sounds for me a good solution, but if I think on my programs in c
source code, I would never use i++ if the variable is not set.
So "set i 0" must be placed somewhere in your source code before calling
"incr i" and you don't need to validate it's exist.

It sounds like tcl is more saver than bad c source code :-)

regards
--
Try Code-Navigator on http://www.codenav.com
a source code navigating, analysis and developing tool. It supports
almost all languages on the scope.

Report this thread to moderator Post Follow-up to this message
Old Post
Khamis
08-26-04 01:57 AM


Re: Large binary data manipulation
Sorry, reply to wrong theme, should go to [incr] and counting occurences



Khamis wrote:

> Wellcode wrote:
> 
>
>
>
> This sounds for me a good solution, but if I think on my programs in c
> source code, I would never use i++ if the variable is not set.
> So "set i 0" must be placed somewhere in your source code before calling
> "incr i" and you don't need to validate it's exist.
>
> It sounds like tcl is more saver than bad c source code :-)
>
> regards

Report this thread to moderator Post Follow-up to this message
Old Post
Khamis Abuelkomboz
08-26-04 01:57 AM


Re: Large binary data manipulation
"USCode" <uscode@dontspam.me> writes:

[...]

> That's interesting ... for us newbies, why is using {} in expr
> faster Bryan?

Because expr does its own expansion of its argument, so putting the
argument in braces makes everything clearer to the bytecode compiler.

> I should get into that habit as well!

For stylistic reasons as much as anything.  "expr $a*$b" seems simple
until you notice the "set a {7*$c+2}" a few lines earlier.

Report this thread to moderator Post Follow-up to this message
Old Post
Bruce Stephens
08-26-04 01:57 AM


Re: Large binary data manipulation
In article <412CF9E2.4070705@wellcode.com>, Wellcode <info@wellcode.com>
wrote:
> Hi Derek
>
> you have two problems in your example:
>
> your first problem is the reading 2 bytes every step. this causes alot
> of io operations. The second problem, that you don't use the append
> command, your method reallocates the buffer every time you set a new
> variable value.
> Just read larger buffers and the routines get alot faster.
>
> set data [read $file_handle 10240]
> set len [string length $data]
> set left_data ""
> set right_data ""
> for {set i 0} {$i < $len} {incr i 4} {
>      append left_data [string range $data $i [expr $i+1]]
>      append right_data [string range $data [expr $i+2] [expr $i+3]]
> }
>
> regards

Someone else already commented that you will get better performance
if you brace ({ ... }) your expressions.  Someone asked why.  The
reason is that if the expression is braced, the byte code compiler
recognizes the braced expression as a constant (to the parse step),
and so hard-codes that into the call to the expr command, which then
substitutes the variables.  If left unbraced, the byte-code compiler
has to compile in the evaluation of each of the parts, but the expr
command cannot tell it is receiving pre-parsed data through its
arguments, so it still has to re-parse them for substitutions.
Plus, there are certain degenerate cases where this double
substitution could break an expression.  (Although there was also a
degenerate case where I deliberately left the braces out just to get
double substitution, such tricks should be reserved for advanced
users and should be very well documented in a comment.)

If you really want to squeeze every last bit of performance out of
your application, compare and benchmark the above to these:

set data [read $file_handle 10240]
set len [string length $data]
set left_data ""
set right_data ""
set i 0
while {$i < $len} {
append left_data [string range $data $i [incr i]]
append right_data [string range $data [incr i] [incr i]]
incr i
}
or,

set data [read $file_handle 10240]
set len [string length $data]
set left_data ""
set right_data ""
set i -1
while {$i < $len - 1} {
append left_data [string range $data [incr i] [incr i]]
append right_data [string range $data [incr i] [incr i]]
}

I'm not sure either one is any faster, but I have a hunch one of
them might be.  I'll leave it to you to benchmark and compare them.

This might be faster or slower (or it might run you out of memory),
and since you are dealing with binary data, not character strings,
it might fail altogether:

set data [read $file_handle 10240]
set left_data ""
set right_data ""
set ldat [split $data {} ]
foreach {0 1 2 3} $ldat {
append left_data [join $0 $1 {}]
append right_data [join $2 $3 {}]
}

Note:  0, 1, 2, and 3 are just variable names, nothing magic.
--
Rich Wurth / rwurth@att.net / Rumson, NJ  USA

Report this thread to moderator Post Follow-up to this message
Old Post
R. T. Wurth
08-26-04 08:57 AM


Re: Large binary data manipulation
"Bruce Stephens" <bruce+usenet@cenderis.demon.co.uk> wrote
> "USCode" <uscode@dontspam.me> writes: 
>
> Because expr does its own expansion of its argument, so putting the
> argument in braces makes everything clearer to the bytecode
compiler.

If speeding up your application code is important, also see the Tcl
Performance page at http://wiki.tcl.tk/348

Bob
--
Bob Techentin                   techentin.robert@NOSPAMmayo.edu
Mayo Foundation                                 (507) 538-5495
200 First St. SW                            FAX (507) 284-9171
Rochester MN, 55901  USA            http://www.mayo.edu/sppdg/




Report this thread to moderator Post Follow-up to this message
Old Post
Bob Techentin
08-26-04 01:57 PM


Re: Large binary data manipulation
Hi All

I replied to wellcode directly but should have copied the list.

We were ina first phase of writing code, i.e. could it be done with
TCL.
Once it was working we found that the code that strips out the L and R
data to be a tcl/code style bottle neck. We have others but these are
more target processor related.

We knew there must have been a better way to do what we were doing but
were focusing on the binary command rather than moving out side our
self constructed box.

From Wellcode's suggestion we only used the append which increased the
performance of this section of code from 40+ seconds to.... well we
didn't bother measuring it but it must be less than a second. This now
puts it off the radar as far as improving the performance of the code.
There are other S/W involved that may be reviewed if we need futher
performance, but the upper level Tcl is now not an imediate concern.

I have been using Tcl/Expect for test automation for some time, mainly
dealing with regexps and small strings, this was the first time I had
looked at binary data.

It was a revelation that the string command could be used to handle
binary data as well.
It now puts other coding exercises at work more within the grasp of
Tcl than I had first expected.

Thanks to all that contributed, we may use more of the ideas if we
need to improve performance, or after our code review.

Again thanks

Derek Philip


rwurth@att.net (R. T. Wurth) wrote in message news:<cgjc42$23o_002@worldnet.att.net>...[col
or=darkred]
> In article <412CF9E2.4070705@wellcode.com>, Wellcode <info@wellcode.com>
> wrote: 
>
> Someone else already commented that you will get better performance
> if you brace ({ ... }) your expressions.  Someone asked why.  The
> reason is that if the expression is braced, the byte code compiler
> recognizes the braced expression as a constant (to the parse step),
> and so hard-codes that into the call to the expr command, which then
> substitutes the variables.  If left unbraced, the byte-code compiler
> has to compile in the evaluation of each of the parts, but the expr
> command cannot tell it is receiving pre-parsed data through its
> arguments, so it still has to re-parse them for substitutions.
> Plus, there are certain degenerate cases where this double
> substitution could break an expression.  (Although there was also a
> degenerate case where I deliberately left the braces out just to get
> double substitution, such tricks should be reserved for advanced
> users and should be very well documented in a comment.)
>
> If you really want to squeeze every last bit of performance out of
> your application, compare and benchmark the above to these:
>
>    set data [read $file_handle 10240]
>    set len [string length $data]
>    set left_data ""
>    set right_data ""
>    set i 0
>    while {$i < $len} {
>         append left_data [string range $data $i [incr i]]
>         append right_data [string range $data [incr i] [incr i]]
>         incr i
>    }
> or,
>
>    set data [read $file_handle 10240]
>    set len [string length $data]
>    set left_data ""
>    set right_data ""
>    set i -1
>    while {$i < $len - 1} {
>         append left_data [string range $data [incr i] [incr i]]
>         append right_data [string range $data [incr i] [incr i]]
>    }
>
> I'm not sure either one is any faster, but I have a hunch one of
> them might be.  I'll leave it to you to benchmark and compare them.
>
> This might be faster or slower (or it might run you out of memory),
> and since you are dealing with binary data, not character strings,
> it might fail altogether:
>
>    set data [read $file_handle 10240]
>    set left_data ""
>    set right_data ""
>    set ldat [split $data {} ]
>    foreach {0 1 2 3} $ldat {
>         append left_data [join $0 $1 {}]
>         append right_data [join $2 $3 {}]
>    }
>
> Note:  0, 1, 2, and 3 are just variable names, nothing magic.[/color]

Report this thread to moderator Post Follow-up to this message
Old Post
Derek
08-28-04 01:57 AM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

Tcl archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 04:57 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.