For Programmers: Free Programming Magazines  


Home > Archive > C > August 2006 > [OT] Re: Storage of char in 64 bit machine









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author [OT] Re: Storage of char in 64 bit machine
Eric Sosman

2006-08-15, 7:56 am

Mikhail Teterin wrote:

> Eric Sosman wrote:
>
>
>
>
> Give me an example, of trickery, that YOU consider really , that is also
> portable...


Straying from topicality, hence the change in Subject ...

Once upon a time, in my rambunctious youth, I would have had
no trouble answering your question. It was to use `x&(x-1)'
to zero the lowest-order one-bit, it was to use "horizontal
addition" to count the one-bits or compute parity, it was to
use `POP PC' instead of `RETURN' (yes! it was faster!), it was
to use a computed GOTO instead of an IF (as you may deduce,
my R.Y. was some time ago ...)

But a funny thing happened: Progress.

The first computer I used had forty thousand decimal digits
of memory, of which one hundred locations were reserved for the
table that allowed it to add and subtract integers. All other
arithmetic -- integer multiplication and division, all kinds of
floating-point -- were done via subroutines. I no longer recall
the instruction timings for that machine, but I do remember that
when it was replaced by a newer system that could execute some
kinds of instructions in LESS THAN TWO MICROSECONDS it seemed
like an onrush of unthinkable speed.

On that machine, in my R.Y., " tricks" were a necessity,
a staple of daily existence. They were the difference between
a program that ran and a program that failed (sorry: your code
is twenty digits bigger than memory). Moving a few instructions
out of an inner loop was a big deal -- and the compilers of the
day were not very good at such things. After all, they had only
the resources of the same slow small machine at their disposal,
and were already stressed simply to get the translations done.
(It was not unusual for a compiler to be a program of five or
so sequential "passes" communicating intermediate results via
temporary files, sometimes on reels of magnetic tape or even on
decks of punched cards.) In my R.Y. the compilers needed all the
help we could give them.

Ah, but there's been Progress.

Not quite twenty years ago I bought my first x386 machine,
and it wasn't long before I had a sort of magical realization:
My very modest 640x480 display was being driven by a bare-bones
low-end video card with SIXTEEN TIMES THE MEMORY CAPACITY OF MY
ENTIRE COLLEGE CAMPUS! In a PERIPHERAL, for Crissakes!

It sort of brought home to me the degree to which progress
had changed my world -- and the sobering fact that the tricks
I once employed to squeeze three more instructions into the size
of one disk sector were simply no longer relevant. The economic
facts that once ruled my profession no longer held; a different
dynamic was loose in the world.

The world's finest maker of buggy whips goes to the poorhouse
when the automobile comes along. The careful choice of leathers
for different parts of the whip, the judicious use of oak for
stiffness and ash for springiness that gives the handle its
inimitable feel, the consummate craftsmanship in the double-looped
stitching to prevent water and snow from penetrating and rotting
the interior ... Irrelevant. Useless. Outmoded. Inane.

As one can admire the skills of the buggy whip makers and the
inordinate amount of labor they would expend on a single whip, so
one can admire the " tricks" of the programmer-craftsmen and
their perseverance at desk-checking to avoid messing up even one
of their couple of compiles per day. (I remember once poring over
a core dump trying to see whether a failed program had at least
computed the right intermediate results before dying, doing pencil-
and-paper long division IN HEXADECIMAL to check the results.) As
I say, one can admire the ingenuity, cleverness, sneakiness, and
sweat -- but the proper venue for such admiration is the same as
for admiring buggy whips: In a museum of antiquities, not in code
written for the present day.

Can you make a fire by rubbing two sticks together? It's
doable -- I've seen it done -- but do you feel any less "able"
if you cannot do it yourself?

Do you know how to flake flint to make a knife edge? If not,
do you find your daily life impeded by the loss of that trick?

What kind of sapling will make a good bow? What animal's
entrails will you wind to make a good bowstring, and how do you
get them out of the carcase intact, and how do you prevent them
from rotting -- how, for that matter, do you catch and kill the
animal?

All these once-essential skills are now become irrelevant.
Interesting to antiquarians, perhaps, and in that limited sense
still -- but of no serious consequence any more. You could
be the world's greatest fire-maker, the best flint-flaker ever
to "Ouch!" his thumb, and the best bowyer who ever bent birch,
and you would be ... what? Not chatting on Usenet, I imagine.

I have (figuratively) lived in caves and eaten tasty nematodes
scrabbled from the dirt (there's a trick to finding the good ones,
I'll show ya how it's done). Nowadays I live in a house and get my
food from a supermarket -- and y'know? I really don't miss the
nematodes all that much.

And *that's* .

--
Eric Sosman
esosman@acm-dot-org.invalid
Frederick Gotham

2006-08-16, 9:56 pm

Eric Sosman posted:

> 0.00000000713037 seconds per comparison.



7 million femtoseconds! Wow! ; )

--

Frederick Gotham
Mikhail Teterin

2006-08-17, 3:56 am

Eric Sosman wrote:

> Elsethread you have posted an attempt to quantify HOW MUCH,
> and your results suggest that the particular "" trick you
> favor will save ...
>
> 0.00000000713037 seconds per comparison.


It is 4-6 times faster -- that, the relative, rather than absolute
difference -- is what counts.

A human being can not distinguish a millisecond from a microsecond.

Until the operation needs to be repeated a million time, that is. Because
millisecond becomes 16 minutes, while microsecond -- only a second.

An EOD (End-of-Day) process in a big bank/hedge fund makes (sometimes) many
millions of such currency comparisions as well as assignments (which don't
_really_ require a memcpy/strcpy), etc.

Using pessimal code for the sake of maintaining portability to Digital
Signal Processing boards is even more wasteful than owning an SUV for the
sake of an imaginary (but possible) once-in-a-lifetime off-road ride.

> Mikhail, I can see your error and understand it and sympathize


Thank you, thank you. Agreeing to disagree.

-mi
Walter Roberson

2006-08-17, 6:56 pm

In article <1657552.jO3pM4m0Qg@aldan.algebra.com>,
Mikhail Teterin <usenet+meow@aldan.algebra.com> wrote:

>An EOD (End-of-Day) process in a big bank/hedge fund makes (sometimes) many
>millions of such currency comparisions as well as assignments (which don't
>_really_ require a memcpy/strcpy), etc.


>Using pessimal code for the sake of maintaining portability to Digital
>Signal Processing boards is even more wasteful than owning an SUV for the
>sake of an imaginary (but possible) once-in-a-lifetime off-road ride.


In that End-of-Day processing, is the bank/fund doing nothing when
the currencies do not match, or is the bank/fund doing a currency
conversion? If it is doing a conversion, then eliminate the check
by simply setting the conversion rate for all currencies to themselves
to be 1, and then unconditionally using the conversion routine
with the conversion factor indexed by both conversions.

But the point is moot, because we know that in the OP's code, the
OP is not primarily concerned with speed: the primary concern for
the OP is ease of debugging, and the code is intended by the OP to
be as fast as practical provided the ease of debugging is preserved.
If speed were the primary concern, then the OP's code would have
converted to an arbitrary currency index and then would compare the
indices "millions of times".
--
"law -- it's a commodity"
-- Andrew Ryan (The Globe and Mail, 2005/11/26)
Mikhail Teterin

2006-08-17, 6:56 pm

We went from "don't do this, you idiot, the compiler is much better
optimizing, than you will ever be" to "yeah, it is 4-6 times faster to do
things your way, but it is still not worth the effort".

I think, this is a considerable progress for one thread and will shut up for
a while...

-mi
Walter Roberson

2006-08-17, 6:56 pm

In article <1940995.IL8L2SCWzz@misha>,
Mikhail Teterin <usenet@aldan.algebra.com> wrote:
>We went from "don't do this, you idiot, the compiler is much better
>optimizing, than you will ever be" to "yeah, it is 4-6 times faster to do
>things your way, but it is still not worth the effort".


No we didn't. We went from "Don't do this, you idiot, that's
non-portable" to -you- saying "Yeah, but it's faster".

A couple of people then {perhaps unwisely} said "a good strcmp() can
probably do just as well", and you produced a benchmark that appeared
to show that your code is measurably faster... at least for the systems
you tried.

This was followed by a collective "So what if it executes faster?
The time differences is small enough to be inconsequential for most
systems, and in the meantime you've written unclear code that will
likely break on future systems. Paul Hsieh is the only one who
said that it's worth exploring, and it isn't clear that he is taking
into account relative costs of maintaince and execution time.

>I think, this is a considerable progress for one thread and will shut up for
>a while...


You haven't really answered to the costs vs portability concerns.
You mooted about banks and funds, but I pointed out that if
the execution time of the comparison was really such a key factor
then you wouldn't have used this approach at all.

Sooo... the space we are in right now is "Here's a non-portable
and less-clear hack that's often faster" and you wanting general
approval for your cleverness... or at the very least, you wanting
people to be content to have such hacks be topical here.

The only progress that I see that you've made is Paul Hseih's
"don't take the naysaying too seriously". Paul is [to me] obviously
fairly knowledgable in some areas, but it -looks- to me as if he
often ends up in arguments, and I get the -impression- that a fair
number of people don't pay careful attention to him (possibly
because of the -way- he says things...)
--
"It is important to remember that when it comes to law, computers
never make copies, only human beings make copies. Computers are given
commands, not permission. Only people can be given permission."
-- Brad Templeton
Keith Thompson

2006-08-17, 6:56 pm

Mikhail Teterin <usenet@aldan.algebra.com> writes:
> We went from "don't do this, you idiot, the compiler is much better
> optimizing, than you will ever be" to "yeah, it is 4-6 times faster to do
> things your way, but it is still not worth the effort".
>
> I think, this is a considerable progress for one thread and will shut up for
> a while...


Looking back in this thread's history, you weren't the OP, but you did
bring up something that triggered a lot of discussion:

| So, comparing, say, 4-char arrays (like currency codes) can NOT be done in
| the following way?
|
| typedef union {
| char acCUR[4];
| int32_t iCUR;
| } xCUR;
|
| int
| CurEqual(xCUR *c1, xCUR *c2)
| {
| if (c1->iCUR == c2->iCUR)
| printf("Same currency %s\n", c1->acCUR);
| else
| printf("%s and %s are different\n",
| c1->acCUR, c2->acCUR);
| }
|
| Having to call a strcmp() in such cases seems like a bad waste to me, but I
| don't see, how the compiler could possibly optimize such a code without the
| trick above...

In this case, it seems to me that there are solutions better than
either using strcmp() or pretending that a char[4] is an int32_t.

Presumably you want the operation of comparing two currency codes to
be as fast as possible, because that operation is a bottleneck in your
application.

You can probably achieve this by storing and comparing the currency
codes *as integers*. One simple way to do this is to compute the
numeric codes arithmetically from the string values. I think you said
that all the currency codes are two characters; if so, it's as simple
as:

numeric_code = (string_code[0] << CHAR_BIT) + string_code[1];

and the reverse conversion is also quite simple.

Note that the values of the numeric codes could vary with different
character sets, and if you store them in files, they could vary with
system byte ordering.

This avoids the need for strcmp() *and* it doesn't depend on type
punning.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Dik T. Winter

2006-08-17, 6:56 pm

In article <1940995.IL8L2SCWzz@misha> Mikhail Teterin <usenet@aldan.algebra.com> writes:
> We went from "don't do this, you idiot, the compiler is much better
> optimizing, than you will ever be" to "yeah, it is 4-6 times faster to do
> things your way, but it is still not worth the effort".


You quoted a gain of 0.00000000713037 seconds per comparison in EOD
processing at a bank. Even if in that EOD processing 1000 million of
such comparisons are made, your net gain is about 7 seconds. I think
that is peanuts compared to the total time required for EOD processing.

I would think that in the EOD process the comparisons are a small fraction
of the total time involved. Say 10% (which is, eh, quite large in my
opinion). If we reduce that by a factor 5, the total time will be
reduced by only 8%.

But even *if* it is significant, there are better methods to increase
speed. Set up a table with all existing currencies, and use indexing.
When a transaction is entered, look up the index numbers for the
transaction (takes about 0.00003 seconds if there are two currencies
involved), and use those indices for the remainder.

Even when loading a full table of conversion rates between two currencies,
the finding of an index would be peanuts (0.0003 seconds gain with your
method). I would think that reading the rates themselves would take
much more time.

This is called micro-optimisation. Optimising a part of the program
that does not use a significant amount of time compared to the
remainder. You should start at the bottle-necks, and for that you
have to measure what the bottle-necks are. Profile the program.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Edmond Dantes

2006-08-17, 9:56 pm

Walter Roberson wrote:

> In article <1940995.IL8L2SCWzz@misha>,
> Mikhail Teterin <usenet@aldan.algebra.com> wrote:
>
> No we didn't. We went from "Don't do this, you idiot, that's
> non-portable" to -you- saying "Yeah, but it's faster".


There must be an equivalent of the old aphorism, "penny-wise, pound-foolish"
that can be applied to optimizing code.

Today, with rare exception, there is hardly the need to optimize assembly
code by hand. I could see it for, say, an embedded system running on
limited resources where it may make sense. But in general? Nope.

--
-- Edmond Dantes, CMC
And Now for something Completely Different:
http://mini-cooper.ReallyHotOrNot.com
http://washers.industrialtips.com
http://embroidery.womencraft.com
http://commodity.yourtruemoney.com
http://Linux.Internet69.com
http://Honda.ReallyHotOrNot.com
http://playset.GadgetRUs.com


Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com
Flash Gordon

2006-08-19, 7:56 am

Walter Roberson wrote:
> In article <1940995.IL8L2SCWzz@misha>,
> Mikhail Teterin <usenet@aldan.algebra.com> wrote:
>
> No we didn't. We went from "Don't do this, you idiot, that's
> non-portable" to -you- saying "Yeah, but it's faster".
>
> A couple of people then {perhaps unwisely} said "a good strcmp() can
> probably do just as well", and you produced a benchmark that appeared
> to show that your code is measurably faster... at least for the systems
> you tried.


I just tried the benchmark on my machine and got the same result.
However, when I changed bench.c to call strcmp directly, which is how I
would actually do the string comparison inline the results switched to
strcmp being far faster. When, to be fair, I changed the integer
comparison to being inline I got the *same* results for both. So I don't
thing the comment about good strcmp implementations was unwise, I think
it was justifiable at least in part. Under the right conditions the
strcmp method *does* perform as efficiently because the compiler knows
what strcmp does, inlines it, and then optimises the inlined code in
context.

<OT>
I believe that this is the reason gcc has a number of "library"
functions implemented as built-ins rather than just calling the library.
The built ins *are* inlined before the optimiser is invoked.
</OT>

> This was followed by a collective "So what if it executes faster?
> The time differences is small enough to be inconsequential for most
> systems, and in the meantime you've written unclear code that will
> likely break on future systems. Paul Hsieh is the only one who
> said that it's worth exploring, and it isn't clear that he is taking
> into account relative costs of maintaince and execution time.


There is also the point that even if it save 3 hours each run it may
still not be worth it. My customers still have jobs that are run as
overnight jobs for very good reasons, they don't care how long it takes
as long as it is finished by morning. In other situations a 1ms saving
can be critical. In other words, you have to consider how valuable
execution time is as well as how much you save and what it costs you.

>
> You haven't really answered to the costs vs portability concerns.
> You mooted about banks and funds, but I pointed out that if
> the execution time of the comparison was really such a key factor
> then you wouldn't have used this approach at all.
>
> Sooo... the space we are in right now is "Here's a non-portable
> and less-clear hack that's often faster" and you wanting general
> approval for your cleverness... or at the very least, you wanting
> people to be content to have such hacks be topical here.


The reason I believe most people object to hacks like these is, I
believe, because they make it needlessly hard to understand the code
without any corresponding *significant* benefit. Note that even when
they do improve performance (some tricks people have posted would make
performance *worse* on some systems) the improved performance is often
not enough for the user to notice.

> The only progress that I see that you've made is Paul Hseih's
> "don't take the naysaying too seriously". Paul is [to me] obviously
> fairly knowledgable in some areas, but it -looks- to me as if he
> often ends up in arguments, and I get the -impression- that a fair
> number of people don't pay careful attention to him (possibly
> because of the -way- he says things...)


What gets me is his apparent assumption that just because optimisations
will work and produce better results on the systems he has experience
with means that they are generally worth while. See my response to his
post pointing out that his *assumptions* about performance are wrong
even for a bog-standard notebook running Linux.
--
Flash Gordon
Still sigless on this computer.
Malcolm

2006-08-19, 6:56 pm


"Edmond Dantes" <edmond@le-comte-de-monte-cristo.biz> wrote
> Walter Roberson wrote:
>
>
> There must be an equivalent of the old aphorism, "penny-wise,
> pound-foolish"
> that can be applied to optimizing code.
>
> Today, with rare exception, there is hardly the need to optimize assembly
> code by hand. I could see it for, say, an embedded system running on
> limited resources where it may make sense. But in general? Nope.
>

If you've got a very slow computer or a very fast computer, often you need
to optimise.
If you are using a 3GHz PC to run a medium-sized company accounts on a
spreadsheet, it is pretty pointless.
--
www.personal.leeds.ac.uk/~bgy1mm
freeware games to download.



Dave Thompson

2006-08-28, 3:56 am

On Fri, 18 Aug 2006 00:24:09 GMT, Keith Thompson <kst-u@mib.org>
wrote:

> Mikhail Teterin <usenet@aldan.algebra.com> writes:


> | So, comparing, say, 4-char arrays (like currency codes) can NOT be done in
> | the following way?
> |
> | typedef union {
> | char acCUR[4];
> | int32_t iCUR;
> | } xCUR;

<snip using iCUR>
> In this case, it seems to me that there are solutions better than
> either using strcmp() or pretending that a char[4] is an int32_t.


> You can probably achieve this by storing and comparing the currency
> codes *as integers*. One simple way to do this is to compute the
> numeric codes arithmetically from the string values. I think you said
> that all the currency codes are two characters; if so, it's as simple


He said they are 3 characters plus null, hence the int32=4*char test
would -- if not for that danged UB -- always give the same equality
result as strcmp, i.e. there will never be any trailing possibly
garbage byte(s) ignored by strcmp but included in int32.

Unsurprisingly, as the applicable official standard for interchange,
ISO 4217, defines codes of three characters = 2 characters country
code (mostly ISO 3166) + one letter allowing for multiple (i.e.
replacement) currencies, and IME is often used internally as well.
More relevant to the discussion here, ISO 4217 also provides numeric
codes which will fit easily in a (minimal standard) 16-bit short,
which are equally official but IME not as widely used.

> as:
>
> numeric_code = (string_code[0] << CHAR_BIT) + string_code[1];
>

<snip>
> This avoids the need for strcmp() *and* it doesn't depend on type
> punning.


But I suspect it isn't very likely to be more efficient than strcmp(),
which was the stated goal. (Setting aside the usual discussions,
already had, about whether/when/how to optimize.)

- David.Thompson1 at worldnet.att.net
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com