Code Comments
Programming Forum and web based access to our favorite programming groups.Hi all;
Pls advise the perlophiliacs method of deciding a string is empty.
I am using
if ( $@ || $c_var eq "" ) {
but constantly read `eq` is expensive.
For example is
if ( $@ || ! length $c_var ) {
better, faster, cheaper
Regards
Ian
Post Follow-up to this messageOn 2004-12-21, ipellew@pipemedia.co.uk <ipellew@pipemedia.co.uk> wrote:
>
> Pls advise the perlophiliacs method of deciding a string is empty.
>
> I am using
> if ( $@ || $c_var eq "" ) {
> but constantly read `eq` is expensive.
...and using $@ in the absence of an eval is silly.
Is there something wrong with
if ($c_var)
? It's not exactly the same, but since you provide no context it's
hard to know what you really need.
--keith
--
kkeller-usenet@wombat.san-francisco.ca.us
(try just my userid to email me)
AOLSFAQ=http://wombat.san-francisco.ca.us/cgi-bin/fom
Post Follow-up to this message<ipellew@pipemedia.co.uk> wrote in comp.lang.perl.misc:
> Hi all;
>
> Pls advise the perlophiliacs method of deciding a string is empty.
>
> I am using
> if ( $@ || $c_var eq "" ) {
> but constantly read `eq` is expensive.
>
> For example is
> if ( $@ || ! length $c_var ) {
> better, faster, cheaper
If you really need to know, "use Benchmark", but that's futile
micro-optimization. The idiomatic way is to test for length.
Be sure the string is defined at all.
Anno
Post Follow-up to this messageKeith Keller (kkeller-usenet@wombat.san-francisco.ca.us) wrote on
MMMMCXXX September MCMXCIII in <URL:news:vq7k92xjqo.ln2@goaway.wombat.san-fr
ancisco.ca.us>:
== On 2004-12-21, ipellew@pipemedia.co.uk <ipellew@pipemedia.co.uk> wrote:
== >
== > Pls advise the perlophiliacs method of deciding a string is empty.
== >
== > I am using
== > if ( $@ || $c_var eq "" ) {
== > but constantly read `eq` is expensive.
==
== ...and using $@ in the absence of an eval is silly.
==
== Is there something wrong with
==
== if ($c_var)
Well, if you are going to comment about using $@ in the absence of
an eval, using 'if ($c_var)' is equally silly. There's no assignment
to $c_var, so it's false, and the then part of the if is never executed
anyway, so we could simple remove the entire then part.
Or perhaps we could say the then part isn't present, so the given
line won't compile, and hence, the program won't run. So we could
just replace the entire program with an empty file.
Or did you just assume the then part would be present in the real
code, and that there would be an assignment to $c_var as well?
If so, why couldn't assume there would be an eval statement?
== ? It's not exactly the same, but since you provide no context it's
== hard to know what you really need.
Why would you think the OP wants something else then "deciding a string
is empty". It's not a silly thing to do.
Abigail
--
perl -we '$@=" \145\143\150\157\040\042\112\165\163\164
\040\141\156\157\164".
" \150\145\162\040\120\145\162\154\040\110
\141\143\153\145\162".
" \042\040\076\040\057\144\145\166\057\164
\164\171";`$@`'
Post Follow-up to this messageipellew@pipemedia.co.uk wrote:
>
> I am using
> if ( $@ || $c_var eq "" ) {
> but constantly read `eq` is expensive.
>
> For example is
> if ( $@ || ! length $c_var ) {
> better, faster, cheaper
Dear Ian,
I'm not convinced that $var eq "" is necessarily more expensive
than length($var) . The reason I think this is because the eq
operator can report a false value as soon as it detects a character in
the variable it is examining, whereas the length() function must count
every single character in $var, even if $var is millions of characters
long.
The method that is more expensive really depends on the
implementation of the two functions/operators. If you really want to
know which one is more expensinve for the task at hand, use the
Benchmark module (read "perldoc Benchmark" to find out how to use it).
But to be honest, it really doesn't matter which method is better,
faster, cheaper. They are pretty much the same in terms of efficiency.
Sure, one may use up a few more clock cycles than the other, but this
is a small constant value that is practically imperceptable, even by
computer standards (in fact, when I got used the Benchmark module I saw
the warning: "(warning: too few iterations for a reliable count)" even
when I used a count of ten million).
A lot of programmers fall into the trap of thinking that if they
always use the faster, more efficient operators that their code will
run much faster than before. This is true only if the algorithms used
in these options behave better with large data (are you familiar with
Big-O notation?). So if your program can't handle large amounts of
data very well (that is, if it had a Big-O value of N-squared), simply
converting all your '$val eq ""' conditions to '!length($val)' isn't
going to make your program magically handle large amounts of data.
That's because eq and length() have roughly the same Big-O value. To
make your program run faster, you'd have to modify its algorithms so
that none of them are N-squared (or worse). At this point, the use of
eq versus length() is really a moot point.
To illustrate, if using the length() function is one-millionth of a
second faster than using eq, it will only make a noticeable difference
if length() (or eq) is used (on the order of) one million times more
often than anything else (and then, the difference might only be one
second). That is, if you want to check for the existence of an empty
string only five, one hundred, or even a thousand times in your code,
it really won't make a difference whether you use eq or length().
Theoretically, one method will be faster than the other, but you
couldn't time this difference with a stopwatch, even if you had faster
reflexes than anybody else in the world. And like I mentioned above,
even Perl's Benchmark module has trouble perceiving this time
difference.
In my opinion, you should usually use the function/operation that is
more readable (and, of course, you have to decide for yourself which is
more readable). If you spend two minutes converting the code to
something that is theoretically faster, you might not even save one
second of total running time (from every time you run the program).
And if it takes someone in the future three extra minutes to figure out
what you were trying to do, that's more than four minutes and 59
seconds wasted changing your code, thinking that your code will become
faster, better, cheaper.
I realize I wrote a lot about this subject, but to summarize, let me
say this:
Making code run faster almost always means eliminating the
bottlenecks. Changing '$var eq ""' to '!length($var)' might make a
difference (probably super small) but it won't eliminate a bottleneck.
Here is a real-world analogy (if you like these kinds of things):
There is a ten-mile-long road that people drive their cars on. Most
of this road has two lanes. But for some reason, five miles along the
road, the two lanes merge into one lane, but only for 100 meters (after
which they become two lanes again).
Ordinarily this isn't a problem when there are few cars on the road.
As a car reaches the place where the two lanes become one, it switches
lanes (if needed), and then switches back when there are two lanes
again.
But during periods of heavy traffic, this lane merge causes a
bottleneck. Multiple cars are trying to squeeze into one lane at the
same time, creating a bottleneck and backing up traffic for miles.
This is unacceptable, and a solution must be found.
Someone might say that the speed limit should be raised from 55 mph
to 60 mph, because 60 mph is faster, and therefore more efficient, and
will make the cars move faster. Another person might say to make the
stretch of road that only has one lane shorter so that there is more of
the road with two full lanes.
Their intentions are good, but none of these solutions eliminate the
bottleneck, which is what is slowing down traffic. A solution that is
much better than either of those just listed would be to insert a
second lane (where there is currently only one lane) for cars to use
instead of having to merge. (In fact, you could even reduce the speed
limit to 50 mph with this solution and it would still work better than
the solution to only raise the speed limit to 60 mph!)
And while raising the speed limit to 60 mph sounds good, it won't
even save you a full minute when the bottleneck is present. With the
bottleneck, the traffic might be backed up for hours, so just
eliminating one minute won't make all that much difference. Eliminate
the bottleneck and hours of driving time will be saved, even when the
speed limit is significantly slower.
And that's why I think you shouldn't worry about whether you should
use eq or length(). Just go with the one that is more readable and
easier to maintain and understand, and you will end up saving more time
in the future by not having to figure some possibly convoluted code
that might not make much difference in the end at all.
This quote is widely attributed to Donald Knuth:
"Premature optimization is the root of all evil."
The point of the quote is that if you try to optimize a section of code
before you can prove that it needs to be optimized, you may end up
writing obfuscated, difficult-to-read code for nothing.
I hope this helps, Ian.
-- Jean-Luc Romano
Post Follow-up to this message>>>>> "jpc" == jl post@hotmail com <jl_post@hotmail.com> writes: jpc> I'm not convinced that $var eq "" is necessarily more expensive jpc> than length($var) . The reason I think this is because the eq jpc> operator can report a false value as soon as it detects a character in jpc> the variable it is examining, whereas the length() function must count jpc> every single character in $var, even if $var is millions of characters jpc> long. why must length count all the chars? how will it know when the string ends? does the string end in a zero byte? but perl strings can have any binary data? so how does perl figure out the length of strings? hmmm. <snip of overly massive tome on this subject> jpc> This quote is widely attributed to Donald Knuth: jpc> "Premature optimization is the root of all evil." jpc> The point of the quote is that if you try to optimize a section of code jpc> before you can prove that it needs to be optimized, you may end up jpc> writing obfuscated, difficult-to-read code for nothing. jpc> I hope this helps, Ian. why didn't you just say that and cut out most of the rest (including your comments on how length works in perl)? uri -- Uri Guttman ------ uri@stemsystems.com -------- [url]http://www.stemsystems.com[/url ] --Perl Consulting, Stem Development, Systems Architecture, Design and Coding - Search or Offer Perl Jobs ---------------------------- [url]http://jobs.perl.org[/url ]
Post Follow-up to this messagejl_post@hotmail.com wrote: > I'm not convinced that $var eq "" is necessarily more expensive > than length($var) . The reason I think this is because the eq > operator can report a false value as soon as it detects a character in > the variable it is examining, whereas the length() function must count > every single character in $var Perl's length() function does not count characters. The information is already present in the guts of a scalar value. Therefore your reasoning is incorrect. -Joe
Post Follow-up to this messageUri Guttman wrote: > > why must length count all the chars? how will it know > when the string ends? does the string end in a zero > byte? but perl strings can have any binary data? so > how does perl figure out the length of strings? hmmm. Hmmm... I didn't think of that. You bring up a good point. Reflecting on what you just said, I'm remembering the Devel::Pmodule. The Devel::P
::Dump() function lists a string's length, so I'm guessing that the length() function could probably get that attribute from the same place that Devel::P
::Dump() does. Thanks for pointing that out. > <snip of overly massive tome on this subject> > > jpc> This quote is widely attributed to Donald Knuth: > jpc> "Premature optimization is the root of all evil." > > why didn't you just say that and cut out most of the rest (including > your comments on how length works in perl)? Since you asked, I'll explain. This subject has come up several times with my peers, and I'm still amazed what some programmers will favor in the name of efficiency and speed. For example, some people will refuse to ever use the line: $i++; when $i is just an integer. Instead, they will say the code is wrong unless it is written as: ++$i; or: $i += 1; The reason they think that using the post-increment operator is wrong is because it makes an extra copy that is never used (which is slower and less efficient). Now, they might have a point if $i is a blessed reference pointing to a huge structure, but when $i is just an integer, it won't save you any noticeable difference to use pre-increment instead of post-increment. But I've had people challenge me on this. They say that if you're writing code, it should be as efficient as possible because it could get called in a very tight loop that gets called a large number of times. And while I agree that code should be efficient, I point out that if the code they write is running slowly, changing a post-increment operator to a (presumably faster) pre-decrement operator isn't going to speed up the program any satisfiable (or noticeable) amount. What will make the difference instead is to re-write any algorithms with a Big-O notation of N-squared (or worse) to be ones that have a Big-O notation of N log(N) (or better). And no matter how many times I try to convince them that a bottleneck won't be eliminated just by replaceing something as trivial as a post-increment operator with a pre-decrement operator, the person I'm talking with often ends the discussion with: "Well... I'm still going to use the more efficient code." Unfortunately, all too often that means that their code will be more difficult to read and understand (for others, of course), especially when they omit comments explaining what their code is attempting to do and why it was written that way. And often, their "more efficient" code is more bug-prone than the equivalent "inferior, inefficient" code. It seemed like you understood my point. But a lot of people don't. They hear a cute little quote like this one I read from http://www-106.ibm.com/developerwor.../l-optperl.html : > All of this help, though, comes at a slight performance > cost. I keep warnings and strict on while programming > and debugging, and I switch it off once the script is > ready to be used in the real world. It won't save much, > but every millisecond counts. I totally disagree with this (I won't go into the reasons why). But my point is that many people will read this and use this as their manifesto not to use warnings and strict. I can counteract with another cute quote, but I've found that if a person has been swayed by a cute-sy quote, they generally won't get swayed back by another. By the way the original poster posted his message, he seemed to think that the faster method was good while all the rest were bad! He may have obtained this notion the same way I did: when a computer science professor gave a lecture on operations and how expensive they are and how they ultimately cost money. To answer your question, one quote alone is usually not enough to sway a person's beliefs, so I felt the need to back it up with a real-world example and scenario in the hopes that it would educate the original poster. I didn't mean to offend you or any other poster on this newsgroup with my long response, but it's a pet peeve of mine when others write obsfuscated code in the name of efficiency, particularly when the amount of time saved from the total run-times of every run of the "efficient" program amounts to less than a second. That's why I felt that a thorough response was in order. I hope this makes sense, Uri. (And thanks for pointing out that thing about using length().) -- Jean-Luc
Post Follow-up to this messageJoe Smith wrote: > > Perl's length() function does not count characters. > The information is already present in the guts of a > scalar value. Therefore your reasoning is incorrect. I see I was wrong. Thanks for pointing that out. I realized later that I could see this information by using the Devel::Pmodule, like this: > perl -MDevel::P
-e "Dump('perl')" SV = PV(0x225208) at 0x1823e98 REFCNT = 1 FLAGS = (PADBUSY,PADTMP,POK,READONLY,pPOK) PV = 0x182ac34 "perl"\0 CUR = 4 LEN = 5 Again, thanks. -- Jean-Luc
Post Follow-up to this messagejl_post@hotmail.com wrote: > This subject has come up several times with my peers, and I'm still > amazed what some programmers will favor in the name of efficiency and > speed. For example, some people will refuse to ever use the line: > > $i++; > > when $i is just an integer. Instead, they will say the code is wrong > unless it is written as: > > ++$i; > > or: > > $i += 1; > > The reason they think that using the post-increment operator is wrong > is because it makes an extra copy that is never used (which is slower > and less efficient). Tell them to take a class in basic compiler construction. Well, compile time optimizations are an advanced topic, so they will have to take two classes. But any compiler, that does not fold all three statements into the most efficient form is not worth its money, even if it's free. jue
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.