Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Good practice to detect empty string?
Hi all;

Pls advise the perlophiliacs method of deciding a string is empty.

I am using
if ( $@ || $c_var eq "" ) {
but constantly read `eq` is expensive.

For example is
if ( $@ || ! length $c_var ) {
better, faster, cheaper

Regards
Ian


Report this thread to moderator Post Follow-up to this message
Old Post
ipellew@pipemedia.co.uk
12-21-04 01:57 AM


Re: Good practice to detect empty string?
On 2004-12-21, ipellew@pipemedia.co.uk <ipellew@pipemedia.co.uk> wrote:
>
> Pls advise the perlophiliacs method of deciding a string is empty.
>
> I am using
> if ( $@ || $c_var eq "" ) {
> but constantly read `eq` is expensive.

...and using $@ in the absence of an eval is silly.

Is there something wrong with

if ($c_var)

?  It's not exactly the same, but since you provide no context it's
hard to know what you really need.

--keith

--
kkeller-usenet@wombat.san-francisco.ca.us
(try just my userid to email me)
AOLSFAQ=http://wombat.san-francisco.ca.us/cgi-bin/fom


Report this thread to moderator Post Follow-up to this message
Old Post
Keith Keller
12-21-04 01:57 AM


Re: Good practice to detect empty string?
<ipellew@pipemedia.co.uk> wrote in comp.lang.perl.misc:
> Hi all;
>
> Pls advise the perlophiliacs method of deciding a string is empty.
>
> I am using
> if ( $@ || $c_var eq "" ) {
> but constantly read `eq` is expensive.
>
> For example is
> if ( $@ || ! length $c_var ) {
> better, faster, cheaper

If you really need to know, "use Benchmark", but that's futile
micro-optimization.  The idiomatic way is to test for length.
Be sure the string is defined at all.

Anno

Report this thread to moderator Post Follow-up to this message
Old Post
Anno Siegel
12-21-04 08:58 PM


Re: Good practice to detect empty string?
Keith Keller (kkeller-usenet@wombat.san-francisco.ca.us) wrote on
MMMMCXXX September MCMXCIII in <URL:news:vq7k92xjqo.ln2@goaway.wombat.san-fr
ancisco.ca.us>:
==  On 2004-12-21, ipellew@pipemedia.co.uk <ipellew@pipemedia.co.uk> wrote:
== >
== > Pls advise the perlophiliacs method of deciding a string is empty.
== >
== > I am using
== > if ( $@ || $c_var eq "" ) {
== > but constantly read `eq` is expensive.
==
==  ...and using $@ in the absence of an eval is silly.
==
==  Is there something wrong with
==
==  if ($c_var)

Well, if you are going to comment about using $@ in the absence of
an eval, using 'if ($c_var)' is equally silly. There's no assignment
to $c_var, so it's false, and the then part of the if is never executed
anyway, so we could simple remove the entire then part.

Or perhaps we could say the then part isn't present, so the given
line won't compile, and hence, the program won't run. So we could
just replace the entire program with an empty file.

Or did you just assume the then part would be present in the real
code, and that there would be an assignment to $c_var as well?
If so, why couldn't assume there would be an eval statement?

==  ?  It's not exactly the same, but since you provide no context it's
==  hard to know what you really need.

Why would you think the OP wants something else then "deciding a string
is empty". It's not a silly thing to do.



Abigail
--
perl -we '$@=" \145\143\150\157\040\042\112\165\163\164
\040\141\156\157\164".
" \150\145\162\040\120\145\162\154\040\110
\141\143\153\145\162".
" \042\040\076\040\057\144\145\166\057\164
\164\171";`$@`'

Report this thread to moderator Post Follow-up to this message
Old Post
Abigail
12-22-04 01:57 AM


Re: Good practice to detect empty string?
ipellew@pipemedia.co.uk wrote:
>
> I am using
> if ( $@ || $c_var eq "" ) {
> but constantly read `eq` is expensive.
>
> For example is
> if ( $@ || ! length $c_var ) {
> better, faster, cheaper


Dear Ian,

I'm not convinced that  $var eq ""  is necessarily more expensive
than  length($var) .  The reason I think this is because the eq
operator can report a false value as soon as it detects a character in
the variable it is examining, whereas the length() function must count
every single character in $var, even if $var is millions of characters
long.

The method that is more expensive really depends on the
implementation of the two functions/operators.  If you really want to
know which one is more expensinve for the task at hand, use the
Benchmark module (read "perldoc Benchmark" to find out how to use it).

But to be honest, it really doesn't matter which method is better,
faster, cheaper.  They are pretty much the same in terms of efficiency.
Sure, one may use up a few more clock cycles than the other, but this
is a small constant value that is practically imperceptable, even by
computer standards (in fact, when I got used the Benchmark module I saw
the warning:  "(warning: too few iterations for a reliable count)" even
when I used a count of ten million).

A lot of programmers fall into the trap of thinking that if they
always use the faster, more efficient operators that their code will
run much faster than before.  This is true only if the algorithms used
in these options behave better with large data (are you familiar with
Big-O notation?).  So if your program can't handle large amounts of
data very well (that is, if it had a Big-O value of N-squared), simply
converting all your '$val eq ""' conditions to '!length($val)' isn't
going to make your program magically handle large amounts of data.
That's because eq and length() have roughly the same Big-O value.  To
make your program run faster, you'd have to modify its algorithms so
that none of them are N-squared (or worse).  At this point, the use of
eq versus length() is really a moot point.

To illustrate, if using the length() function is one-millionth of a
second faster than using eq, it will only make a noticeable difference
if length() (or eq) is used (on the order of) one million times more
often than anything else (and then, the difference might only be one
second).  That is, if you want to check for the existence of an empty
string only five, one hundred, or even a thousand times in your code,
it really won't make a difference whether you use eq or length().
Theoretically, one method will be faster than the other, but you
couldn't time this difference with a stopwatch, even if you had faster
reflexes than anybody else in the world.  And like I mentioned above,
even Perl's Benchmark module has trouble perceiving this time
difference.

In my opinion, you should usually use the function/operation that is
more readable (and, of course, you have to decide for yourself which is
more readable).  If you spend two minutes converting the code to
something that is theoretically faster, you might not even save one
second of total running time (from every time you run the program).
And if it takes someone in the future three extra minutes to figure out
what you were trying to do, that's more than four minutes and 59
seconds wasted changing your code, thinking that your code will become
faster, better, cheaper.

I realize I wrote a lot about this subject, but to summarize, let me
say this:

Making code run faster almost always means eliminating the
bottlenecks.  Changing '$var eq ""' to '!length($var)' might make a
difference (probably super small) but it won't eliminate a bottleneck.

Here is a real-world analogy (if you like these kinds of things):

There is a ten-mile-long road that people drive their cars on.  Most
of this road has two lanes.  But for some reason, five miles along the
road, the two lanes merge into one lane, but only for 100 meters (after
which they become two lanes again).

Ordinarily this isn't a problem when there are few cars on the road.
As a car reaches the place where the two lanes become one, it switches
lanes (if needed), and then switches back when there are two lanes
again.

But during periods of heavy traffic, this lane merge causes a
bottleneck.  Multiple cars are trying to squeeze into one lane at the
same time, creating a bottleneck and backing up traffic for miles.
This is unacceptable, and a solution must be found.

Someone might say that the speed limit should be raised from 55 mph
to 60 mph, because 60 mph is faster, and therefore more efficient, and
will make the cars move faster.  Another person might say to make the
stretch of road that only has one lane shorter so that there is more of
the road with two full lanes.

Their intentions are good, but none of these solutions eliminate the
bottleneck, which is what is slowing down traffic.  A solution that is
much better than either of those just listed would be to insert a
second lane (where there is currently only one lane) for cars to use
instead of having to merge.  (In fact, you could even reduce the speed
limit to 50 mph with this solution and it would still work better than
the solution to only raise the speed limit to 60 mph!)

And while raising the speed limit to 60 mph sounds good, it won't
even save you a full minute when the bottleneck is present.  With the
bottleneck, the traffic might be backed up for hours, so just
eliminating one minute won't make all that much difference.  Eliminate
the bottleneck and hours of driving time will be saved, even when the
speed limit is significantly slower.

And that's why I think you shouldn't worry about whether you should
use eq or length().  Just go with the one that is more readable and
easier to maintain and understand, and you will end up saving more time
in the future by not having to figure some possibly convoluted code
that might not make much difference in the end at all.

This quote is widely attributed to Donald Knuth:

"Premature optimization is the root of all evil."

The point of the quote is that if you try to optimize a section of code
before you can prove that it needs to be optimized, you may end up
writing obfuscated, difficult-to-read code for nothing.
I hope this helps, Ian.

-- Jean-Luc Romano


Report this thread to moderator Post Follow-up to this message
Old Post
jl_post@hotmail.com
12-23-04 02:13 AM


Re: Good practice to detect empty string?
>>>>> "jpc" == jl post@hotmail com <jl_post@hotmail.com> writes:

jpc> I'm not convinced that  $var eq ""  is necessarily more expensive
jpc> than  length($var) .  The reason I think this is because the eq
jpc> operator can report a false value as soon as it detects a character in
jpc> the variable it is examining, whereas the length() function must count
jpc> every single character in $var, even if $var is millions of characters
jpc> long.

why must length count all the chars? how will it know when the string
ends? does the string end in a zero byte? but perl strings can have any
binary data? so how does perl figure out the length of strings? hmmm.

<snip of overly massive tome on this subject>

jpc> This quote is widely attributed to Donald Knuth:

jpc> "Premature optimization is the root of all evil."

jpc> The point of the quote is that if you try to optimize a section of code
jpc> before you can prove that it needs to be optimized, you may end up
jpc> writing obfuscated, difficult-to-read code for nothing.
jpc> I hope this helps, Ian.

why didn't you just say that and cut out most of the rest (including
your comments on how length works in perl)?

uri

--
Uri Guttman  ------  uri@stemsystems.com  -------- [url]http://www.stemsystems.com[/url
]
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding
-
Search or Offer Perl Jobs  ----------------------------  [url]http://jobs.perl.org[/url
]

Report this thread to moderator Post Follow-up to this message
Old Post
Uri Guttman
12-23-04 02:13 AM


Re: Good practice to detect empty string?
jl_post@hotmail.com wrote:

> I'm not convinced that  $var eq ""  is necessarily more expensive
> than  length($var) .  The reason I think this is because the eq
> operator can report a false value as soon as it detects a character in
> the variable it is examining, whereas the length() function must count
> every single character in $var

Perl's length() function does not count characters.
The information is already present in the guts of a scalar value.
Therefore your reasoning is incorrect.
-Joe

Report this thread to moderator Post Follow-up to this message
Old Post
Joe Smith
12-23-04 09:10 AM


Re: Good practice to detect empty string?
Uri Guttman wrote:
>
> why must length count all the chars? how will it know
> when the string ends? does the string end in a zero
> byte? but perl strings can have any binary data? so
> how does perl figure out the length of strings? hmmm.


Hmmm... I didn't think of that.  You bring up a good point.
Reflecting on what you just said, I'm remembering the Devel::P
module.  The Devel::P::Dump() function lists a string's length, so
I'm guessing that the length() function could probably get that
attribute from the same place that Devel::P::Dump() does.

Thanks for pointing that out.


> <snip of overly massive tome on this subject>
>
>   jpc> This quote is widely attributed to Donald Knuth:
>   jpc> "Premature optimization is the root of all evil."
>
> why didn't you just say that and cut out most of the rest (including
> your comments on how length works in perl)?


Since you asked, I'll explain.

This subject has come up several times with my peers, and I'm still
amazed what some programmers will favor in the name of efficiency and
speed.  For example, some people will refuse to ever use the line:

$i++;

when $i is just an integer.  Instead, they will say the code is wrong
unless it is written as:

++$i;

or:

$i += 1;

The reason they think that using the post-increment operator is wrong
is because it makes an extra copy that is never used (which is slower
and less efficient).

Now, they might have a point if $i is a blessed reference pointing
to a huge structure, but when $i is just an integer, it won't save you
any noticeable difference to use pre-increment instead of
post-increment.

But I've had people challenge me on this.  They say that if you're
writing code, it should be as efficient as possible because it could
get called in a very tight loop that gets called a large number of
times.

And while I agree that code should be efficient, I point out that if
the code they write is running slowly, changing a post-increment
operator to a (presumably faster) pre-decrement operator isn't going to
speed up the program any satisfiable (or noticeable) amount.  What will
make the difference instead is to re-write any algorithms with a Big-O
notation of N-squared (or worse) to be ones that have a Big-O notation
of N log(N) (or better).

And no matter how many times I try to convince them that a
bottleneck won't be eliminated just by replaceing something as trivial
as a post-increment operator with a pre-decrement operator, the person
I'm talking with often ends the discussion with:  "Well... I'm still
going to use the more efficient code."  Unfortunately, all too often
that means that their code will be more difficult to read and
understand (for others, of course), especially when they omit comments
explaining what their code is attempting to do and why it was written
that way.  And often, their "more efficient" code is more bug-prone
than the equivalent "inferior, inefficient" code.

It seemed like you understood my point.  But a lot of people don't.
They hear a cute little quote like this one I read from
http://www-106.ibm.com/developerwor.../l-optperl.html :

>   All of this help, though, comes at a slight performance
>   cost. I keep warnings and strict on while programming
>   and debugging, and I switch it off once the script is
>   ready to be used in the real world. It won't save much,
>   but every millisecond counts.

I totally disagree with this (I won't go into the reasons why).  But my
point is that many people will read this and use this as their
manifesto not to use warnings and strict.

I can counteract with another cute quote, but I've found that if a
person has been swayed by a cute-sy quote, they generally won't get
swayed back by another.

By the way the original poster posted his message, he seemed to
think that the faster method was good while all the rest were bad!  He
may have obtained this notion the same way I did:  when a computer
science professor gave a lecture on operations and how expensive they
are and how they ultimately cost money.

To answer your question, one quote alone is usually not enough to
sway a person's beliefs, so I felt the need to back it up with a
real-world example and scenario in the hopes that it would educate the
original poster.

I didn't mean to offend you or any other poster on this newsgroup
with my long response, but it's a pet peeve of mine when others write
obsfuscated code in the name of efficiency, particularly when the
amount of time saved from the total run-times of every run of the
"efficient" program amounts to less than a second.  That's why I felt
that a thorough response was in order.

I hope this makes sense, Uri.  (And thanks for pointing out that
thing about using length().)

-- Jean-Luc


Report this thread to moderator Post Follow-up to this message
Old Post
jl_post@hotmail.com
12-23-04 09:10 AM


Re: Good practice to detect empty string?
Joe Smith wrote:
>
> Perl's length() function does not count characters.
> The information is already present in the guts of a
> scalar value.  Therefore your reasoning is incorrect.


I see I was wrong.  Thanks for pointing that out.

I realized later that I could see this information by using the
Devel::P module, like this:


> perl -MDevel::P -e "Dump('perl')"
SV = PV(0x225208) at 0x1823e98
REFCNT = 1
FLAGS = (PADBUSY,PADTMP,POK,READONLY,pPOK)
PV = 0x182ac34 "perl"\0
CUR = 4
LEN = 5


Again, thanks.

-- Jean-Luc


Report this thread to moderator Post Follow-up to this message
Old Post
jl_post@hotmail.com
12-23-04 09:10 AM


Re: Good practice to detect empty string?
jl_post@hotmail.com wrote:
> This subject has come up several times with my peers, and I'm still
> amazed what some programmers will favor in the name of efficiency and
> speed.  For example, some people will refuse to ever use the line:
>
> $i++;
>
> when $i is just an integer.  Instead, they will say the code is wrong
> unless it is written as:
>
> ++$i;
>
> or:
>
> $i += 1;
>
> The reason they think that using the post-increment operator is wrong
> is because it makes an extra copy that is never used (which is slower
> and less efficient).

Tell them to take a class in basic compiler construction. Well, compile time
optimizations are an advanced topic, so they will have to take two classes.
But any compiler, that does not fold all three statements into the most
efficient form is not worth its money, even if it's free.

jue



Report this thread to moderator Post Follow-up to this message
Old Post
Jürgen Exner
12-23-04 09:10 AM


Sponsored Links




Last Thread Next Thread Next
Pages (2): [1] 2 »
Search this forum -> 
Post New Thread

PERL Miscellaneous archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 08:14 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.