Home > Archive > PERL Miscellaneous > December 2004 > perl trim function
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
perl trim function
|
|
|
|
| Tassilo v. Parseval 2004-11-27, 8:55 am |
| Also sprach Shailesh Humbad:
> Does this look right? Any suggestions?
>
> http://www.somacon.com/blog/page14.php
Which is:
sub trimwhitespace($)
{
my $string;
$string = pop(@_);
$string =~ s/^\s+//;
$string =~ s/\s+$//;
return $string;
}
That looks right, yes. The argument handling is a bit clumsy, though.
The prototype of $ means the function always receives one argument,
never more, never less. The pop() suggests otherwise:
sub trimwhitespace ($) {
my $string = shift;
$string =~ s/^\s+//;
$string =~ s/\s+$//;
return $string;
}
Another thing you could do is add some convenience to the function, for
example by checking in which context it was called. If it was called in
void context, trim the argument in-place:
sub trim ($) {
if (! defined wantarray) {
$_[0] =~ s/^\s+//;
$_[0] =~ s/\s+$//;
return;
}
my $string = shift;
$string =~ s/^\s+//;
$string =~ s/\s+$//;
return $string;
}
That will allow:
my $s1 = my $s2 = " \t string \n\n";
trim $s1; # $string trimmed in-place
$s2 = trim $s2; # returns trimmed string (does some copying)
But you could get fatal errors at runtime when doing this:
trim(" \t string \n\n");
In void context, the argument must not be read-only.
Tassilo
--
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{re
htonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval
| |
| Tad McClellan 2004-11-27, 3:56 pm |
| Shailesh Humbad <noreply@nowhere.com> wrote:
> Does this look right? Any suggestions?
>
> http://www.somacon.com/blog/page14.php
The forward declaration is unnecessary when the sub is called
the way you are calling it.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
| |
| Arndt Jonasson 2004-11-29, 4:05 pm |
|
"Tassilo v. Parseval" <tassilo.von.parseval@rwth-aachen.de> writes:
> sub trim ($) {
> if (! defined wantarray) {
> $_[0] =~ s/^\s+//;
> $_[0] =~ s/\s+$//;
> return;
> }
> my $string = shift;
> $string =~ s/^\s+//;
> $string =~ s/\s+$//;
> return $string;
> }
I think it's a good idea to always try to avoid duplicated code
(*). Using an auxiliary function and sending \$_[0] and \$string to
it, respectively, as below, seems to work. Since I'm new to Perl,
I have to ask: is it a good way? Are there other ways?
(If speed is important, one has to do benchmark tests and see if the
overhead of the extra function is acceptable.)
sub trim1 ($) {
my $r = shift;
$$r =~ s/^\s+//;
$$r =~ s/\s+$//;
}
sub trim ($) {
if (! defined wantarray) {
trim1 \$_[0];
return;
}
my $string = shift;
trim1 \$string;
return $string;
}
(*) Not that much can go wrong in this function as it stands, but if it
grows and changes semantics, the same logic will have to be entered in
two places every time, and one may easily miss something.
| |
| Uri Guttman 2004-11-29, 4:05 pm |
| >>>>> "AJ" == Arndt Jonasson <do-not-use@invalid.net> writes:
AJ> "Tassilo v. Parseval" <tassilo.von.parseval@rwth-aachen.de> writes:[color=darkred]
AJ> sub trim1 ($) {
AJ> my $r = shift;
AJ> $$r =~ s/^\s+//;
AJ> $$r =~ s/\s+$//;
AJ> }
why pass in a ref when @_ is aliased to the args? you save a deref (if
you are looking for speed).
AJ> sub trim ($) {
AJ> if (! defined wantarray) {
AJ> trim1 \$_[0];
AJ> return;
AJ> }
eww, prototypes! steer clear of them in general.
AJ> my $string = shift;
AJ> trim1 \$string;
AJ> return $string;
AJ> }
with a cleaner api you can easily fold that code and not need a helper
sub. something like (untested):
sub trim {
my $str_ref = defined wantarray ? \"$_[0]" : \$_[0] ;
$$str_ref =~ s/^\s+// ;
$$str_ref =~ s/\s+$// ;
return $$str_ref if defined wantarray ;
return ;
}
there could be a better way to get a ref to a copy of $_[0] but that
should work ok.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
| |
| Arndt Jonasson 2004-11-30, 9:04 am |
|
Uri Guttman <uri@stemsystems.com> writes:
> AJ> sub trim ($) {
> AJ> if (! defined wantarray) {
> AJ> trim1 \$_[0];
> AJ> return;
> AJ> }
>
> eww, prototypes! steer clear of them in general.
Oh. I thought they were meant to be useful, for example catching incorrect
calling usage. What are the arguments against them?
| |
| Anno Siegel 2004-11-30, 9:04 am |
| Uri Guttman <uri@stemsystems.com> wrote in comp.lang.perl.misc:
[...]
> sub trim {
>
> my $str_ref = defined wantarray ? \"$_[0]" : \$_[0] ;
>
> $$str_ref =~ s/^\s+// ;
> $$str_ref =~ s/\s+$// ;
>
> return $$str_ref if defined wantarray ;
> return ;
> }
>
> there could be a better way to get a ref to a copy of $_[0] but that
> should work ok.
That's the lack in Perl of an anonymous scalar ref constructor.
I don't see an attractive alternative to quoting (or $_[ 0] . '' and
equivalent). One could say "... ? my $x = shift : shift;" instead,
but that's neither cleaner nor clearer. Here's a ref-free
alternative:
sub trim {
for ( defined wantarray ? my $x = shift : shift ) {
s/^\s+//;
s/\s+$//;
return $_ if defined wantarray;
}
}
Anno
| |
| Anno Siegel 2004-11-30, 9:04 am |
| Arndt Jonasson <do-not-use@invalid.net> wrote in comp.lang.perl.misc:
>
> Uri Guttman <uri@stemsystems.com> writes:
>
> Oh. I thought they were meant to be useful, for example catching incorrect
> calling usage. What are the arguments against them?
Tom Christansen has written the "classical" paper on prototypes and
their drawbacks. Google for "tchrist prototypes".
Anno
| |
| Uri Guttman 2004-11-30, 4:01 pm |
| >>>>> "AS" == Anno Siegel <anno4000@lublin.zrz.tu-berlin.de> writes:
AS> Uri Guttman <uri@stemsystems.com> wrote in comp.lang.perl.misc:
AS> [...]
[color=darkred]
[color=darkred]
AS> That's the lack in Perl of an anonymous scalar ref constructor.
yep.
AS> I don't see an attractive alternative to quoting (or $_[ 0] . '' and
AS> equivalent). One could say "... ? my $x = shift : shift;" instead,
AS> but that's neither cleaner nor clearer. Here's a ref-free
AS> alternative:
AS> sub trim {
AS> for ( defined wantarray ? my $x = shift : shift ) {
AS> s/^\s+//;
AS> s/\s+$//;
AS> return $_ if defined wantarray;
AS> }
AS> }
i like it but it took a while for my morning eyes to follow the aliasing
levels. i never realized that shift kept the same aliasing as $_[0] but
i rarely use shift on @_ or the aliasing like that. and we both have the
duplicate use of wantarray but i don't see any easy way around that.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
| |
| Tassilo v. Parseval 2004-11-30, 4:01 pm |
| Also sprach Uri Guttman:
[color=darkred]
> AS> I don't see an attractive alternative to quoting (or $_[ 0] . '' and
> AS> equivalent). One could say "... ? my $x = shift : shift;" instead,
> AS> but that's neither cleaner nor clearer. Here's a ref-free
> AS> alternative:
>
> AS> sub trim {
> AS> for ( defined wantarray ? my $x = shift : shift ) {
> AS> s/^\s+//;
> AS> s/\s+$//;
> AS> return $_ if defined wantarray;
> AS> }
> AS> }
>
> i like it but it took a while for my morning eyes to follow the aliasing
> levels. i never realized that shift kept the same aliasing as $_[0] but
> i rarely use shift on @_ or the aliasing like that. and we both have the
> duplicate use of wantarray but i don't see any easy way around that.
There is no need for the second wantarray check. It doesn't really harm
to return a value in void context. :-) As for the aliasing effect of
shift, I am not too much a fan of that either, so I'd write the whole
thing thusly:
sub trim ($) {
local *_ = defined wantarray ? \(my $dummy = shift) : \$_[0];
s/^\s+//;
s/\s+$//;
return $_;
}
Tassilo
--
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{re
htonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval
| |
| Uri Guttman 2004-11-30, 4:01 pm |
|
TvP> There is no need for the second wantarray check. It doesn't really harm
TvP> to return a value in void context. :-) As for the aliasing effect of
TvP> shift, I am not too much a fan of that either, so I'd write the whole
TvP> thing thusly:
TvP> sub trim ($) {
TvP> local *_ = defined wantarray ? \(my $dummy = shift) : \$_[0];
TvP> s/^\s+//;
TvP> s/\s+$//;
TvP> return $_;
TvP> }
and now we can drop the prototype and allow trim to work on $_ if there
are no arguments:
sub trim {
local *_ = defined wantarray ? \(my $dummy = shift) : \$_[0] if @_ ;
s/^\s+//;
s/\s+$//;
return $_;
}
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
| |
| Tassilo v. Parseval 2004-11-30, 4:01 pm |
| Also sprach Uri Guttman:
> TvP> sub trim ($) {
> TvP> local *_ = defined wantarray ? \(my $dummy = shift) : \$_[0];
> TvP> s/^\s+//;
> TvP> s/\s+$//;
> TvP> return $_;
> TvP> }
>
> and now we can drop the prototype and allow trim to work on $_ if there
> are no arguments:
>
> sub trim {
> local *_ = defined wantarray ? \(my $dummy = shift) : \$_[0] if @_ ;
> s/^\s+//;
> s/\s+$//;
> return $_;
> }
Even better. Amazing how a bunch of people can occupy their minds for
several days to find the best five lines of Perl code for a given minor
problem. :-)
Tassilo
--
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{re
htonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval
| |
| Uri Guttman 2004-11-30, 4:01 pm |
| >>>>> "TvP" == Tassilo v Parseval <tassilo.von.parseval@rwth-aachen.de> writes:
[color=darkred]
TvP> Even better. Amazing how a bunch of people can occupy their minds for
TvP> several days to find the best five lines of Perl code for a given minor
TvP> problem. :-)
but untested. i wonder if the local/if will work. a known similar thing
is my/if and that works by mistake. a workaround would be to use another
?: first with @_ and $_ and always localize $_.
local *_ = @_ ? defined wantarray ? \(my $dummy = shift) : \$_[0]
: \$_ ;
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
| |
| Tassilo v. Parseval 2004-11-30, 8:57 pm |
| Also sprach Uri Guttman:
>
>
> TvP> Even better. Amazing how a bunch of people can occupy their minds for
> TvP> several days to find the best five lines of Perl code for a given minor
> TvP> problem. :-)
>
> but untested. i wonder if the local/if will work. a known similar thing
> is my/if and that works by mistake. a workaround would be to use another
> ?: first with @_ and $_ and always localize $_.
>
> local *_ = @_ ? defined wantarray ? \(my $dummy = shift) : \$_[0]
> : \$_ ;
I don't think this is necessary. The problem with 'my' in a
statement-modifier stems from the fact that it's happening at
compiletime...it even happens when the modifier is false at compiletime
which is then optimized away as in
my $var if 0;
It probably works because the 'my $var' is processed before the whole
statement is optimized away. If the parser chose to reverse the order of
events for some reason, it would most likely stop working.
'local' propagates into the enclosing block at compiletime, too. But
unlike 'my', it is not used to create a new variable. So the line
local *_ = defined wantarray ? \(my $dummy = shift) : \$_[0] if @_ ;
works exactly as it should...and will continue to do so: assign to the
scalar slot of *_ after localizing it. You can scratch the whole line
and the function will continue working (only trimming $_, of course).
Quite unlike with a scratched modified 'my'. This will result in a
compiletime error.
So I think the only things that should not be subjet to statement
modifiers are things that have a visible compiletime effect. That would
be 'my', 'our' and I suspect 'use', too. As for the latter, it looks as
if perl doesn't like that at all:
use CGI if 0;
syntax error at - line 1, near "use CGI if"
Execution of - aborted due to compilation errors.
I wouldn't suggest any such practice, but I think it should at least get
past the parsing stage. IMHO, this could even be called a bug.
Tassilo
--
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{re
htonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval
| |
| Uri Guttman 2004-11-30, 8:57 pm |
| >>>>> "TvP" == Tassilo v Parseval <tassilo.von.parseval@rwth-aachen.de> writes:
[color=darkred]
TvP> I don't think this is necessary. The problem with 'my' in a
TvP> statement-modifier stems from the fact that it's happening at
TvP> compiletime...it even happens when the modifier is false at compiletime
TvP> which is then optimized away as in
as i said, untested :)
yeah, i can see local as being runtime so it isn't the same as my/if.
TvP> use CGI if 0;
TvP> syntax error at - line 1, near "use CGI if"
TvP> Execution of - aborted due to compilation errors.
there is a newish 'if' pragma that does that and i think it is core
now.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
| |
| Anno Siegel 2004-11-30, 8:57 pm |
| Uri Guttman <uri@stemsystems.com> wrote in comp.lang.perl.misc:
> AS> Uri Guttman <uri@stemsystems.com> wrote in comp.lang.perl.misc:
[...]
> AS> That's the lack in Perl of an anonymous scalar ref constructor.
>
> yep.
>
> AS> I don't see an attractive alternative to quoting (or $_[ 0] . '' and
> AS> equivalent). One could say "... ? my $x = shift : shift;" instead,
> AS> but that's neither cleaner nor clearer. Here's a ref-free
> AS> alternative:
>
> AS> sub trim {
> AS> for ( defined wantarray ? my $x = shift : shift ) {
> AS> s/^\s+//;
> AS> s/\s+$//;
> AS> return $_ if defined wantarray;
> AS> }
> AS> }
>
> i like it but it took a while for my morning eyes to follow the aliasing
> levels. i never realized that shift kept the same aliasing as $_[0] but
> i rarely use shift on @_ or the aliasing like that.
I think I picked it up on clpm (the behavior of shift and pop, I mean),
years back. Can be useful, though infrequently since argument-modifying
subs are rare.
> and we both have the
> duplicate use of wantarray but i don't see any easy way around that.
I do: scratch the second one. What's the harm in returning $_ to
void context? :)
Anno
| |
| David Combs 2004-12-26, 3:56 am |
| In article <cohnrb$3k4$1@mamenchi.zrz.TU-Berlin.DE>,
Anno Siegel <anno4000@lublin.zrz.tu-berlin.de> wrote:
....
>
>Tom Christansen has written the "classical" paper on prototypes and
>their drawbacks. Google for "tchrist prototypes".
google google-GROUPS for ...
David
|
|
|
|
|