Home > Archive > PERL Miscellaneous > March 2004 > [NEWBIE] newline question
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
[NEWBIE] newline question
|
|
| Jan Biel 2004-03-31, 1:34 pm |
| Hello!
From some tutorials on the web I managed to create a perl script which finds
and replaces certain occurences in text files via regular expressions.
Then something happened which I cannot really explain, so I hope you can
clarify it for me.
The original perl script looks like this:
-------------------------------
$filein = 'a.txt';
$fileout = 'b.txt';
open(INFO, $filein);
open(INFO2, ">$fileout");
@lines = <INFO>;
grep(s/\n//g,@lines);
grep(s/ab/found/g,@lines);
print INFO2 @lines;
close(INFO);
close(INFO2);
--------------------------------
where a.txt is a file containing:
--------------------------------
a
b
c
--------------------------------
The resulting b.txt contains:
--------------------------------
abc
--------------------------------
So the second regular expression is ignored.
But if I write two perl scripts where each executes only one of the regular
expressions it works with the result:
--------------------------------
foundc
--------------------------------
as expected.
What is the mystery here?
I hope this wasn't too confusing :)
Janbiel
| |
| Richard Morse 2004-03-31, 1:34 pm |
| In article <c4eupc$mqe$1@ariadne.rz.tu-clausthal.de>,
"Jan Biel" <jan.biel@tu-clausthal.de> wrote:
> -------------------------------
> $filein = 'a.txt';
> $fileout = 'b.txt';
>
> open(INFO, $filein);
> open(INFO2, ">$fileout");
>
> @lines = <INFO>;
@lines = ( 'a\n', 'b\n', 'c\n' );
> grep(s/\n//g,@lines);
@lines = ( 'a', 'b', 'c');
> grep(s/ab/found/g,@lines);
At this point, no entry in @lines matched 'ab', so the substitute never
occurs.
Try this:
#!/usr/bin/perl
# always use these next two lines
use strict;
use warnings;
my $filein = 'a.txt';
my $fileout = 'b.txt';
open(my $in, "<", $filein) or die("Can't open $filein: $!");
# slurp all of the data into one string, since you really don't
# care about newline separations
my $data;
{
local $/;
$data = <$in>;
}
close($in);
# remove any newline characters
$data =~ s/\n//g;
# change 'ab' to 'found'
$data =~ s/ab/found/g;
# save the data
open(my $out, ">", $fileout) or die("Couldn't open >$fileout: $!");
print $out $data, "\n";
close($out);
__END__
HTH,
Ricky
--
Pukku
| |
| Gunnar Hjalmarsson 2004-03-31, 1:34 pm |
| Jan Biel wrote:
> The original perl script looks like this:
use strict; # Make Perl help you detect errors
use warnings; # "-
> $filein = 'a.txt';
> $fileout = 'b.txt';
my $filein = 'a.txt';
my $fileout = 'b.txt';
----^^
Declare variables with my()
> open(INFO, $filein);
> open(INFO2, ">$fileout");
open INFO, $filein or die $!;
open INFO2, "> $fileout" or die $!;
----------------------------^^^^^^^^^^
Check if file was successfully opened
> @lines = <INFO>;
my @lines = <INFO>;
> grep(s/\n//g,@lines);
That works, but it's clearer written as:
@lines = map { tr/\n//d; $_ } @lines;
Now it's time for reflection. :)
@lines is an array, and at this point, it contains three elements. You
seem to want to concatenate the elements to a string. That can be
done like this:
my $string = join '', @lines;
> grep(s/ab/found/g,@lines);
That takes one element at a time, and replaces occurrences of the
sting 'ab'. None of the elements contains that string, so nothing happens.
You can apply the s/// operator to $string instead:
$string =~ s/ab/found/g;
> print INFO2 @lines;
print INFO2 "$string\n";
> close(INFO);
> close(INFO2);
> --------------------------------
HTH
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
| |
| Gunnar Hjalmarsson 2004-03-31, 2:36 pm |
| Richard Morse wrote:
> In article <c4eupc$mqe$1@ariadne.rz.tu-clausthal.de>,
> "Jan Biel" <jan.biel@tu-clausthal.de> wrote:
>
> @lines = ( 'a\n', 'b\n', 'c\n' );
No, that wouldn't populate @lines with the same thing. This would:
@lines = ( "a\n", "b\n", "c\n" );
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
| |
| Brian McCauley 2004-03-31, 2:36 pm |
| Gunnar Hjalmarsson <noreply@gunnar.cc> writes:
>
> That works, but it's clearer written as:
>
> @lines = map { tr/\n//d; $_ } @lines;
That works, but it's not clearer. Using tr/// rather than using s///
adds efficiency not clarity. Using "map" where you really want "for"
instead of using "grep" where you really want "for" is a neutral change.
Addding a redundant assignement is just obfuscaion.
Clearer would be something like
tr/\n//d for @lines;
Or
s/\n//g for @lines;
--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
| |
| Gunnar Hjalmarsson 2004-03-31, 2:36 pm |
| Brian McCauley wrote:
> Gunnar Hjalmarsson <noreply@gunnar.cc> writes:
>
>
> That works, but it's not clearer.
Well, I could argue, but I won't. Let's just agree that I should
better have used 'for'. :)
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
| |
| Richard Morse 2004-03-31, 3:35 pm |
| In article <c4f1j1$2hkaud$1@ID-184292.news.uni-berlin.de>,
Gunnar Hjalmarsson <noreply@gunnar.cc> wrote:
> Richard Morse wrote:
>
> No, that wouldn't populate @lines with the same thing. This would:
>
> @lines = ( "a\n", "b\n", "c\n" );
Right. I knew that.
/me starts flipping through documentation
My bad,
Ricky
--
Pukku
| |
| Jan Biel 2004-03-31, 3:35 pm |
| Richard Morse wrote:
> Try this:
[...]
Thanks a lot.
I really appreciate the comments.
Always glad to learn decent style along with learning the basics of a
language.
Thanks,
Janbiel
| |
| Jan Biel 2004-03-31, 3:35 pm |
| Gunnar Hjalmarsson wrote:
> my $filein = 'a.txt';
> my $fileout = 'b.txt';
> ----^^
> Declare variables with my()
Is that a style hint or is it crucial to the script?
I'm always eager to learn coding good style, I was just wondering which it
is ;)
>
> That works, but it's clearer written as:
>
> @lines = map { tr/\n//d; $_ } @lines;
I guess I'll need to read some tutorials to get a better hang of the
language. Trial and error won't cut it with perl I guess. I wasted too many
hours trying to find out why my code doesn't work. RTFM would have helped I
guess.
Thanks a lot for your input. I appreciate it.
Janbiel
| |
| David K. Wall 2004-03-31, 3:35 pm |
| Gunnar Hjalmarsson <noreply@gunnar.cc> wrote:
> Brian McCauley wrote:
>
> Well, I could argue, but I won't. Let's just agree that I should
> better have used 'for'. :)
Since the OP was just stripping newlines from lines read from a file,
why not use this:
chomp( my @lines = <INFO> );
Did I miss something somewhere? Why use map() when chomp() is made for
this purpose?
| |
| Gunnar Hjalmarsson 2004-03-31, 5:37 pm |
| Jan Biel wrote:
> Gunnar Hjalmarsson wrote:
>
> Is that a style hint or is it crucial to the script?
Running the script with strictures enabled:
use strict;
is a very important style hint. Doing so requires that all variables
are declared, and normally you declare the variables lexically using my().
> I guess I'll need to read some tutorials to get a better hang of
> the language. Trial and error won't cut it with perl I guess. I
> wasted too many hours trying to find out why my code doesn't work.
> RTFM would have helped I guess.
Good thoughts. :)
> Thanks a lot for your input. I appreciate it.
You are welcome.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
| |
| Gunnar Hjalmarsson 2004-03-31, 5:37 pm |
| David K. Wall wrote:
> Gunnar Hjalmarsson <noreply@gunnar.cc> wrote:
>
> Since the OP was just stripping newlines from lines read from a file,
> why not use this:
>
> chomp( my @lines = <INFO> );
>
> Did I miss something somewhere? Why use map() when chomp() is made for
> this purpose?
No, I think it was I who missed something (twice). (And Brian once.)
Thanks.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
| |
| Randal L. Schwartz 2004-03-31, 5:37 pm |
| *** post for FREE via your newsreader at post.newsfeed.com ***
[color=darkred]
Gunnar> That works, but it's clearer written as:
Gunnar> @lines = map { tr/\n//d; $_ } @lines;
I don't consider that clearer. The loop first modifies @lines (via
the side effect of having changed $_ in the map block), then gathers
all those results together to create a new list, then assigns the
entire new list over the top of the identically updated list.
Weird. Definitely not clearer, and more dangerous too. Consider
the obvious cut-and-paste mangling:
@newlines = map { tr/\n//d; $_ } @lines;
Your copy of @lines and @newlines are identical, even though you might
expect @lines to remain unaffected!
Definitely bad. Definitely don't do this. Not without the required
BIG HONKIN COMMENT to the right describing how wasteful you are.
print "Just another Perl hacker,"; # yeah, the guy who invented this phrase
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
-----= Posted via Newsfeed.Com, Uncensored Usenet News =-----
http://www.newsfeed.com - The #1 Newsgroup Service in the World!
-----== 100,000 Groups! - 19 Servers! - Unlimited Download! =-----
| |
| Tad McClellan 2004-03-31, 6:40 pm |
| Jan Biel <jan.biel@tu-clausthal.de> wrote:
> Gunnar Hjalmarsson wrote:
>
>
> Is that a style hint or is it crucial to the script?
It is a style hint (a really strong one).
It _may_ be crucial to _a_ script, but I don't think it is
crucial to the code quoted above. (dodged a bullet)
It may be crucial to having folks here decide between looking
at the code and moving on to the next post. :-)
> I'm always eager to learn coding good style, I was just wondering which it
> is ;)
Controlling the scope of variables is a general CS type topic, much
of it is the same regardless of the programming language being used.
Learn about scoping in Perl:
"Coping with Scoping":
http://perl.plover.com/FAQs/Namespaces.html
Without my(), it is a Global Variable, and everybody knows that the
indiscriminate use of Global Variables is bad design.
I like to encourage these preliminary "rules" regarding variable scope:
For a Perl Beginner:
Always prefer lexical (my) over package (our) variables,
except for the built-in variables.
For a Perl Intermediate:
Always prefer lexical (my) over package (our) variables,
except when your program is so big as to require breaking
it up into several files.
For Everyone Else:
Always prefer lexical (my) over package (our) variables,
except when you can't. (you'll know when you can't by this point)
:-)
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
|
|
|
|
|