Home > Archive > PERL Beginners > February 2006 > bug or am I not understanding?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
bug or am I not understanding?
|
|
| Bruce Bowen 2006-02-20, 9:55 pm |
| I have posted question related to this before and have received many suggestions.
I have a file of varying number of records of varying lengths, delimited by comas.
022,D,092,000,004,034,000,001,000,000
023,D,031,000,000,000,000,002,000,000
024,@D ,025,000,001,900,900,093,093,900,255,065
,000,279,001,028,130,161,037,106,016,410
,017,410,255,255,255,255
026,>,255,255,255,255,255,255,255,255
027,E,034,093,106,396,396,396,396,007
According to the hex editor each line ends in a hex 0D 0A.
My question to the group is why would this line recognize that carriage return.
open STATE, "STATEFILE.txt" or die;
@lines = split /\n/, <STATE>;
and this not recognize the same character;
open STATE, "STATEFILE.txt" or die;
$state = <STATE>;
$a = index($state, "/\n");
Thanks
Bruce Bowen
| |
| Tom Phoenix 2006-02-21, 3:55 am |
| On 2/20/06, Bowen, Bruce <Bowenb@diebold.com> wrote:
> My question to the group is why would this line recognize that carriage r=
eturn.
> @lines =3D split /\n/, <STATE>;
> and this not recognize the same character;
> $a =3D index($state, "/\n");
The second parameter you're passing to index() is a two-character
string. Is that your bug? (Are you also setting $/ to a non-default
value? Else I don't understand your intent.) Hope this helps!
--Tom Phoenix
Stonehenge Perl Training
| |
| Bruce Bowen 2006-02-21, 3:55 am |
|
-----Original Message-----
From: tom.phoenix@gmail.com [mailto:tom.phoenix@gmail.com]On Behalf Of
Tom Phoenix
Sent: Monday, February 20, 2006 10:50 PM
To: Bowen, Bruce
Cc: beginners@perl.org
Subject: Re: bug or am I not understanding?
On 2/20/06, Bowen, Bruce <Bowenb@diebold.com> wrote:
> My question to the group is why would this line recognize that =
carriage return.
> @lines =3D split /\n/, <STATE>;
> and this not recognize the same character;
> $a =3D index($state, "/\n");
The second parameter you're passing to index() is a two-character
string. Is that your bug? (Are you also setting $/ to a non-default
value? Else I don't understand your intent.) Hope this helps!
Would $a =3D index($state, /\n); then be the correct format to find the =
end of the line or a carriage return? Because that gives errors. I'm =
getting the impression you can use \n with some commands and not others. =
Just that it doesn't seem to be documented anywhere.
TX,
Bruce
--Tom Phoenix
Stonehenge Perl Training
| |
| Chas Owens 2006-02-21, 3:55 am |
| On 2/20/06, Bowen, Bruce <Bowenb@diebold.com> wrote:
snip
> Would $a =3D index($state, /\n); then be the correct format to find the e=
nd of the line
> or a carriage return? Because that gives errors. I'm getting the impres=
sion you can
> use \n with some commands and not others. Just that it doesn't seem to b=
e
> documented anywhere.
snip
The correct syntax is
$a =3D index($state, "\n");
| |
| Tom Phoenix 2006-02-21, 3:55 am |
| On 2/20/06, Bowen, Bruce <Bowenb@diebold.com> wrote:
> Would $a =3D index($state, /\n); then be the correct format to find the e=
nd of
> the line or a carriage return? Because that gives errors. I'm getting t=
he
> impression you can use \n with some commands and not others. Just
> that it doesn't seem to be documented anywhere.
Could you be conflating the syntax of the m// operator with general
quoting? See the "Quote and Quote-like Operators" section in the
perlop manpage for information about when you can use \n to mean a
newline character. (Mostly, that's anywhere quoted, except by single
quotes or qw{}.) The same manpage has information on m//. Cheers!
--Tom Phoenix
Stonehenge Perl Training
| |
| David Luke 2006-02-22, 6:56 pm |
| Hi Bruce,
Hex 0D 0A (Carriage Return Line Feed pair) is commonly used in Windows
to signify the end of a line of text. "\n" is supposed to generate an
end of line appropriate for the machine on which Perl is run. On
Windows, 0D 0A is ideal, but 0A often works the same. Hex 0A is also the
standard end of line marker for Unix systems, so a Unix "\n" can appear
to work with Windows data in many cases.
Depending upon the OS your Perl script is running under,=20
$a =3D index($state, "/\n");
will match Hex 2F 0A or Hex 2F 0D 0A.
David Luke, Systems Project Analyst
DMS Enterprise Information Technology Services
Building 4030, Suite 115
4050 Esplanade Way
Tallahassee, Florida 32399-0950
=20
(850) 922-7587
-----Original Message-----
From: Bowen, Bruce [mailto:Bowenb@diebold.com]=20
Sent: Monday, February 20, 2006 10:38 PM
To: beginners@perl.org
Subject: bug or am I not understanding?
I have posted question related to this before and have received many
suggestions.
I have a file of varying number of records of varying lengths, delimited
by comas.
022,D,092,000,004,034,000,001,000,000
023,D,031,000,000,000,000,002,000,000
024,@D
,025,000,001,900,900,093,093,900,255,065
,000,279,001,028,130,161,037,106
,016,410,017,410,255,255,255,255
026,>,255,255,255,255,255,255,255,255
027,E,034,093,106,396,396,396,396,007
According to the hex editor each line ends in a hex 0D 0A.
My question to the group is why would this line recognize that carriage
return.
open STATE, "STATEFILE.txt" or die;
@lines =3D split /\n/, <STATE>;
and this not recognize the same character;
open STATE, "STATEFILE.txt" or die;
$state =3D <STATE>;
$a =3D index($state, "/\n");
Thanks
Bruce Bowen
| |
| Xavier Noria 2006-02-22, 6:56 pm |
| On Feb 22, 2006, at 18:52, Luke, David wrote:
> Depending upon the OS your Perl script is running under,
>
> $a = index($state, "/\n");
>
> will match Hex 2F 0A or Hex 2F 0D 0A.
Since the meaning of that "match" is a bit ambiguous to me, let me be
maybe a bit redundant to be sure the OP gets this right and clear:
length "\n" is == 1 always. Everywhere. No matter the OS newline
convention. "\n" is eq "\012" in all supported platforms except in
MacPerl (pre-OSX), where it is eq "\015".
In text mode the I/O subsytem is responsible of transforming native
newlines up to "\n" while reading, and "\n" down to native newlines
while writing. That is transparent to the programmer but is the key
to understand problems when line-oriented programs deal with text
that does not follow the conventions of the runtime platform.
-- fxn
| |
| Xavier Noria 2006-02-22, 6:56 pm |
| On Feb 22, 2006, at 21:08, Bowen, Bruce wrote:
> Thanks, I'm running under Windows XP. It's strange that with the
> exact same file where it system can't find the index /\n or "/\n"
> it has no problem finding that when used with the split command.
> Many people have indicated that the index should work, but all I
> get when I print out the $a is -1.
>
> This is more a question of is this the way it should be or should
> this be considered a bug. In fact after careful review of my code,
> using the split method gave much better results with way shorter code.
I have not followed the thread in detail but the point is:
"\n" in Windows XP is exactly "\012". When a text file has the
conventions of Windows and your line-oriented script is running in
Windows (not in Cygwin) you don't see pairs CRLF in the script, you
see just "\012"s.
Now in the examples you sent the split was on the regexp /\n/,
whereas the index is on "/\n". You shouldn't expect them to do the
same because the latter requires a slash before the newline, whereas
the former only splits on newlines, with no additional constraint.
You seem to assume /\n/ in the first example carries the initial
slash, but it doesn't, the regexp has only one character, which in
your marchine is exactly "\012". Both slashes are delimiters, they
are not part of the regexp itself.
-- fxn
|
|
|
|
|