For Programmers: Free Programming Magazines  


Home > Archive > PERL Miscellaneous > March 2008 > Windows paths in glob









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Windows paths in glob
Dmitry

2008-03-31, 8:42 am

OK, so there's a well-known difficulty with handling Windows-style paths in glob: it doesn't
like backslashes, nor does it like spaces. One solution to that is to use Unix-style paths:

glob('C:\Documents and Settings\*'); # Doesn't work
glob('C:/Documents\ and\ Settings/*'); # Works

Problem is, the rest of Perl's built-in file-handling functionality behaves the other way around.
For instance, with -d:

-d 'C:\Documents and Settings'; # Works
-d 'C:/Documents\ and\ Settings'; # Doesn't work

Question: is there any way to use the same path string with glob and with the rest of Perl,
without having to convert them back and forth?

John W. Krahn

2008-03-31, 8:42 am

Dmitry wrote:
> OK, so there's a well-known difficulty with handling Windows-style paths in glob: it doesn't
> like backslashes, nor does it like spaces. One solution to that is to use Unix-style paths:
>
> glob('C:\Documents and Settings\*'); # Doesn't work
> glob('C:/Documents\ and\ Settings/*'); # Works
>
> Problem is, the rest of Perl's built-in file-handling functionality behaves the other way around.
> For instance, with -d:
>
> -d 'C:\Documents and Settings'; # Works
> -d 'C:/Documents\ and\ Settings'; # Doesn't work
>
> Question: is there any way to use the same path string with glob and with the rest of Perl,
> without having to convert them back and forth?


perldoc File::DosGlob
perldoc File::Spec
perldoc File::Basename


John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
Martijn Lievaart

2008-03-31, 8:42 am

On Sun, 30 Mar 2008 19:09:18 +0000, Dmitry wrote:

> OK, so there's a well-known difficulty with handling Windows-style paths
> in glob: it doesn't like backslashes, nor does it like spaces. One
> solution to that is to use Unix-style paths:
>
> glob('C:\Documents and Settings\*'); # Doesn't work glob('C:/Documents\
> and\ Settings/*'); # Works
>
> Problem is, the rest of Perl's built-in file-handling functionality
> behaves the other way around. For instance, with -d:
>
> -d 'C:\Documents and Settings'; # Works -d 'C:/Documents\ and\
> Settings'; # Doesn't work
>
> Question: is there any way to use the same path string with glob and
> with the rest of Perl, without having to convert them back and forth?


I don't have Windows to test here, but I recall that using either a
forward slash '/' or a backward slash -- properly escaped -- '\' works
either way in both situations.

In the examples you gave, the versions with backslashes cannot work, the
backslashes are not escaped.

M4
Gunnar Hjalmarsson

2008-03-31, 8:42 am

Dmitry wrote:
> OK, so there's a well-known difficulty with handling Windows-style paths in glob: it doesn't
> like backslashes, nor does it like spaces. One solution to that is to use Unix-style paths:
>
> glob('C:\Documents and Settings\*'); # Doesn't work
> glob('C:/Documents\ and\ Settings/*'); # Works
>
> Problem is, the rest of Perl's built-in file-handling functionality behaves the other way around.
> For instance, with -d:
>
> -d 'C:\Documents and Settings'; # Works
> -d 'C:/Documents\ and\ Settings'; # Doesn't work
>
> Question: is there any way to use the same path string with glob and with the rest of Perl,
> without having to convert them back and forth?


A long time ago I decided to use opendir() and readdir() instead of
glob(). It may not be as 'elegant', but it works flawlessly without
escaping spaces.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
Peter J. Holzer

2008-03-31, 8:42 am

On 2008-03-30 19:27, Martijn Lievaart <m@rtij.nl.invlalid> wrote:
> On Sun, 30 Mar 2008 19:09:18 +0000, Dmitry wrote:

I didn't expect that but on second thought it makes sense.

[color=darkred]
>
> I don't have Windows to test here, but I recall that using either a
> forward slash '/' or a backward slash -- properly escaped -- '\' works
> either way in both situations.


You misunderstood the problem. The problem is that glob patterns, like
regexps are mini-languages where some characters (or sequences of
characters) have a special meaning. Just as you cannot just use any
string as a regexp and expect it to match itself (or even be a
well-formed regexp) you cannot use any filename as a glob pattern and
expect it to expand to itself. Actually, for globs the situation is
worse: While any string can be converted to a regexp matching that
string, this is not true for globs. Spaces can be escaped with a
backslash, but I didn't find any way to escape an asterisk or question
mark.

So I guess Gunnar's advice is the best: If you need to deal with
arbitrary file and directory names, avoid glob and use opendir/readdir.
Or maybe File::Find or a similar module (which uses opendir/readdir
internally).

hp
Ben Morrow

2008-03-31, 8:42 am


Quoth Gunnar Hjalmarsson <noreply@gunnar.cc>:
>
> A long time ago I decided to use opendir() and readdir() instead of
> glob(). It may not be as 'elegant', but it works flawlessly without
> escaping spaces.


To save Uri the trouble of pointing it out :), File::Slurp now has a
read_dir function.

Ben

Uri Guttman

2008-03-31, 8:42 am

>>>>> "BM" == Ben Morrow <ben@morrow.me.uk> writes:

BM> Quoth Gunnar Hjalmarsson <noreply@gunnar.cc>:[color=darkred]

BM> To save Uri the trouble of pointing it out :), File::Slurp now has a
BM> read_dir function.

it has always had a read_dir function! its advantages are a simpler API
(no need for a handle, opendir, closedir calls) and it filters out . and
... for you. a minor divantage (and very minor IMO) is that it can't
iterate in scalar mode so you get one dir entry at a time. that would
only matter if your dir was enormous and i mean very big.

future plans include passing in a regex or code ref to filter for
you. yeah, you can use grep on the output but it is slightly shorter
that way.

uri

--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
Dmitry

2008-03-31, 8:42 am

"John W. Krahn" <someone@example.com> wrote in
news:cGRHj.9264$9X3.7583@edtnps82:

> Dmitry wrote:
>
> perldoc File::DosGlob
> perldoc File::Spec
> perldoc File::Basename


I tried DosGlob, but when I passed it 'C:\Documents and Settings\*' it bugged out with an
error somewhere in the module...
Dmitry

2008-03-31, 8:42 am

Martijn Lievaart <m@rtij.nl.invlalid> wrote in news:pan.2008.03.30.19.27.16@rtij.nl.invlalid:

> On Sun, 30 Mar 2008 19:09:18 +0000, Dmitry wrote:
>
>
> I don't have Windows to test here, but I recall that using either a
> forward slash '/' or a backward slash -- properly escaped -- '\' works
> either way in both situations.
>
> In the examples you gave, the versions with backslashes cannot work, the
> backslashes are not escaped.
>
> M4


Spaces are a more serious problem than slashes. But anyway, the examples work,
because I used single quotes. BTW, current core glob seems to ignore backslashes
altogether, unless they escape something other than a backslash.
Dmitry

2008-03-31, 8:42 am

Gunnar Hjalmarsson <noreply@gunnar.cc> wrote in
news:65aa8vF2euim1U1@mid.individual.net:

> Dmitry wrote:
>
> A long time ago I decided to use opendir() and readdir() instead of
> glob(). It may not be as 'elegant', but it works flawlessly without
> escaping spaces.
>


OK, thanks. I guess if I wanted to process wildcards in the file name, I would pass them
through grep?
szr

2008-03-31, 8:42 am

Dmitry wrote:
> OK, so there's a well-known difficulty with handling Windows-style
> paths in glob: it doesn't like backslashes, nor does it like spaces.
> One solution to that is to use Unix-style paths:
>
> glob('C:\Documents and Settings\*'); # Doesn't work
> glob('C:/Documents\ and\ Settings/*'); # Works
>
> Problem is, the rest of Perl's built-in file-handling functionality
> behaves the other way around. For instance, with -d:
>
> -d 'C:\Documents and Settings'; # Works
> -d 'C:/Documents\ and\ Settings'; # Doesn't work
>
> Question: is there any way to use the same path string with glob and
> with the rest of Perl, without having to convert them back and forth?


I find, just as in geenral under Win32, putting double quotes around the
path gets around problems like this:

C:\>perl -e "my @d = glob('"""C:/Documents and Settings"""/*'); print
qq{\n}, join(qq{\n}, @d), qq{\n};"

C:/Documents and Settings/Administrator
C:/Documents and Settings/All Users
[...]

*** Note that """, when used in a double quoted string, under the
cmd.exe shell yields a literal ", so the glob statement is effectively:

glob('"C:/Documents and Settings"/*');

*** This is only because the command was run from the command line; in
an actual script you would of course use a normal double quote around
the path (just like in the linux examples below.)


And this works for tests like -d as well:

C:\>perl -e "print int (-d """C:/Documents and Settings""")"
1
C:\>perl -e "print int (-d """C:/123Documents and Settings""")"
0


And this form works under linux as well:

$ perl -e 'my @d = glob(q{"/mnt/samba/win_hd/Documents and
Settings"/*}); print qq{\n}, join(qq{\n}, @d), qq{\n};'

/mnt/samba/win_hd/Documents and Settings/Administrator
/mnt/samba/win_hd/Documents and Settings/All Users

$ perl -e 'print int (-d "/mnt/samba/win_hd/Documents and Settings")'
1
$ perl -e 'print int (-d "/mnt/samba/win_hd/123Documents and Settings")'
0

This was tested under ActivePerl 5.6.1 and 5.8.7, and under linux using
5.10.0, 5.8.8, and 5.6.1.


So if you want to do it in a way that works on most platforms (at the
very least windows and *nix),

1) Use a forward slash, not a back slash, as a path delimiter.
I.E., C:/path to/somewhere/file.ext, and

2) Surround the path with quotes.
I.E., "C:/path to/somewhere/a long filename.ext", or
"C:/path to/somewhere"/file.ext, or
"C:/Documents and Settings/"

and you should be fine.

Hope this helps.

--
szr


Joe Smith

2008-03-31, 8:42 am

Dmitry wrote:

> OK, thanks. I guess if I wanted to process wildcards in the file name, I would pass them
> through grep?


Yes, after converting wildcard characters into regex characters, of course.
Bart Lateur

2008-03-31, 9:11 pm

Dmitry wrote:

>Question: is there any way to use the same path string with glob and with the rest of Perl,
>without having to convert them back and forth?


Is a simple conversion acceptable?

If you put double quotes aroudn the path *in* the string for glob, then
it'll work.

($\, $,) = ("\n", "\t");
chdir 'c:/temp';
foreach('C:/Documents and Settings', 'C:\\Documents and Settings') {
print $_, glob(qq("$_")), -d $_ || 0;
}

Result:

C:/Documents and Settings C:/Documents and Settings 1
C:\Documents and Settings C:./Documents and Settings 1

Well, ok... the response of glob to a backslash *is* weird. But at
least, it seems to work.

--
Bart.
Ben Morrow

2008-03-31, 9:12 pm


Quoth Bart Lateur <bart.lateur@pandora.be>:
>
> Result:
>
> C:/Documents and Settings C:/Documents and Settings 1
> C:\Documents and Settings C:./Documents and Settings 1
>
> Well, ok... the response of glob to a backslash *is* weird. But at
> least, it seems to work.


Not just weird: wrong. Win32 has a notion of 'current directory on a
given drive'; C:./Documents and Settings is a path relative to the
current directory on drive C:.

Ben

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com