For Programmers: Free Programming Magazines  


Home > Archive > PERL Miscellaneous > June 2005 > newbie regexp question: how to extract just the filename from a canonical file/pathna









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author newbie regexp question: how to extract just the filename from a canonical file/pathna
James Calivar

2005-06-10, 3:59 am

Hello,

I've been trying to figure out how to use the regular expressions in Perl to
strip the leading path name information from a "canonical" path/filename and
just leave me with the filename. For example, if I have C:/Program
Files/Microsoft Office Tools/excel.exe, I would like to extract just
"excel.exe".

I've tried stuff like /\/(.)+(.){3}/ but that matches the *first* forward
slash, not the last one. Maybe I need to be using "revfind" (or whatever
that "reverse" find string manipulator operation is called)?

TIA

James


A. Sinan Unur

2005-06-10, 3:59 am

"James Calivar" <amheiserbush@yahoo.com.au> wrote in
news:Ja7qe.1967$VK4.739@newsread1.news.atl.earthlink.net:

> I've been trying to figure out how to use the regular expressions in
> Perl to strip the leading path name information from a "canonical"
> path/filename and just leave me with the filename. For example, if I
> have C:/Program Files/Microsoft Office Tools/excel.exe, I would like
> to extract just "excel.exe".


You have been using the wrong tool.

For the general problem of extracting a substring after the last
occurence of a character, use substr and rindex:

perldoc -f rindex
perldoc -f substr

On the other hand, in this particular case, you should use a module that
is tailored for the particular problem you are trying to solve:
File::Basename. See:

perldoc File::Basename

Using this module will help you avoid unnecessary portability issues.

D:\Home> cat tt.pl
#! /usr/bin/perl

use strict;
use warnings;

my $path = q{C:/Program Files/Microsoft Office Tools/excel.exe};
my $filename = substr $path, 1 + rindex $path, '/';
print "Filename = $filename\n";

use File::Basename;
$filename = fileparse $path;
print "Filename = $filename\n";

__END__

D:\Home> tt
Filename = excel.exe
Filename = excel.exe

Sinan

--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/c...guidelines.html
James Calivar

2005-06-10, 3:59 am

"A. Sinan Unur" <1usa@llenroc.ude.invalid> wrote in message
news:Xns9670EA474B4BDasu1cornelledu@127.0.0.1...
> "James Calivar" <amheiserbush@yahoo.com.au> wrote in
> news:Ja7qe.1967$VK4.739@newsread1.news.atl.earthlink.net:
>
>
> You have been using the wrong tool.
>
> For the general problem of extracting a substring after the last
> occurence of a character, use substr and rindex:
>
> perldoc -f rindex
> perldoc -f substr
>
> On the other hand, in this particular case, you should use a module that
> is tailored for the particular problem you are trying to solve:
> File::Basename. See:
>
> perldoc File::Basename
>


Thanks for the quick response, folks! I didn't think to look for a
dedicated module.

James


axel@white-eagle.invalid.uk

2005-06-10, 8:56 am

James Calivar <amheiserbush@yahoo.com.au> wrote:
> I've been trying to figure out how to use the regular expressions in Perl to
> strip the leading path name information from a "canonical" path/filename and
> just leave me with the filename. For example, if I have C:/Program
> Files/Microsoft Office Tools/excel.exe, I would like to extract just
> "excel.exe".


> I've tried stuff like /\/(.)+(.){3}/ but that matches the *first* forward
> slash, not the last one.


Yes. Because that is what you are specifying in your RE.

Essentially you are wanting to capture everything after the last '/'.
So, something like this:

my $instr = "C:/Program Files/Microsoft Office Tools/excel.exe";
$instr =~ m#([^/]*)$#;
print $1;

yields

excel.exe

(and an empty string is the file/pathname ends in '/').

However this will not work with a path/filename that has a drive letter but
no directory separators (e.e. 'C:foo.bar').

Axel

James Calivar

2005-06-10, 3:58 pm

<axel@white-eagle.invalid.uk> wrote in message
news:p0eqe.22787$bl3.9466@fe1.news.blueyonder.co.uk...
> James Calivar <amheiserbush@yahoo.com.au> wrote:
Perl to[color=darkred]
and[color=darkred]
>
forward[color=darkred]
>
> Yes. Because that is what you are specifying in your RE.
>
> Essentially you are wanting to capture everything after the last '/'.
> So, something like this:
>
> my $instr = "C:/Program Files/Microsoft Office Tools/excel.exe";
> $instr =~ m#([^/]*)$#;
> print $1;
>
> yields
>
> excel.exe
>
> (and an empty string is the file/pathname ends in '/').
>
> However this will not work with a path/filename that has a drive letter

but
> no directory separators (e.e. 'C:foo.bar').
>


Ah, gotcha! My interpretation of the regular expression you gave is (hope
this is right): "Anything that is not a forward slash, repeated zero or more
times, rooted to the end-of-line".

Thanks

James
> Axel
>



Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com