Home > Archive > PERL Miscellaneous > May 2004 > noob question: Trying to extract part of a string in a variable to another variable
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
noob question: Trying to extract part of a string in a variable to another variable
|
|
| cayenne 2004-04-25, 12:39 pm |
| Hello all,
I'm a perl noob...and just can't quite figure out how to do something
that should be pretty simple.
Here's an example.
I have $mail_address = 'fred jones <fred_jones@somewhere.com>'
I want to use regular expressions to just parse out the userid here of
fred_jones
I'm trying things like this:
$mail_address =~ /\w+@/;
But, doesn't seem to work. I'm a little hazy on exactly how the =~
works...through examples I've successfully used it for substitutions
like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
expression and extract it to the variable...or even to another
variable leaving $mail_address unchanged.
I've looked in books at the substr() function, but, I don't know how
to use regular expressions to find the offset point, etc.
Can someone give me an example...or pointers to a good reference on
this type of thing?
Thanks in advance,
chilecayenne
| |
| Bob Walton 2004-04-25, 1:33 pm |
| cayenne wrote:
....
> I have $mail_address = 'fred jones <fred_jones@somewhere.com>'
>
> I want to use regular expressions to just parse out the userid here of
> fred_jones
....
> Can someone give me an example...or pointers to a good reference on
> this type of thing?
....
> chilecayenne
>
Try:
my($userid)=$mail_address=~/(\w+)@/;
References:
perldoc perlre
perldoc perlretut
perldoc perlop
The books: "Learning Perl (3rd edition)", "Programming Perl (3rd
edition)" and "Mastering Regular Expressions (2nd edition)".
Online: learn.perl.org, www.perl.com, www.perldoc.com
--
Bob Walton
Email: http://bwalton.com/cgi-bin/emailbob.pl
| |
| Jürgen Exner 2004-04-25, 1:33 pm |
| cayenne wrote:
> Here's an example.
>
> I have $mail_address = 'fred jones <fred_jones@somewhere.com>'
>
> I want to use regular expressions to just parse out the userid here of
> fred_jones
>
> I'm trying things like this:
>
> $mail_address =~ /\w+@/;
>
> But, doesn't seem to work.
Please define "doesn't seem to work". What exactly do you expect that
statement to do and what do you observe instead? Like, what do you mean by
"parse out"? Do you want to remove the userid from the string? Or do you
want to capture the userid in a different variable?
> I'm a little hazy on exactly how the =~
> works...
It is the binding operator. If used the substitute or match will be applied
to the variable on it's left side instead of to the default $_.
> through examples I've successfully used it for substitutions
> like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
> expression and extract it to the variable...or even to another
> variable leaving $mail_address unchanged.
Well, Perl regular expressions do that automatically. Just use grouping:
my $mail_address = 'fred jones <fred_jones@somewhere.com>';
$mail_address =~ /(\w+)@/;
print $1;
Further details "perldoc perlretut" or for the advanced part "perldoc
perlre"
However, I hope you are aware that '\w' does not even begin to cover the
full set of possible email aliases.
Please see "perldoc -q valid", third paragraph for further information.
> I've looked in books at the substr() function, but, I don't know how
> to use regular expressions to find the offset point, etc.
You don't. You would use index() to find the position of a character or
string in a text.
jue
| |
| Web Surfer 2004-04-25, 1:33 pm |
| [This followup was posted to comp.lang.perl.misc]
In article <2deb3d1.0404250759.7676bbb5@posting.google.com>,
chilecayenne@yahoo.com says...
> Hello all,
> I'm a perl noob...and just can't quite figure out how to do something
> that should be pretty simple.
>
> Here's an example.
>
> I have $mail_address = 'fred jones <fred_jones@somewhere.com>'
>
> I want to use regular expressions to just parse out the userid here of
> fred_jones
>
> I'm trying things like this:
>
> $mail_address =~ /\w+@/;
>
> But, doesn't seem to work. I'm a little hazy on exactly how the =~
> works...through examples I've successfully used it for substitutions
> like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
> expression and extract it to the variable...or even to another
> variable leaving $mail_address unchanged.
>
> I've looked in books at the substr() function, but, I don't know how
> to use regular expressions to find the offset point, etc.
>
> Can someone give me an example...or pointers to a good reference on
> this type of thing?
>
> Thanks in advance,
>
> chilecayenne
>
#!/usr/bin/perl -w
use strict;
my ( $mail_address , $userid );
$mail_address = 'fred jones <fred_jones@somewhere.com>';
$mail_address =~ /(\w+)@/;
$userid = $1;
print "Userid = [$userid]\n";
exit 0;
| |
|
| "cayenne" <chilecayenne@yahoo.com> wrote in message
news:2deb3d1.0404250759.7676bbb5@posting.google.com...
> I'm trying things like this:
>
> $mail_address =~ /\w+@/;
>
> But, doesn't seem to work.
'doesn't seem to work' does not tell us anything
except that you expected it to do something other
than what it does. many of us have negligent PSI
powers, so it helps us not a lot.
on the other hand, maybe what you want is:
my ($id)= $mail_address =~ /(\w+)@/;
>
> I've looked in books at the substr() function, but, I don't know how
> to use regular expressions to find the offset point, etc.
>
> Can someone give me an example...or pointers to a good reference on
> this type of thing?
take a look at the perl documentation:
perldoc perlop
perldoc perlre
gnari
| |
| Tad McClellan 2004-04-25, 3:32 pm |
| Jürgen Exner <jurgenex@hotmail.com> wrote:
> Just use grouping:
>
> my $mail_address = 'fred jones <fred_jones@somewhere.com>';
> $mail_address =~ /(\w+)@/;
> print $1;
But don't use it like that!
You should never use the dollar-digit variables without first ensuring
that the match *succeeded*.
if ( $mail_address =~ /(\w+)@/ ) {
print $1;
}
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
| |
| Tad McClellan 2004-04-25, 4:35 pm |
| Web Surfer <raisin@delete-this-trash.mts.net> wrote:
> $mail_address =~ /(\w+)@/;
> $userid = $1;
What is with this epidemic of teaching the WRONG way in this thread?
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
| |
| Tad McClellan 2004-04-27, 1:52 am |
| Milo Minderbinder <noMail@fmail.com> wrote:
[ snip full-quote, please don't do that]
> you have to mark the part you want to get.
>
> $mail_address =~ m/(\w+?)@/;
> $name = $1;
>
> Take brackets to mark what you want. You will find the result in $1.
^^^^
^^^^
No, you *might* find the result in $1.
If you've tested that the match *succeeded*,
_then_ you will find the result in $1.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
| |
|
|
"cayenne" <chilecayenne@yahoo.com> wrote in message
news:2deb3d1.0404250759.7676bbb5@posting.google.com...
> Hello all,
> I'm a perl noob...and just can't quite figure out how to do something
> that should be pretty simple.
>
> Here's an example.
>
> I have $mail_address = 'fred jones <fred_jones@somewhere.com>'
>
> I want to use regular expressions to just parse out the userid here of
> fred_jones
>
> I'm trying things like this:
>
> $mail_address =~ /\w+@/;
>
> But, doesn't seem to work. I'm a little hazy on exactly how the =~
> works...through examples I've successfully used it for substitutions
> like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
> expression and extract it to the variable...or even to another
> variable leaving $mail_address unchanged.
>
> I've looked in books at the substr() function, but, I don't know how
> to use regular expressions to find the offset point, etc.
>
> Can someone give me an example...or pointers to a good reference on
> this type of thing?
>
> Thanks in advance,
>
> chilecayenne
Regular expressions are not the right way to find the offset unless you want
to use $1 an $2 and $3...etc, and then use index, it still isn't an optimal
way to find the offset point. Just change up your regular expression looks
like the other code, man I'm so tired.
-Robin
| |
| Sherm Pendley 2004-04-27, 1:53 am |
| Robin wrote:
> Regular expressions are not the right way to find the offset unless you
> want to use $1 an $2 and $3...etc, and then use index, it still isn't an
> optimal way to find the offset point.
Darn right it's not. If your pattern has subexpressions, then on a match the
offset of each subexpression appears in the @- array. That is, the offset
of $1 is in $-[0], $2 is in $-[1], and so forth.
Note that offsets, no matter how they're found, are irrelevant to the
original question anyway. All he wanted was the value of the matched
substring, not its position. He was thinking he might need to offset to get
the substring, but he was barking in the wrong forest with that idea.
So tell me Robin, when are you going to stop posting nonsense answers to
questions you don't understand?
sherm--
--
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org
| |
| Anno Siegel 2004-04-27, 1:53 am |
| Jürgen Exner <jurgenex@hotmail.com> wrote in comp.lang.perl.misc:
> cayenne wrote:
[...]
>
> You don't.
Ah, but you do, though not in this case. The @- and @+ arrays are
there to support it.
Anno
| |
| Richard Morse 2004-04-27, 2:06 am |
| In article <2deb3d1.0404250759.7676bbb5@posting.google.com>,
chilecayenne@yahoo.com (cayenne) wrote:
> I have $mail_address = 'fred jones <fred_jones@somewhere.com>'
>
> I want to use regular expressions to just parse out the userid here of
> fred_jones
>
> I'm trying things like this:
>
> $mail_address =~ /\w+@/;
What you seem to be asking for is this:
my ($user_id) = ($mail_address =~ m/(\w+)@/);
However, please note that \w doesn't really have the complete set of
valid characters to prefix the '@' sign in an email address.
Just off the top of my head, I know that '.', '-', '?', '=', and more
are valid. Possibly any unicode character other than whitespace and '@'
are valid. It might even be valid to have '<' in an email address.
At the very least, you probably want
my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);
HTH,
Ricky
--
Pukku
| |
| Glenn Jackman 2004-04-27, 2:06 am |
| Richard Morse <remorse@partners.org> wrote:
> At the very least, you probably want
>
> my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);
Be careful where you use '-' inside a range:
Invalid [] range ".-+" before HERE mark in regex m/([\w.-+ << HERE =]+)@/
Put the hyphen last: [\w.+=-]
--
Glenn Jackman
NCF Sy min
glennj@ncf.ca
| |
| Tad McClellan 2004-04-27, 2:10 am |
| Glenn Jackman <xx087@freenet.carleton.ca> wrote:
> Put the hyphen last: [\w.+=-]
Or first.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
| |
| cayenne 2004-05-19, 2:31 pm |
| Richard Morse <remorse@partners.org> wrote in message news:<remorse-79A20D.12400626042004@plato.harvard.edu>...
> In article <2deb3d1.0404250759.7676bbb5@posting.google.com>,
> chilecayenne@yahoo.com (cayenne) wrote:
>
>
> What you seem to be asking for is this:
>
> my ($user_id) = ($mail_address =~ m/(\w+)@/);
>
> However, please note that \w doesn't really have the complete set of
> valid characters to prefix the '@' sign in an email address.
>
> Just off the top of my head, I know that '.', '-', '?', '=', and more
> are valid. Possibly any unicode character other than whitespace and '@'
> are valid. It might even be valid to have '<' in an email address.
>
> At the very least, you probably want
>
> my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);
>
> HTH,
> Ricky
Just quickly, can you explain the extensive use of parens here? I
understand the () in the regular expression, to keep those parts the
match...but, what is the function of the () around $user_id and the
entire part after the = sign?
Thanks in advance,
CC
| |
| Richard Morse 2004-05-19, 3:32 pm |
| In article <2deb3d1.0405190945.f888fa9@posting.google.com>,
chilecayenne@yahoo.com (cayenne) wrote:
> Richard Morse <remorse@partners.org> wrote in message
> news:<remorse-79A20D.12400626042004@plato.harvard.edu>...
>
>
> Just quickly, can you explain the extensive use of parens here? I
> understand the () in the regular expression, to keep those parts the
> match...but, what is the function of the () around $user_id and the
> entire part after the = sign?
Parens around $user_id force the match to happen in a list context. A
match in a scalar context would return the number of matches, while in a
list context, it returns the various matches.
my $user_id = ($mail_address =~ m/.../)
would have $user_id be the value 1 (because there is one match, as it
isn't a /g match).
The parens around the match are there because it makes it easier for me
to read it. I've never not put them there, although a quick test I just
did seems to indicate that they aren't necessary.
HTH,
Ricky
--
Pukku
|
|
|
|
|