For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > April 2007 > Regex Matching & Grouping









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Regex Matching & Grouping
Rodrick Brown

2007-04-20, 3:58 am

I'm testing a regex with the sample data

Chicago Bulls 55 66
Miami Heat 13 44
Alpha Dogs 22 48
Hornets 84 22
Celtics 22 24

while($line = <FH> )
{
$line =~ m/((\w+\s+)?\w+)\s+(\d+)\s+(\d+)/;
($team,$wins,$losses) = ($1, $3, $4);
print $team;
}

I came up with the following regex to map team names, wins and losses into 3
varibles
I'm having a hard time understanding how regex capture matches work ie.
which exactly group is grabbing $1

Thanks in advance.

--
Rodrick R. Brown
http://www.rodrickbrown.com

Chas Owens

2007-04-20, 3:58 am

On 4/20/07, Rodrick Brown <rodrick.brown@gmail.com> wrote:
> I'm testing a regex with the sample data
>
> Chicago Bulls 55 66
> Miami Heat 13 44
> Alpha Dogs 22 48
> Hornets 84 22
> Celtics 22 24
>
> while($line = <FH> )
> {
> $line =~ m/((\w+\s+)?\w+)\s+(\d+)\s+(\d+)/;
> ($team,$wins,$losses) = ($1, $3, $4);
> print $team;
> }
>
> I came up with the following regex to map team names, wins and losses into 3
> varibles
> I'm having a hard time understanding how regex capture matches work ie.
> which exactly group is grabbing $1
>
> Thanks in advance.
>
> --
> Rodrick R. Brown
> http://www.rodrickbrown.com
>


I believe it is left to right, outer before inner

$_ = "abcdefghijk";
/((.)((.)(.)))(.)/;
print "$1 -- $2 -- $3 -- $4 -- $5 -- $6\n";

prints abc -- a -- bc -- b -- c -- d

But it looks like you don't really need non-capturing grouping. This
is done by putting ?: at the start of the group: (?:pattern). Also,
you need to check whether $line is defined in your while loop, you
should be declaring your variables with my, and you need to check to
see if your regex matched.

#!/usr/bin/perl

use strict;
use warnings;

while(defined (my $line = <DATA> )) {
unless ($line =~ m{
( #capture $1
(?:\w+\s+)? #optional word followed by spaces
\w+ #required word
)
\s+ #space sparator
(\d+) #capture $2 an integer
\s+ #space sparator
(\d+) #capture $3 an integer
}x) {
warn "could not match";
next;
}

my ($team,$wins,$losses) = ($1, $2, $3);
print "team $team wins $wins losses $losses\n";
}

__DATA__
Chicago Bulls 55 66
Miami Heat 13 44
Alpha Dogs 22 48
Malformed foo bar baz
Hornets 84 22
Celtics 22 24

or the more compact

#!/usr/bin/perl

use strict;
use warnings;

while(defined (my $line = <DATA> )) {
my ($team,$wins,$losses) = $line =~ /((?:\w+\s+)?\w+)\s+(\d+)\s+(\d+)/
or do { warn "could not match"; next; };
print "team $team wins $wins losses $losses\n";
}

__DATA__
Chicago Bulls 55 66
Miami Heat 13 44
Alpha Dogs 22 48
Malformed foo bar baz
Hornets 84 22
Celtics 22 24
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com