For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > January 2006 > negative match









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author negative match
Adriano Allora

2006-01-20, 6:59 pm

hi to all,

I cannot use a negative match, and I cannot understand why: someone may
help me?

I've got this four rows (for instance):

araba ADJ arabo
arabo ADJ arabo
arabo NOM arabo
arano VER:pres arare

and, with this regular expression, I would extract only the fourth one:

my $form1 = qw(ara\w+);
my $pos1 = qw([A-Z]+);
my $lemma1 = qw(?!arabo);
my $pattern = "^(?:$form1)[^A-Z]*($pos1)[^A-Z]*($lemma1)\n";

but it doesn't work (the script extracts all the lines).

ANOTHER QUESTION: is there a module to merge different arrays and
extract equal values or I have to use a foreach statement?

Thanks to all,

alladr



|^|_|^|_|^| |^|_|^|_|^|
| | | |
| | | |
| |*\_/*\_/*\_/*\_/*\_/* | |
| |
| |
| |
| http://www.e-allora.net |
| |
| |
**************************************

Gerard Robin

2006-01-20, 6:59 pm

On Fri, Jan 20, 2006 at 04:22:28PM +0100, Adriano Allora wrote:
>From: Adriano Allora <all.adr@e-allora.net>


>I've got this four rows (for instance):
>
>araba ADJ arabo
>arabo ADJ arabo
>arabo NOM arabo
>arano VER:pres arare
>
>and, with this regular expression, I would extract only the fourth one:
>
>my $form1 =3D qw(ara\w+);
>my $pos1 =3D qw([A-Z]+);
>my $lemma1 =3D qw(?!arabo);
>my $pattern =3D "^(?:$form1)[^A-Z]*($pos1)[^A-Z]*($lemma1)\n";
>
>but it doesn't work (the script extracts all the lines).
>

If I understand well your question this one can help you ?

#!/usr/bin/perl
# match1.pl

use warnings;
use strict;

my @col4;

while (<DATA> ) {

if (/\w+ +\w+ +\w+ +(\w+)/ or /\w+ +\w+:\w+ +(\w+)/) {

push @col4, $1 if $1;
}
}
print "@col4\n";

__DATA__
araba ADJ arabo manolo
arabo ADJ arabo issan
arabo NOM arabo bonobo
arano VER:pres arare


--=20
G=E9rard

Shawn Corey

2006-01-20, 6:59 pm

Adriano Allora wrote:
> hi to all,
>
> I cannot use a negative match, and I cannot understand why: someone may
> help me?
>
> I've got this four rows (for instance):
>
> araba ADJ arabo
> arabo ADJ arabo
> arabo NOM arabo
> arano VER:pres arare
>
> and, with this regular expression, I would extract only the fourth one:
>
> my $form1 = qw(ara\w+);
> my $pos1 = qw([A-Z]+);
> my $lemma1 = qw(?!arabo);
> my $pattern = "^(?:$form1)[^A-Z]*($pos1)[^A-Z]*($lemma1)\n";
>
> but it doesn't work (the script extracts all the lines).
>


From `perldoc perlre`:

"(?!pattern)"
A zero-width negative look-ahead assertion.

This means $lemma1 will match a string of zero length, which is ALWAYS
before the newline.

Your problem is that you are trying to do too much with a single
pattern. You should avoid this because it makes understanding, and
therefore maintenance, difficult. Try:

if( /^ara/ && ! /arabo\n$/ && /([A-Z]+)/ ){
print "\$1 = $1\n";
print "yes: $_";
}else{
print "no : $_";
}


> ANOTHER QUESTION: is there a module to merge different arrays and
> extract equal values or I have to use a foreach statement?


No, Try:

my %hash = map { $_ => 1 } @array1;
for ( @array2 ){
push @both, $_ if $hash{$_};
}


--

Just my 0.00000002 million dollars worth,
--- Shawn

"Probability is now one. Any problems that are left are your own."
SS Heart of Gold, _The Hitchhiker's Guide to the Galaxy_

* Perl tutorials at http://perlmonks.org/?node=Tutorials
* A searchable perldoc is available at http://perldoc.perl.org/
Tom Phoenix

2006-01-20, 6:59 pm

On 1/20/06, Adriano Allora <all.adr@e-allora.net> wrote:

> my $form1 =3D qw(ara\w+);
> my $pos1 =3D qw([A-Z]+);
> my $lemma1 =3D qw(?!arabo);
> my $pattern =3D "^(?:$form1)[^A-Z]*($pos1)[^A-Z]*($lemma1)\n";


You probably don't have the pattern you think you have. Have you tried
printing $pattern to see what it contains?

In general, it's difficult to assemble a pattern from strings when
metacharacters may be involved. But it helps to use qr// instead of
qw//, since the former has the same "metacharacter sense" as normal
Perl patterns. See the documentation of qr// in the perlop manpage.

Does that get you closer to a solution? Good luck with it!

--Tom Phoenix
Stonehenge Perl Training
Adriano Allora

2006-01-24, 6:56 pm

Hi,
I tried with qr{} (after readind the perlop manpage as tom suggested)
and the pattern results in stdout as this:
^(?:(?-xism:ara\w+))[^A-Z]*((?-xism:[A-Z]+))[^A-Z]*(?!arabo)
It's quite strange: the first and second element have got a pair of
brackets more and I don't understand what -xism does mean.

Any help is appreciated,

adriano allora

Il giorno 20/gen/06, alle 17:22, Tom Phoenix ha scritto:

> On 1/20/06, Adriano Allora <all.adr@e-allora.net> wrote:
>
>
> You probably don't have the pattern you think you have. Have you tried
> printing $pattern to see what it contains?
>
> In general, it's difficult to assemble a pattern from strings when
> metacharacters may be involved. But it helps to use qr// instead of
> qw//, since the former has the same "metacharacter sense" as normal
> Perl patterns. See the documentation of qr// in the perlop manpage.
>
> Does that get you closer to a solution? Good luck with it!
>
> --Tom Phoenix
> Stonehenge Perl Training
>
>


|^|_|^|_|^| |^|_|^|_|^|
| | | |
| | | |
| |*\_/*\_/*\_/*\_/*\_/* | |
| |
| |
| |
| http://www.e-allora.net |
| |
| |
**************************************

Tom Phoenix

2006-01-24, 6:56 pm

On 1/24/06, Adriano Allora <all.adr@e-allora.net> wrote:

> and the pattern results in stdout as this:
> ^(?:(?-xism:ara\w+))[^A-Z]*((?-xism:[A-Z]+))[^A-Z]*(?!arabo)
> It's quite strange: the first and second element have got a pair of
> brackets more and I don't understand what -xism does mean.


It's still your pattern. Perl has re-written it a little, now that
it's one big string. Since parts compiled with qr{} didn't have flags
like i (case-insensitive), and also in order to show correct
precedence, those parts are marked with one or another form of the
(?:) non-memory parentheses. So the part that looks like -xism to you
looks like a piece of (?-xism:FOO) to me, where FOO is the affected
part of the pattern. Look for the item on (?imsx-imsx:pattern) in the
perlre manpage.

In this case, your pattern would look roughly like this, I think, if
those parts weren't marked:

^ara\w+[^A-Z]*([A-Z]+)[^A-Z]*(?!arabo)

Does that look roughly like the pattern that you were aiming for? Hope
this helps!

--Tom Phoenix
Stonehenge Perl Training
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com