Home > Archive > PERL Beginners > January 2006 > negative match
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Adriano Allora 2006-01-20, 6:59 pm |
| hi to all,
I cannot use a negative match, and I cannot understand why: someone may
help me?
I've got this four rows (for instance):
araba ADJ arabo
arabo ADJ arabo
arabo NOM arabo
arano VER:pres arare
and, with this regular expression, I would extract only the fourth one:
my $form1 = qw(ara\w+);
my $pos1 = qw([A-Z]+);
my $lemma1 = qw(?!arabo);
my $pattern = "^(?:$form1)[^A-Z]*($pos1)[^A-Z]*($lemma1)\n";
but it doesn't work (the script extracts all the lines).
ANOTHER QUESTION: is there a module to merge different arrays and
extract equal values or I have to use a foreach statement?
Thanks to all,
alladr
|^|_|^|_|^| |^|_|^|_|^|
| | | |
| | | |
| |*\_/*\_/*\_/*\_/*\_/* | |
| |
| |
| |
| http://www.e-allora.net |
| |
| |
**************************************
| |
| Gerard Robin 2006-01-20, 6:59 pm |
| On Fri, Jan 20, 2006 at 04:22:28PM +0100, Adriano Allora wrote:
>From: Adriano Allora <all.adr@e-allora.net>
>I've got this four rows (for instance):
>
>araba ADJ arabo
>arabo ADJ arabo
>arabo NOM arabo
>arano VER:pres arare
>
>and, with this regular expression, I would extract only the fourth one:
>
>my $form1 =3D qw(ara\w+);
>my $pos1 =3D qw([A-Z]+);
>my $lemma1 =3D qw(?!arabo);
>my $pattern =3D "^(?:$form1)[^A-Z]*($pos1)[^A-Z]*($lemma1)\n";
>
>but it doesn't work (the script extracts all the lines).
>
If I understand well your question this one can help you ?
#!/usr/bin/perl
# match1.pl
use warnings;
use strict;
my @col4;
while (<DATA> ) {
if (/\w+ +\w+ +\w+ +(\w+)/ or /\w+ +\w+:\w+ +(\w+)/) {
push @col4, $1 if $1;
}
}
print "@col4\n";
__DATA__
araba ADJ arabo manolo
arabo ADJ arabo issan
arabo NOM arabo bonobo
arano VER:pres arare
--=20
G=E9rard
| |
| Shawn Corey 2006-01-20, 6:59 pm |
| Adriano Allora wrote:
> hi to all,
>
> I cannot use a negative match, and I cannot understand why: someone may
> help me?
>
> I've got this four rows (for instance):
>
> araba ADJ arabo
> arabo ADJ arabo
> arabo NOM arabo
> arano VER:pres arare
>
> and, with this regular expression, I would extract only the fourth one:
>
> my $form1 = qw(ara\w+);
> my $pos1 = qw([A-Z]+);
> my $lemma1 = qw(?!arabo);
> my $pattern = "^(?:$form1)[^A-Z]*($pos1)[^A-Z]*($lemma1)\n";
>
> but it doesn't work (the script extracts all the lines).
>
From `perldoc perlre`:
"(?!pattern)"
A zero-width negative look-ahead assertion.
This means $lemma1 will match a string of zero length, which is ALWAYS
before the newline.
Your problem is that you are trying to do too much with a single
pattern. You should avoid this because it makes understanding, and
therefore maintenance, difficult. Try:
if( /^ara/ && ! /arabo\n$/ && /([A-Z]+)/ ){
print "\$1 = $1\n";
print "yes: $_";
}else{
print "no : $_";
}
> ANOTHER QUESTION: is there a module to merge different arrays and
> extract equal values or I have to use a foreach statement?
No, Try:
my %hash = map { $_ => 1 } @array1;
for ( @array2 ){
push @both, $_ if $hash{$_};
}
--
Just my 0.00000002 million dollars worth,
--- Shawn
"Probability is now one. Any problems that are left are your own."
SS Heart of Gold, _The Hitchhiker's Guide to the Galaxy_
* Perl tutorials at http://perlmonks.org/?node=Tutorials
* A searchable perldoc is available at http://perldoc.perl.org/
| |
| Tom Phoenix 2006-01-20, 6:59 pm |
| On 1/20/06, Adriano Allora <all.adr@e-allora.net> wrote:
> my $form1 =3D qw(ara\w+);
> my $pos1 =3D qw([A-Z]+);
> my $lemma1 =3D qw(?!arabo);
> my $pattern =3D "^(?:$form1)[^A-Z]*($pos1)[^A-Z]*($lemma1)\n";
You probably don't have the pattern you think you have. Have you tried
printing $pattern to see what it contains?
In general, it's difficult to assemble a pattern from strings when
metacharacters may be involved. But it helps to use qr// instead of
qw//, since the former has the same "metacharacter sense" as normal
Perl patterns. See the documentation of qr// in the perlop manpage.
Does that get you closer to a solution? Good luck with it!
--Tom Phoenix
Stonehenge Perl Training
| |
| Adriano Allora 2006-01-24, 6:56 pm |
| Hi,
I tried with qr{} (after readind the perlop manpage as tom suggested)
and the pattern results in stdout as this:
^(?:(?-xism:ara\w+))[^A-Z]*((?-xism:[A-Z]+))[^A-Z]*(?!arabo)
It's quite strange: the first and second element have got a pair of
brackets more and I don't understand what -xism does mean.
Any help is appreciated,
adriano allora
Il giorno 20/gen/06, alle 17:22, Tom Phoenix ha scritto:
> On 1/20/06, Adriano Allora <all.adr@e-allora.net> wrote:
>
>
> You probably don't have the pattern you think you have. Have you tried
> printing $pattern to see what it contains?
>
> In general, it's difficult to assemble a pattern from strings when
> metacharacters may be involved. But it helps to use qr// instead of
> qw//, since the former has the same "metacharacter sense" as normal
> Perl patterns. See the documentation of qr// in the perlop manpage.
>
> Does that get you closer to a solution? Good luck with it!
>
> --Tom Phoenix
> Stonehenge Perl Training
>
>
|^|_|^|_|^| |^|_|^|_|^|
| | | |
| | | |
| |*\_/*\_/*\_/*\_/*\_/* | |
| |
| |
| |
| http://www.e-allora.net |
| |
| |
**************************************
| |
| Tom Phoenix 2006-01-24, 6:56 pm |
| On 1/24/06, Adriano Allora <all.adr@e-allora.net> wrote:
> and the pattern results in stdout as this:
> ^(?:(?-xism:ara\w+))[^A-Z]*((?-xism:[A-Z]+))[^A-Z]*(?!arabo)
> It's quite strange: the first and second element have got a pair of
> brackets more and I don't understand what -xism does mean.
It's still your pattern. Perl has re-written it a little, now that
it's one big string. Since parts compiled with qr{} didn't have flags
like i (case-insensitive), and also in order to show correct
precedence, those parts are marked with one or another form of the
(?:) non-memory parentheses. So the part that looks like -xism to you
looks like a piece of (?-xism:FOO) to me, where FOO is the affected
part of the pattern. Look for the item on (?imsx-imsx:pattern) in the
perlre manpage.
In this case, your pattern would look roughly like this, I think, if
those parts weren't marked:
^ara\w+[^A-Z]*([A-Z]+)[^A-Z]*(?!arabo)
Does that look roughly like the pattern that you were aiming for? Hope
this helps!
--Tom Phoenix
Stonehenge Perl Training
|
|
|
|
|