| Author |
Perl regular expressions help
|
|
|
| Hi,
I am trying to learn perl regular expressions.
I have:
my $store = q(
<store>
<book>
<title>abc</title>
<author>xyz</author>
</book>
<book>
<title>pqr</title>
<author>mno</author>
</book>
</store> );
I want to extract the titles only in a loop. How can I do this?
Thanks,
Rumpa
| |
| Paul Lalli 2005-08-26, 6:58 pm |
|
Rumpa wrote:
> Hi,
> I am trying to learn perl regular expressions.
> I have:
> my $store = q(
> <store>
> <book>
> <title>abc</title>
> <author>xyz</author>
> </book>
> <book>
> <title>pqr</title>
> <author>mno</author>
> </book>
> </store> );
>
> I want to extract the titles only in a loop. How can I do this?
Regular expressions are *not* the right tool for parsing XML-like data.
Please search http://search.cpan.org for the many XML parsing tools.
If you insist on this as a learning excercise, it would be something
along the lines of:
while ($store =~ m{<title>(.*?)</title>}g) {
print "Title: $1\n";
}
For more information:
perldoc perlretut
perldoc perlre
Paul Lalli
| |
| A. Sinan Unur 2005-08-26, 6:58 pm |
| Rumpa <sgrumpa@yahoo.com> wrote in news:deo51v$26ui$1@heap.juniper.net:
> I am trying to learn perl regular expressions.
But parsing XML is best done by using an XML parser.
> I have:
> my $store = q(
> <store>
> <book>
> <title>abc</title>
> <author>xyz</author>
> </book>
> <book>
> <title>pqr</title>
> <author>mno</author>
> </book>
> </store> );
>
> I want to extract the titles only in a loop. How can I do this?
Use an XML parser. You can build on the following example:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Parser;
my $xml = <<XML
<store>
<book>
<title>abc</title>
<author>xyz</author>
</book>
<book>
<title>pqr</title>
<author>mno</author>
</book>
</store>
XML
;
my $in_title;
my $p = XML::Parser->new(Style => 'Stream', Pkg => 'main');
$p->parse($xml);
sub StartTag {
my ($x, $el) = @_;
return unless $el eq 'title';
$in_title = 1;
}
sub Text {
my ($x) = @_;
print "$_\n" if $in_title;
}
sub EndTag {
my ($x, $el) = @_;
return unless $el eq 'title';
$in_title = 0;
}
__END__
--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(reverse each component and remove .invalid for email address)
comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/c...guidelines.html
| |
| Abigail 2005-08-26, 9:56 pm |
| Rumpa (sgrumpa@yahoo.com) wrote on MMMMCCCLXXVIII September MCMXCIII in
<URL:news:deo51v$26ui$1@heap.juniper.net>:
`' Hi,
`' I am trying to learn perl regular expressions.
`' I have:
`' my $store = q(
`' <store>
`' <book>
`' <title>abc</title>
`' <author>xyz</author>
`' </book>
`' <book>
`' <title>pqr</title>
`' <author>mno</author>
`' </book>
`' </store> );
`'
`' I want to extract the titles only in a loop. How can I do this?
That depends on what is a title. For the given example,
/(abc|pqr)/
will do.
/<title>([^<]+)<\/title>/
works for the given example as well, but it might give false positives.
You are probably much better off with a parser, althouh it is
possible to replace a lot of parsers with regular expressions.
Not that any sane person would want to do it.
Abigail
--
#!/opt/perl/bin/perl -w
$\ = $"; $; = $$; END {$: and print $:} $SIG {TERM} = sub {$ := $_}; kill 15 =>
fork and ($; == getppid and exit or wait) foreach qw /Just another Perl Hacker/
| |
| Tad McClellan 2005-08-27, 3:56 am |
| Rumpa <sgrumpa@yahoo.com> wrote:
> I want to extract the titles only in a loop. How can I do this?
Use a module that understands XML for processing XML data.
---------------------------
#!/usr/bin/perl
use warnings;
use strict;
use XML::Simple;
my $store = q(
<store>
<book>
<title>abc</title>
<author>xyz</author>
</book>
<book>
<title>pqr</title>
<author>mno</author>
</book>
</store> );
my $content = XMLin $store;
foreach my $book ( @{ $content->{book} } ) { # perldoc perlreftut
print "Title is '$book->{title}'\n";
}
---------------------------
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
|
|
|
|