| Author |
newby question : parse xml file
|
|
|
| Hi
I have xml file that look like that
<entities>
<Comp id="123">
<Description max_length="" reference_to="" type="multiline_string"/>
<Clone max_length="" reference_to="Comp" type="reference_list">
<Comp id="129" />
</Clone>
</Comp>
<Comp id="124">
<Description max_length="" reference_to="" type="multiline_string"/>
</Comp>
</entities>
I need a way to get the Comp id numbers that are only on the top level
( under entities )
and not in all xml file.
Thanks
| |
| David Squire 2006-04-20, 7:02 pm |
| EF wrote:
> I have xml file that look like that
> <entities>
> <Comp id="123">
> <Description max_length="" reference_to="" type="multiline_string"/>
> <Clone max_length="" reference_to="Comp" type="reference_list">
> <Comp id="129" />
> </Clone>
> </Comp>
> <Comp id="124">
> <Description max_length="" reference_to="" type="multiline_string"/>
> </Comp>
> </entities>
>
> I need a way to get the Comp id numbers that are only on the top level
> ( under entities )
> and not in all xml file.
I take it you are using a module such as XML::Parser?
One way to handle this is to set a flag to some OK value in the start
handler section that handles entities elements, then in the section of
the start handler that handles Comp elements, do what you need to do iff
the flag is OK, then set it to a not OK value.
DS
PS. Having a element called "entities" is likely to cause confusion in
the XML world.
| |
| Michel Rodriguez 2006-04-20, 7:02 pm |
| EF wrote:
> I have xml file that look like that
> <entities>
> <Comp id="123">
> <Description max_length="" reference_to="" type="multiline_string"/>
> <Clone max_length="" reference_to="Comp" type="reference_list">
> <Comp id="129" />
> </Clone>
> </Comp>
> <Comp id="124">
> <Description max_length="" reference_to="" type="multiline_string"/>
> </Comp>
> </entities>
>
> I need a way to get the Comp id numbers that are only on the top level
> ( under entities )
> and not in all xml file.
Hi,
Any module that offers XPath support of some sort will make it easy.
With XML::Twig you can do this:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
my @ids;
XML::Twig->new( twig_handlers => { '/entities/Comp' => sub { push @ids,
$_->id } })
->parse( \*DATA);
$,="\n";
print @ids;
__DATA__
<entities>
<Comp id="123">
<Description max_length="" reference_to="" type="multiline_string"/>
<Clone max_length="" reference_to="Comp" type="reference_list">
<Comp id="129" />
</Clone>
</Comp>
<Comp id="124">
<Description max_length="" reference_to="" type="multiline_string"/>
</Comp>
</entities>
| |
| David Squire 2006-04-20, 7:02 pm |
| David Squire wrote:
> EF wrote:
>
....
[color=darkred]
>
> One way to handle this is to set a flag to some OK value in the start
> handler section that handles entities elements, then in the section of
> the start handler that handles Comp elements, do what you need to do iff
> the flag is OK, then set it to a not OK value.
>
Sorry, I have misread the question. For some reason I thought you wanted
only the first Comp child of entities. Please ignore.
DS
| |
| Jim Gibson 2006-04-20, 7:02 pm |
| In article <1145545157.475197.189940@i39g2000cwa.googlegroups.com>, EF
<felad@walla.com> wrote:
> Hi
>
> I have xml file that look like that
[XML snipped, see below]
>
> I need a way to get the Comp id numbers that are only on the top level
> ( under entities )
> and not in all xml file.
Use an XML parser such as XML::Simple that puts the data into a tree
structure and extract only the items you want from an explicit level in
the tree:
#!/usr/local/bin/perl
#
use strict;
use warnings;
use XML::Simple;
undef $/;
my $string = <DATA>;
my $xml = XMLin($string);
my @keys = keys %{$xml->{Comp}};
print "Comp: @keys\n";
__END__
<entities>
<Comp id="123">
<Description max_length="" reference_to="" type="multiline_string"/>
<Clone max_length="" reference_to="Comp" type="reference_list">
<Comp id="129" />
</Clone>
</Comp>
<Comp id="124">
<Description max_length="" reference_to="" type="multiline_string"/>
</Comp>
</entities>
__Output__
Comp: 124 123
| |
| Sherm Pendley 2006-04-20, 7:02 pm |
| "EF" <felad@walla.com> writes:
> I have xml file that look like that
> <entities>
> <Comp id="123">
> <Description max_length="" reference_to="" type="multiline_string"/>
> <Clone max_length="" reference_to="Comp" type="reference_list">
> <Comp id="129" />
> </Clone>
> </Comp>
> <Comp id="124">
> <Description max_length="" reference_to="" type="multiline_string"/>
> </Comp>
> </entities>
>
> I need a way to get the Comp id numbers that are only on the top level
> ( under entities )
> and not in all xml file.
What have you tried so far? What were the results, and how were they diff-
erent from the results you expected?
Have you read the posting guidelines for this group yet? It's generally
expected that you give it your best shot first, and then ask for help if
you get stuck.
sherm--
--
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org
| |
| robic0 2006-04-20, 7:02 pm |
| On 20 Apr 2006 07:59:17 -0700, "EF" <felad@walla.com> wrote:
>Hi
>
>I have xml file that look like that
><entities>
> <Comp id="123">
> <Description max_length="" reference_to="" type="multiline_string"/>
> <Clone max_length="" reference_to="Comp" type="reference_list">
> <Comp id="129" />
> </Clone>
> </Comp>
> <Comp id="124">
> <Description max_length="" reference_to="" type="multiline_string"/>
> </Comp>
></entities>
>
>I need a way to get the Comp id numbers that are only on the top level
>( under entities )
>and not in all xml file.
>
>Thanks
I'm assuming the tag names as well as the entire xml has symbolic names to
some other real xml.
Its like looking at bad varaible naming convention in Perl.
'<entities>' especially, since, although there is an '!ENTITY keyword tag,
the word itself represents a substitution of content data.
Just an observation. Perhaps you should read and understand xml before you try to
work on it. Here's a good reference:
http://www.w3.org/TR/xml11/
Not very popular reading, granted...
| |
|
|
"Jim Gibson" <jgibson@mail.arc.nasa.gov> wrote in message
news:200420060933402613%jgibson@mail.arc.nasa.gov...
> In article <1145545157.475197.189940@i39g2000cwa.googlegroups.com>, EF
> <felad@walla.com> wrote:
>
>
> [XML snipped, see below]
>
>
> Use an XML parser such as XML::Simple that puts the data into a tree
> structure and extract only the items you want from an explicit level in
> the tree:
>
> #!/usr/local/bin/perl
> #
> use strict;
> use warnings;
> use XML::Simple;
>
> undef $/;
> my $string = <DATA>;
> my $xml = XMLin($string);
> my @keys = keys %{$xml->{Comp}};
> print "Comp: @keys\n";
>
> __END__
> <entities>
> <Comp id="123">
> <Description max_length="" reference_to="" type="multiline_string"/>
> <Clone max_length="" reference_to="Comp" type="reference_list">
> <Comp id="129" />
> </Clone>
> </Comp>
> <Comp id="124">
> <Description max_length="" reference_to="" type="multiline_string"/>
> </Comp>
> </entities>
>
> __Output__
>
> Comp: 124 123
I always use XML:Simple but would add ForceArray=>1, suppressempty=>1.
my ($response)=@_; # XML string
my $xml = new XML::Simple (ForceArray=>1, suppressempty=>1); # create object
my $data = $xml->XMLin("<ignore>" . $response . "</ignore>"); # tags needed
since we have a string
Everthing is then in a hash array and can be accessed directly.
Regards
John
| |
| John Bokma 2006-05-31, 7:06 pm |
| "EF" <felad@walla.com> wrote:
> Hi
>
> I have xml file that look like that
> <entities>
> <Comp id="123">
> <Description max_length="" reference_to=""
> type="multiline_string"/>
> <Clone max_length="" reference_to="Comp"
> type="reference_list">
> <Comp id="129" />
> </Clone>
> </Comp>
> <Comp id="124">
> <Description max_length="" reference_to=""
> type="multiline_string"/>
> </Comp>
> </entities>
>
> I need a way to get the Comp id numbers that are only on the top level
> ( under entities )
> and not in all xml file.
Alternative solution, using XML::Parser:
http://johnbokma.com/perl/element-i...ven-parent.html
--
John Bokma Freelance software developer
&
Experienced Perl programmer: http://castleamber.com/
|
|
|
|