For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > June 2005 > xml parser









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author xml parser
Dermot Paikkos

2005-06-09, 8:56 am

Hi,
Thanx for the replies. At the moment I am testing the water so I am
using Activestate 5.8. All I have done with XML::Simple is below.

============= xml-parse.pl =========
#!/bin/perl
use strict;
use XML::Simple;
use Data::Dumper;

my $file = shift; # Doesn't like this.
#my $file = "section-xml"; # or this.
my $config = XMLin('c:\\windows\\desktop\\section-xml');

print Dumper($config);
===============================

I get the error:
junk after document element at line 65, column 0, byte 1993 at
C:/Perl/site/lib/XML/Parser.pm line 185.

The records are pasted at the end of this mail.

I seem to have a bit of trouble getting XML::Simple to accept the
file option, for instance $config("$file") does seem to work for me
and I have to give it a literal path but that is a small problem.

Line 65 is the start of the second record ("<record>") but I don't
know why it is choking on it. I have had a paw through the perldoc
for xml-simple but I can't see an obvious option that will help me
although I think there probably is.

I had began to roll my own, it does work albeit it doesn't look so
pretty but I would rather use a module as everyone seems to agree
this is the best way to go.

So why is XML::Simple choking on the end of the record at line 65? Do
I need to define the record separator?
Thanx.
Dp



============== records ==============
<record>
<CT.NUM>S370/0128</CT.NUM>
<CT.DAT>03-Jun-05</CT.DAT>
<CT.UPD>03-Jun-05</CT.UPD>
<CT.ORI>IN</CT.ORI>
<CT.ALN>0</CT.ALN>
<CT.UKN>0</CT.UKN>
<CT.TI1>Apollo mission </CT.TI1>
<CT.DSC>Apollo mission </CT.DSC>
<CT.C01>CREDIT: NASA</CT.C01>
<CT.C03>In December of 1972, Apollo 17 astronauts Eugene </CT.C03>
<CT.C04>Cernan and Harrison Schmitt spent about 75 hours </CT.C04>
<CT.C05>on the moon, in the Taurus-Littrow valley, while </CT.C05>
<CT.C06>colleague Ronald Evans orbited overhead. THe </CT.C06>
<CT.C07>Apollo 17 crew returned with 110 kilograms of rock </CT.C07>
<CT.C08>and soil samples, more than from any other lunar </CT.C08>
<CT.C09>landing sites. And after thirty plus years, Cernan </CT.C09>
<CT.C10>and Schmitt are still the last to walk on the </CT.C10>
<CT.C11>Moon.</CT.C11>
<CT.MBS>41</CT.MBS>
<CT.COL></CT.COL>
<CT.WPL></CT.WPL>
<CT.SU1>SQUARE</CT.SU1>
<CT.SU2>MOON LANDING</CT.SU2>
<CT.SU3>ASTRONAUTS</CT.SU3>
<CT.SU4>NASA</CT.SU4>
<CT.SU5>LUNAR EXPLORATION</CT.SU5>
<CT.SU6>SPACE PROGRAM</CT.SU6>
<CT.SU7>SPACE</CT.SU7>
<CT.SU8>MOON</CT.SU8>
<CT.SU9>APOLLO 17</CT.SU9>
<CT.SU10>RONALD EVANS</CT.SU10>
<CT.SU11>EUGENE CERNAN</CT.SU11>
<CT.SU12>EXPLORATION</CT.SU12>
<CT.SU13>HARRISON SCHMITT</CT.SU13>
<CT.SU14>MANNED</CT.SU14>
<CT.SU15>SPACEFLIGHT</CT.SU15>
<CT.SU16>SPACE</CT.SU16>
<CT.SU17>APOLLO</CT.SU17>
<CT.SU18>PROGRAM</CT.SU18>
<CT.SU19>PROGRAMME</CT.SU19>
<CT.UPD>03-Jun-05</CT.UPD>
<CT.DUP></CT.DUP>
<CT.PHO>XNS</CT.PHO>
<PH.NAM>NASA</PH.NAM>
<RS.CY2>* UK</RS.CY2>
<RS.CY3>* EIRE</RS.CY3>
<RS.CY4>* BAHRAIN</RS.CY4>
<RS.CY5>* EGYPT</RS.CY5>
<RS.CY6>* HONG KONG</RS.CY6>
<RS.CY7>* ICELAND</RS.CY7>
<RS.CY8>* MALAYSIA</RS.CY8>
<RS.CY9>* SINGAPORE</RS.CY9>
<RS.CY10>* SAUDI ARABIA</RS.CY10>
<RS.CY11>* SOUTH AFRICA</RS.CY11>
<RS.CY12>* TAIWAN</RS.CY12>
<RS.CY13>* THAILAND</RS.CY13>
<RS.CY14>* UNITED ARAB EMIRATES</RS.CY14>
<CT.PSD></CT.PSD>
<CT.PSI></CT.PSI>
<CT.REF>AV120A</CT.REF>
<CT.FOR></CT.FOR>
<CT.DFM></CT.DFM>
</record>
<record>
<CT.NUM>S375/0045</CT.NUM>
<CT.DAT>22-Oct-92</CT.DAT>
<CT.UPD>03-Jun-05</CT.UPD>
<CT.ORI>IN</CT.ORI>
<CT.ALN>10</CT.ALN>
<CT.UKN>4</CT.UKN>
<CT.TI1> </CT.TI1>
<CT.DSC>Night launch of Apollo</CT.DSC>
<CT.C01>CREDIT: NASA</CT.C01>
<CT.C03>Launch of Apollo 17. The Saturn V rocket carrying </CT.C03>
<CT.C04>Apollo 17 blasts into the night sky at Cape </CT.C04>
<CT.C05>Canaveral on 7 December 1972. This was the only </CT.C05>
<CT.C06>night launch of the Apollo Lunar programme. Apollo </CT.C06>
<CT.C07>17 carried astronauts Eugene Cernan and Harrison </CT.C07>
<CT.C08>Schmitt to the Moon, Ronald Evans remaining in the </CT.C08>
<CT.C09>command module in lunar orbit. Cernan and Schmitt </CT.C09>
<CT.C10>landed in the Taurus-Littrow region on 11 December </CT.C10>
<CT.C11>and left on 13 December. Cernan was the last man </CT.C11>
<CT.C12>to stand on the Moon's surface. Apollo 17 returned </CT.C12>
<CT.C13>to Earth on 19 December 1972 - the end of manned </CT.C13>
<CT.C14>Lunar exploration for the time being.</CT.C14>
<CT.MBS>46</CT.MBS>
<CT.COL>C</CT.COL>
<CT.WPL></CT.WPL>
<CT.SU1>APOLLO 17, LAUNCH, NIGHT</CT.SU1>
<CT.SU2>SATURN V, LAUNCH, APOLLO 17</CT.SU2>
<CT.SU3>ROCKET</CT.SU3>
<CT.SU4>MANNED SPACEFLIGHT, SPACE</CT.SU4>
<CT.SU5>APOLLO PROGRAM, PROGRAMME</CT.SU5>
<CT.UPD>03-Jun-05</CT.UPD>
<CT.DUP>13</CT.DUP>
<CT.PHO>NAS</CT.PHO>
<PH.NAM>NASA</PH.NAM>
<CT.PSD></CT.PSD>
<CT.PSI></CT.PSI>
<CT.REF>S72-55070</CT.REF>
<CT.FOR>8x10 print</CT.FOR>
<CT.DFM></CT.DFM>
</record>
<record>
<CT.NUM>S380/0286</CT.NUM>
<CT.DAT>03-Jun-05</CT.DAT>
<CT.UPD>03-Jun-05</CT.UPD>
<CT.ORI>IN</CT.ORI>
<CT.ALN>0</CT.ALN>
<CT.UKN>0</CT.UKN>
<CT.TI1>Apollo 14 lunar central station</CT.TI1>
<CT.DSC>Apollo 14 lunar central station</CT.DSC>
<CT.C01>CREDIT: NASA</CT.C01>
<CT.C03>East and north sides of the Central Station, with </CT.C03>
<CT.C04>good definition of the astronaut switches at the </CT.C04>
<CT.C05>bottom. Apollo 14 was the third mission in which </CT.C05>
<CT.C06>humans walked on the lunar surface and returned to </CT.C06>
<CT.C07>Earth. On 5 February 1971 two astronauts (Apollo </CT.C07>
<CT.C08>14 Commander Alan B. Shepard, Jr. and LM pilot </CT.C08>
<CT.C09>Edgar D. Mitchell) landed near Fra Mauro crater on </CT.C09>
<CT.C10>the Moon in the Lunar Module (LM) while the </CT.C10>
<CT.C11>Command and Service Module (CSM) (with CM pilot </CT.C11>
<CT.C12>Stuart A. Roosa) continued in lunar orbit. During </CT.C12>
<CT.C13>their stay on the Moon, the astronauts set up </CT.C13>
<CT.C14>scientific experiments, took photographs, and </CT.C14>
<CT.C15>collected lunar samples. The LM took off from the </CT.C15>
<CT.C16>Moon on 6 February and the astronauts returned to </CT.C16>
<CT.C17>Earth on 9 February.</CT.C17>
<CT.MBS>48</CT.MBS>
<CT.COL></CT.COL>
<CT.WPL></CT.WPL>
<CT.SU1>PORTRAIT</CT.SU1>
<CT.SU2>SPACE</CT.SU2>
<CT.SU3>LUNAR EXPERIMENTS</CT.SU3>
<CT.SU4>APOLLO 14</CT.SU4>
<CT.SU5>MOON</CT.SU5>
<CT.SU6>LUNAR EXPERIMENT</CT.SU6>
<CT.SU7>NASA</CT.SU7>
<CT.SU8>SPACE PROGRAM</CT.SU8>
<CT.SU9>MANNED</CT.SU9>
<CT.SU10>SPACEFLIGHT</CT.SU10>
<CT.SU11>SPACE</CT.SU11>
<CT.SU12>APOLLO</CT.SU12>
<CT.SU13>PROGRAM</CT.SU13>
<CT.SU14>PROGRAMME</CT.SU14>
<CT.UPD>03-Jun-05</CT.UPD>
<CT.DUP></CT.DUP>
<CT.PHO>XNS</CT.PHO>
<PH.NAM>NASA</PH.NAM>
<RS.CY2>* UK</RS.CY2>
<RS.CY3>* EIRE</RS.CY3>
<RS.CY4>* BAHRAIN</RS.CY4>
<RS.CY5>* EGYPT</RS.CY5>
<RS.CY6>* HONG KONG</RS.CY6>
<RS.CY7>* ICELAND</RS.CY7>
<RS.CY8>* MALAYSIA</RS.CY8>
<RS.CY9>* SINGAPORE</RS.CY9>
<RS.CY10>* SAUDI ARABIA</RS.CY10>
<RS.CY11>* SOUTH AFRICA</RS.CY11>
<RS.CY12>* TAIWAN</RS.CY12>
<RS.CY13>* THAILAND</RS.CY13>
<RS.CY14>* UNITED ARAB EMIRATES</RS.CY14>
<CT.PSD></CT.PSD>
<CT.PSI></CT.PSI>
<CT.REF>AV064A</CT.REF>
<CT.FOR></CT.FOR>
<CT.DFM></CT.DFM>
</record>





Thomas Bätzler

2005-06-09, 8:56 am

Dermot Paikkos <dermot@sciencephoto.com> asked:
> Thanx for the replies. At the moment I am testing the water
> so I am using Activestate 5.8. All I have done with
> XML::Simple is below.
>
> ============= xml-parse.pl =========
> #!/bin/perl
> use strict;
> use XML::Simple;
> use Data::Dumper;
>
> my $file = shift; # Doesn't like this.
> #my $file = "section-xml"; # or this.
> my $config = XMLin('c:\\windows\\desktop\\section-xml');


You can use forward slashes as directory separators, i.e.

my $config = XMLin('c:/windows/desktop/section-xml');

> I get the error:
> junk after document element at line 65, column 0, byte 1993
> at C:/Perl/site/lib/XML/Parser.pm line 185.


Your XML isn't well-formed. You are missing an outermost
tag that encompasses the whole document, i.e.

<document>
<record>
....
</record>
....
<record>
....
</record>
</document>

Add the tags and it'll work.

HTH,
Thomas
Dermot Paikkos

2005-06-09, 8:56 am

Yeap that seems to have done it. Thanx Thomas.
Dp.


On 9 Jun 2005 at 12:23, Thomas B=E4tzler wrote:

> Dermot Paikkos <dermot@sciencephoto.com> asked:
=3D=3D=3D[color=darkred]
>
> You can use forward slashes as directory separators, i.e.
>
> my $config =3D XMLin('c:/windows/desktop/section-xml');
>
>
> Your XML isn't well-formed. You are missing an outermost
> tag that encompasses the whole document, i.e.
>
> <document>
> <record>
> ...
> </record>
> ...
> <record>
> ...
> </record>
> </document>
>
> Add the tags and it'll work.
>
> HTH,
> Thomas
>



~~
Dermot Paikkos * dermot@sciencephoto.com
Network Administrator @ Science Photo Library
Phone: 0207 432 1100 * Fax: 0207 286 8668

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com