Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

XML::Twig
OK. I am now desperate. I have written a sub routine to slipt up large
(~2-3MB) XML documents into seperate documents. When I use $twig->
parsefile I get the following error:

"not well-formed (invalid token) at line 27072, column 1934, byte 878399
at C:/Perl/site/lib/XML/Parser.pm line 187"

When I change to $twig->safe_parsefile I can parse the document, but it
only gets a portion of the document (~38 of 83 elements).

I am the first to admit that I am not a Perl hack by trade, so please
don't rape me for my code sample. I should also mention that this code
worked great on smaller files ( <300k ).

Any help/suggestions would be greatly appreciated.

Brendan


sub splitFiles {
my  $fPath = $_[0];
my $twig= new XML::Twig;
&logMessage("DEBUG - Build the Twig for " . $fPath);
$twig->safe_parsefile($fPath);    # build the twig
&logMessage("DEBUG - I can parse the file");
my $root = $twig->root;          # get the root of the twig
(vdf_metadata_list)
&logMessage("DEBUG - Videos: ". $root->children_count);
my @videos = $root->children;    # put the vdf_metadata elements into
an array
if (scalar @videos > 0 ) {
&logMessage("DEBUG - Number of videos is " . scalar @videos);
my $i = 0;
foreach my $video (@videos) {
$i++;
my $timeStamp = gettimeofday;
my $tmpPath = "$tmpDir".$timeStamp.$i;
my $FH;
open($FH, ">$tmpPath") || die("cannot open file: " . $!);
$video->print($FH);
close (FH);
}
} else {
&logMessage("DEBUG - Skipping file " . $fPath);
}
}

Report this thread to moderator Post Follow-up to this message
Old Post
c0rk
09-25-04 08:56 PM


Re: XML::Twig

c0rk wrote:
> OK. I am now desperate. I have written a sub routine to slipt up large
> (~2-3MB) XML documents into seperate documents. When I use $twig->
> parsefile I get the following error:
>
> "not well-formed (invalid token) at line 27072, column 1934, byte 878399
> at C:/Perl/site/lib/XML/Parser.pm line 187"

Well, in the absense of any evidence to the contrary I'm be inclined to
accept that at face value.

Do you have a reason to disbelive it?


Report this thread to moderator Post Follow-up to this message
Old Post
Brian McCauley
09-25-04 08:56 PM


Re: XML::Twig
c0rk <pam4prezNOSPAM@hotmail.com> wrote:

> When I use $twig->
> parsefile I get the following error:
>
> "not well-formed (invalid token) at line 27072, column 1934, byte 878399
> at C:/Perl/site/lib/XML/Parser.pm line 187"


This message means that there is something wrong with the _data_
rather than with the code.

Open the data file to the 1934th character on the 27072nd line
and see what it is that makes it invalid XML.



--
Tad McClellan                          SGML consulting
tadmc@augustmail.com                   Perl programming
Fort Worth, Texas

Report this thread to moderator Post Follow-up to this message
Old Post
Tad McClellan
09-25-04 08:56 PM


Re: XML::Twig
Brian McCauley <nobull@mail.com> wrote in
news:cj46h1$v2m$1@slavica.ukpost.com:

>
>
> c0rk wrote: 
>
> Well, in the absense of any evidence to the contrary I'm be inclined
> to accept that at face value.
>
> Do you have a reason to disbelive it?
>

Brian

You know - I have been working on this script since Thursday, trying to
determine _my_ problem. When I saw this error, I took it as there was an
error in my processing method (i.e. memory problem). For whatever reason, I
just didn't read the error message for what it was. Turns out that the XML
has bad characters in it. I replaced those characters and my script
processed a 3MB file in seconds.

Many thanks for your response!

-c

Report this thread to moderator Post Follow-up to this message
Old Post
c0rk
09-26-04 08:56 PM


Re: XML::Twig
Tad McClellan <tadmc@augustmail.com> wrote in
news:slrnclb93j.qpk.tadmc@magna.augustmail.com:

> c0rk <pam4prezNOSPAM@hotmail.com> wrote:
> 
>
>
> This message means that there is something wrong with the _data_
> rather than with the code.
>
> Open the data file to the 1934th character on the 27072nd line
> and see what it is that makes it invalid XML.
>
>
>

Tad,

thanks for the response. you are 100% correct. I replaced the bad
characters at the specified location, and life is good!!!

Thanks,

-c

Report this thread to moderator Post Follow-up to this message
Old Post
c0rk
09-26-04 08:56 PM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

PERL Miscellaneous archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 05:22 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.