Home > Archive > PERL Miscellaneous > August 2007 > Re: UTF-8 problem
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Peter J. Holzer 2007-08-25, 7:05 pm |
| On 2007-08-21 22:23, Todor Vachkov <vachkov@math.tu-berlin.de> wrote:[color=darkred]
> Hello all,
>
> I'm trying to convert an exported xml file into a perl data structre with the XML::LibXML modul.
> Thus I got this error message:
>
>
> I thought the solution would be:
>
Don't do this. XML-files contain an indication of their encoding, you
should treat them as binary files
open(my $fh, "< :raw" ,'/foodir/export.xml');
and let the XML parser do the rest.
It that doesn't work, the encoding stored in the file is probably
wrong, either because the generating software was buggy or because
someone already incorrectly converted the file. You may have luck by
fixing the encoding (it should be in the first line which looks like
this:
<?xml version="1.0" encoding="UTF-8" ?>
If the encoding is missing, UTF-8 is assumed).
--
_ | Peter J. Holzer | I know I'd be respectful of a pirate
|_|_) | Sy min WSR | with an emu on his shoulder.
| | | hjp@hjp.at |
__/ | http://www.hjp.at/ | -- Sam in "Freefall"
| |
| Todor Vachkov 2007-08-25, 7:05 pm |
| Peter J. Holzer wrote:
> On 2007-08-21 22:23, Todor Vachkov <vachkov@math.tu-berlin.de> wrote:
>
> Don't do this. XML-files contain an indication of their encoding, you
> should treat them as binary files
>
> open(my $fh, "< :raw" ,'/foodir/export.xml');
>
> and let the XML parser do the rest.
>
> It that doesn't work, the encoding stored in the file is probably
> wrong, either because the generating software was buggy or because
> someone already incorrectly converted the file. You may have luck by
> fixing the encoding (it should be in the first line which looks like
> this:
>
> <?xml version="1.0" encoding="UTF-8" ?>
>
> If the encoding is missing, UTF-8 is assumed).
>
Thanks for your reply Peter!
I'm using now XML::Smart and so I don't have the UTF-8 problem anymore.
The file has the declaration
<?xml version="1.0" encoding="UTF-8" ?>
As I already mentioned, it contains source code from perl scripts and I
found out that some of them are iso-8859-1 encoded. Especially the german "Umlaute" made some trouble as you know;)
Greetings,
Todor
|
|
|
|
|