Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Convert utf-8 XML Document to ISO format
Hi List,

I'm trying really hard the last 2 days to get around the problem UTF-8 to
ISO-8859-1

I receive a POST of an UTF-8 XML Document, declaration is okay, the document
is send by a Windows Server.

Now I have tried to convert the document to Latin1 (ISO-8859-1) by all the
ways I can imagine, but nothing really modifies the utf flag.

When I change the text to iso-8859-1 and I put it into my database (utf8
also as latin1) I get this sign " Â "  before the sign I want to save in my
database!

When I print out the string on the screen of the server (logfile) it shows
me that the data comes in with the utf-8 flag set on (Â sign I guess) an
after transforming it I print it out by data::dump and the signs become
something like \x{c2}\x.... the \x{c2} I guess is the special character set
by utf, okay now I transform the string using Unicode::String
And the string becomes Latin1 in the logfile, but in my database not, in the
UTF-8 table the signs are good, but in the latin1 table the signs become
weird.

Maybe someone has a hint how to convert a XML::Simple document (by POST) in
UTF-8 with the FLAG set on to a Simple LATIN1 document so that I can safe it
into my latin1 table!

Tanks for any help


Ciao Thomas




Report this thread to moderator Post Follow-up to this message
Old Post
webmaster@echtwahr.com
08-04-06 12:55 PM


Re: Convert utf-8 XML Document to ISO format
On 08/04/2006 05:07 AM, webmaster@echtwahr.com wrote:
> Hi List,
> [...]

Hi Web.

>
> Maybe someone has a hint how to convert a XML::Simple document (by POST) i
n
> UTF-8 with the FLAG set on to a Simple LATIN1 document so that I can safe 
it
> into my latin1 table!
>
> Tanks for any help
>
>
> Ciao Thomas
>
>
>
>

Use the Encode module to convert the string to iso-8859-1.



Report this thread to moderator Post Follow-up to this message
Old Post
Mumia W.
08-04-06 11:55 PM


RE: Convert utf-8 XML Document to ISO format
You can try with

*Encode::Unicode*
<http://search.cpan.org/author/DANKO...code/Unicode.pm>
or *Unicode::Transform*
<http://search.cpan.org/author/SADAH...34/Transform.pm>

An other form is using the sustitution parameters (s/) in order to do
inverse convertion.


--

Atentamente,

,_,
(O,O)   J. Alejandro Ceballos Z.      buzon@alejandro.ceballos.info
(   )
-"-"-----------------------------------------------------------------
http://alejandro.ceballos.info        movil: (33) 3849-8936





Report this thread to moderator Post Follow-up to this message
Old Post
J. Alejandro Ceballos Z. -JOAL-
08-08-06 02:55 AM


Re: Convert utf-8 XML Document to ISO format
Thomas,

I've had a similar experience and will provide my solution below. I'm
not sure it's optimal, but it works for me. I'm working with a file, not
an HTTP POST. (In addition to what is below, I would suggest looking at
how you specify the charset encoding of your POST to be sure it is what
you think it is. That part is beyond me.)

As far as I can tell, Perl works in UTF-8 and can mangle diacritics
given to it in other character sets. The key is that you encode TWICE.
First to get it into Perl, then once more right before you put data in
the database. As soon as Perl does any transformations on text, it seems
to go back to UTF-8. When I leave off the first or second encoding, I
get mangled diacritics.

use Encode;
my $file = "file data in iso-8859-1 or LATIN1";

# this could be a string too, i.e., what you receive from your POST, but
then you would use the second command below, I think

open (F, "<:encoding(iso-8859-1)", $file)

#This gets the data in cleanly. You do transformations on the text as
you please, but then Perl has it in UTF-8 again. So *right before* you
put it in your SQL query, take your $string and put it into the proper
encoding for your database.

$string = encode("iso-8859-1", $string);

#Probably a good idea to use a bound parameter, i.e., ? in the query and
provide the $string as a parameter in the execute command.

At least in my case, this solves the problem.

-Chris Cosner

webmaster@echtwahr.com wrote:
> Hi List,
>
> I'm trying really hard the last 2 days to get around the problem UTF-8 to
> ISO-8859-1
>
> I receive a POST of an UTF-8 XML Document, declaration is okay, the docume
nt
> is send by a Windows Server.
>
> Now I have tried to convert the document to Latin1 (ISO-8859-1) by all the
> ways I can imagine, but nothing really modifies the utf flag.
>
> When I change the text to iso-8859-1 and I put it into my database (utf8
> also as latin1) I get this sign " Â "  before the sign I want to save in m
y
> database!
>
> When I print out the string on the screen of the server (logfile) it shows
> me that the data comes in with the utf-8 flag set on (Â sign I guess) an
> after transforming it I print it out by data::dump and the signs become
> something like \x{c2}\x.... the \x{c2} I guess is the special character se
t
> by utf, okay now I transform the string using Unicode::String
> And the string becomes Latin1 in the logfile, but in my database not, in t
he
> UTF-8 table the signs are good, but in the latin1 table the signs become
> weird.
>
> Maybe someone has a hint how to convert a XML::Simple document (by POST) i
n
> UTF-8 with the FLAG set on to a Simple LATIN1 document so that I can safe 
it
> into my latin1 table!
>
> Tanks for any help
>
>
> Ciao Thomas
>
>
>
>

--
Chris Cosner

Systems Administrator
Stanford University Press
1450 Page Mill Road
Palo Alto, CA 94304
(650) 724-7276
ccosner@stanford.edu
http://www.sup.org

Report this thread to moderator Post Follow-up to this message
Old Post
Chris Cosner
08-14-06 11:55 PM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

PERL CGI Beginners archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 03:23 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.