For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > October 2006 > Re: Yet another unicode question: windows platform









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Re: Yet another unicode question: windows platform
tewilk@gmail.com

2006-10-31, 7:56 am

Will do thanks for the guidance!

Alexei A. Frounze wrote:
> tewilk@gmail.com wrote:
>
> Well, in general, if you don't know the type of file (ASCII, UTF8,
> UTF16LE/BE, UTF32LE/BE, some non-ASCII non-Unicode 8/16-bit encoding), you
> have to check against all supportable types and if you find that the
> contains, say, what's a valid UTF8, then so be it. A few hints... Unicode
> text files may begin with so-called BOM (Byte Order Mark). Notepad usually
> (if not always) puts it at the beginning of the saved Unicode text file.
> It's a different sequence of bytes for UTF8, UTF16LE, UTF16BE, etc. If you
> find it, you may validate the rest of the file pretending you know the
> Unicode format used (from the BOM). The Unicode standard describes valid
> "code point" number ranges. If you find something outside these ranges, it's
> not Unicode or the file is corrupt. To find if the file is plain ASCII, just
> check that all bytes in it are in the range 0...127. If a file doesn't look
> like ASCII or Unicode, it's either some other 8-bit or 16-bit encoding or
> it's corrupt. Btw, 7-bit ASCII is a subset of UTF8.
>
> I highly suggest that you read the Unicode documentation from the Unicode
> website: http://www.unicode.org. A must to read are: Unicode FAQ, "To the
> BMP and Beyond!" by Eric Muller -- must be somewhere on the net. I suggest
> that you start with the latter to get an overall idea of Unicode quickly.
> And the ultimate source of the information is the Unicode standard itself.
>
> Alex


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com