Home > Archive > Java Help > February 2007 > Re: Stupid question regarding encoding
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Re: Stupid question regarding encoding
|
|
| a249@mailinator.com 2007-02-19, 10:06 pm |
| On 19 Feb., 10:55, Daniel Moyne <dmo...@tiscali.fr> wrote:
> I have written a java plugin for an application that reads a file to execute
> a series of actions ; when I parse my text file I check for the existence
> of the following line "#HEADER" ; it works fine on my equipment ; on
> Windows somebody working with the text file encoded in utf-8 told me that
> there is problem as the line containing "#HEADER" is not found !
>
> Now I am a little bit puzzled about all this :
>
> (a) when you write java code and compile it what happens to the
> string "#HEADER" because it will be used in the following test :
> if (line.equals("#HEADER") {...}
> where line is read from the text file with encoding as is on the machine
> where the class is executed ? in other words you are comparing what to
> what ?
It is not how Java compiles the "#HEADER" literal. It is how you open
and read the file. Fix that.
| |
| Daniel Moyne 2007-02-20, 7:07 pm |
| a249@mailinator.com wrote:
> On 19 Feb., 10:55, Daniel Moyne <dmo...@tiscali.fr> wrote:
>
> It is not how Java compiles the "#HEADER" literal. It is how you open
> and read the file. Fix that.
Please can you be more specific here ; for you information I use this to
read my text file line by line :
I get fileName (File object) from JFileChooser
Then I do this :
BufferedReader entree=new BufferedReader(new FileReader(fileName));
do {
/* we parse text file line by line */
fileTextLine=entree.readLine();
.................
} while (fileTextLine != null) ;
In this there is no encoding check !
Thanks.
| |
| Gordon Beaton 2007-02-20, 7:07 pm |
| On Tue, 20 Feb 2007 15:42:24 +0100, Daniel Moyne wrote:
> Then I do this :
> BufferedReader entree=new BufferedReader(new FileReader(fileName));
Do not use FileReader. Instead, use a FileInputStream and an
InputStreamReader, specifying the correct encoding when you create the
InputStreamReader.
/gordon
--
[ don't email me support questions or followups ]
g o r d o n + n e w s @ b a l d e r 1 3 . s e
| |
| Oliver Wong 2007-02-20, 7:07 pm |
|
"Gordon Beaton" <n.o.t@for.email> wrote in message
news:45db0e7c$0$31542$8404b019@news.wineasy.se...
> On Tue, 20 Feb 2007 15:42:24 +0100, Daniel Moyne wrote:
>
> Do not use FileReader. Instead, use a FileInputStream and an
> InputStreamReader, specifying the correct encoding when you create the
> InputStreamReader.
This is assuming you know the correct encoding. To answer the OP's
earlier question:
"Daniel Moyne" <dmoyne@tiscali.fr> wrote in message
news:erbs6k$7jc$1@news.tiscali.fr...
>
> (b) when with a Java app you read a text file can you get its encoding
> format like utf-8 or ANSI or whatever to decide about some actions to be
> taken ?
No, you can't. No computer program possibly can, as it's often
ambiguous. The string "#HEADER" encoded in ASCII produces the exact same
sequence of bits as when it's encoded in UTF-8, for example. So given that
sequence of bits, it's impossible to know whether it was encoded in ASCII or
in UTF-8 -- the results are the same!
Possible solutions include: mandating a specific encoding (e.g. everyone
who uses your program must use UTF-8) or allowing the user to specify the
encoding (perhaps via command line arguments?)
- Oliver
| |
| Oliver Wong 2007-02-20, 7:07 pm |
|
"Oliver Wong" <owong@castortech.com> wrote in message
news:xvECh.31429$Fi3.720449@wagner.videotron.net...
>
> The string "#HEADER" encoded in ASCII produces the exact same sequence of
> bits as when it's encoded in UTF-8, for example. So given that sequence of
> bits, it's impossible to know whether it was encoded in ASCII or in
> UTF-8 -- the results are the same!
Incidentally, because of this, I doubt that the problem with your
program is one of encoding. Perhaps you could get your user to zip up and
send the faulty file back to you, and you could do some sort of analysis on
the file to figure out what the problem is.
The "zip up" step is important, because you don't want the user's FTP or
e-mail program to re-encode the text into some other encoding, or convert
"\n\r" to \n" or any other conversions that programs typically do
automatically to text files when sending them between computers.
- Oliver
|
|
|
|
|