> Hello,
> the drop data is "file:///home/user/.../donn%E9es.xls\r\n"
>
> I drop the "\r\n", and i try to decode the %xx :
>
> URI uri = new URI(filename);
> String decodedPath = uri.getPath();
try
String decodedPath = URLDecoder.decode(filename, "ISO-8859-1");
instead. My guess at ISO-8859 is inzzzztinct (from seeing the %E9).
> but the decoded path is: "/home/user/.../donn�es.xls"
>
> So i guess it's a Charset problem but what can i do to solve
Unless there is a way to query the Java drag and drop stuff about the
encoding used, my suggesting is to kick that penguin out of your
computer. Drag and drop has always been a pain in Linux, and character
encoding issues are not taken too seriously there either.
Søren
mtp - 22 Aug 2006 09:50 GMT
>> Hello,
>
[quoted text clipped - 10 lines]
>
> instead. My guess at ISO-8859 is inzzzztinct (from seeing the %E9).
you were right. My only idea about where the encoding come from is the
file.encoding system property:
String fileEncodingCharsetName = System.getProperty("file.encoding");
String decodedUrl = URLDecoder.decode(s, fileEncodingCharsetName);
URL url = new URL(decodedUrl);
File f = new File(url.getPath());
>> but the decoded path is: "/home/user/.../donn�es.xls"
>>
[quoted text clipped - 4 lines]
> computer. Drag and drop has always been a pain in Linux, and character
> encoding issues are not taken too seriously there either.
true, but i can't drop it ...
> Hello,
>
[quoted text clipped - 21 lines]
>
> Does anyone know how to solve this?
In your "incorrect" decoded path, what's the unicode value of the
incorrect character? E9 is indeed the correct unicode value for the
"lowercase latin e with acute accent":
http://www.eki.ee/letter/chardata.cgi?ucode=00E9
- Oliver
mtp - 22 Aug 2006 10:18 GMT
> In your "incorrect" decoded path, what's the unicode value of the
> incorrect character? E9 is indeed the correct unicode value for the
> "lowercase latin e with acute accent":
> http://www.eki.ee/letter/chardata.cgi?ucode=00E9
it's also true for ISO-8859-1 and ISO-8859-15:
found in charsets: 8859-1 (E9); 8859-10 (E9); 8859-13 (E9); 8859-14
(E9); 8859-15 (E9); ...
but the page explains it:
UTF-8 (c3, a9) é
^^^^^^
which i checked :
bsh % print(URLDecoder.decode("donn%E9es", "UTF-8"));
donn�es
bsh % print(URLDecoder.decode("donn%E9es", "ISO-8859-1"));
données
bsh % print(URLEncoder.encode("données", "UTF-8"));
donn%C3%A9es
^^^^^^