Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / August 2006

Tip: Looking for answers? Try searching our database.

stat() help

Thread view: 
rudy.martono@gmail.com - 04 Aug 2006 22:26 GMT
Hi,

I am writing a JNI function that receives jstring filename and return
the created date based on stat function.

The issue is when I am supposed to handle a Unicode filename.
For example:
Εχ.txt ==> "\u0395\u03c7.txt"

Please correct me if I am wrong.

function header:
JNIEXPORT jlong JNICALL Java_getAccessedDate
(JNIEnv * env, jclass obj, jstring filename)

using the GetStringUTFRegion, I am able to translate the jstring
filename into UTF-8 format.

(*env)->GetStringUTFRegion(env, filename, 0, len, rtn);
where
filename is the parameter jstring
rtn is (char *)
and len is (*env)->GetStringLength(env, filename)

When I print it out, it looks like that it gives the right value.
But I double check the value back by using fopento see whether the file
exists or not , and it returns NULL.
Therefore, I assume stat will return 0, but it returns 724466048.

I am still not familiar with Unicode or UTF-8.
Does UTF-8 need 2 bytes per character?
If that is true, then I should use wchar_t instead of char, _wfopen (to
detect whether the file exists), and _wstat (to get the file's info).

Thank you,

Rudy
rudy.martono@gmail.com - 04 Aug 2006 23:00 GMT
I set a flag if stat returns -1.
So it looks like it is coming from the translation......

> Hi,
>
[quoted text clipped - 33 lines]
>
> Rudy
Roland de Ruiter - 04 Aug 2006 23:17 GMT
> Hi,
>
[quoted text clipped - 33 lines]
>
> Rudy

UTF-8 is a variable-length character encoding requiring 1, 2, 3 or 4
bytes per character. See <http://en.wikipedia.org/wiki/UTF-8>.

JNI however uses a so-called modified form of UTF-8, which, among other
differences, only uses 1, 2 or 3 bytes per character. See
<http://java.sun.com/j2se/1.5.0/docs/guide/jni/spec/types.html#wp16542>
<http://java.sun.com/j2se/1.5.0/docs/api/java/io/DataInput.html#modified-utf-8>

The UTF-8 bytes of the string Εχ.txt are (hexadecimal notation):
  ce 95 | cf 87 | 2e | 74 | 78 | 74
Ε \u0395: 2 bytes: ce 95
χ \u03c7: 2 bytes: cf 87
. \u002e: 1 byte:  2e
t \u0074: 1 byte:  74
x \u0078: 1 byte:  78

Probably stat/wstat and fopen/wfopen expect a fixed size char as
filename parameter. Which encoding do they expect?
Signature

Regards,

Roland

rudy.martono@gmail.com - 07 Aug 2006 21:59 GMT
Well,

I am not sure about the encoded part. The filename can be anything.
Basically I want to be able to retrieve the date created from it.

Is it correct to convert the jstring filename into wide character
everytime, and use _wstat to get the date created?

What I have changed the code so that it uses GetStringChars( env,
filename, NULL )
to get the Unicode value instead of GetStringUTFChars.

jchar* file = (*env)->GetStringChars( env, filename, NULL )

and use WideCharToMultiByte function

WideCharToMultiByte( CP_ACP, 0, (LPCWSTR)filename,
                          (*env)->GetStringLength(env, filename)*2,
new_filename,
                          ((*env)->GetStringLength(env,
filename)*2+1), NULL, NULL )

I test it with sampletest_ù.txt, and it works.

when I test it again with Εχ.txt, i get Εχ.txt

Thank you,

Rudy

> > Hi,
> >
[quoted text clipped - 52 lines]
> Probably stat/wstat and fopen/wfopen expect a fixed size char as
> filename parameter. Which encoding do they expect?
rudy.martono@gmail.com - 07 Aug 2006 22:56 GMT
I think I have found the solution.
Someone posted the same question, and the solution is using memcpy to
copy the value between jchar* and wchar_t.

I will do more testing and post the result.

Thank you,

Rudy

> Well,
>
[quoted text clipped - 82 lines]
> > Probably stat/wstat and fopen/wfopen expect a fixed size char as
> > filename parameter. Which encoding do they expect?
Bill Medland - 04 Aug 2006 23:52 GMT
> Hi,
>
[quoted text clipped - 33 lines]
>
> Rudy

Presumably since you mention _wfopen and _wstat you are talking about a
Microsoft Windows platform.  As far as I know Windows does not normally use
UTF8 for filenames.  Your best bet, on Windows, would probably be to use
the wide format functions and GetStringChars.

(Subtle complication; if you are not on Windows then watch out for jchar
possibly not matching wchar_t which might well be 4 bytes wide)

Signature

Bill Medland



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.