Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / March 2008

Tip: Looking for answers? Try searching our database.

In need of something like RandomAccessFile.read(char[] cAr, int off,     int len)

Thread view: 
lbrtchx@gmail.com - 11 Mar 2008 22:43 GMT
~
http://java.sun.com/j2se/1.4.2/docs/api/java/io/RandomAccessFile.html
~
has:
~
public int read(byte[] b,
               int off,
               int len)
        throws IOException
~
But I need to read in and compare chars/Unicode
~
I have tried many things but I haven't been able to find out how
~
What kind of carpentry do you do with I/O objects to achieve such a
thing?
~
Thanks
lbrtchx
Knute Johnson - 11 Mar 2008 23:49 GMT
> ~
>  http://java.sun.com/j2se/1.4.2/docs/api/java/io/RandomAccessFile.html
[quoted text clipped - 15 lines]
>  Thanks
>  lbrtchx

Besides RandomAccessFile.readChar() you could get a FileChannel, read in
a buffer and convert it to a CharBuffer.  But seek() and readChar()
ought to be adequate.

Signature

Knute Johnson
email s/nospam/linux/

     ------->>>>>>http://www.NewsDem

Mike Schilling - 12 Mar 2008 06:38 GMT
>> ~
>>
[quoted text clipped - 23 lines]
> readChar()
> ought to be adequate.

ISTM that there should be an InputStream subclass that reads bytes
from a RandomAccessFile starting at a given offset.  It's not
difficult to construct, but why doesn't it come standard?
Knute Johnson - 12 Mar 2008 23:24 GMT
>>> ~
>>>
[quoted text clipped - 26 lines]
> from a RandomAccessFile starting at a given offset.  It's not
> difficult to construct, but why doesn't it come standard?

It does.  He wanted to read chars and that is alittle more complicated
because you have to read two bytes at a time.

Of course that is what he really wanted.

Signature

Knute Johnson
email s/nospam/linux/

Mike Schilling - 13 Mar 2008 00:35 GMT
>>>> ~
>>>>
[quoted text clipped - 31 lines]
> complicated
> because you have to read two bytes at a time.

If there were an InputStream, you coiuld attach an InputStreamReader
to it and read characters in any encoding you like.  Unfortunately,
there isn't one.
lbrtchx@gmail.com - 15 Mar 2008 00:35 GMT
> ... are these files written with a Java program?
~
No necessarily. They are mostly texts downloaded from the Internet
~
> ... If there were an InputStream
~
option as an argument to a RandomAccessFile ctor then things were
easier, but I think, and I may be wrong, that there is a fundamental
problem here.
~
InputStreams and RandomAccessFiles should not be mixed because when
you go:
~
RandomAccessFiles.seek((long) lThere)
~
you can not be absolutely sure that:
~
1) you will land at the start of a byte sequence conforming a
character,
~
2) belonging to the encoding you specified in the InputStream
~
3) at, ... where actually? lThere? the API says:
~
http://java.sun.com/j2se/1.5.0/docs/api/java/io/RandomAccessFile.html#seek(long)
~
<API>
seek: public void seek(long pos) throws IOException
~
Sets the file-pointer offset, measured from the beginning of this
file, at which the next read or write occurs. The offset may be set
beyond the end of the file. Setting the offset beyond the end of the
file does not change the file length. The file length will change only
by writing after the offset has been set beyond the end of the file.
Parameters: pos - the offset position, measured in bytes from the
beginning of the file, at which to set the file pointer.
Throws: IOException - if pos is less than 0 or if an I/O error
occurs.
</API>
~
The unclear bit is "measured from the beginning of this file" how
exactly is it measured? Why not simply saying in "bytes"?
~
I think that java should let the programmer do something like what I
illustrate with a piece of pseudo code below
~
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
// __
 int iChrsRd, iChrBfrSz = 4096;
 char[] cChrAr = new char[iChrBfrSz];
 RandomAccessFile RAxFl;
 String aEnc = ...;  // "UTF8", "ISO-8859-1", "" or whatever encoding
your text are written in
// __
 try{
  FileInputStream FIS = new FileInputStream(IFl);
  InputStreamReader ISRdr = new InputStreamReader(FIS, aEnc);
  RAxFl = new RandomAccessFile(ISRdr);
// . . .
  RAxFl.seek(lThere);
  iChrsRd = RAxFl.read(cChrAr, 0, iChrBfrSz); // reading iChrsRd into
cChrAr provided iChrBfrSz can fully take them
// . . .
  RAxFl.close();
 }catch(FileNotFoundException FlNtFX){ FlNtFX.printStackTrace(); }
   catch(IOException IOX){ IOX.printStackTrace(); }
// __
 finally{
  if(RAxFl != null){ try{ RAxFl.close(); }catch(IOException IOXcptn)
{ ; }}
 }
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~
Dealing with Java's I/O and internationalization is not exactly easy
~
lbrtchx
Roedy Green - 12 Mar 2008 08:27 GMT
> What kind of carpentry do you do with I/O objects to achieve such a
>thing?
>~

see http://mindprod.com/jgloss/nio.html
You can look at your file as if it were a giant array of char
--

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
lbrtchx@gmail.com - 12 Mar 2008 11:28 GMT
OK,
~
I am trying to do something like this.
~
. . .
  FileInputStream FIS = new FileInputStream(new File(aIFl));
  FileChannel FlChnl = FIS.getChannel();
  long lFlChnlL = FlChnl.size();
  MappedByteBuffer MptBytBfr =
FlChnl.map(FileChannel.MapMode.READ_ONLY, 0, lFlChnlL);
// __
  String aChrSet = "ISO-8859-1";  // UTF8 or whatever
  Charset ChrSt = Charset.forName(aChrSet);
  CharsetDecoder ChrStDkdr = ChrSt.newDecoder();
  CharBuffer ChrBfr = ChrStDkdr.decode(MptBytBfr);
  char[] cArFl = ChrBfr.array();
. . .
~
and I know the offsets in the files where certain sequences appear
and the length of the sequences, so then I go:
~
. . .
  for(int i = 0; (i < iSeqL); ++i){ aB.append(cArFl[((int)lFfst +
i)]); }
  aS = aB.toString();
. . .
~
to grab the actual sequence. However it does not seem to be working.
Can you, please, point me to a full example out there?
~
I like that you can set a CharsetDecoder to the actual file since I
may be using non "ISO-8859-1" text files, but I still wonder about:
~
1. what the speed gains really are
~
2. if "offsets" somehow change based on the CharsetDecoder
~
3. how safe the use of Memomy maps + CharsetDecoder while reading
file-based data feeds
~
I find java's I/O a bit confusing and I also think that the
RandomAccessFile class should take care of the inner plumbing needed
to offer something like what I need, namely
RandomAccessFile.read(char[] buffer, int start, int end)
~
When I don't see it, I think there may be some rather unsafe issues
underlying that reason.
~
Thanks
lbtchx
Knute Johnson - 12 Mar 2008 23:32 GMT
>  OK,
> ~
[quoted text clipped - 46 lines]
>  Thanks
>  lbtchx

So are these files written with a Java program and do they use the Java
16 bit unicode characters?  Do you actually need random access or are
you just going to read parts out of the file once and go on?  If the
characters are encoded in full unicode how would you know where to look
for them?  If the files are encoded in ASCII or UTF-8 you should be able
to use a BufferedReader and select an appropriate character set.  Just
skip over the bytes you don't want to read.

Signature

Knute Johnson
email s/nospam/linux/



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.