Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / First Aid / January 2007

Tip: Looking for answers? Try searching our database.

reading a large text file that is part of a very tight loop

Thread view: 
thebad1 - 05 Jan 2007 02:14 GMT
Hi,

I have a algorithm which loops over a set of 100 million sets of 3
values in a tight loop. eg;

while(1){ //do forever until terminated
BufferedReader in = new BufferedReader(
                            new FileReader(
                                    "ratings.txt"));
    for(i=0;i<numSets;i++){
        int value1 = Integer.parseInt(in.readLine());
        int value2 = Integer.parseInt(in.readLine());
        int value3 = Integer.parseInt(in.readLine());

        //do some stuff with the values here....
    }
}

I was hoping that someone might be able to suggest whether that can be
optimised in terms of speed.
I can format the values in the file in any way, I have tried fixed
width fields, but the file gets significantly bigger due to the
padding. I tried loading the file into memory, but I don't have the
ram. (the file is 1.6GB like that)

Thanks,

T.
Paul Hamaker - 05 Jan 2007 08:49 GMT
Try DataOut/InputStream's writeInt and readInt instead.
http://java.sun.com/docs/books/tutorial/essential/io/datastreams.html
--
http://javalessons.com  Paul Hamaker, SEMM
Teaching ICT since 1987
Ian Shef - 05 Jan 2007 18:51 GMT
"thebad1" <thomas.hodder@gmail.com> wrote in news:1167963279.223039.234850@
42g2000cwt.googlegroups.com:

> Hi,
>
> I have a algorithm which loops over a set of 100 million sets of 3
> values in a tight loop. eg;
>
> while(1){ //do forever until terminated

OK, what is your real code?  This is not Java, it won't compile.

I want to help, but if this is wrong, then I cannot trust the rest of what
you posted.

Signature

Ian Shef     805/F6      *    These are my personal opinions    
Raytheon Company         *    and not those of my employer.
PO Box 11337             *
Tucson, AZ 85734-1337    *

thebad1 - 08 Jan 2007 10:19 GMT
> > while(1){ //do forever until terminated
>
> OK, what is your real code?  This is not Java, it won't compile.
>
> I want to help, but if this is wrong, then I cannot trust the rest of what
> you posted.

OK, while this isn;t the exact code, its the relevant section....

    static final int numRatings = 100456599;
    static final int epoch = 100000;

    FileChannel fc = new FileInputStream (filepath).getChannel();
    MappedByteBuffer mappedBuff = fc.map(FileChannel.MapMode.READ_ONLY, 0,
fc.size());

    for (int epoch = 0; epoch < numEpochs; epoch++) {
        mappedBuff.rewind();

        for (int k = 0; k < numRatings; k++) {
            cUser = mappedBuff.getInt();
            cMovie = (int)mappedBuff.getShort();
            cRating = (int)mappedBuff.get();
       
            //doStuff;
           
        }
    }
thebad1 - 08 Jan 2007 10:38 GMT
>     FileChannel fc = new FileInputStream (filepath).getChannel();
>     MappedByteBuffer mappedBuff = fc.map(FileChannel.MapMode.READ_ONLY, 0,
> fc.size());

Where filepath is a binary file packed with over 100,000,000 sets of
int,short,bytes concated together like so;
$ od -x -N 30 ratings.bin
0000000 1600 ccb7 0100 0003 8b0c 005d 0501 0d00
0000020 1581 0100 0004 7800 009e 0401 0c00

I'm now down to about 13 seconds for a complete loop over that file, I
was wondering whether there were any further improvements to such a
sweep that can be done in Java.
Ian Shef - 08 Jan 2007 20:05 GMT
"thebad1" <thomas.hodder@gmail.com> wrote in news:1168251545.671290.310410@
51g2000cwl.googlegroups.com:

> OK, while this isn;t the exact code, its the relevant section....
>
[quoted text clipped - 17 lines]
>           }
>      }

Quite different from what was originally posted!  Perhaps you have been
experimenting and learning along the way.

At this point I will shut up on this topic, as I have no experience with
FileChannel and MappedByteBuffer.

Anyone with experience have suggestions for speeding this up?

Signature

Ian Shef     805/F6      *    These are my personal opinions    
Raytheon Company         *    and not those of my employer.
PO Box 11337             *
Tucson, AZ 85734-1337    *

thebad1 - 08 Jan 2007 21:53 GMT
> Quite different from what was originally posted!  Perhaps you have been
> experimenting and learning along the way.

I've changed the inner loop to avoid checking the variable every time;

for (;;) {
        cUser = mappedBuff.getInt();
        cMovie = mappedBuff.getShort();
        cRating = mappedBuff.get();
    }
}catch ( BufferUnderflowException ex ){
//    System.out.println("");
}
Ian Shef - 09 Jan 2007 19:18 GMT
"thebad1" <thomas.hodder@gmail.com> wrote in news:1168293201.398727.188740
@v33g2000cwv.googlegroups.com:

>> Quite different from what was originally posted!  Perhaps you have been
>> experimenting and learning along the way.
[quoted text clipped - 9 lines]
> //     System.out.println("");
> }

Did this provide a measurable speedup ?  How much?

Signature

Ian Shef     805/F6      *    These are my personal opinions    
Raytheon Company         *    and not those of my employer.
PO Box 11337             *
Tucson, AZ 85734-1337    *



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.