Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / November 2005

Tip: Looking for answers? Try searching our database.

record size tracking

Thread view: 
MileHighCelt - 22 Nov 2005 13:28 GMT
I have a CSV file with 100 columns and an unknown number of rows that I
am going to parse and insert into a database.  I can't just fill up an
ArrayList with potentially 65K objects because performance just tanks.
At every 1000 records, I thought I would just shot that ArrayList over
to my DAO to do the insert of those row objects.

However, I am trying to think of a better way to handle that break than
comparing a counter to "1000"  like this

if (aa ==1000) {
// call DAO
aa =0;
arraylist.clear;
}

because that will obviously fail near the end if there are not exactly
1000 records left to insert.  Does anyone have any suggestions on how
to do this? I know this should be pretty basic (didn't we learn this in
school), but for the life of me I can't recall how to do it.
Rhino - 22 Nov 2005 14:20 GMT
>I have a CSV file with 100 columns and an unknown number of rows that I
> am going to parse and insert into a database.  I can't just fill up an
[quoted text clipped - 15 lines]
> to do this? I know this should be pretty basic (didn't we learn this in
> school), but for the life of me I can't recall how to do it.

If I had to do something similar in my own code, my logic would have a loop
to read all of the records in the CSV until you hit end of file. As you
retrieve each record, you store it in the array and increment the counter.
If the counter hits 1000, you call DAO, verify that the inserts worked, then
reset the counter back to zero. Once you're out of the loop, you call DAO
one last time, passing it the partially-filled array, and insert those
records. That should ensure that _all_  of the records get stored.

Rhino
Roedy Green - 22 Nov 2005 14:25 GMT
>I have a CSV file with 100 columns and an unknown number of rows that I
>am going to parse and insert into a database.

see http://mindprod.com/jgloss/csv.htm

You can use a CSV reader to read a row at a pop.  You pretty well have
to feed them a row at a time to the database don't you?  That way you
don't need that many objects at once.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Roedy Green - 22 Nov 2005 14:27 GMT
>because that will obviously fail near the end if there are not exactly
>1000 records left to insert.  Does anyone have any suggestions on how
>to do this?

that's a common problem, fixed sized chunks with one possibly empty
short one at the end.  I have longed for slick pattern to avoid
repeating any code, but in the end I have always just handled the last
chunk specially.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Robert Klemme - 22 Nov 2005 15:41 GMT
> I have a CSV file with 100 columns and an unknown number of rows that
> I am going to parse and insert into a database.  I can't just fill up
[quoted text clipped - 15 lines]
> to do this? I know this should be pretty basic (didn't we learn this
> in school), but for the life of me I can't recall how to do it.

I'd use List.size() instead of an extra counter.  This is error prone.

private static final int COUNT = 1000;

List records = new ArrayList(COUNT);

while ( more to read ) {
 records.add( read record );

 if ( records.size() == COUNT ) {
   send data(records);
   records.clear();
 }
}

if ( !records.isEmpty() ) {
 send data(records);
}

Kind regards

   robert
HalcyonWild - 22 Nov 2005 19:20 GMT
> I have a CSV file with 100 columns and an unknown number of rows that I
> am going to parse and insert into a database.  I can't just fill up an
[quoted text clipped - 15 lines]
> to do this? I know this should be pretty basic (didn't we learn this in
> school), but for the life of me I can't recall how to do it.

Instead of adding to an arrayList, is it possible to read one record
off the CSV, insert it, and then go to the 2nd row on the CSV. Keep a
list of all failed inserts instead, in the array list.
MileHighCelt - 22 Nov 2005 20:17 GMT
Thank you everyone for ideas I hadn't considered (was really thinking
counter mod something) and it gives me lots of things to think about.
I really like the idea of inserting a record as its parsed, except I
have to wonder about that performance.  I have database pooling
working, so that wouldn't be an issue, but I wonder about other hits.
I suspect that each file will have about 80,000 records so what kind of
issues should I look out for when I get an existing connection, prepare
a statement and update a table?
Jimi Hullegård - 23 Nov 2005 00:17 GMT
> Thank you everyone for ideas I hadn't considered (was really thinking
> counter mod something) and it gives me lots of things to think about.
[quoted text clipped - 4 lines]
> issues should I look out for when I get an existing connection, prepare
> a statement and update a table?

What database are you using? If all of the data in this file should be
inserted into the database, then maybe the database itself has some
functionallity to insert the data from the csv-file? Or some third party
program (search with google)....
I suspect this would be much faster then parsing the file in a javaprogram,
and inserting it into the database row by row.
Even if you want to manipulate the data this approach could still be valid.
Just insert it into a temporary table, then use sql-functions to manipulate
the data, and then move it to the real target table.

/Jimi
Roedy Green - 23 Nov 2005 01:05 GMT
On Wed, 23 Nov 2005 01:17:39 +0100, "Jimi Hullegård"
<ask_for_real_email@nothotmail.com> wrote, quoted or indirectly quoted
someone who said :

>What database are you using? If all of the data in this file should be
>inserted into the database, then maybe the database itself has some
>functionallity to insert the data from the csv-file?

Sybase has a load format like CSV, but with ' instead of ". It is very
quick.  In database load mode it does not worry about transactions.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

MileHighCelt - 23 Nov 2005 04:32 GMT
I am using mySQL 5.1 and a CSVparser from Ostemiller.  Since there is
some manipulation of data (scrubbing mostly) and the uploads only occur
every 3 months or so, I think I will continue on this path.  Besides, I
have it mostly coded now ;)


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.