Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / Databases / April 2008

Tip: Looking for answers? Try searching our database.

large queries

Thread view: 
Terry - 22 Apr 2008 19:39 GMT
Hi,

Can anyone make a recommendation on how
to design a Java system that will make SQL
queries on a mySQL database that results in very
large data sets?  What can be done to improve
performance?  Is hybernate better than using
straight JDBC?

Thanks,

Terry
Martin Gregorie - 22 Apr 2008 22:13 GMT
> Can anyone make a recommendation on how to design a Java system that
> will make SQL queries on a mySQL database that results in very large
> data sets?  What can be done to improve performance?  Is hybernate
> better than using straight JDBC?

Please define "very large".

My application stores e-mail messages in a database. Its standard search
query uses a join over tables storing an e-mail body, correspondents and a
subject line. There's a many:many relationship between correspondents and
messages with the intersection table saying whether the correspondent is
sender or recipient of each message. Various 'where' clauses that limit
the search scope (by correspondent, title and date range) are added
depending on the search terms the user provides. All prime keys and search
terms are indexed.

- specifying no search terms gets all possible rows in the result set.
 That retrieved 83,000 rows in 14 secs.

- specifying a correspondent's name with a "like '%david%" clause
 returned 7,000 rows in 2 seconds.

I'm using Postgres 7.3, its matching JDBC driver and Java 6.05. The
machine isn't big: 512 MB RAM, 866 MHz and running Fedora 8.

The result set is used to populate the model underlying a JTable. The
query that populates the table only returns three displayable columns of
v/l text (To/from NAME, subject, date), i.e. enough to form a displayable
row in the table, plus a soft dbk. The much larger underlying structures
are only retrieved when selected from the table to display the full
message: it takes around a second to get all the data from a join on the
row containing the dbk, populate a JFrame and display it. This join is
over the four tables and typically accesses 6-10 rows in total.

I'm not claimimg this idea (store a minimal list and allow selecting from
it to pull back full details of the selected item) is the best solution
but it suits my application, hasn't caused memory problems so far and
gives acceptable performance.

Signature

martin@   | Martin Gregorie
gregorie. |
org       | Zappa fan & glider pilot

Lew - 23 Apr 2008 02:48 GMT
> I'm using Postgres 7.3, its matching JDBC driver and Java 6.05. The
> machine isn't big: 512 MB RAM, 866 MHz and running Fedora 8.

BTW, that's a rather old version of PG.  Apparently there have been
significant improvements since then.  On the Postgres boards they don't even
regard version 7.4 as sufficiently current.

Also, PG is fully capable of using much more horsepower if you make it
available.  I routinely read about RAID arrays and multi-GB database servers
for PG.  If one is truly doing large queries, they might find 512 MB a little
skinny for the need.

One more advice: if data integrity is worth something to you, either turn of
write-back caching, or use a battery-backed RAID controller.  If you suffer a
crash before things are fully written to the drive from the cache it could
seriously fubar your data.

Signature

Lew

EricF - 23 Apr 2008 03:01 GMT
>Hi,
>
[quoted text clipped - 8 lines]
>
>Terry

The best way to improve performance is to have small result sets. Do you
really need large results? Can you partition your queries?

Try to minimize things going across the network. I don't like stored
procedures, but you may need to consider them.

I don't think Hinernate will help you much for this.

Assuming large is really large, and you are talking about the query result,
not the size of the database.

Eric
Thomas Kellerer - 23 Apr 2008 07:21 GMT
Terry, 22.04.2008 20:39:

> Hi,
>
[quoted text clipped - 4 lines]
> performance?  Is hybernate better than using
> straight JDBC?

I find hibernate is lot more complicated to be used in such a situation.
Straight JDBC gives you a lot more control over what is beeing fetched.

The question is: what do you want to do with the data set?
The strategies for displaying large data sets to the end user are completely different than processing large data sets in the backend.

Check the driver documentation, maybe the driver is first caching the complete result set in memory before delivering it to the caller.

Thomas
Arne Vajhøj - 24 Apr 2008 02:50 GMT
> Check the driver documentation, maybe the driver is first caching the
> complete result set in memory before delivering it to the caller.

At least older MySQL drivers does just that. I have not looked at
the newer.

Arne


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.