> joined tables (select * from a,b where a.ib=b.id). A full
> table scan on our hardware takes 20-30 minutes.
[quoted text clipped - 31 lines]
> Thanks
> Mike
Hi Joe
> Hi. I'd have to say that you should probably spend whatever resources
> you have to redesign this. It's not right for a servlet to select millions
> of rows of data out of the DBMS. Operate on raw data where it is, in the
> DBMS.
Point taken. Though the database will contain a list of astronomical
objects. Allowing the users to mine for rare objects is handled fairly
easily in SQL but users will still want to extract relatively large
subsets of the data to do their own complex analysis which would be
very difficult/impossible to do on the DBMS side. Allowing users to
extract millions of rows but restrict themselves to a subset of the
hundreds of parameters available for each object reduces the data size
significantly from Tb to
a few hundres of Mb.
Regards
Mike
Joe Weinstein - 12 May 2005 00:04 GMT
> Hi Joe
>
[quoted text clipped - 15 lines]
> subsets of the data to do their own complex analysis which would be
> very difficult/impossible to do on the DBMS side.
You may be correct, but maybe not, too. If you find (or become) a SQL
guru that can do pivots, and generated SQL from macros and stuff like
that, you might be astonished at the power of SQL.
> Allowing users to
> extract millions of rows but restrict themselves to a subset of the
[quoted text clipped - 4 lines]
> Regards
> Mike
Good luck then. Yes it is good to pare down what users want. For bulk
downloads I might investigate the DBMSes dumping facilities to a file format
rather than a direct servlet-to-DBMS transfer. In fact I wonder if the
raw data is non-volatile enough that the majority of it can reside in
simple OS files for rapid transfer, and have them updated periodically
from the DBMS?
Joe Weinstein at BEA