kronos wrote:
"Neil Padgen" translated...
[Note: OP was in Italian]
>> I would like to explain problems I have encountered, not regarding
>> programming errors but possible optimisations of my program. I have
[quoted text clipped - 18 lines]
>> the database are taking such a long time that the table
>> is filled in exactly 5 hours.
There can be many reasons for the time it takes...
Are you sure it's the database?
There are also the "reading" from WordNet, the tokenizing, beside the reads
from and writes to the db.
One suggestion is to time a smaller sample for each step, to be sure the db
*really* is the bottleneck.
Have you tried to create the database with a larger filesize than the
default? (5MB seems to be the default for One$DB, if it needs bigger space,
the "growing" can be an expensive factor)
>> - I'm taking a note, tokenizing it eliminating the stopwords,
The algorithm for tokenizing can be one bottleneck...
>> - for each word doing a select on the table
>> to see if it has already been inserted;
As this is done for *each* word, there's plenty of selects...
Is the column "word" indexed in some way?
If it isn't, you probably should, as that would be a huge bottleneck in the
search for a specific record...
>> if not I insert it with frequency one, if yes I
>> update the frequency.
How do you make this update? By the primary key you got in the previous
select, or by using a "WHERE"-clause?
>> If you can tell me how to help me in this thing I will
>> send you the program code and the Wordnet interface
>> which I am using, along with the database schema.
No, I don't want the code, interface or schema.
I'm more into an open discussion than to do consulting work...
// Bjorn A
On Tue, 08 Nov 2005 17:10:12 +0000, Neil Padgen
<neil.padgen@mon.bbc.co.uk> wrote, quoted or indirectly quoted someone
who said :
>> The problem consists of the fact that the operations on the database are
>> taking such a long time that the table is filled in exactly 5 hours.
>>
>> This is not acceptable, but I don't know what else to do to improve this
>> time.
Couple of thoughts.
Database loads can often be drastically improved by presorting the
data.
You are using a database called "one dollar". You get what you pay
for. You might see if another low price one gives better performance.
See http://mindprod.com/jgloss/sqlvendors.html
During database load it may be possible to turn off various forms of
transaction rollback or transaction replay for extra speed.
You may also find doubling your RAM will help.

Signature
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 09 Nov 2005 06:32 GMT
On Wed, 09 Nov 2005 03:52:59 GMT, Roedy Green
<my_email_is_posted_on_my_website@munged.invalid> wrote, quoted or
indirectly quoted someone who said :
>Database loads can often be drastically improved by presorting the
>data.
It seems most likely the database is the bottleneck. You can easily
check that out with a dummy version of your code that bypasses the SQL
calls and see how long they take, or by dumping the massaged data to a
file then using whatever bulk load utility there is. You may also
find that bulk load is considerably faster than feeding records one at
a time.

Signature
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.