Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / November 2005

Tip: Looking for answers? Try searching our database.

Aiuto!

Thread view: 
kronos - 08 Nov 2005 15:09 GMT
Volevo esporvi i problemi che ho riscontrato i quali non riguardano errori
di programmazione ma possibili ottimizzazioni del programma a me ignote.
Io sto realizzando un programma che realizza l'arricchimento linguistico
automatizzato di un ontologia tramite l'utilizzo di una risorsa linguistica
concettualizzata (tassonomica e con glosse) che nel mio caso particolare è
wordnet.
I programmi da me utilizzati sono:
-Eclipse SDK (piattaforma java)
-Wordnet 2.0
-Protege 3.1
-OneDollarDB (database)
Il problema principale che ho per adesso è la realizzazione di una tabella
nel database contenente due colonne: word,frequency.
Dove word è la colonna contenente tutte le parole che compongono le glosse
di wordnet e frequency è la loro frequenza di apparizione in esse.
Il problema consiste nel fatto che le operazioni effettuate sul database mi
portano via tanto di quel tempo che la tabella viene relaizzata in
esattamente 5 ore.
Ciò non è accettabile ma io non so come altro fare per migliorare questo
tempo.
Se mi dici che puoi aiutarmi su questa cosa io ti mando il codice del
programma e l'interfaccia per wordnet che utilizzo,inoltre lo schema della
tabella del database.
Inoltre ti spiego anche più specificatamente quello che faccio,
ma in poche parole prendo una glossa, la tokenizzo eliminandone gli elementi
di punteggiatura,per ogni parola faccio una select sulla tabella per vedere
se è già stata inserita se no la inserisco con frequenza uno, se si faccio
un update sulla frequenza.
Neil Padgen - 08 Nov 2005 17:10 GMT
<posted & mailed>

Ciao kronos!

Normalmente si scrive in inglese in comp.lang.java.programmer.  Provo a
traduire che hai scritto.

Non-Italian speakers: here is my attempted, slightly paraphrased translation
of what kronos <energia05@libero.it> posted.

> Subject: Help!
>
[quoted text clipped - 30 lines]
> been inserted; if not I insert it with frequency one, if yes I update the
> frequency.    

-- Neil
Bjorn Abelli - 08 Nov 2005 17:47 GMT
kronos wrote:
"Neil Padgen" translated...

[Note: OP was in Italian]

>> I would like to explain problems I have encountered, not regarding
>> programming errors but possible optimisations of my program.  I have
[quoted text clipped - 18 lines]
>> the database are taking such a long time that the table
>> is filled in exactly 5 hours.

There can be many reasons for the time it takes...

Are you sure it's the database?

There are also the "reading" from WordNet, the tokenizing, beside the reads
from and writes to the db.

One suggestion is to time a smaller sample for each step, to be sure the db
*really* is the bottleneck.

Have you tried to create the database with a larger filesize than the
default? (5MB seems to be the default for One$DB, if it needs bigger space,
the "growing" can be an expensive factor)

>> - I'm taking a note, tokenizing it eliminating the stopwords,

The algorithm for tokenizing can be one bottleneck...

>> - for each word doing a select on the table
>>   to see if it has already been inserted;

As this is done for *each* word, there's plenty of selects...

Is the column "word" indexed in some way?

If it isn't, you probably should, as that would be a huge bottleneck in the
search for a specific record...

>> if not I insert it with frequency one, if yes I
>> update the frequency.

How do you make this update? By the primary key you got in the previous
select, or by using a "WHERE"-clause?

>> If you can tell me how to help me in this thing I will
>> send you the program code and the Wordnet interface
>> which I am using, along with the database schema.

No, I don't want the code, interface or schema.

I'm more into an open discussion than to do consulting work...

// Bjorn A
Roedy Green - 09 Nov 2005 03:52 GMT
On Tue, 08 Nov 2005 17:10:12 +0000, Neil Padgen
<neil.padgen@mon.bbc.co.uk> wrote, quoted or indirectly quoted someone
who said :

>> The problem consists of the fact that the operations on the database are
>> taking such a long time that the table is filled in exactly 5 hours.
>>
>> This is not acceptable, but I don't know what else to do to improve this
>> time.

Couple of thoughts.  

Database loads can often be drastically improved by presorting the
data.

You are using a database called "one dollar". You get what you pay
for. You might see if another low price one gives better performance.
See http://mindprod.com/jgloss/sqlvendors.html

During database load it may be possible to turn off various forms of
transaction rollback or transaction replay for extra speed.

You may also find doubling your RAM will help.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Roedy Green - 09 Nov 2005 06:32 GMT
On Wed, 09 Nov 2005 03:52:59 GMT, Roedy Green
<my_email_is_posted_on_my_website@munged.invalid> wrote, quoted or
indirectly quoted someone who said :

>Database loads can often be drastically improved by presorting the
>data.

It seems most likely the database is the bottleneck. You can easily
check that out with a dummy version of your code that bypasses the SQL
calls and see how long they take, or by dumping the massaged data to a
file then using whatever bulk load utility there is.  You may also
find that bulk load is considerably faster than feeding records one at
a time.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.