Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / November 2005

Tip: Looking for answers? Try searching our database.

Question about hashing

Thread view: 
david@david.com - 10 Nov 2005 08:49 GMT
Dear All,

   I have a text file which consists of 10000000 data.  I would like to do
hashing
to search the desired data rather than linear search. Would you please give
me some hints to implement that ?

Best and Regards,
David
Benji - 10 Nov 2005 09:01 GMT
> Dear All,

>     I have a text file which consists of 10000000 data.  I would like to do
> hashing
> to search the desired data rather than linear search. Would you please give
> me some hints to implement that ?

You're going to have to be more specific about your requirements.  What are
the "data"?  What are your size and speed requirements?

If you have 10 million of anything that's even close to large, you won't be
able to store it all in memory, and you'll have to index the data if you
want searching to be fast.  (but that's only possible depending on what
type of stuff the data is, so be more specific)

Signature

Of making better designs there is no end,
 and much refactoring wearies the body.

david@david.com - 10 Nov 2005 09:50 GMT
It's glad to receive your reply. Actually, I have a frequency table, which
stores words and frequency of the words. I have tried to read this values to
hashtable and then search the word. However, it took a long time to finish.
Thanks.

Frequency Table :
Word          Frequencey
Java            45
.NET           11

The size of this table only 5 MB.

Best and Regards,
David

"Benji" <bdg@cc.gatech.edu> ¼¶¼g©ó¶l¥ó·s»D
:dkv28j$j1v$1@news-int.gatech.edu...
> > Dear All,
[quoted text clipped - 11 lines]
> want searching to be fast.  (but that's only possible depending on what
> type of stuff the data is, so be more specific)
Ingo R. Homann - 10 Nov 2005 10:33 GMT
Hi,

> It's glad to receive your reply. Actually, I have a frequency table, which
> stores words and frequency of the words. I have tried to read this values to
[quoted text clipped - 7 lines]
>
> The size of this table only 5 MB.

That sounds to be no problem, I think. Can you post a code snippet, how
you read the file and how you store the data in a HashMap?

What takes so long? Reading the file or looking for an entry? What do
you mean with "long"? 10 ms? 1 sec? 10 sec?

Ciao,
Ingo
david@david.com - 10 Nov 2005 12:47 GMT
Hi,

   I just simply use StreamReader to read the file by using readline().
Then put the word as key and frequencey as value to
the hashtable. After updating the frequency in the hashtable, the whole
frequency table would write to frequencey table with I/O.

Thanks.
"Ingo R. Homann" <ihomann_spam@web.de>
???????:437321fe$0$21950$9b4e6d93@newsread2.arcor-online.net...
> Hi,
>
[quoted text clipped - 18 lines]
> Ciao,
> Ingo
Chris Uppal - 10 Nov 2005 13:47 GMT
>     I just simply use StreamReader to read the file by using readline().
> Then put the word as key and frequencey as value to
> the hashtable. After updating the frequency in the hashtable, the whole
> frequency table would write to frequencey table with I/O.

You'll have to post some code.  And say why you think it's too slow.

(Yes, that /is/ just repeating the questions that Ingo has already asked, but
which you didn't answer).

   -- chris
Roedy Green - 10 Nov 2005 11:08 GMT
>    I have a text file which consists of 10000000 data.  I would like to do
>hashing
>to search the desired data rather than linear search. Would you please give
>me some hints to implement that ?

You would have to break each line in to words.  You would do that with
a regex split. see http://mindprod.com/jgloss/regex.html

Then you would add each word to a HashMap key the word, and value the
line number or offset.

see http://mindprod.com/jgloss/hashmap.html
http://mindprod.com/jgloss/hashtable.html

The default hashCode for String will do fine.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.