Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / Databases / November 2005

Tip: Looking for answers? Try searching our database.

Ultra fast db access & searching?

Thread view: 
Aerodyne - 24 Nov 2005 15:22 GMT
Hi all,

Could some one help me out with this Q?

If I designed a db (MySQL or PostgreSQL) & it contained more than 5+
million users with all unique user names... how can I speed up db
access ...etc to search a new users name if it has already been taken?

All hints & tips welcome (plus good db books on this matter & some)...

The reason I ask is to know how sites like hotmail, gmail, ebay ...etc
all do it so quickly!  But I'd prefer to use either MySQL or
PostgreSQL.

TIA
IchBin - 24 Nov 2005 16:50 GMT
> Hi all,
>
[quoted text clipped - 11 lines]
>
> TIA

Out side of a user name as the primary key or just indexed don't now.
You should look at the 100 percent java database called HSQLDB. I moved
away from MySQL to this DBMS.

Performance wise it runs rings around MySQL, PostgreSQL and most all of
the other dbms's.. Its has a small footprint. You can run it as
embedded, standalone or server. It was just ranked third as a enterprise
databases after Oracle and DB2. All that info is at http://hsqldb.org/

If you want something even faster they have a commercial version called
HXSQL. They have all of the performance information there at
http://www.hxsql.com/

Signature

Thanks in Advance...
IchBin, Pocono Lake, Pa, USA
http://weconsultants.servebeer.com/JHackerAppManager
__________________________________________________________________________

'If there is one, Knowledge is the "Fountain of Youth"'
-William E. Taylor,  Regular Guy (1952-)

Aerodyne - 25 Nov 2005 12:43 GMT
Thanks IchBin .... but I'm sure even with HSQLDB searching through
millions of unique user names can be optimize in some way or another.
What are the best solutions for this task or do most just leave the
database do it with out a good design.

Hotmail has 150+ million users ... yet they get back to you very
quickly as if the user name is already in use & several suggestions!

I'm just looking for good database design if any or tricks for fast
searching... ie instead of putting the complete city name in a field
(LosAngeles) you just use LA, that way you only have to search for two
characters rather than all 10 for equality or like wise for all other
cities ... just a 3 letter acronym.
Dyreatnews@sun.com - 25 Nov 2005 13:28 GMT
> Thanks IchBin .... but I'm sure even with HSQLDB searching through
> millions of unique user names can be optimize in some way or another.
[quoted text clipped - 9 lines]
> characters rather than all 10 for equality or like wise for all other
> cities ... just a 3 letter acronym.

You are on to something here. I believe it is called hashing :)

Signature

dt

However, experience shows that for many people and many applications a
dose of paranoia is reasonable - Bjarne Stroustrup

IchBin - 25 Nov 2005 15:48 GMT
>> Thanks IchBin .... but I'm sure even with HSQLDB searching through
>> millions of unique user names can be optimize in some way or another.
[quoted text clipped - 11 lines]
>
> You are on to something here. I believe it is called hashing :)

Yes it is called hashing but the Database does all of this thru keys and
indexes under the covers for. Defining these relationship's using PK, FK
and indexes correctly makes queries run so fast.

IMHO - It still depends on the relationship of those keys and indexes to
be defined correctly and the knowledge of correct SQL statement
definition to optimized the use of the fore mentioned keys an indexes.
Also to normalize you database the best you can.

Looking for a state can be optimized by defining it as a unique index or
primary key. Defined as a FK when if referenced by other tables. You do
not want to do any hashing. This is why you have a DBMS. You may want to
google for "Database normalization techniques" also.

Here are some good links for review..

http://en.wikipedia.org/wiki/Hash_table
http://www.datamodel.org/index.html
http://en.wikipedia.org/wiki/Database_normalization
http://www.datamodel.org/NormalizationRules.html

Signature

Thanks in Advance...
IchBin, Pocono Lake, Pa, USA
http://weconsultants.servebeer.com/JHackerAppManager
__________________________________________________________________________

'If there is one, Knowledge is the "Fountain of Youth"'
-William E. Taylor,  Regular Guy (1952-)

IchBin - 25 Nov 2005 15:56 GMT
>>> Thanks IchBin .... but I'm sure even with HSQLDB searching through
>>> millions of unique user names can be optimize in some way or another.
[quoted text clipped - 32 lines]
> http://en.wikipedia.org/wiki/Database_normalization
> http://www.datamodel.org/NormalizationRules.html

Forgot to saying something about lookup tables or maintenance tables.

Good Example would be a State table.

http://www.datamodel.org/DataModelLookup.html
http://www.datamodel.org/DataModelKeys.html
Signature


Thanks in Advance...
IchBin, Pocono Lake, Pa, USA
http://weconsultants.servebeer.com/JHackerAppManager
__________________________________________________________________________

'If there is one, Knowledge is the "Fountain of Youth"'
-William E. Taylor,  Regular Guy (1952-)

Bill Karwin - 25 Nov 2005 18:58 GMT
> Hi all,
>
[quoted text clipped - 9 lines]
> all do it so quickly!  But I'd prefer to use either MySQL or
> PostgreSQL.

I'd put an index on the username field and increase the size of the
MySQL key cache.  The key cache is a specific block of memory that is
used to store indexes.  If this is large enough, MySQL may be encouraged
to store the entire index in memory, so it won't need to access the disk
to use that index.

See also http://dev.mysql.com/doc/refman/5.0/en/myisam-key-cache.html,
and especially about index cache preloading:
http://dev.mysql.com/doc/refman/5.0/en/index-preloading.html

Anyway, that should put the index into system memory, and subsequent
lookups (B-tree lookups on a data structure cached in memory) will be
about as fast as you could hope for, given the technology choice.

I'm not sure if PostgreSQL has a similar tuning option.  I'll leave that
research to you. :)

Regards,
Bill K.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.