Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / July 2006

Tip: Looking for answers? Try searching our database.

what's better way to store a million keys in mem?

Thread view: 
John_Woo - 17 Jul 2006 15:11 GMT
Hi,

A application, needs to check whether a user already logined, by
looking at static var in memory.

I don't have good idea. what's doing is, use Hashmap <not sure
Hashtable is better interms of inserting/removing/finding> to store ID
<string> -- key and status <character> -- value.

question:
what the better way to implement this goal, it should support up to 1
million of keys, in terms of high inserting/removing/finding, and
happened more frequently.

--
Thanks
John
Toronto
dsjoblom@abo.fi - 17 Jul 2006 16:09 GMT
> Hi,
>
[quoted text clipped - 9 lines]
> million of keys, in terms of high inserting/removing/finding, and
> happened more frequently.

If you *must* keep track of users by keeping user data in memory, this
is probably one of the quickest solutions. But it will eat up a lot of
memory.

If we assume an overhead of 8 bytes per object and 4 byte references, a
HashMap will use /at least/ 28 bytes per entry and that does not
include any data you put in. So if we assume your keys and values are
say 50 bytes on average, a HashMap will use /at least/ around 80
megabytes for 1 million entries. And that can probably be safely
rounded up to about 100 MB. If you run a 64-bit machine, you are even
worse off. While this may not sound too much, all of this memory is
memory that the main part of your application can't use.

HashMaps (by design) are also notorious for thrashing the CPU cache,
because of bad locality of data, and this is made even more painful in
java, because everything in the map is *also* a reference which
augments the problem.

I'd give a real database a try for this. There is of course a bit of
overhead, but you use a lot less memory and I guess it will also be
faster. Of course, you should benchmark and compare any solutions
suggested.

Regards,
Daniel Sjöblom
Mark Space - 17 Jul 2006 16:29 GMT
>> Hi,
>>
[quoted text clipped - 13 lines]
> is probably one of the quickest solutions. But it will eat up a lot of
> memory.

And it won't be stored in memory anyway.  Every system now-a-days uses
virtual memory.  Which means that memory that isn't used much is
off-load to disk, and only loaded back when needed.  So your data will
be stored on disk anyway.

A database is a good idea.  Maybe check out file locking as an
alternative.  For example, have a list of users and passwords in a file.
 Maybe call it /etc/passwd or something.  Each time a user logs in,
lock the first byte of his/her entry in the /etc/passwd file.  This may
take more memory and time than a Java hash map but who knows, sometimes
these low level system operations are highly optimized so, on the other
hand, you also might come out ahead.

Off the top of my head, I think the other way to do this is to write a
very efficient program in C, then let Java talk to it over a socket or
something.

Final thoughts: one million users is A LOT.  I mean, REALLY A LOT.
There are only 40,000 or so user ports on a Unix system with TCP/IP.
Have you very carefully investigated the requirements of this system you
are building?  I'm starting to seem more than just "a database" or "use
C."  I think you'll need to build out some kind of load balancing array,
and that's non-trivial.  Even a couple thousand users logged in at any
given time is going to stress out a single system greatly.  Better think
this through carefully if it's important...
Mark Space - 17 Jul 2006 16:35 GMT
>>> Hi,
>>>
[quoted text clipped - 23 lines]
>  Maybe call it /etc/passwd or something.  Each time a user logs in, lock
> the first byte of his/her entry in the /etc/passwd file.  This may take

Oops, John I think I got confuzzled with your previous post.  Ignore the
stuff about locking users. Everything else about one million users still
applies I think, at least in principle.

> Off the top of my head, I think the other way to do this is to write a
> very efficient program in C, then let Java talk to it over a socket or
[quoted text clipped - 8 lines]
> time is going to stress out a single system greatly.  Better think this
> through carefully if it's important...
Vincent Cantin - 17 Jul 2006 20:12 GMT
> I'd give a real database a try for this. There is of course a bit of
> overhead, but you use a lot less memory and I guess it will also be
> faster. Of course, you should benchmark and compare any solutions
> suggested.

I also think that you should use a database. You can use a ORM
(Object-Relation Mapping) like the javax.persistence API of JEE5 or
hibernate directly, for example.
steve - 17 Jul 2006 23:02 GMT
> Hi,
>
[quoted text clipped - 14 lines]
> John
> Toronto

this will not scale with a single CPU, you will need multiple cpu's and
servers , if you tend to have a million users. and  if you use multiple
servers you cannot store the keys in memory, unless you share the memory
between processors.
so therefore you are looking at some sort of database or shared disk storage.
something like oracle , will allow you to have multiple  physical servers &
cpu accessing the same database, as if it is a single entity, you can even
partition the database into areas of say 100,000 so that the loading on  the
diskdrives is partitioned and spread evenly between the different cpu's.

Steve
John_Woo - 18 Jul 2006 01:01 GMT
> > Hi,
> >
[quoted text clipped - 26 lines]
>
> Steve

Thanks Steve, that sounds.
Currently we have multi-cpus, may use multi-server later, but still no
database designed
for holding the login-status.

Can u tell more about how to use shared disk in java, how to test it?

--
John
Thomas Hawtin - 18 Jul 2006 14:11 GMT
> A application, needs to check whether a user already logined, by
> looking at static var in memory.
[quoted text clipped - 7 lines]
> million of keys, in terms of high inserting/removing/finding, and
> happened more frequently.

If you arrange your IDs to be sequential, then you could use an
AtomicIntegerArray. That would have little overhead over the minimum
possible memory requirement. If the IDs are not sequential, the a linear
probing hash map would make sense. Either find an open source
implementation or write your own. A lot easier and faster than using C
or a database.

Tom Hawtin
Signature

Unemployed English Java programmer
http://jroller.com/page/tackline/

Chris Uppal - 18 Jul 2006 14:46 GMT
> If you arrange your IDs to be sequential, then you could use an
> AtomicIntegerArray. That would have little overhead over the minimum
> possible memory requirement. If the IDs are not sequential, the a linear
> probing hash map would make sense. Either find an open source
> implementation or write your own. A lot easier and faster than using C
> or a database.

I think the latter would be better even if the IDs are sequential -- the whole
million are unlikely to be logged in at once.

   -- chris
Thomas Hawtin - 18 Jul 2006 15:13 GMT
>> If you arrange your IDs to be sequential, then you could use an
>> AtomicIntegerArray. That would have little overhead over the minimum
[quoted text clipped - 5 lines]
> I think the latter would be better even if the IDs are sequential -- the whole
> million are unlikely to be logged in at once.

If it's only one status byte per user, then we are only talking 1 MB -
tiny. Using AtomicIntegerArray in that way is relatively easy (so long
as you know what you are doing) and removes problems related to locking.

Tom Hawtin
Signature

Unemployed English Java programmer
http://jroller.com/page/tackline/

Chris Uppal - 19 Jul 2006 09:16 GMT
[me:]
> > I think the latter would be better even if the IDs are sequential --
> > the whole million are unlikely to be logged in at once.
>
> If it's only one status byte per user, then we are only talking 1 MB -
> tiny.

Or even less if you use 1-bit per user.  That way you can run the whole system
on the CTO's mobile phone ;-)

   -- chris


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.