Java Forum / Databases / October 2005
Opinions on application design?
Davey - 02 Oct 2005 17:38 GMT I am planning on developing an application which will involve skills that I have very little experience of - therefore I would appreciate comments on my initial design thoughts.
Overview on system:
I'm not going to divulge the exact aims of the application but the purpose of it is to allow multiple client applications to retrieve data from a database (on a db server) and feed this data into another Windows application using it's C++ API (already provided to me). There could eventually be thousands of client apps connecting to this server, and the database could have millions of rows. Each row of data retrieved would typically have around 5-10 fields of information and rows would only be retreived one at a time ie each transmission between server and client would be only one row of data.
Rough design plans:
I plan on having the following major components:- - The source database - The server-side application - The client application which will run on Windows and will use an API to control another Windows application on the same machine - The database driven web application
I am planning on making the source database using mySQL due to the obvious licensing benefits. This side of things is easy to me as I specialise in databases (although primarily MS SQL Server).
The server side application will be written in Java and will use JDBC to connect to the database. I am planning on implementing a cache into this app so that database reads are relatively infrequent in comparison to data requests from the client apps.
The client-side application will potentially run on thousands of client PCs. It will request data from the server-side app which it will then feed on to the other Windows application using the API. I am planning on writing this in C++ as the API is in C++
The web application will connect to the database and will allow users to manage the data. I will probably write this in PHP.
Issues/questions:
- What is the best method to allow the C++ client app to speak to the Java server application (e.g. request, send and receive data) - what about SOAP?
- Will a Java server-side app cope efficiently with for example 10,000 simultaneous client connections with each requesting data roughly once every 10 seconds?
- In these plans I have given the web application a direct connection to the source database. Is this a serious security risk and if it is what steps should I take to minimise this risk?
- Is it easy to encrypt the data transmission between the C++ client app and the Java server app?
- Should I consider any different approaches to the ones mentioned in my rough design plans?
Please bear in mind that I have almost no experience of writing cient-server applications so any comments, advice, tips or pointers to relevant reading material would be greatly appreciated.
TIA.
Mladen Adamovic - 02 Oct 2005 19:16 GMT > - Will a Java server-side app cope efficiently with for example 10,000 > simultaneous client connections with each requesting data roughly once every > 10 seconds? Tomcat web server (java) is definitily very fast, so it is a good choice.
> - In these plans I have given the web application a direct connection to the > source database. Is this a serious security risk and if it is what steps > should I take to minimise this risk? If you make your web application properly, you won't have security risk. Anyway, regularly backup is always a good idea.
> - Is it easy to encrypt the data transmission between the C++ client app and > the Java server app? Yes, you can use HTTPS inside Tomcat web server.
> Please bear in mind that I have almost no experience of writing cient-server > applications so any comments, advice, tips or pointers to relevant reading > material would be greatly appreciated. RTFM.
Wibble - 02 Oct 2005 20:02 GMT > I am planning on developing an application which will involve skills that I > have very little experience of - therefore I would appreciate comments on my [quoted text clipped - 62 lines] > > TIA. TIA, it sounds like you've bitten off more than you can chew. Your volume of connections both to your app and to the database stress most technologies. How did such a large architecture task get assigned to somebody without experience?
Henry Townsend - 02 Oct 2005 20:22 GMT > How did such a large architecture task > get assigned to somebody without experience? It's not always a matter of _assignment_. Maybe he's an entrepreneur inventing something on his own. I've been in that position and you do have to stretch yourself out. You can't hire someone to do the things you aren't expert at till you at least have a proof of concept ... so you learn fast.
Davey - 02 Oct 2005 21:49 GMT >> How did such a large architecture task >> get assigned to somebody without experience? [quoted text clipped - 4 lines] > aren't expert at till you at least have a proof of concept ... so you > learn fast. Exactly.
Davey - 02 Oct 2005 21:49 GMT > How did such a large architecture task > get assigned to somebody without experience? Because I am the one who assigned the task to myself.
Roedy Green - 03 Oct 2005 04:55 GMT >The client-side application will potentially run on thousands of client PCs. >It will request data from the server-side app which it will then feed on to >the other Windows application using the API. I am planning on writing this >in C++ as the API is in C++ One area of concern is this Java-C++ boundary.
There are several ways you could implement it:
1. write the app in Java with a JNI library to the C++ app manipulators.
2. Talk to your C++ app with a socket connecting to Java.
3. some sort of circular buffer disk file shared between java and C++
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Roedy Green - 03 Oct 2005 05:38 GMT >- What is the best method to allow the C++ client app to speak to the Java >server application (e.g. request, send and receive data) - what about SOAP? Your choices are wide : see http://mindprod.com/jgloss/remotefileaccess.html They are even wider if you have Java on both ends. What's the best way depends on many things:
There are three most likely approaches:
1. http get. The response can be text or binary, uncompressed or compressed. This is the garden variety way to solve it. I would not suggest adding extra layers like XML, SOAP etc since your app is so simple that all you would be doing is adding overhead. You don't need the flexibility.
2. raw sockets. This assumes a user has an extended session with many requests one after the other. It also assumes you don't have all that many users potentially connected at once.
3. datagrams. Presumes responses are short. You must handle lost packets yourself. Particularly useful if there is peer to peer communication.
>- Will a Java server-side app cope efficiently with for example 10,000 >simultaneous client connections with each requesting data roughly once every >10 seconds? A one every 10 seconds for 10,000 users, you want to pump 1000 transactions per second through your system. Lets say that each transaction required 200 bytes of payload plus another 200 bytes of overhead. That means you want to emit 400,000 bytes per second. That is 4 million bits per second. A T2 data connection is 6 million bits per second rated. So you would probably want a pair of them. Last time I looked I would have had to mortgage my house to get such a connection. I presume they are lower now.
Let's say for example that database lived on disk and you had enough ram so that the index was 100% cached, or you were doing lookup by relative record number on a flat file. You could thus do a lookup with a single disk access.
Let's presume your hard disk got 10 ms average seek/read time. You need to do 1000 accesses per second. In other words you have 10 seconds worth of accesses to do every second. Oops! You need then 10 SSCI disks with overlapped seek/read, or some massive caching so that 90% of transactions can be handled with ZERO disk accesses.
You are talking airline reservation type volumes. Somebody more experienced should be in charge here, or at least signing off on whatever plan you come up with. You don't want to spend those kind of bucks based on the say so of some person on the net with no accountability for your project.
The Java server does bugger all. You might not even bother with a full blown womb, and just write a little special purpose server that just handles this one app. Even with a womb, the server just makes the query and enqueues the result for going out the wire. You don't even need to wrap the data in an HTML page.
>- In these plans I have given the web application a direct connection to the >source database. Is this a serious security risk and if it is what steps >should I take to minimise this risk? This is a big no-no from a security point of view. You would only do it on an intranet. Further you would only do it if your applet were very generic and conceivably make any query and display any data. Other than that you cook up a CGI interface to make your query and never expose JDBC to the outside. If you tried SSL or other encryption you cut the speed in half and double your transmission bill. If you give your client SQL access, then have a direct C++ to SQL connection, no Java server involved.
>- Is it easy to encrypt the data transmission between the C++ client app and >the Java server app? No. It is much easier if both ends are C++ or both ends Java. Otherwise you have to find implementations identical in absolutely every detail. Also consider that encryption is expensive in terms of CPU time and transmission time. It bulks up the stream. There are several flavours of security that you could be concerned with:
1) preventing someone from getting in a hacking your database, e.g. doing a DROP TABLE on it.
2) stopping someone from pretending to be a legit client and sending you bum data just to screw you up. To stop this, look logins, and digital signing, which is not as expensive as full blown encryption.
3) preventing someone tapping your phone lines or snooping on packet from extracting data from your packets. This requires some sort of encryption, or at least scrambling so that it is not obvious how to read the data. This would be mandatory for example if your packets contained credit card numbers
>- Should I consider any different approaches to the ones mentioned in my >rough design plans? If you are trying to do this on the cheap, have a look at how BitTorrent works. You might do something similar to distribute the entire database over your users and have them hand it off to each other.
see http://mindprod.com/jgloss/bittorrent.html
Another approach if the database is slowly changing is to maintain a local copy of it with each client. Your job then in just to keep them abreast of changes.
Have a look at the Replicator:
see http://mindprod.com/webstarts/replicator.html
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Davey - 03 Oct 2005 18:16 GMT >>- What is the best method to allow the C++ client app to speak to the Java >>server application (e.g. request, send and receive data) - what about [quoted text clipped - 12 lines] > simple that all you would be doing is adding overhead. You don't > need the flexibility. Interesting - and new to me. Just to confirm - this would work with a standard Windows C++ application connecting to a Java server sending and receiving textual data?
If you can point me in the direction of any reading material on this I would be grateful.
> 2. raw sockets. This assumes a user has an extended session with many > requests one after the other. It also assumes you don't have all that > many users potentially connected at once. This was what I initially assumed I would use. I'm not entirely sure whether you would consider this application having "extended sessions" - e.g. a client would connect to the server at a varying frequency... sometimes once every 10 seconds, then sometimes not for another 10 minutes during typical usage. Overall though they would use it for a typical working day i.e. 8 hours.
Is this an "extended session"?
>>- Will a Java server-side app cope efficiently with for example 10,000 >>simultaneous client connections with each requesting data roughly once [quoted text clipped - 26 lines] > bucks based on the say so of some person on the net with no > accountability for your project. OK, my solution to this problem is reduce the number of users or build more servers. The 10,000 was just a figure I hoped I would fit onto one server...
:)
> The Java server does bugger all. You might not even bother with a full > blown womb, and just write a little special purpose server that just > handles this one app. Even with a womb, the server just makes the > query and enqueues the result for going out the wire. You don't even > need to wrap the data in an HTML page. Sorry... what do you mean by "womb"?
>>- In these plans I have given the web application a direct connection to >>the [quoted text clipped - 8 lines] > bill. If you give your client SQL access, then have a direct C++ to > SQL connection, no Java server involved. By "web application" I just mean a PHP website which connects to my database. Is this how you interpreted it?
>>- Is it easy to encrypt the data transmission between the C++ client app >>and [quoted text clipped - 4 lines] > every detail. Also consider that encryption is expensive in terms of > CPU time and transmission time. It bulks up the stream. Yes, and bandwidth is an issue. I'm not sending highly sensitive data, so I might have to do without encryption.
> There are > several flavours of security that you could be concerned with: > > 1) preventing someone from getting in a hacking your database, e.g. > doing a DROP TABLE on it. Yes, I'm aware of injection.
> 2) stopping someone from pretending to be a legit client and sending > you bum data just to screw you up. To stop this, look logins, and > digital signing, which is not as expensive as full blown encryption. OK.
> 3) preventing someone tapping your phone lines or snooping on packet > from extracting data from your packets. This requires some sort of > encryption, or at least scrambling so that it is not obvious how to > read the data. This would be mandatory for example if your packets > contained credit card numbers OK.
>>- Should I consider any different approaches to the ones mentioned in my >>rough design plans? [quoted text clipped - 5 lines] > > see http://mindprod.com/jgloss/bittorrent.html I would like it to be suitable for something like this but unfortunately it isn't.
> Another approach if the database is slowly changing is to maintain a > local copy of it with each client. Your job then in just to keep them [quoted text clipped - 3 lines] > > see http://mindprod.com/webstarts/replicator.html The data is changing constantly and will potentiall involve large volumes (millions of rows).
Roedy Green - 03 Oct 2005 19:17 GMT >Interesting - and new to me. Just to confirm - this would work with a >standard Windows C++ application connecting to a Java server sending and >receiving textual data? HTTP GET and HTTP POST started out with C and CGI. See http://mindprod.com/jgloss/cgi.html
Java adopted the HTTP protocol, and invented the Servlet to make the code faster by avoiding a load for each transaction.
In Java coding HTTP is pretty easy for both ends. See http://mindprod.com/applets/fileio.html for the client side. It will generate sample code to do it both with URLConnection and with raw sockets.
See http://mindprod.com/products1.html#ECHO for a simple server side EchoServer
In C++, HTTP is slightly more work. You must find a proprietary library. It is not part of the language.
You are actually far more familiar with this than you think. HTTP is the protocol the browser uses to either download a webpage or download a binary file. If you watch your browser in action with a packet sniffer, you will soon understand HTTP protocol.
See http://mindprod.com/jgloss/http.html http://mindprod.com/jgloss/packetsniffer.html
If worse comes to worse you can handle HTTP yourself at the raw socket level in C. I did this in Java when I first started out. I found that easier than understanding the high level method documentation.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Roedy Green - 03 Oct 2005 19:28 GMT >then sometimes not for another 10 minutes during typical >usage. Overall though they would use it for a typical working day i.e. 8 >hours. Sockets are for more or less continuous streams. One problem is a socket takes up considerable resources on the server even when idling. So sockets are not the way to fly. With HTTP you open a connection, send in your packet, send back your response (which may take many packets) then close the socket. Then you forget about that user and his state (other than perhaps in the database.) They must identify themselves afresh on each transaction.
The main problem with HTTP is the fussing about sending packets back and forth to establish and shut down the connection.
Most of the time you just live with it, but there is another option, datagrams, not often resorted to. Their drawbacks are: 1. there is no assurance they were delivered. It is up to application level software to deal with lost packets. 2. packets have to be kept quite short in the order of 100 bytes, in both directions. to get the benefits. 3. there is no guarantee packets will be delivered in order. 4. all the usual tools presume HTTP, not datagrams.
Their advantages are: 1. no setup overhead. 2. work well peer to peer.
I would not normally bring that option up, but in your case, it looks as though you have very high volumes, and very simple data structure.
What you might do is code first in HTTP, then as your load grows, spin just some stable but significant part of it over to datagrams.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Davey - 03 Oct 2005 22:26 GMT >>then sometimes not for another 10 minutes during typical >>usage. Overall though they would use it for a typical working day i.e. 8 [quoted text clipped - 10 lines] > The main problem with HTTP is the fussing about sending packets back > and forth to establish and shut down the connection. I think HTTP is the way for me.
> Most of the time you just live with it, but there is another option, > datagrams, not often resorted to. Their drawbacks are: [quoted text clipped - 11 lines] > I would not normally bring that option up, but in your case, it looks > as though you have very high volumes, and very simple data structure. I have *potentially* high volumes, and yes I do have a very simple data structure.
> What you might do is code first in HTTP, then as your load grows, spin > just some stable but significant part of it over to datagrams. The datagrams suggestion is interesting and I will definitely read into it.
Roedy Green - 04 Oct 2005 17:57 GMT >The main problem with HTTP is the fussing about sending packets back >and forth to establish and shut down the connection. HTTP has a large number of optional fields in the header. These are to help the browser deal with a generic source and render it. In your case everything, even the length field is not really necessary. Everything you need to know will be in an identifying header of the message itself. Pruning those down will help speed transmission.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Roedy Green - 03 Oct 2005 19:29 GMT >Sorry... what do you mean by "womb"? see http://mindprod.com/jgloss/womb.html
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Roedy Green - 03 Oct 2005 19:33 GMT >OK, my solution to this problem is reduce the number of users or build more >servers. The 10,000 was just a figure I hoped I would fit onto one server... Some apps you can do that. Others you can't partition the data easily.
Presumably then the each group of users has its own private set of data, or the global data they are accessing is slowly changing so you could replicate it on each server and keep it up to date.
A database difficult to partition would be say a massive global instant messaging system.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Davey - 03 Oct 2005 22:31 GMT >>OK, my solution to this problem is reduce the number of users or build >>more [quoted text clipped - 5 lines] > Presumably then the each group of users has its own private set of > data, Yes, exactly.
Roedy Green - 03 Oct 2005 19:35 GMT >> If you are trying to do this on the cheap, have a look at how >> BitTorrent works. You might do something similar to distribute the [quoted text clipped - 5 lines] >I would like it to be suitable for something like this but unfortunately it >isn't. Just making clear I am not suggesting BitTorrent itself, just the BitTorrent technique of getting the users to store you database for you and serve it up amongst themselves so you can scale up without requiring a more powerful server. Users bring some storage, bandwidth and compute serving power with them.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Davey - 03 Oct 2005 22:34 GMT >>> If you are trying to do this on the cheap, have a look at how >>> BitTorrent works. You might do something similar to distribute the [quoted text clipped - 12 lines] > requiring a more powerful server. Users bring some storage, bandwidth > and compute serving power with them. The problem is that this will not just be used by users at home - it will also be used by businesses and the nature of the project is such that bandwidth in these businesses will be stretched to its limit. This means the distributed model of BitTorrent isn't appropriate for non-home users (because they will have no spare bandwidth) - unfortunately.
Roedy Green - 03 Oct 2005 19:40 GMT >The data is changing constantly and will potentiall involve large volumes >(millions of rows). That implies high end, read expensive, SQL engines. It also means your severs will need a ton of ram. As I mentioned earlier, even the cleverest SQL engine could not deal with your problem without having most of the database in RAM.
Unless you have deep pockets, this project may be somewhat before its time.
To get he sort of performance you need, you would have to store the records physically on disk sorted by likelihood of use with a totally in-RAM index with most of the likely records cached. I don't know if you can get that in SQL.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Davey - 03 Oct 2005 22:36 GMT >>The data is changing constantly and will potentiall involve large volumes >>(millions of rows). [quoted text clipped - 11 lines] > in-RAM index with most of the likely records cached. I don't know if > you can get that in SQL. Hmmmm.
The capacity issues that you are emphasising to me will simply be dealt with by restricting the number of users per server. To be honest this project could be successful whether there are 10000, 1000 or 250 simultaneous users per server.
Roedy Green - 04 Oct 2005 08:39 GMT >The capacity issues that you are emphasising to me will simply be dealt with >by restricting the number of users per server. To be honest this project >could be successful whether there are 10000, 1000 or 250 simultaneous users >per server. That gets you off a lot of hooks. It also lets you use cheap small servers rather than great honking expensive ones. I was looking at an ad the other day for one from HP where rackmount servers by the dozens. Each had a 64 bit opteron CPU.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Monique Y. Mudama - 04 Oct 2005 22:42 GMT ["Followup-To:" header set to comp.lang.java.programmer.] On 2005-10-03, Roedy Green penned:
> There are three most likely approaches: > [quoted text clipped - 11 lines] > packets yourself. Particularly useful if there is peer to peer > communication. Why would sockets in particular have more trouble with many users than other methods?
This isn't just an idle question; we're looking at switching an applet from using http to raw sockets because of the HTTP limit on the number of simultaneous connections to a single server.
 Signature monique
Ask smart questions, get good answers: http://www.catb.org/~esr/faqs/smart-questions.html
Jon Martin Solaas - 03 Oct 2005 14:18 GMT > I am planning on developing an application which will involve skills that I > have very little experience of - therefore I would appreciate comments on my [quoted text clipped - 42 lines] > - What is the best method to allow the C++ client app to speak to the Java > server application (e.g. request, send and receive data) - what about SOAP? Soap would work, webservices also.
> - Will a Java server-side app cope efficiently with for example 10,000 > simultaneous client connections with each requesting data roughly once every > 10 seconds? Depends on a lot of things, but at least it doesn't sound like a smart idea to let every client keep it's own permanent/persistent connection, keeping state for 10.000 clients eats resources that could be spent wiser. Keep it stateless if you can.
> - In these plans I have given the web application a direct connection to the > source database. Is this a serious security risk and if it is what steps > should I take to minimise this risk? If it is possible for the end-user to post a form with values that actually will be interpreted as SQL by the database, you have a serious risk. Also, if you have your webserver in a demilitarized zone and your appserver inside the firewall it means that you need to open up for database access from the dmz, which you otherwise wouldn't have.
> - Is it easy to encrypt the data transmission between the C++ client app and > the Java server app? https?
> - Should I consider any different approaches to the ones mentioned in my > rough design plans? Consider if you really need to allow access directly from the web application to the database. Generally it sounds like a bad idea to have two ways of accessing the database, but you may have a good reason?
When it comes to database; there are alternatives to mySQL. mySQL is running all over, so it's a proven solution and if you encounter any problems it's likely that someone already has solved them or found a workaround already. mySQL has a cache-mechanism and a pretty solid jdbc driver (java connectivity). But; mySQL has not yet all features one normally expect from a rdbms, triggers and stored procedures are scheduled for version 5, as far as I know, and I recently learned that it can't facilitate more than one index per table when performing multitable queries. I suppose it has row-level locking these days? So you should really study the feature set and see if it is sufficient for your needs. If it is then go for it. Alternatives would be MySQL MAX (former SAP db) or PostgreSQL.
> Please bear in mind that I have almost no experience of writing cient-server > applications so any comments, advice, tips or pointers to relevant reading > material would be greatly appreciated. Read up on J2EE architecture. Free books available at www.theserverside.com, and lots of useful stuff available from Sun too.
> TIA.
 Signature jon martin solaas
Davey - 03 Oct 2005 18:21 GMT >> I am planning on developing an application which will involve skills that >> I have very little experience of - therefore I would appreciate comments [quoted text clipped - 45 lines] > > Soap would work, webservices also. Would this offer any benefits over the suggestions by Roedy of HTTP GET and sockets? Or would SOAP work in conjunction with these?
>> - Will a Java server-side app cope efficiently with for example 10,000 >> simultaneous client connections with each requesting data roughly once [quoted text clipped - 4 lines] > keeping state for 10.000 clients eats resources that could be spent wiser. > Keep it stateless if you can. OK.
>> - In these plans I have given the web application a direct connection to >> the source database. Is this a serious security risk and if it is what [quoted text clipped - 5 lines] > appserver inside the firewall it means that you need to open up for > database access from the dmz, which you otherwise wouldn't have. Yes, I'm aware of these issues.
>> - Is it easy to encrypt the data transmission between the C++ client app >> and the Java server app? > > https? OK.
>> - Should I consider any different approaches to the ones mentioned in my >> rough design plans? > > Consider if you really need to allow access directly from the web > application to the database. Generally it sounds like a bad idea to have > two ways of accessing the database, but you may have a good reason? The main reason is that the two methods of accessing the data - the client app and the website - are for two different purposes and two different types of user. The web app is more for management, whereas the client app is just for a standard user.
> When it comes to database; there are alternatives to mySQL. mySQL is > running all over, so it's a proven solution and if you encounter any [quoted text clipped - 3 lines] > normally expect from a rdbms, triggers and stored procedures are scheduled > for version 5, as far as I know, I use SPs a lot in my normal DBMS so I would miss them.
> and I recently learned that it can't facilitate more than one index per > table when performing multitable queries. I suppose it has row-level > locking these days? > So you should really study the feature set and see if it is sufficient for > your needs. If it is then go for it. Alternatives would be MySQL MAX > (former SAP db) or PostgreSQL. Is PostgreSQL free and as fast as MySQL?
>> Please bear in mind that I have almost no experience of writing >> cient-server applications so any comments, advice, tips or pointers to >> relevant reading material would be greatly appreciated. > > Read up on J2EE architecture. Free books available at > www.theserverside.com, and lots of useful stuff available from Sun too. Thank you very much (to you and everyone else who has been helpful).
Roedy Green - 03 Oct 2005 19:48 GMT >Would this offer any benefits over the suggestions by Roedy of HTTP GET and >sockets? Or would SOAP work in conjunction with these? Soap in a protocol that piggybacks on HTTP.
HTTP just delivers raw bytes. They could be ASCII characters, locale-encoded chars, UTF, XML, big-endian binary, little-endian binary, serialised objects, downloaded files...
For your volumes, you want to pack your messages as tight as possible and design them to require minimal processing. To me that means binary.
Soap is a species of XML with envelopes to help you tell what kind of data you have. If you have many different sort of messages going back and forth, SOAP provides a way of identifying them and packing data in a standard way.
The price you pay is obscene overhead. In your case you don't need that flexibility. You only have one kind of message if I understood you.
see http://mindprod.com/jgloss/xml.html
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Davey - 03 Oct 2005 22:40 GMT >>Would this offer any benefits over the suggestions by Roedy of HTTP GET >>and [quoted text clipped - 16 lines] > > The price you pay is obscene overhead. OK, no SOAP then.
> In your case you don't need > that flexibility. You only have one kind of message if I understood > you. Well, yes I only have one primary kind of message sent from the server to the client. However there will be different types of messages depending on the response from the client. Overall, I would say there would be three or four different types of messages sent.
Roedy Green - 04 Oct 2005 08:46 GMT >Well, yes I only have one primary kind of message sent from the server to >the client. However there will be different types of messages depending on >the response from the client. Overall, I would say there would be three or >four different types of messages sent. Complexity happens.
So there is a good chance that number of message types will expand, but you are still in the ballpark where you can compose those messages with DataOutputStream or LEDataOutputStream without being overwhelmed. This the fastest way to compose and disassemble messages.
When you have scores of different messages then you need generic tools so you don't have to hand code the construction and taking apart of each message, e..g. RMI or serialised objects or CORBA.
Messages coming back from the server are raw bytes. However messages going in have to be ASCII text. To pass awkward characters or binary in, you need to use URLEncoding. See http://mindprod.com/jgloss/urlencoded.html
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Roedy Green - 03 Oct 2005 19:49 GMT >Is PostgreSQL free and as fast as MySQL? see http://mindprod.com/jgloss/postgresql.html for a comparison.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Mark Matthews - 04 Oct 2005 18:19 GMT >>Is PostgreSQL free and as fast as MySQL? > > see http://mindprod.com/jgloss/postgresql.html > for a comparison. Keep in mind Reody's comparison is close to three years out-of-date ;)
Both products are quite different now, and have new features (and shortcomings), just like any other software product would.
Some examples of where Roedy's information is inaccurate:
* MySQL has had transactions since before Roedy's web page was written, and they use MVCC, just like PostgreSQL and Oracle
* MySQL has many datatypes too, and has for quite some time
* MySQL has had network access via SSL since before this page was written
* There are plenty of people who use MySQL under heavy loads without it crashing, (Yahoo, Sabre, Cox Communications, Live Journal, Wikipedia to name a few).
Personally, I believe there are very few people that can do an RDBMS comparison justice, since you really have to be an expert on all products compared, and it takes a considerable amount of time to be an expert on just a single one of these systems. There's so many differences that you can't capture it on a single page anyway.
It _will_ be easier to find folks with MySQL skillsets, as well as books and other reference material, since both resources are more widely available for MySQL comapred to PostgreSQL.
-Mark
 Signature Mark Matthews MySQL AB, Software Development Manager - Connectivity www.mysql.com
Roedy Green - 04 Oct 2005 18:49 GMT >* MySQL has had transactions since before Roedy's web page was written, >and they use MVCC, just like PostgreSQL and Oracle [quoted text clipped - 6 lines] >crashing, (Yahoo, Sabre, Cox Communications, Live Journal, Wikipedia to >name a few). If it has been inaccurate since it was written, why did you wait till now to pipe up?
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Joe Weinstein - 04 Oct 2005 19:26 GMT >>* MySQL has had transactions since before Roedy's web page was written, >>and they use MVCC, just like PostgreSQL and Oracle [quoted text clipped - 9 lines] > If it has been inaccurate since it was written, why did you wait till > now to pipe up? Well, If we can assume the doc's copyright date means the doc was written originally in 1996, and Mark only read it yesterday, and found data that is wrong and has always been wrong since '06, that would explain it. However, Mark does not know how this document has evolved, so there is a small chance that it may have not originally had incorrect info, but only later did you add information that was wrong. In '86 I posted an alternate cosmology that involves the concept of 'flavored gravity'. No one has posted a single challenge to it yet, so it must be right! Joe
Roedy Green - 05 Oct 2005 03:01 GMT >Well, If we can assume the doc's copyright date means the doc was written >originally in 1996, The document itself was written perhaps two or three years ago. The glossary as a whole has that date. I started in circa 1994-1995 for my own used then published it first circa 1996.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Mark Matthews - 04 Oct 2005 20:26 GMT >>* MySQL has had transactions since before Roedy's web page was written, >>and they use MVCC, just like PostgreSQL and Oracle [quoted text clipped - 9 lines] > If it has been inaccurate since it was written, why did you wait till > now to pipe up? Roedy,
Perhaps because I'm not omniscient? I'm too busy to spend each day googling for webpages about MySQL, and then asking them to be updated.
I only stumbled on your site because of usenet, as at least for a little while longer the signal-to-noise in this group means meaningful discussions go on :|
-Mark
 Signature Mark Matthews MySQL AB, Software Development Manager - Connectivity www.mysql.com
Monique Y. Mudama - 04 Oct 2005 21:45 GMT ["Followup-To:" header set to comp.lang.java.programmer.] On 2005-10-03, Roedy Green penned:
>>Is PostgreSQL free and as fast as MySQL? > > see http://mindprod.com/jgloss/postgresql.html for a comparison. Roedy --
I will almost swear I've used transactions recently in mySQL. Are you sure they're still not supported?
 Signature monique
Ask smart questions, get good answers: http://www.catb.org/~esr/faqs/smart-questions.html
Roedy Green - 03 Oct 2005 19:51 GMT >Thank you very much (to you and everyone else who has been helpful). Usually someone with a project this expensive would not be taking advice from use group denizens.
I hope all you are doing is checking out the viability of your project should it take off. What you do for a prototype and what you do to seriously handle these sorts of volumes are quite different.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Davey - 03 Oct 2005 22:49 GMT >>Thank you very much (to you and everyone else who has been helpful). > [quoted text clipped - 4 lines] > should it take off. What you do for a prototype and what you do to > seriously handle these sorts of volumes are quite different. The project is entirely my own idea, and I won't put any serious cash into it unless I am confident it is up to the job. I have decent test kit available and over 100 test users already available when necessary.
Thanks again Roedy.
Monique Y. Mudama - 04 Oct 2005 21:45 GMT ["Followup-To:" header set to comp.lang.java.programmer.] On 2005-10-03, Davey penned:
> Is PostgreSQL free and as fast as MySQL? More free than mySQL. There have been various claims between the two about relative speed over the years. I believe that postgreSQL has always been more full-featured than MySQL; at various times MySQL has been faster.
PostgreSQL is also neat in that it supports OO in the DB. You can inherit a flight attendant table from a person table, etc. I've never yet found a use for it, but it may be one of those things you have to understand well before you see why you care, like OO in code.
 Signature monique
Ask smart questions, get good answers: http://www.catb.org/~esr/faqs/smart-questions.html
Branimir Maksimovic - 03 Oct 2005 23:59 GMT > I am planning on developing an application which will involve skills that I > have very little experience of - therefore I would appreciate comments on my [quoted text clipped - 24 lines] > licensing benefits. This side of things is easy to me as I specialise in > databases (although primarily MS SQL Server). mySQl has table level lock. No problem when just reading, but writes would usually hang if there are lot of reads/writes.
> The server side application will be written in Java and will use JDBC to > connect to the database. I am planning on implementing a cache into this app > so that database reads are relatively infrequent in comparison to data > requests from the client apps. No need to cache as db and os will cache it for you. Other side of things is wether java and mysql can handle thousands of sim. connections at all? You can't have thousands of connections on mySQL I'm sure without DOS'ing whole system (or you'll need a very big machine). Even if you do nio in java there is lot of overhead in interface and all those conversions. In c++ just polling over several k of fd's burns a lot of cpu cycles. Thread per connection is out of question in this case, so you should cache pending connections and do perhaps num of cpu * 4 requests in parallel and close connections as soon as possible in order to reduce number of fd's polled. IME there are out there rarely more then few dozens of simultaneous parallel requests, whether you have more then max number of pending connections or not. You should meassure how much requests are simultanious and according to that and number of cpus in the system adjust number of worker threads. (leader/follower pattern is useful in this case) If there are more parallel requests then #of cpus can handle ( system becomes unresponsive), the only choice is to buy some new cpus. Obviously PC's have very limited max. number of CPU's so you should resort to cluster of PC's in that case or buy some expensive multi cpu machine (of course, if just link speed is not an limitation).
> The client-side application will potentially run on thousands of client PCs. > It will request data from the server-side app which it will then feed on to > the other Windows application using the API. I am planning on writing this > in C++ as the API is in C++ Hm, you have C++ on non critical side and java on critical side?
> The web application will connect to the database and will allow users to > manage the data. I will probably write this in PHP. With php and web server connectiong to mysql with table level lock, your server will be DOS-ed before you know with several thousand connections.
> - Is it easy to encrypt the data transmission between the C++ client app and > the Java server app? Yes it is easy. You can use avail. libs (openssl is one) or write one.
> - Should I consider any different approaches to the ones mentioned in my > rough design plans? Yes, indeed. You don't need php if you have client app in java or c++. Think about server side app. It can't do thousands of reqests in parallel on cheap machine without some extreme programmers effort and knowledge .
Greetings, Bane.
Henry Townsend - 04 Oct 2005 00:42 GMT [Removed comp.lang.c++, they don't appreciate this stuff]
> I am planning on making the source database using mySQL due to the obvious > licensing benefits. Sorry for the OT but what are the licensing "benefits" of MySQL? As far as I can tell they have a dual-license policy: it's free for GPLed apps but all other users must buy a commercial license. And IIRC it's not too cheap either. See <http://www.mysql.com/company/legal/licensing/> for instance.
I assume you're not planning to GPL your SW or you wouldn't be playing your cards so close to the vest. So if I was you and I was looking for "obvious licensing benefits" I'd look at PostgreSQL which has a completely open BSD-style license. Or Cloudscape/Derby or HSQLDB for a 100% Java DB.
So are you confused about the MySQL license, or am I confused about either it or your intended use of MYSQL?
HT
Davey - 04 Oct 2005 08:12 GMT > So are you confused about the MySQL license This one. :)
Looks like PostgreSQL it is then... thanks.
Stefan Schwetschke - 24 Oct 2005 10:35 GMT [...]
> - What is the best method to allow the C++ client app to speak to the Java > server application (e.g. request, send and receive data) - what about SOAP? > > - Will a Java server-side app cope efficiently with for example 10,000 > simultaneous client connections with each requesting data roughly once every > 10 seconds? [...]
> - Is it easy to encrypt the data transmission between the C++ client app and > the Java server app? [...]
The problem with SOAP is, that SOAP is quite slow. This might be no problem for the clients, but it might put some extra load on the server to process all the SOAP requeste. Perheps you might be interested in faster communication protocol, that is easier to process for the server. One possibility is to use CORBA. It is already build in in Java, and there are many solutions for C++. You can get CORBA implementation for nearly every need, free or commercial, with encryption and very lean and fast. Just ask in an appropriate newsgroup (1).
An alternative to CORBA is ICE from ZeroC (2). It uses a proprietary protocol. An implemetnaiton is available under a free and a commercial license. ICE is faster than CORBA and easier to use.
CORBA and ICE might have problems with web-proxies and some firewalls. You should test them, before you use one of them. SOAP works usually fine in such situations because it uses protocols designes for the WWW.
Geggo
(1: CORBA newsgroups) comp.lang.java.corba comp.object.corba
(2: ICE) http://www.zeroc.com/ice.html
Roedy Green - 24 Oct 2005 14:15 GMT >The problem with SOAP is, that SOAP is quite slow. another possibility is a serialised object, possibly compressed. Another is RMI. another custom binary format messages on a raw socket if you don't have too many message types. Nio makes handling raw structs from C with wrong endianness much easier.
See http://mindprod.com/jgloss/remotefileaccess.html for a catalog of possible connection techniques.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Stefan Schwetschke - 25 Oct 2005 09:15 GMT [...]
> another possibility is a serialised object, possibly compressed. > Another is RMI. I thought the OP wanted to have Java at one end and C++ at the other end. But you're right, this is not perfectly clear from his post. So Java specific IPC might indeed be an option.
> another custom binary format messages on a raw socket Designing an IPC mechanism from scratch needs some serious experience with distributed systems and their error semantics. One must be expecially careful with endianess, field lengths and protocol versions, otherwise one gets some nasty bugs that are very hard to track down. It is alos very hard to design a protocol that can cope with network failures. No offence, but I think the OP lacks this experience. Hence my hint to preexistent solutions that could fit his needs.
> See http://mindprod.com/jgloss/remotefileaccess.html > for a catalog of possible connection techniques. WebDAV is another intresting file transfer protocol. It can be easily implemented using servlets. There must be a modular WebDAV implementation somewhere on the Apache Jakarta site.
Geggo
Roedy Green - 25 Oct 2005 10:17 GMT >I thought the OP wanted to have Java at one end and C++ at the other >end. But you're right, this is not perfectly clear from his post. So >Java specific IPC might indeed be an option. Even if he does, he can still do a link with C++<->C++ or Java<->Java and then bridge to the other at one end with JNI, sockets etc.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Roedy Green - 25 Oct 2005 10:22 GMT On Tue, 25 Oct 2005 09:17:47 GMT, Roedy Green <my_email_is_posted_on_my_website@munged.invalid> wrote, quoted or indirectly quoted someone who said :
>>I thought the OP wanted to have Java at one end and C++ at the other >>end. But you're right, this is not perfectly clear from his post. So >>Java specific IPC might indeed be an option. > >Even if he does, he can still do a link with C++<->C++ or Java<->Java >and then bridge to the other at one end with JNI, sockets etc. This clumsy-looking solution might have special appeal if your link has to be encrypted, needs custom protocols, has fancy authentication, when C++ end is little endian, when you have a large number of custom message types...
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Again taking new Java programming contracts.
Stefan Schwetschke - 25 Oct 2005 09:15 GMT [...]
> - Will a Java server-side app cope efficiently with for example 10,000 > simultaneous client connections with each requesting data roughly once every > 10 seconds? [...]
People might flame me for posting this to a database group :)
Perhaps prevayler (1) is an option for you. Prevayler is a framework, that can be used to replace a database. Prevalyer holds every object in memory, thus it is much faster than a database. It supports clustering and persistent. To some extend it can replace a database with ACID properties. It is NOT a real database, it doesn't really support transactional isolation; but it can be fast as hell.
Geggo
(1: Prevayler) <http://www.prevayler.org/>
F'up to c.l.j.d
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|