Java Forum / General / June 2005
NIO
technical0@gmail.com - 15 Jun 2005 13:42 GMT Hi, I have a little test server and client application. The client connects to the server, sends 10 bytes, then exits and the socket closes. When the server tries to read these bytes I get an error: "An existing connection was forcibly closed by the remote host" The sockets are non-blocking, so I guess it makes sense that the connection might be closed before all the data is either sent from the client, or read on the server. However, I'd like to ensure that all the data is sent before the close takes effect. I'm using SocketChannels to send the data, then doing a SocketChannel.close(). I think I need something like "linger" but doing SocketChannel.socket().setSoLinger(...) doesn't seem to help. Any ideas? Thanks.
technical0@gmail.com - 15 Jun 2005 13:53 GMT Ha, don't worry I've fixed it now!
Remon van Vliet - 16 Jun 2005 13:05 GMT Well, this is not how you're supposed to do it. Like you said, the socket is non-blocking, which means that it will return from your .write() right away, and after that calls the close(). The proper way to do this is this :
1) establish a connection using a non-blocking socket connect() 2) wait for finishConnect() to return true 3) At this point you can send data, do so once the OP_WRITE key is selected after a select() call 4) Write all data (in a secure way, meaning keep track of what you send and make sure you send it all) 5) Once this is done you can close the connection
Now, be sure you actually need non-blocking sockets to begin with. For a limited number of connection blocking sockets are much easier to work with. Use non-blocking sockets only when you wish to manage multiple connection with one thread.
Remon
> Hi, > I have a little test server and client application. [quoted text clipped - 11 lines] > Any ideas? > Thanks. Esmond Pitt - 16 Jun 2005 13:59 GMT > Well, this is not how you're supposed to do it. Like you said, the socket is > non-blocking, which means that it will return from your .write() right away, > and after that calls the close(). The proper way to do this is this : > > 1) establish a connection using a non-blocking socket connect() > 2) wait for finishConnect() to return true This is generally pointless unless you want to timeout the connection and are already using select(); usually better to use blocking connect with a timeout
> 3) At this point you can send data, do so once the OP_WRITE key is selected > after a select() call which will happen immediately the connection is complete, not much point selecting for it really
> 4) Write all data (in a secure way, meaning keep track of what you send and > make sure you send it all) this part is correct, you do need to ensure no short or zero-length writes before you close
> 5) Once this is done you can close the connection and this
Remon van Vliet - 16 Jun 2005 16:49 GMT > > Well, this is not how you're supposed to do it. Like you said, the socket is > > non-blocking, which means that it will return from your .write() right away, [quoted text clipped - 6 lines] > and are already using select(); usually better to use blocking connect > with a timeout True, but if you want full non-blocking connects, this is how to do it. Call finishConnect as soon as OP_CONNECT has been selected.
> > 3) At this point you can send data, do so once the OP_WRITE key is selected > > after a select() call > > which will happen immediately the connection is complete, not much point > selecting for it really It's good practice to do writing once OP_WRITE is selected, that it happens immediately is true, but doesnt change the fact that it's bad style to assume it. But fair enough, it will work.
> > 4) Write all data (in a secure way, meaning keep track of what you send and > > make sure you send it all) [quoted text clipped - 5 lines] > > and this Bjorn Borud - 17 Jun 2005 00:50 GMT ["Remon van Vliet" <remon@exmachina.nl>]
| > which will happen immediately the connection is complete, not much point | > selecting for it really | | It's good practice to do writing once OP_WRITE is selected, that it | happens immediately is true, but doesnt change the fact that it's | bad style to assume it. But fair enough, it will work. no, you don't generally want to waste time doing another select to determine if the socket is writable, because it would constitute fairly odd behavior if it *wasn't* writable when connect has completed.
in fact, the way the underlying implementation on UNIX (using the poll() or select() _system calls_) decides if the socket is done connecting is to check if it is writable. OP_CONNECT seems to have been an API design tradeoff made to hide this fact because it may confuse people.
also, when you have a connected socket and you have a scenario where it is plausible that the data you have written has been sent, so the OS can accept more data on that connection, you always try to write the data first and *if* you have some data left that didn't get written, you enqueue that and let the select() loop take care of pushing it through the connection once the socket becomes writable again.
-Bjørn
Bjorn Borud - 16 Jun 2005 14:15 GMT ["Remon van Vliet" <remon@exmachina.nl>]
| 1) establish a connection using a non-blocking socket connect() | 2) wait for finishConnect() to return true the proper way to do proper non-blocking connect is to issue the connect, register interest in OP_CONNECT with a selector, perform select and then call finishConnect() once the connection has OP_CONNECT in its ready set.
looping around finishConnect() until it returns true is pointless and consumes unecessary CPU; it is better to connect the socket while in blocking mode and then change it to nonblocking once the connect() has completed.
BUT!
beware of nonblocking connect() with NIO though: the decision to implement OP_CONNECT as a separate operation category seems to have lead to some errors in 1.4 implementations of NIO. this is supposedly fixed in 1.5 (I have not verified that this is so, but Sun's bug database says the issue has been addressed in Tiger).
the reason is probably that in the underlying implementation OP_CONNECT is implemented by way of OP_WRITE -- ie. an unconnected socket in connecting state that becomes writable has finished its connect so the OP_CONNECT state doesn't really "exist" (at least on platforms where select() is used). a common error mode for this bug is that this can lead to a Selector never blocking on select() again, which of course, defeats its entire purpose and leads to CPU-hogging busy-wait.
my recommendation is therefore that for code that might run on 1.4 JVMs, *don't* do asynchronous/nonblocking connects; change the mode of the connection after the connect is completed.
after all, that *is* the net effect of what you are describing -- just without the busy-wait.
| 3) At this point you can send data, do so once the OP_WRITE key is selected | after a select() call [quoted text clipped - 6 lines] | Use non-blocking sockets only when you wish to manage multiple connection | with one thread. I'd actually recommend gaining some experience with what the NIO is (sort of) modeled from first. I have seen a lot of rather experienced Java programmers trying to wrap their heads around NIO and most of the time the problem is that they have no experience in writing applications that perform connection multiplexing in C on UNIX.
write a single-threaded program in C which handles multiple connections in parallel using select() or poll(). for instance a simple download program which can take a list of URLs and download them in parallel from N servers simultaneously.
NIO has its own set of quirks and bugs which you need to learn as well, but you won't come very far until you have a proper understanding of the underlying concepts, OS apis and techniques.
also, read the second edition of W Richard Stevens "Unix Network Programming". even if you are a Java programmer and you could care less about UNIX this is still pretty much a "must read" for anyone wanting to understand network programming.
-Bjørn
Esmond Pitt - 17 Jun 2005 10:37 GMT > my recommendation is therefore that for code that might run on 1.4 > JVMs, *don't* do asynchronous/nonblocking connects; change the mode of > the connection after the connect is completed. You don't need to be this drastic. You only need to ensure that you only have OP_CONNECT registered until it fires, then deregister it, and only have OP_WRITE registered *after* that point, and in practice only when necessary, i.e. when you have got a short write, as you have pointed out elsewhere. The problem as you say is that under the hood OP_CONNECT and OP_WRITE are the same thing. Sun's mistake was in trying to distinguish them, which only means that in practice *we* have to distinguish them too, which is silly, as all it really means is that the connection is writable in both cases.
Bjorn Borud - 17 Jun 2005 14:35 GMT [Esmond Pitt <esmond.nospam.pitt@nospam.bigpond.com>]
| You don't need to be this drastic. You only need to ensure that you | only have OP_CONNECT registered until it fires, then deregister it, | and only have OP_WRITE registered *after* that point, it has been a few months since I ran into this problem so I don't remember if that would work, but if it does I'd still be a bit careful. I left big fat warnings all around the code where I set up the connections.
(did you test this in detail or read the underlying selection implementations to verify that this is safe? I just looked at the system call trace and figured out what was going on from that. I didn't look at the implementation.)
| and in practice only when necessary, i.e. when you have got a short | write, as you have pointed out elsewhere. The problem as you say is [quoted text clipped - 3 lines] | silly, as all it really means is that the connection is writable in | both cases. from an API point of view I can understand why the designers of NIO chose to do so, but in retrospect it would be justified to call it a mistake because a) the NIO implementation necessarily gets more complicated, which leads to b) the implementation gets less robust and c) when they fail to implement it correctly (as they did) it makes matters even worse.
besides, it isn't *that* bad having to remember that "if it's writable, then it's connected".
-Bjørn
Esmond Pitt - 18 Jun 2005 09:58 GMT > (did you test this in detail or read the underlying selection > implementations to verify that this is safe? Yes and yes.
> besides, it isn't *that* bad having to remember that "if it's > writable, then it's connected". I agree completely. There is no difference between the states, all they mean is that a send buffer exists with space in it.
Bjorn Borud - 18 Jun 2005 16:56 GMT [Esmond Pitt <esmond.nospam.pitt@nospam.bigpond.com>]
| > (did you test this in detail or read the underlying selection | > implementations to verify that this is safe? | | Yes and yes. so if I've understood you right it'll only freak out if you have OP_READ and OP_CONNECT registered for the same connection at the same time during non-blocking connect()? good!
I think I could fit a workaround for that into the Reactor pattern implementation I use.
if it doesn't work I'll blame you, of course :-)
-Bjørn
Esmond Pitt - 20 Jun 2005 05:06 GMT > so if I've understood you right it'll only freak out if you have > OP_READ and OP_CONNECT registered for the same connection at the same > time during non-blocking connect()? good! not quite, OP_WRITE and OP_CONNECT
Bjorn Borud - 20 Jun 2005 08:49 GMT [Esmond Pitt <esmond.nospam.pitt@nospam.bigpond.com>]
| > so if I've understood you right it'll only freak out if you have | > OP_READ and OP_CONNECT registered for the same connection at the same | > time during non-blocking connect()? good! | | not quite, OP_WRITE and OP_CONNECT err, yes of course. my mistake.
-Bjørn
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|