Java Forum / General / February 2006
Repeated read From Socket is Truncated
Trouble@Mill - 01 Feb 2006 19:19 GMT I'm trying to read a bunch of XML from a connected socket. The 1st time I read it, I get all the data correctly. But, the 2nd, and subsequent times, the data is always truncated, and so the parse of the XML fails.
Here's the piece of code in use (edited):
package com.cd3o;
public class NetworkFunctions { private Socket socket = null; private PrintWriter output; private BufferedReader input; final int iBuffSize = 1024; private StringBuffer inputXML; private char[] cBuff; public NetworkFunctions() { }
public void sendListDir(XMLFunctions xml) { // String reply; try { socket = new Socket("The-Tardis", 3616); output = new PrintWriter(socket.getOutputStream(), true); input = new BufferedReader(new InputStreamReader(socket.getInputStream())); output.println(xml.buildListDirXML()); cBuff = new char[iBuffSize]; inputXML = new StringBuffer(iBuffSize); do { inputXML.append(cBuff, 0, input.read(cBuff, 0, iBuffSize)); } while (input.ready()); // Returns true/false System.out.println(inputXML.length()); xml.parseV(inputXML.toString()); socket.close(); } catch (UnknownHostException e) { System.out.println("Ooops: UnknownHostException: " + e); e.printStackTrace(); } catch (IOException e) { System.out.println("Ooops: IOException: " + e); e.printStackTrace(); } } }
This class is instantiated once into a private variable. The relevant method is called initially from main, and following that, is always called from a "new" Listener.
The 1st time the method is called, the length of the received XML is 51,867 bytes. Subsequent calls to the method, the XML is truncated at 11,340 bytes, or 15,120 bytes, or some other value.
It's probably some stupid newbie mistake, but I can't see what.
Cheers, Eddie
Gordon Beaton - 01 Feb 2006 19:59 GMT > I'm trying to read a bunch of XML from a connected socket. The 1st > time I read it, I get all the data correctly. But, the 2nd, and > subsequent times, the data is always truncated, and so the parse of > the XML fails. Your call to ready() is not a valid test for EOF. It may return false many times before you've actually reached the end of the input.
Don't use ready(), simply call read(), which will block until there is data to read. Also note the number of characters it returns each time, which may be less than the number you requested.
/gordon
 Signature [ do not email me copies of your followups ] g o r d o n + n e w s @ b a l d e r 1 3 . s e
Trouble@Mill - 01 Feb 2006 20:29 GMT >> I'm trying to read a bunch of XML from a connected socket. The 1st >> time I read it, I get all the data correctly. But, the 2nd, and [quoted text clipped - 7 lines] >data to read. Also note the number of characters it returns each time, >which may be less than the number you requested. Maybe I'm misunderstanding something here, which may well be the case.
This is a client application, that sends a request, and reads the reply. Not a server, that is waiting for input continually.
If read() will block until there is something to read, how do I "break out" to process the data that I've already read.
Cheers, Eddie
Lothar Kimmeringer - 01 Feb 2006 20:47 GMT > Maybe I'm misunderstanding something here, which may well be the case. > > This is a client application, that sends a request, and reads the > reply. Not a server, that is waiting for input continually. Then the code of the server-side might be helpful to see how you read the data.
The output-stream returned by the socket might be buffered (shoudn't be but don't estimate nothing in that case), so before you start reading the response, you should do a flush() before.
> If read() will block until there is something to read, how do I "break > out" to process the data that I've already read. Send the number of characters that will be sent before the actual data starts or a special character at the end of the data indicating exactly that.
BTW: inputXML.append(cBuff, 0, input.read(cBuff, 0, iBuffSize));
will lead to an exception when the end of the stream is reached, because read(...) will return -1 in that case.
BTW2: input = new BufferedReader(new InputStreamReader(socket.getInputStream()));
You should specify the encoding to be used, otherwise the system- encoding will be used that might be different from platform to platform.
Regards, Lothar
 Signature Lothar Kimmeringer E-Mail: spamfang@kimmeringer.de PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)
Always remember: The answer is forty-two, there can only be wrong questions!
Trouble@Mill - 01 Feb 2006 21:43 GMT >Then the code of the server-side might be helpful to see how you >read the data. Unfortunately, I don't have access to that, other than by sniffing the data.
>Send the number of characters that will be sent before the >actual data starts or a special character at the end of >the data indicating exactly that. As I said, I don't have access to the code.
>BTW: inputXML.append(cBuff, 0, input.read(cBuff, 0, iBuffSize)); > >will lead to an exception when the end of the stream is reached, >because read(...) will return -1 in that case. No. I never get a -1. Without the test for input.ready(), the code just keeps waiting for more input.
>BTW2: input = new BufferedReader(new InputStreamReader(socket.getInputStream())); > >You should specify the encoding to be used, otherwise the system- >encoding will be used that might be different from platform to >platform. Thanks for the tip. Let me get the code working cleanly first before I start worrying about that. <G>
Cheers, Eddie
Gordon Beaton - 01 Feb 2006 20:50 GMT > This is a client application, that sends a request, and reads the > reply. Not a server, that is waiting for input continually. > > If read() will block until there is something to read, how do I "break > out" to process the data that I've already read. If you use the connection to send a single request and read a single reply, then simply read to EOF.
If you need to use the connection for more requests and replies, then your protocol needs to define clearly how to recognize the end of a request or a reply. When you use TCP, hoping for timing gaps in the data stream won't get you very far.
/gordon
 Signature [ do not email me copies of your followups ] g o r d o n + n e w s @ b a l d e r 1 3 . s e
Trouble@Mill - 01 Feb 2006 21:48 GMT >If you use the connection to send a single request and read a single >reply, then simply read to EOF. How do I recognise EOF. If I take out the input.ready() test, then the code just keeps waiting for more input. I'm going to guess, because the server doesn't break the connection. And I don't have access to the server to change it.
do { gotLen = input.read(cBuff, 0, iBuffSize); System.out.println(gotLen); } while (gotLen != -1);
That loop never ends, and the lengths printed out show that I received the 51,867 bytes I know that this particular data is.
>If you need to use the connection for more requests and replies, then >your protocol needs to define clearly how to recognize the end of a >request or a reply. When you use TCP, hoping for timing gaps in the >data stream won't get you very far. The way this is working, a single request/reply is all I need.
Thanks for "sticking" with this.
Cheers, Eddie
Gordon Beaton - 02 Feb 2006 07:53 GMT > How do I recognise EOF. If I take out the input.ready() test, then > the code just keeps waiting for more input. I'm going to guess, [quoted text clipped - 8 lines] > That loop never ends, and the lengths printed out show that I received > the 51,867 bytes I know that this particular data is. Your client read() will indicate EOF when the server closes() or does shutdownOutput() to the connection. If that isn't happening, it seems to indicate that the server expects to use the connection for more messages (or it's broken).
If that's the case, then you need to be able to *recognise* when you've received a complete valid message by looking at the contents of the data. If there is no clear way to do that, I'd say your protocol has been poorly designed.
/gordon
 Signature [ do not email me copies of your followups ] g o r d o n + n e w s @ b a l d e r 1 3 . s e
Trouble@Mill - 02 Feb 2006 17:35 GMT >Your client read() will indicate EOF when the server closes() or does >shutdownOutput() to the connection. If that isn't happening, it seems >to indicate that the server expects to use the connection for more >messages (or it's broken). Yeah. It's what I'd say is a "normal" client-server connection. The client connects, sends a request, waits for response, processes response, decides what to do next, either remain connected to do more work, or disconnect. It NEVER expects the server to disconnect. It's just that Java doesn't recognize when that response is complete, and to return control back to the client code.
>If that's the case, then you need to be able to *recognise* when >you've received a complete valid message by looking at the contents of >the data. If there is no clear way to do that, I'd say your protocol >has been poorly designed. Quite the opposite. I'd say that a protocol that relies on "data content" to know when a transmission is complete is the poorly designed one. The socket protocol, for TCP, *DOES* know when the transmission is complete, and *DOES* signal that to the client. It works perfectly in C using recv(). That continues to receive data, until the complete message has been accepted, and then it signals completion, so the code can then process what it's just recieved. Why can't Java do that.
Cheers,
Eric Sosman - 02 Feb 2006 18:12 GMT Trouble@Mill wrote On 02/02/06 12:35,:
>>Your client read() will indicate EOF when the server closes() or does >>shutdownOutput() to the connection. If that isn't happening, it seems [quoted text clipped - 21 lines] > completion, so the code can then process what it's just recieved. Why > can't Java do that. Java can't do it because C can't do it, and neither can C++ or Python or COBOL or APL or assembly language.
What you fail to realize (don't feel ashamed; many others before you have failed to realize it) is that TCP provides a byte stream, not a message stream. The only two "boundaries" in a byte stream are the beginning and the end. There is no "end of request" or "end of response" marker in the byte stream. There is nothing to divide byte N from byte N+1, or to say that they belong to different "messages."
Contrary to your assertion, TCP does *not* know when the transmission is complete. As long as the connection continues to exist, there is the possibility that either side could suddenly decide to generate some more data, and this data is in no way separated from what's gone before -- it's just the N+1st, N+2nd, ... bytes that follow the N that have already been sent. The only way TCP can know that there's no more data forthcoming is when somebody closes his end of the connection.
So: If you want to send multiple "messages" or "transactions" on a single TCP connection, you need to arrange some kind of convention to indicate where the logical divisions in the continuous byte stream occur. Some common conventions are
- All messages consist of a fixed number of bytes. The receiver keeps reading until N bytes have arrived, at which point it knows it has received a complete message.
- Messages are of variable length, but each is preceded by a fixed-length count. The sender transmits "Here comes an N-byte message" followed by N message bytes. The receiver reads the fixed-length N value and then reads N more bytes.
- Messages are of variable length, but each is followed by some kind of "sentinel" to mark the end of the message. (Obviously, the "sentinel" must be something that cannot appear as part of the message "payload.") The sender transmits the message bytes followed by the sentinel, and the receiver just keeps on reading until the sentinel is received.
There are, of course, about a bazillion variations on these themes.
If you want a protocol that has built-in boundaries, there's no shortage: UDP is one such, and others exist. But TCP is not such a protocol, and you will never get anywhere trying to treat it as such.
 Signature Eric.Sosman@sun.com
Lothar Kimmeringer - 02 Feb 2006 18:46 GMT > It's > just that Java doesn't recognize when that response is complete, and > to return control back to the client code. As Eric already pointed out other languages are not able to do that, either.
> Quite the opposite. I'd say that a protocol that relies on "data > content" to know when a transmission is complete is the poorly > designed one. The socket protocol, for TCP, *DOES* know when the > transmission is complete, and *DOES* signal that to the client. As Eric already pointed out ... There is just one point where you might be right. If the whole message fits into one TCP-packet (AFAIR something up to 55 KB) your point of view can be right.
But since the days of X.25 over ISDN I never saw a protocol (e.g. OFTP) again that relies on that. Even these kind of protocols don't use single packets for transfering "X.25-packets" over TCP/IP but wrap around an own packet (e.g. OFTP over TCP/IP)
> It > works perfectly in C using recv(). Maybe the size of the response is fixed (recv() reads in data up to a specified length - like inputstream.read(byte[], off, len)) so you should initialize cBuf with that size and read until the buffer is full.
> That continues to receive data, > until the complete message has been accepted, and then it signals > completion, so the code can then process what it's just recieved. Why > can't Java do that. If the fixed-size-theory is correct the code should look like this:
byte[] buf = new byte[bufferSize]; int readSum = 0; int read; while (readSum < buf.length && (read = input.read(buf, readSum, buf.length - readSum)) != -1){ readSum += read; } xml.parseV(new String(buf, 0, readSum, encoding));
You can't use a Reader, because TCP/IP is byte-based so a fixed data-size will always be specified in bytes, that isn't necesserily the base of the encoding being used (e.g. UTF-8)
If that isn't working, either, you should organize the source of the mentioned C-program that is working. Maybe it's possible to find out the difference between that implementation and yours.
Regards, Lothar
 Signature Lothar Kimmeringer E-Mail: spamfang@kimmeringer.de PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)
Always remember: The answer is forty-two, there can only be wrong questions!
Trouble@Mill - 02 Feb 2006 22:07 GMT In response to Eric and Lothar:
I'm not going to dispute what you are saying, but, in my defense, I'd like to share the code I am running, in C, that talks to the same server I am trying to code the Java against. You will be able to see from this, that it *IS* able to recognize when a reply is complete without the server closing the socket. (Sorry about the length, of the post).
Here's the (trimmed) code that initiates the request, and waits for the reply. I removed the processing that might cause other requests to be sent for clarity, and added in some timings, so there can be no dispute about when the reply was processed:
sendString(tcpSocket, sendBuff); if (debug > 0) { printTime(); printf("About to receive\n"); } do { memset(recvBuff, 0, sizeof(recvBuff));
recvLength = recvString(tcpSocket, recvBuff, sizeof(recvBuff)); if (debug > 0) { printTime(); printf("Got back: [%s]\n", recvBuff); } if (debug > 1) { printTime(); printf("recv %d\n", recvLength); printHex(recvBuff, recvLength); }
} while (recvLength > 0);
closesocket(tcpSocket);
OK, now here's the code for recvString():
int recvString(SOCKET outSock, char *buf, int bufLen) {
int bytesReceived = 0, totalReceived = 0, bufOffset = 0, numFDS = 0, returnFD = 0;
char recvBuf [128];
fd_set fd = {0}; struct timeval tv = {10, 0};
FD_ZERO(&fd);
FD_SET(outSock, &fd); #if defined (__linux__) if (outSock >= numFDS) { numFDS = outSock + 1; } #endif returnFD = select(numFDS, &fd, NULL, NULL, &tv); switch (returnFD) { case 0: printTime(); printf("Timed out waiting for reply\n"); break; default: do { bytesReceived = recvChunk(outSock, recvBuf, sizeof(recvBuf)); printTime(); printf("bytesReceived %d\n", bytesReceived); if (bytesReceived < 0) { totalReceived = bytesReceived; break; } if (bytesReceived > 0) { recvBuf [bytesReceived] = '\0'; strcpy(buf + bufOffset, recvBuf); bufOffset += bytesReceived; totalReceived += bytesReceived; }
} while (bytesReceived > 0); }
return totalReceived; }
And finally, here is the resultant output from the EXACT code shown above:
Feb 02 13:34:28.597 Sending data [<directory command="Directory"/> ] to server Data: 00000000: 3C64 6972 6563 746F 7279 2063 6F6D 6D61 '<directory comma' 00000010: 6E64 3D22 4469 7265 6374 6F72 7922 2F3E 'nd="Directory"/>' 00000020: 0A '. ' Feb 02 13:34:28.607 About to receive Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 128 Feb 02 13:34:28.617 bytesReceived 90 Feb 02 13:34:28.617 bytesReceived 0 Feb 02 13:34:28.617 Got back: [<directory><current name="/home/eddie"><file name ="." type="dir" /><file name=".." type="dir" /><file name=".qt" type="dir" /><fi le name=".kde" type="dir" /><file name=".ssh" type="dir" /><file name=".vnc" typ e="dir" /><file name="Mail" type="dir" /><file name="Test.m3u" type="file" /><fi le name=".arkc" type="dir" /><file name=".java" type="dir" /><file name=".mcop" type="dir" /><file name=".xine" type="dir" /><file name=".xmms" type="dir" /><fi le name=".firefox" type="dir" /><file name="cd3oFake" type="file" /><file name=" .nvidia-settings-rc" type="file" /><file name=".config" type="dir" /><file name= ".arkeiasb-gui" type="dir" /><file name=".mozilla" type="dir" /><file name=".mco prc" type="file" /><file name="cd3oServer.c.multibuffer" type="file" /><file nam e=".ICEauthority" type="file" /><file name="log.txt" type="file" /><file name=". viminfo" type="file" /><file name="Desktop" type="dir" /><file name="compserver" type="file" /><file name="compfake" type="file" /><file name="cd3oFake.c" type= "file" /><file name=".DCOPserver_The-Tardis_:0" type="file" /><file name=".DCOPs erver_The-Tardis__0" type="file" /><file name=".bash_history" type="file" /><fil e name="cd3oServer.h.multibuffer" type="file" /><file name=".thumbnails" type="d ir" /><file name=".Xauthority" type="file" /><file name="cd3oRemote" type="file" /><file name="cd3oServer" type="file" /><file name=".gnupg" type="dir" /><file name=".gxine" type="dir" /><file name=".kderc" type="file" /><file name=".local" type="dir" /><file name=".vimrc" type="file" /><file name="Streams" type="dir" /><file name="lsmod.txt" type="file" /><file name="cd3oServer.c" type="file" />< file name="cd3oServer.h" type="file" /><file name="swt-3.1.1-gtk-linux-x86.zip" type="file" /><file name="cd3oRemote.jar" type="file" /><file name=".fullcircle" type="dir" /><file name=".mailcap" type="file" /></current></directory> ] Feb 02 13:34:28.617 recv 1882 Data: 00000000: 3C64 6972 6563 746F 7279 3E3C 6375 7272 '<directory><curr' 00000010: 656E 7420 6E61 6D65 3D22 2F68 6F6D 652F 'ent name="/home/' 00000020: 6564 6469 6522 3E3C 6669 6C65 206E 616D 'eddie"><file nam' 00000030: 653D 222E 2220 7479 7065 3D22 6469 7222 'e="." type="dir"' 00000040: 202F 3E3C 6669 6C65 206E 616D 653D 222E ' /><file name=".' 00000050: 2E22 2074 7970 653D 2264 6972 2220 2F3E '." type="dir" />' 00000060: 3C66 696C 6520 6E61 6D65 3D22 2E71 7422 '<file name=".qt"' 00000070: 2074 7970 653D 2264 6972 2220 2F3E 3C66 ' type="dir" /><f' 00000080: 696C 6520 6E61 6D65 3D22 2E6B 6465 2220 'ile name=".kde" ' 00000090: 7479 7065 3D22 6469 7222 202F 3E3C 6669 'type="dir" /><fi' 000000A0: 6C65 206E 616D 653D 222E 7373 6822 2074 'le name=".ssh" t' 000000B0: 7970 653D 2264 6972 2220 2F3E 3C66 696C 'ype="dir" /><fil' 000000C0: 6520 6E61 6D65 3D22 2E76 6E63 2220 7479 'e name=".vnc" ty' 000000D0: 7065 3D22 6469 7222 202F 3E3C 6669 6C65 'pe="dir" /><file' 000000E0: 206E 616D 653D 224D 6169 6C22 2074 7970 ' name="Mail" typ' 000000F0: 653D 2264 6972 2220 2F3E 3C66 696C 6520 'e="dir" /><file ' 00000100: 6E61 6D65 3D22 5465 7374 2E6D 3375 2220 'name="Test.m3u" ' 00000110: 7479 7065 3D22 6669 6C65 2220 2F3E 3C66 'type="file" /><f' 00000120: 696C 6520 6E61 6D65 3D22 2E61 726B 6322 'ile name=".arkc"' 00000130: 2074 7970 653D 2264 6972 2220 2F3E 3C66 ' type="dir" /><f' 00000140: 696C 6520 6E61 6D65 3D22 2E6A 6176 6122 'ile name=".java"' 00000150: 2074 7970 653D 2264 6972 2220 2F3E 3C66 ' type="dir" /><f' 00000160: 696C 6520 6E61 6D65 3D22 2E6D 636F 7022 'ile name=".mcop"' 00000170: 2074 7970 653D 2264 6972 2220 2F3E 3C66 ' type="dir" /><f' 00000180: 696C 6520 6E61 6D65 3D22 2E78 696E 6522 'ile name=".xine"' 00000190: 2074 7970 653D 2264 6972 2220 2F3E 3C66 ' type="dir" /><f' 000001A0: 696C 6520 6E61 6D65 3D22 2E78 6D6D 7322 'ile name=".xmms"' 000001B0: 2074 7970 653D 2264 6972 2220 2F3E 3C66 ' type="dir" /><f' 000001C0: 696C 6520 6E61 6D65 3D22 2E66 6972 6566 'ile name=".firef' 000001D0: 6F78 2220 7479 7065 3D22 6469 7222 202F 'ox" type="dir" /' 000001E0: 3E3C 6669 6C65 206E 616D 653D 2263 6433 '><file name="cd3' 000001F0: 6F46 616B 6522 2074 7970 653D 2266 696C 'oFake" type="fil' 00000200: 6522 202F 3E3C 6669 6C65 206E 616D 653D 'e" /><file name=' 00000210: 222E 6E76 6964 6961 2D73 6574 7469 6E67 '".nvidia-setting' 00000220: 732D 7263 2220 7479 7065 3D22 6669 6C65 's-rc" type="file' 00000230: 2220 2F3E 3C66 696C 6520 6E61 6D65 3D22 '" /><file name="' 00000240: 2E63 6F6E 6669 6722 2074 7970 653D 2264 '.config" type="d' 00000250: 6972 2220 2F3E 3C66 696C 6520 6E61 6D65 'ir" /><file name' 00000260: 3D22 2E61 726B 6569 6173 622D 6775 6922 '=".arkeiasb-gui"' 00000270: 2074 7970 653D 2264 6972 2220 2F3E 3C66 ' type="dir" /><f' 00000280: 696C 6520 6E61 6D65 3D22 2E6D 6F7A 696C 'ile name=".mozil' 00000290: 6C61 2220 7479 7065 3D22 6469 7222 202F 'la" type="dir" /' 000002A0: 3E3C 6669 6C65 206E 616D 653D 222E 6D63 '><file name=".mc' 000002B0: 6F70 7263 2220 7479 7065 3D22 6669 6C65 'oprc" type="file' 000002C0: 2220 2F3E 3C66 696C 6520 6E61 6D65 3D22 '" /><file name="' 000002D0: 6364 336F 5365 7276 6572 2E63 2E6D 756C 'cd3oServer.c.mul' 000002E0: 7469 6275 6666 6572 2220 7479 7065 3D22 'tibuffer" type="' 000002F0: 6669 6C65 2220 2F3E 3C66 696C 6520 6E61 'file" /><file na' 00000300: 6D65 3D22 2E49 4345 6175 7468 6F72 6974 'me=".ICEauthorit' 00000310: 7922 2074 7970 653D 2266 696C 6522 202F 'y" type="file" /' 00000320: 3E3C 6669 6C65 206E 616D 653D 226C 6F67 '><file name="log' 00000330: 2E74 7874 2220 7479 7065 3D22 6669 6C65 '.txt" type="file' 00000340: 2220 2F3E 3C66 696C 6520 6E61 6D65 3D22 '" /><file name="' 00000350: 2E76 696D 696E 666F 2220 7479 7065 3D22 '.viminfo" type="' 00000360: 6669 6C65 2220 2F3E 3C66 696C 6520 6E61 'file" /><file na' 00000370: 6D65 3D22 4465 736B 746F 7022 2074 7970 'me="Desktop" typ' 00000380: 653D 2264 6972 2220 2F3E 3C66 696C 6520 'e="dir" /><file ' 00000390: 6E61 6D65 3D22 636F 6D70 7365 7276 6572 'name="compserver' 000003A0: 2220 7479 7065 3D22 6669 6C65 2220 2F3E '" type="file" />' 000003B0: 3C66 696C 6520 6E61 6D65 3D22 636F 6D70 '<file name="comp' 000003C0: 6661 6B65 2220 7479 7065 3D22 6669 6C65 'fake" type="file' 000003D0: 2220 2F3E 3C66 696C 6520 6E61 6D65 3D22 '" /><file name="' 000003E0: 6364 336F 4661 6B65 2E63 2220 7479 7065 'cd3oFake.c" type' 000003F0: 3D22 6669 6C65 2220 2F3E 3C66 696C 6520 '="file" /><file ' 00000400: 6E61 6D65 3D22 2E44 434F 5073 6572 7665 'name=".DCOPserve' 00000410: 725F 5468 652D 5461 7264 6973 5F3A 3022 'r_The-Tardis_:0"' 00000420: 2074 7970 653D 2266 696C 6522 202F 3E3C ' type="file" /><' 00000430: 6669 6C65 206E 616D 653D 222E 4443 4F50 'file name=".DCOP' 00000440: 7365 7276 6572 5F54 6865 2D54 6172 6469 'server_The-Tardi' 00000450: 735F 5F30 2220 7479 7065 3D22 6669 6C65 's__0" type="file' 00000460: 2220 2F3E 3C66 696C 6520 6E61 6D65 3D22 '" /><file name="' 00000470: 2E62 6173 685F 6869 7374 6F72 7922 2074 '.bash_history" t' 00000480: 7970 653D 2266 696C 6522 202F 3E3C 6669 'ype="file" /><fi' 00000490: 6C65 206E 616D 653D 2263 6433 6F53 6572 'le name="cd3oSer' 000004A0: 7665 722E 682E 6D75 6C74 6962 7566 6665 'ver.h.multibuffe' 000004B0: 7222 2074 7970 653D 2266 696C 6522 202F 'r" type="file" /' 000004C0: 3E3C 6669 6C65 206E 616D 653D 222E 7468 '><file name=".th' 000004D0: 756D 626E 6169 6C73 2220 7479 7065 3D22 'umbnails" type="' 000004E0: 6469 7222 202F 3E3C 6669 6C65 206E 616D 'dir" /><file nam' 000004F0: 653D 222E 5861 7574 686F 7269 7479 2220 'e=".Xauthority" ' 00000500: 7479 7065 3D22 6669 6C65 2220 2F3E 3C66 'type="file" /><f' 00000510: 696C 6520 6E61 6D65 3D22 6364 336F 5265 'ile name="cd3oRe' 00000520: 6D6F 7465 2220 7479 7065 3D22 6669 6C65 'mote" type="file' 00000530: 2220 2F3E 3C66 696C 6520 6E61 6D65 3D22 '" /><file name="' 00000540: 6364 336F 5365 7276 6572 2220 7479 7065 'cd3oServer" type' 00000550: 3D22 6669 6C65 2220 2F3E 3C66 696C 6520 '="file" /><file ' 00000560: 6E61 6D65 3D22 2E67 6E75 7067 2220 7479 'name=".gnupg" ty' 00000570: 7065 3D22 6469 7222 202F 3E3C 6669 6C65 'pe="dir" /><file' 00000580: 206E 616D 653D 222E 6778 696E 6522 2074 ' name=".gxine" t' 00000590: 7970 653D 2264 6972 2220 2F3E 3C66 696C 'ype="dir" /><fil' 000005A0: 6520 6E61 6D65 3D22 2E6B 6465 7263 2220 'e name=".kderc" ' 000005B0: 7479 7065 3D22 6669 6C65 2220 2F3E 3C66 'type="file" /><f' 000005C0: 696C 6520 6E61 6D65 3D22 2E6C 6F63 616C 'ile name=".local' 000005D0: 2220 7479 7065 3D22 6469 7222 202F 3E3C '" type="dir" /><' 000005E0: 6669 6C65 206E 616D 653D 222E 7669 6D72 'file name=".vimr' 000005F0: 6322 2074 7970 653D 2266 696C 6522 202F 'c" type="file" /' 00000600: 3E3C 6669 6C65 206E 616D 653D 2253 7472 '><file name="Str' 00000610: 6561 6D73 2220 7479 7065 3D22 6469 7222 'eams" type="dir"' 00000620: 202F 3E3C 6669 6C65 206E 616D 653D 226C ' /><file name="l' 00000630: 736D 6F64 2E74 7874 2220 7479 7065 3D22 'smod.txt" type="' 00000640: 6669 6C65 2220 2F3E 3C66 696C 6520 6E61 'file" /><file na' 00000650: 6D65 3D22 6364 336F 5365 7276 6572 2E63 'me="cd3oServer.c' 00000660: 2220 7479 7065 3D22 6669 6C65 2220 2F3E '" type="file" />' 00000670: 3C66 696C 6520 6E61 6D65 3D22 6364 336F '<file name="cd3o' 00000680: 5365 7276 6572 2E68 2220 7479 7065 3D22 'Server.h" type="' 00000690: 6669 6C65 2220 2F3E 3C66 696C 6520 6E61 'file" /><file na' 000006A0: 6D65 3D22 7377 742D 332E 312E 312D 6774 'me="swt-3.1.1-gt' 000006B0: 6B2D 6C69 6E75 782D 7838 362E 7A69 7022 'k-linux-x86.zip"' 000006C0: 2074 7970 653D 2266 696C 6522 202F 3E3C ' type="file" /><' 000006D0: 6669 6C65 206E 616D 653D 2263 6433 6F52 'file name="cd3oR' 000006E0: 656D 6F74 652E 6A61 7222 2074 7970 653D 'emote.jar" type=' 000006F0: 2266 696C 6522 202F 3E3C 6669 6C65 206E '"file" /><file n' 00000700: 616D 653D 222E 6675 6C6C 6369 7263 6C65 'ame=".fullcircle' 00000710: 2220 7479 7065 3D22 6469 7222 202F 3E3C '" type="dir" /><' 00000720: 6669 6C65 206E 616D 653D 222E 6D61 696C 'file name=".mail' 00000730: 6361 7022 2074 7970 653D 2266 696C 6522 'cap" type="file"' 00000740: 202F 3E3C 2F63 7572 7265 6E74 3E3C 2F64 ' /></current></d' 00000750: 6972 6563 746F 7279 3E0A 'irectory>. ' Feb 02 13:34:38.642 Timed out waiting for reply Feb 02 13:34:38.642 Got back: [] Feb 02 13:34:38.642 recv 0 Data: 00000000: ' ' TCP Socket closed 1888
From this, you can see that the returned data was not received in a single buffer, and that the recv() DID understand that the data was complete, and posted a 0 length reply to indicate this.
This sequence is what I am trying to replicate in Java.
Cheers, Eddie
Eric Sosman - 02 Feb 2006 22:38 GMT Trouble@Mill wrote On 02/02/06 17:07,:
> In response to Eric and Lothar: > [...] [quoted text clipped - 12 lines] > single buffer, and that the recv() DID understand that the data was > complete, and posted a 0 length reply to indicate this. No; recv() did not understand that the "data was complete." Rather, select() understood that there was a ten-second interval of silence; after select() timed out, recv() wasn't even called again.
Using a timeout to indicate "end of message" is a risky business. All you know is that there's been a period of silence; you do *not* know whether that's because the sender has sent everything, or because a "network storm" has disrupted things temporarily. Heck, it needn't even be network difficulties: Maybe some high- priority process is monopolizing the sender's machine and preventing the sender process from getting any CPU time; two seconds from now things will be back to normal and the sender will resume transmitting ...
The Java you showed 'way back at the start of the thread is even more sensitive to interruptions in the data stream: It read()s a bunch of data and then checks ready() immediately -- no ten-second timeout, nothing. If the read() drains all the data that's arrived so far (and there's more "in flight"), ready() returns false and you conclude that "the message" is complete. To look at it another way, the Java code is unlikely to work unless the entire response has arrived *before* you call read()! Stick a one-microsecond gap between "what's already here" and "what's forthcoming," and blooey: you start trying to parse half a document.
Is there some reason you cannot use the XML markup itself to indicate where the transmission ends? There's going to be a closing bracket; you'll need to do some amount of parsing to identify it, but since you intend to parse the stuff anyhow ...
 Signature Eric.Sosman@sun.com
Trouble@Mill - 02 Feb 2006 23:14 GMT >Trouble@Mill wrote On 02/02/06 17:07,: >> In response to Eric and Lothar: [quoted text clipped - 18 lines] >a ten-second interval of silence; after select() timed >out, recv() wasn't even called again. Sorry Eric, you missed some pieces in your reply:
>>Feb 02 13:34:28.617 bytesReceived 0 >>Feb 02 13:34:28.617 Got back: [<directory><current [quoted text clipped - 5 lines] >> 00000750: 6972 6563 746F 7279 3E0A 'irectory>. ' >> Feb 02 13:34:38.642 Timed out waiting for reply From that you can see that the data was received and printed BEFORE the select() was re-entered, not after the select() timed out.
Cheers, Eddie
Rogan Dawes - 03 Feb 2006 13:00 GMT >> Trouble@Mill wrote On 02/02/06 17:07,: >>> In response to Eric and Lothar: [quoted text clipped - 35 lines] > Cheers, > Eddie It seems to me that recv returned immediately when there were 0 bytes available for reading, since you do a non-blocking read. You are just lucky that this occurs at the right time in your C code, as opposed to your Java code. Your assumption that that a 0 byte read indicates the end of the "message" is invalid, as mentioned numerous times already, it simply indicates that there was nothing ready at that particular time.
I'd suggest doing something like:
// write your message to the server OutputStream os = socket.getOutputStream(); os.write(message.getBytes()); os.flush();
InputStream is = socket.getInputStream(); // read the response from the server byte[] buff = new byte[1024]; int got; boolean complete = false; ByteArrayOutputStream baos = new ByteArrayOutputStream(); while ((got=is.read(buff))>=0 && ! complete) { baos.write(buff,0,got); // this is VERY inefficient, a better way would be to maintain // some kind of state machine, or possibly use SAX to parse // the XML as it comes in, so you know when you end the response. if (new String(baos.getBytes()).endsWith("</directory>")) complete = true; } // process the message
Regards,
Rogan
Trouble@Mill - 02 Feb 2006 22:44 GMT Ah Cr@p. Iforgot to post the other routine, recvChunk():
int recvChunk(SOCKET outSock, char *buf, int bufLen) {
int bytesReceived = 0, numFDS = 0, returnFD = 0;
fd_set fd = {0}; struct timeval tv = {0, 0};
FD_ZERO(&fd);
FD_SET(outSock, &fd); #if defined (__linux__) if (outSock >= numFDS) { numFDS = outSock + 1; } #endif
returnFD = select(numFDS, &fd, NULL, NULL, &tv); switch (returnFD) { case 0: // NOTE ** Drop through intended ** case -1: ipError(returnFD, "receive select()"); break; default: bytesReceived = recv(outSock, buf, bufLen, 0); #if defined(_WIN32) if (WSAGetLastError() == WSAECONNRESET) { #else if (errno == ECONNRESET) { errno = 0; #endif printf("Client force closed connection\n"); bytesReceived = 0; } } return bytesReceived; }
Cheers, Eddie
Nigel Wade - 03 Feb 2006 11:41 GMT > From this, you can see that the returned data was not received in a > single buffer, and that the recv() DID understand that the data was > complete, and posted a 0 length reply to indicate this. If recv() returned 0, it implies that the underlying socket was closed by the other end.
> This sequence is what I am trying to replicate in Java. Java can detect the socket being closed.
 Signature Nigel Wade, System Administrator, Space Plasma Physics Group, University of Leicester, Leicester, LE1 7RH, UK E-mail : nmw@ion.le.ac.uk Phone : +44 (0)116 2523548, Fax : +44 (0)116 2523555
Thomas Weidenfeller - 03 Feb 2006 09:29 GMT > Quite the opposite. I'd say that a protocol that relies on "data > content" to know when a transmission is complete is the poorly [quoted text clipped - 4 lines] > completion, so the code can then process what it's just recieved. Why > can't Java do that. There are a number of misconceptions in your posting.
* Sockets is not a protocol. Sockets are an API. Sockets don't add a protocol layer.
* TCP is a stream protocol. It doesn't handle or do messages. There are simply no messages on the TCP layer. It doesn't matter how often people claim that there are messages. There aren't. There are segments of the stream, but you typically have no control over the segments. And there are probably fragments on the IP layer, of which you have even less control from the application.
* recv() on top of a TCP stream does not do what you claim it does. If you use recv() on a SOCK_STREAM it simply ignores message boundaries, because there are of course no messages in a stream. And SOCK_STREAM is the socket type which one of course uses for TCP. recv() simply boils down to a read() in that case, and it gives you just what is currently available of the TCP stream in the receiver buffer. If you program an application which uses TCP and recv() to receive TCP "messages", then your application is fundamentally flawed.
* Many protocols in the TCP/IP suit (and in many other protocol suits) rely on some kind of length information. Requiring that a message protocol which you put on top of TCP in some way contains some message length information or some EOM indication is the norm, not bad design.
* It is not wrong to have the length in what you call "data contents". It is very typical for a PDU to contain a protocol-layer specific header (with a length). So common that in protocol layer terms people simply use the formula
PDU(N) = SDU(N - 1) = header(N) + PDU(N + 1) + footer(N) = header(N) + SDU(N) + footer(N)
to explain how PDUs are made up ('N' being a particular protocol layer). Which is actually the essence of layered protocols.
To repeat what Gordon wrote. Any message protocol on top of TCP which has not been designed that so that is possible to recognize individual messages as part of this protocol is poorly designed, and bound to fail.
/Thomas
 Signature The comp.lang.java.gui FAQ: ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq http://www.uni-giessen.de/faq/archiv/computer-lang.java.gui.faq/
Roedy Green - 03 Feb 2006 12:06 GMT On Fri, 03 Feb 2006 10:29:11 +0100, Thomas Weidenfeller <nobody@ericsson.invalid> wrote, quoted or indirectly quoted someone who said :
> It doesn't matter how often people >claim that there are messages. underneath of course there are packets, but TCP/IP presents as two continuous streams of characters one in each direction with no hint as to where the packet boundaries were.
If you want messages, you need to invent your own way of doing them in a continuous stream. The traditional way is to start each header with two binary shorts:
1. message type 2. message length
You then can use the message type as in a case switch, an enum values()[i] or a HashMap lookup to get the code to deal with the message. You can hop over to the next message by reading length bytes.
if you have less than 255 types of message, you can shrink that to a ubyte and if your messages are bigger than 32K, you can expand the length to an ushort or an int. Leave yourself some breathing room.
.You can also do it by sending serialised objects which have their own internal length tracking.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Thomas Weidenfeller - 03 Feb 2006 13:00 GMT > On Fri, 03 Feb 2006 10:29:11 +0100, Thomas Weidenfeller > <nobody@ericsson.invalid> wrote, quoted or indirectly quoted someone [quoted text clipped - 4 lines] > > underneath of course there are packets, There are packets, fragments, segments. And non of this makes up a message. You can try as hard as you want, but you can't squeeze messages out of the TCP protocol as such. You need a higher layer protocol.
> If you want messages, you need to invent your own way of doing them in > a continuous stream. This is what we try to tell the OP for the Nth time. Invent your own way or use an existing higher layer protocol which already does what he want.
But the OP keeps claiming that his TCP is special, and that his Unix has a very special recv() system call, which does things the way he wants it, and not like they are. And he keeps telling us that Java is broken, because it does not work with his imaginary TCP.
/Thomas
 Signature The comp.lang.java.gui FAQ: ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq http://www.uni-giessen.de/faq/archiv/computer-lang.java.gui.faq/
Roedy Green - 03 Feb 2006 15:47 GMT On Fri, 03 Feb 2006 14:00:12 +0100, Thomas Weidenfeller <nobody@ericsson.invalid> wrote, quoted or indirectly quoted someone who said :
>But the OP keeps claiming that his TCP is special, and that his Unix has >a very special recv() system call, which does things the way he wants >it, and not like they are. And he keeps telling us that Java is broken, >because it does not work with his imaginary TCP. He is being seduced by the timing. The stream is coming at him in bursts. IF the packets were flowing slowly with say 0.01 second between them, it would LOOK as if you had access to the individual packet boundaries in the stream. But sooner or late the packets will come in a burst and you won't see those boundaries. Just think about what happens on retransmission. The stream gets held up by the oldest unsuccessful packet. When it gets through BRRMM. It can charge ahead become the windowing allows it be several packets ahead. It would look like to this OP, that suddenly the underlying packets got huge.
This is not something you can rely on to mean anything.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Nigel Wade - 03 Feb 2006 11:33 GMT >>Your client read() will indicate EOF when the server closes() or does >>shutdownOutput() to the connection. If that isn't happening, it seems [quoted text clipped - 16 lines] > content" to know when a transmission is complete is the poorly > designed one. Really? How would you do otherwise (without re-designing TCP/IP)?
> The socket protocol, for TCP, *DOES* know when the > transmission is complete, and *DOES* signal that to the client. How? The only "signal" is EOF, when the socket is closed. There are no other "signals".
> It > works perfectly in C using recv(). You just got lucky.
> That continues to receive data, > until the complete message has been accepted, and then it signals > completion, so the code can then process what it's just recieved. How does it signal it's complete? I can see nothing in the documentation for recv() that says it will signal anything to the caller. If there is more data available than its buffer size it may discard it, if there is less it will usually return what is available. How does that fit your description of continuing to receive data until a complete message has been received and then signalling completion?
From the man page for recv(): "If a message is too long to fit in the supplied buffer, excess bytes may be discarded depending on the type of socket the message is received from"
"The receive calls normally return any data available, up to the requested amount, rather than waiting for receipt of the full amount requested."
> Why > can't Java do that. Because, as has already been stated, it simply is not possible with TCP/IP. If you know how much you need to read you can use the readFully() method of DataInputStream, but that pre-supposes that you know in advance exactly how much you are expecting to receive. If your protocol doesn't specify that, you can't use it.
 Signature Nigel Wade, System Administrator, Space Plasma Physics Group, University of Leicester, Leicester, LE1 7RH, UK E-mail : nmw@ion.le.ac.uk Phone : +44 (0)116 2523548, Fax : +44 (0)116 2523555
Chris Uppal - 03 Feb 2006 11:53 GMT > From the man page for recv(): > "If a message is too long to fit in the supplied > buffer, excess bytes may be discarded depending on the type of socket the > message is received from" I have to presume that the 'excess bytes may be discarded' warning only applies to "unreliable" protocols like UDP; if it did that with TCP then -- to put it mildly -- things would get tricky ;-)
-- chris
Trouble@Mill - 03 Feb 2006 17:17 GMT OK, OK, You've all beaten me into submission. I'll crawl gracefully back into my corner and promise to do more research next time.
But, I'd just like to add a couple of random muses on some of the replies. (Should I be pulling on the nomex undies at this point <G>)
>>It seems to me that recv returned immediately when there were 0 bytes >>available for reading, since you do a non-blocking read. You are just >>lucky that this occurs at the right time in your C code Well, it was select() that was the gatekeeper, so it thought there was "something" to read.
Hey, I've been "lucky" on every single recv() that this client application has done for the past 2 years. Maybe it's time to go buy Lotto tickets. Or have I used up *all* my luck here.
>>If recv() returned 0, it implies that the underlying socket was closed >>by the other end. Except it wasn't, otherwise the next select() would have failed, which it didn't. Plus, there was more code that shown in the snippet which continues to use the socket.
>>You just got lucky. See above.
>>He is being seduced by the timing. Awww, is that the only seduction being offered. <Wide Grin>
But seriously. Thank you all for the various lessons learned. Maybe I should patent this:
>>very special recv() system call because, as pointed out, it's been working fine for over 2 years.
Cheers, Eddie
Trouble@Mill - 02 Feb 2006 17:56 GMT >Your call to ready() is not a valid test for EOF. It may return false >many times before you've actually reached the end of the input. In light of what else has been discussed, lets come back to this.
Can you explain why the first request I make ALWAYS completes sucessfully. But no others do. If it were as simple as ready() returning false before the end-of-input, I would expect that it would happen on every use, or for occaisional ones (other than the 1st) to complete, but it doesn't. Plus, the point at which it fails isn't as random as I thought. The test I just ran gave:
1st - Worked 2nd - Failed at 40321 3rd and every other attempt - Failed at 15121.
Cheers, Eddie
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|