Java Forum / General / November 2005
Problem communicating with socket application
Pep - 26 Oct 2005 23:37 GMT I am experiencing problems when trying to communicate with a TCP socket based application that does not always append a <CR> at the end of the data and so cannot use readLine on a BufferedInputStream.
I have tried using a simple read in to a char array but have found that where the application has sent 4 records using 4 separate socket writes, my read operation has resulted in them all being read in to one array and I have no means of determining where one record ends and the next begins, unless there are <CR> appended to each record, which as I stated is not always the case :(
Using a normal unix socket read operation in C++ I would not have this problem as each read operation would result in one record.
How can I get java sockets to operate in a similar manner as unix socket reads so that one record is obtained in each read operation, regardless of whether it is appended with a <CR> or not?
TIA, Pep.
Gordon Beaton - 27 Oct 2005 08:07 GMT > I am experiencing problems when trying to communicate with a TCP > socket based application that does not always append a <CR> at the [quoted text clipped - 9 lines] > Using a normal unix socket read operation in C++ I would not have > this problem as each read operation would result in one record. No, you've just been lucky so far. Your C++ application is broken too, but has been working "by accident". A subtle difference in timing may make the difference.
> How can I get java sockets to operate in a similar manner as unix > socket reads so that one record is obtained in each read operation, > regardless of whether it is appended with a <CR> or not? TCP is a lowly byte stream, and it does not know anything about record boundaries, nor does it make any attempts to preserve them. You may find that multiple records are occasionally combined, and single records are somtimes broken into two or more parts.
If you need delimited records you need to manage them yourself. The easiest way is to insert delimiters (special characters like CR or anything else that can't occur within a record) between the records as you send them, so the recipient can determine where one record in the stream ends and the next one begins.
Another way is to precede each record with a short header containing the length of the record.
If you already know the length of each record in advance, simply read the correct number of bytes from the stream each time.
Finally, maybe there is some other mechanism you can use in your client to recognize the end of a record.
/gordon
 Signature [ do not email me copies of your followups ] g o r d o n + n e w s @ b a l d e r 1 3 . s e
Roedy Green - 27 Oct 2005 08:31 GMT >Another way is to precede each record with a short header containing >the length of the record. [quoted text clipped - 4 lines] >Finally, maybe there is some other mechanism you can use in your >client to recognize the end of a record. and yet another way is an ObjectStream that deals with breaking the stream up into objects for you. That won't work though when one end is C++.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Pep - 27 Oct 2005 09:32 GMT >>Another way is to precede each record with a short header containing >>the length of the record. [quoted text clipped - 8 lines] > stream up into objects for you. That won't work though when one end is > C++. Yep, unfortunately this data is being provided by some windows based c++ program.
Cheers, Pep.
Pep - 27 Oct 2005 09:31 GMT >> I am experiencing problems when trying to communicate with a TCP >> socket based application that does not always append a <CR> at the [quoted text clipped - 13 lines] > but has been working "by accident". A subtle difference in timing may > make the difference. I'm surprised I've been "lucky so far". I'm talking about a application that services transactions from multiple clients under extreme load and it has never missed a record yet. The records are passed using a normal socket write operation so I know how the data is provided.
Still I won't argue with someone that knows better than me and by that I am actually trying to be sincere not rude :)
>> How can I get java sockets to operate in a similar manner as unix >> socket reads so that one record is obtained in each read operation, [quoted text clipped - 21 lines] > > /gordon Unfortunately I do not have control over the data being sent to me now and I have now found from a ethereal analysis that sometimes the records have a <CR> ending and other times they do not.
I'll have to try simply reading a determined number of bytes :(
Cheers, Pep
Pep - 27 Oct 2005 09:34 GMT >>> I am experiencing problems when trying to communicate with a TCP >>> socket based application that does not always append a <CR> at the [quoted text clipped - 56 lines] > Cheers, > Pep Sorry I forgot to add that the length of the record is variable so that without this random <CR> I'm pretty much screwed :(
Gordon Beaton - 27 Oct 2005 09:51 GMT > Sorry I forgot to add that the length of the record is variable so > that without this random <CR> I'm pretty much screwed :( Probably, yes. I thought to add that exact sentiment in my previous reply, but didn't want to come across as rude!
It sounds odd to me that the existence (or lack) of CR is "random".
Have I understood correctly that your C++ clients work as expected, and you are now implementing a Java client to an existing C++ server?
Do the C++ clients not have any special logic to deal with record boundaries? Can you not change the way the server sends messages?
As I mentioned earlier, a small timing change could make a difference.
Basically if there is a short delay between calls to write(), it is often the case that they will be sent separately by TCP. As long as the recipient reads() sufficiently quickly, he will receive them separately as well. This can work ("by accident"), but you really shouldn't rely on this behaviour.
If the sender sends short messages without delay in between, then TCP may send them together. Similarly when the reader is slow, messages will accumulate in his receive buffer, and calls to read() cannot distinguish between them.
/gordon
 Signature [ do not email me copies of your followups ] g o r d o n + n e w s @ b a l d e r 1 3 . s e
Pep - 27 Oct 2005 11:01 GMT >> Sorry I forgot to add that the length of the record is variable so >> that without this random <CR> I'm pretty much screwed :( [quoted text clipped - 3 lines] > > It sounds odd to me that the existence (or lack) of CR is "random". Same here. I was told in the spec that the records would be capped with a <CR> but they are not always which is something I am arguing with the designer of the server code.
> Have I understood correctly that your C++ clients work as expected, > and you are now implementing a Java client to an existing C++ server? NO. The original application which I wrote consisted of a C++ server and client both of which use unix sockets with no EOR delimiter and they work fine.
Now I am having to replace this with a server that is provided and has been written using windows based C++ and I have to write the client. So I have done this using java.
> Do the C++ clients not have any special logic to deal with record > boundaries? Can you not change the way the server sends messages? [quoted text clipped - 6 lines] > separately as well. This can work ("by accident"), but you really > shouldn't rely on this behaviour. Accepted and thanks for that knowledge.
> If the sender sends short messages without delay in between, then TCP > may send them together. Similarly when the reader is slow, messages > will accumulate in his receive buffer, and calls to read() cannot > distinguish between them. Which appears to be my problem here.
> /gordon Okay this is the code that I am trying to run
=============================================================================== try {
while ((running.get()) && (parentProxy.br != null) && (parentProxy.br.ready())) {
try { mResponse = "";
if (parentProxy.br != null) { mResponse = parentProxy.br.readLine(); // read the response from the m server logDebugToFile("TSreader::run processing [" + mResponse + "]"); parentProxy.processMResult(mResponse); // send the result back to the client }
} catch(SocketTimeoutException e) { // do nothing here }
}
} catch(Throwable e) { logFatalToFile("TSreader::run Error (running) " + e.getMessage(), e); e.printStackTrace(); } ===============================================================================
and this is the output of the code
=============================================================================== 27 Oct 2005 09:34:15 GMT: DEBUG - {Thread-0} {Thread-4} TSreader::run processing [CCOK Q458000:Y:a:XXXX48] 27 Oct 2005 09:34:16 GMT: DEBUG - {Thread-0} {Thread-4} TSreader::run processing [] 27 Oct 2005 09:34:16 GMT: FATAL - {Thread-0} {Thread-4} TSreader::run Error (running) String index out of range: 6 java.lang.StringIndexOutOfBoundsException: String index out of range: 6 at java.lang.String.charAt(Unknown Source) at TS.MP.processMResult(MP.java:486) at TS.TSreader.run(TSreader.java:81) at java.lang.Thread.run(Unknown Source) 27 Oct 2005 09:34:16 GMT: DEBUG - {Thread-0} {Thread-4} TSreader::run processing [CCOK E458000:Y:a:XXXX34] ===============================================================================
as can be seen, it can read the first record and the 3rd record but the second record is coming back empty
Based on this input on the socket (obtained using ethereal)
=============================================================================== CCOK Q458000:Y:a:XXXX48C\x9f`C\xd8@^@6^@^@^@6^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc (^H^@E^@^@(V\xbf@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f \xbc#\x8d\xf5\x95\xd4\xd3\xd1\x83^G\xfdP^P\xe4 Y^F^@^@C\x9f`C\xdbC^@<^@^@^@<^@^@^@^@^K\xdb\x95\xc (^@^N\x83\x9c\x91\xb2^H^@E^@^@)^O\xf5@^@}^F\x94\xee^]^H^P\x8f\xc0\xa8j\xac#\x8d\xbc\xd1\x83^G\xfd\xf5\x95\xd4\xd3P^X^^^[Z\x91^@^@^M^@^@^@^@^@C\x9f`C\x8en^@\x87^@^@^@\x87^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc (^H^@E^@^@yV\xc6@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f \xbc#\x8d\xf5\x95\xd4\xd3\xd1\x83^G\xfeP^X\xe4YW^@^@C:MISTER-48/5 :WXYZ :4111111111111111 :0605: 20.00:JTWB801XXXX48 ^MC\x9f`C\xab\x86^@M^@^@^@M^@^@^@^@^K\xdb\x95\xc (^@^N\x83\x9c\x91\xb2^H^@E^@^@?^P\xf5@^@}^F\x93\xd8^]^H^P\x8f\xc0\xa8j\xac#\x8d \xbc\xd1\x83^G\xfe\xf5\x95\xd5$P^X^]\xca\x88g^@^@CCOKR458000:Y:a:XXXX10C\x9f`C8\xb7^K^@\x87^@^@^@\x87^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc (^H^@E^@^@yV\xcc@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f \xbc#\x8d\xf5\x95\xd5$\xd1\x83^H^UP^X\xe4 YW^@^@C:MISTER-18/5 :WXYZ :4111111111111111 :0605: 20.00:KTWB801XXXX18 ^MC\x9f`C\xba^K^@O^@^@^@O^@^@^@^@^K\xdb\x95\xc (^@^N\x83\x9c\x91\xb2^H^@E^@^@A^R\xf5@^@}^F\x91\xd6^]^H^P\x8f\xc0\xa8j\xac#\x8d \xbc\xd1\x83^H^U\xf5\x95\xd5uP^X^]yE~^@^@^MCCOK E458000:Y:a:XXXX34^MC\x9f`C^K'^M^@6^@^@^@6^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc (^H^@E^@^@(V\xd1@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f \xbc#\x8d\xf5\x95\xd5u\xd1\x83^H.P^P\xe4Y^F^@^@C\x9f`C\xf3\xdd^M^@M^@^@^@M^@^@^@^@^K\xdb\x95\xc (^@^N\x83\x9c\x91\xb2^H^@E^@^@?^T\xf5@^@}^F\x8f\xd8^]^H^P\x8f\xc0\xa8j\xac#\x8d \xbc\xd1\x83^H.\xf5\x95\xd5uP^X^]y\x82F^@^@ ===============================================================================
and I am struggling to find out why this is happening :(
If you can see where I am going wrong then I would greatly appreciate your advice.
The really annoying thing is that I have written a simulator based on the servers spec of capping the records with a <CR> and my client can process up to 100,000 record sin a 2 hour period. So this is really starting to piss me off!
Cheers, Pep.
Roedy Green - 27 Oct 2005 11:28 GMT >Same here. I was told in the spec that the records would be capped with a ><CR> but they are not always which is something I am arguing with the >designer of the server code. If you are using a BufferedOutputStream, you want to do a flush() after every record or parts of it could stay stuck in the buffer wrapping the socket until you write some more to push it out. .
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Pep - 27 Oct 2005 11:42 GMT >>Same here. I was told in the spec that the records would be capped with a >><CR> but they are not always which is something I am arguing with the [quoted text clipped - 4 lines] > wrapping the socket until you write some more to push it out. > . I'm using a print writer
pw = new PrintWriter(microgateSocket.getOutputStream(),true);
The really annoying thing about this is that it seems ot be related to the fact that they are not placing a <CR> at the end of the records. My simulator puts a \n at the end of each record and like I said, my client will then process over 100,000 record swithout dropping a single one.
Cheers, Pep.
Roedy Green - 27 Oct 2005 12:04 GMT >pw = new PrintWriter(microgateSocket.getOutputStream(),true); > >The really annoying thing about this is that it seems ot be related to the >fact that they are not placing a <CR> at the end of the records. That is not the official duty of a PrintWriter. It is supposed to put a platform specific line separator there. If you want a cr specifically you should do a write( '\r' );
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Pep - 27 Oct 2005 12:25 GMT >>pw = new PrintWriter(microgateSocket.getOutputStream(),true); >> [quoted text clipped - 4 lines] > a platform specific line separator there. If you want a cr > specifically you should do a write( '\r' ); Yeah but they are not using a print writer. They have written their application using either Visual C++ or Visual Basic so maybe god knows how they are writing the data to the socket?
I use the print writer in my java client to send the transactions to their server and I append a <CR> to the data.
I have just done another massive run against their server and they seem to only be appending a <CR> to maybe around 2% of their transaction reply records. Which of course means as it is a variable length record with no delimiter I cannot even handle the protocol myself using a data stream reader.
Cheers, Pep.
Pep - 27 Oct 2005 16:02 GMT >>pw = new PrintWriter(microgateSocket.getOutputStream(),true); >> [quoted text clipped - 4 lines] > a platform specific line separator there. If you want a cr > specifically you should do a write( '\r' ); I have now found out, with the use of ethereal at both ends of the socket, that the windows application is definitely sending a 0x0D but it is being transformed in to a 0xDC by the time it reaches my end of the socket.
Similarly my 0x0D0x0A byte sequence is being converted in to a 0x0d.
Cheers, Pep.
Gordon Beaton - 27 Oct 2005 16:20 GMT > I have now found out, with the use of ethereal at both ends of the > socket, that the windows application is definitely sending a 0x0D > but it is being transformed in to a 0xDC by the time it reaches my > end of the socket. I find it extremly hard to believe that the cable or a switch alone would be making such selective changes to the data stream.
Are you absolutely certain that you aren't making parts of this observation in the code itself, where some processing has already taken place? Or that your data doesn't pass through a proxy of some kind?
/gordon
 Signature [ do not email me copies of your followups ] g o r d o n + n e w s @ b a l d e r 1 3 . s e
Pep - 27 Oct 2005 20:42 GMT >> I have now found out, with the use of ethereal at both ends of the >> socket, that the windows application is definitely sending a 0x0D [quoted text clipped - 10 lines] > > /gordon At this point I am not sure of anything other than ethereal shows a 0x0D on the windows end of the socket and a 0xDC on the unix end of the socket.
Similarly that the 0x0D0x0A on the unix end of the socket is a 0x0D when it reaches the windows end of the socket.
I make no assumptions as to what is causing the change but am relieved to find out that it is not my client written in Java or the server written in some windows based language.
Cheers, Pep.
Steve Horsley - 28 Oct 2005 21:21 GMT > At this point I am not sure of anything other than ethereal shows a 0x0D on > the windows end of the socket and a 0xDC on the unix end of the socket. [quoted text clipped - 5 lines] > find out that it is not my client written in Java or the server written in > some windows based language. Spooky. So what exactly connects the client and server?
As Gordon says, I imagine they must be talking via a proxy. I would be inclined to compare the traces for IP address, MAC address, IP sequence numbers, to prove there is some entity playing piggy in the middle and corrupting the data stream. That kind of change doesn't happen by accident - you have to re-write checksums, and even change sequence numbering if you're dropping bytes from the stream.
Steve
Pep - 31 Oct 2005 12:16 GMT >> At this point I am not sure of anything other than ethereal shows a 0x0D >> on the windows end of the socket and a 0xDC on the unix end of the [quoted text clipped - 18 lines] > > Steve I agree, very spooky.
We are about to run the client and server on the same segment as each other to see if the problem still exists.
Cheers, Pep.
Pep - 31 Oct 2005 16:43 GMT >> At this point I am not sure of anything other than ethereal shows a 0x0D >> on the windows end of the socket and a 0xDC on the unix end of the [quoted text clipped - 18 lines] > > Steve Having now run my client on the same network segment as the server, I have successfully processed in excess of 5,000 records without dropping any. So it looks like there is something on the network which is responsible for the conversion or dropping of bytes.
Now to find out what it is :(
Cheers, Pep.
Steve Horsley - 31 Oct 2005 21:09 GMT > Having now run my client on the same network segment as the server, I have > successfully processed in excess of 5,000 records without dropping any. So > it looks like there is something on the network which is responsible for > the conversion or dropping of bytes. > > Now to find out what it is :( The thot plickens.
That might prove to be an interesting investigation.
One thing I can guarantee - when you find the kit responsible, and find whoever is responsible for that kit, they will categorically deny that their kit could possibly have anything to do with your problem.
Steve
Pep - 01 Nov 2005 09:34 GMT >> Having now run my client on the same network segment as the server, I >> have successfully processed in excess of 5,000 records without dropping [quoted text clipped - 13 lines] > > Steve ROFL.
I already had one sysad state "absolutely impossible" when I described the problem to him and it's not even his problem :)
Pep.
Roedy Green - 29 Oct 2005 07:58 GMT >I have now found out, with the use of ethereal at both ends of the socket, >that the windows application is definitely sending a 0x0D but it is being >transformed in to a 0xDC by the time it reaches my end of the socket. So Java nothing to do with it.
Write a class that reads one record scanning it byte by byte
You might use http://mindprod.com/jgloss/readblocking.html as a model. Perhaps it should also convert it to char for you as well after it has scanned the bytes.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Pep - 31 Oct 2005 12:15 GMT >>I have now found out, with the use of ethereal at both ends of the socket, >>that the windows application is definitely sending a 0x0D but it is being >>transformed in to a 0xDC by the time it reaches my end of the socket. > > So Java nothing to do with it. Thankfully, no.
> Write a class that reads one record scanning it byte by byte > > You might use http://mindprod.com/jgloss/readblocking.html > as a model. Perhaps it should also convert it to char for you as well > after it has scanned the bytes. Just about to run my client on the same network segmet as the server to see if we still have the same problem. If not then we can work outwards from there to see where it comes in.
Cheers, Pep.
Pep - 31 Oct 2005 16:43 GMT >>I have now found out, with the use of ethereal at both ends of the socket, >>that the windows application is definitely sending a 0x0D but it is being [quoted text clipped - 7 lines] > as a model. Perhaps it should also convert it to char for you as well > after it has scanned the bytes. Having now run my client on the same network segment as the server, I have successfully processed in excess of 5,000 records without dropping any. So it looks like there is something on the network which is responsible for the conversion or dropping of bytes.
Now to find out what it is :(
Cheers, Pep.
Pep - 27 Oct 2005 11:46 GMT >>Same here. I was told in the spec that the records would be capped with a >><CR> but they are not always which is something I am arguing with the [quoted text clipped - 4 lines] > wrapping the socket until you write some more to push it out. > . Actually as I am looking at the ethereal output, I cannot see a <CR> at the end of any of the records they are seeing back to me at all so I'm now wondering how the readLine function is working at all?
Cheers, Pep.
Missaka Wijekoon - 29 Oct 2005 06:01 GMT > Actually as I am looking at the ethereal output, I cannot see a <CR> at the > end of any of the records they are seeing back to me at all so I'm now > wondering how the readLine function is working at all? Per the Java API docs:
public String readLine() throws IOException Read a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed.
From some of the conversation, it feels as if there might be a filter that is converting the stream like dos2unix, etc. Is there a chance that the socket on the server end is not a true socket, but perhaps a telnet connections? For example, the telnet protocol requires that 0xFF be escaped.
> Cheers, > Pep. Pep - 31 Oct 2005 16:43 GMT >> Actually as I am looking at the ethereal output, I cannot see a <CR> at >> the end of any of the records they are seeing back to me at all so I'm [quoted text clipped - 15 lines] >> Cheers, >> Pep. Having now run my client on the same network segment as the server, I have successfully processed in excess of 5,000 records without dropping any. So it looks like there is something on the network which is responsible for the conversion or dropping of bytes.
Now to find out what it is :(
Cheers, Pep.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|