Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / November 2005

Tip: Looking for answers? Try searching our database.

Problem communicating with socket application

Thread view: 
Pep - 26 Oct 2005 23:37 GMT
I am experiencing problems when trying to communicate with a TCP socket
based application that does not always append a <CR> at the end of the data
and so cannot use readLine on a BufferedInputStream.

I have tried using a simple read in to a char array but have found that
where the application has sent 4 records using 4 separate socket writes, my
read operation has resulted in them all being read in to one array and I
have no means of determining where one record ends and the next begins,
unless there are <CR> appended to each record, which as I stated is not
always the case :(

Using a normal unix socket read operation in C++ I would not have this
problem as each read operation would result in one record.

How can I get java sockets to operate in a similar manner as unix socket
reads so that one record is obtained in each read operation, regardless of
whether it is appended with a <CR> or not?

TIA,
Pep.
Gordon Beaton - 27 Oct 2005 08:07 GMT
> I am experiencing problems when trying to communicate with a TCP
> socket based application that does not always append a <CR> at the
[quoted text clipped - 9 lines]
> Using a normal unix socket read operation in C++ I would not have
> this problem as each read operation would result in one record.

No, you've just been lucky so far. Your C++ application is broken too,
but has been working "by accident". A subtle difference in timing may
make the difference.

> How can I get java sockets to operate in a similar manner as unix
> socket reads so that one record is obtained in each read operation,
> regardless of whether it is appended with a <CR> or not?

TCP is a lowly byte stream, and it does not know anything about record
boundaries, nor does it make any attempts to preserve them. You may
find that multiple records are occasionally combined, and single
records are somtimes broken into two or more parts.

If you need delimited records you need to manage them yourself. The
easiest way is to insert delimiters (special characters like CR or
anything else that can't occur within a record) between the records as
you send them, so the recipient can determine where one record in the
stream ends and the next one begins.

Another way is to precede each record with a short header containing
the length of the record.

If you already know the length of each record in advance, simply read
the correct number of bytes from the stream each time.

Finally, maybe there is some other mechanism you can use in your
client to recognize the end of a record.

/gordon

Signature

[  do not email me copies of your followups  ]
g o r d o n + n e w s @  b a l d e r 1 3 . s e

Roedy Green - 27 Oct 2005 08:31 GMT
>Another way is to precede each record with a short header containing
>the length of the record.
[quoted text clipped - 4 lines]
>Finally, maybe there is some other mechanism you can use in your
>client to recognize the end of a record.

and yet another way is an ObjectStream that deals with breaking the
stream up into objects for you. That won't work though when one end is
C++.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Pep - 27 Oct 2005 09:32 GMT
>>Another way is to precede each record with a short header containing
>>the length of the record.
[quoted text clipped - 8 lines]
> stream up into objects for you. That won't work though when one end is
> C++.

Yep, unfortunately this data is being provided by some windows based c++
program.

Cheers,
Pep.
Pep - 27 Oct 2005 09:31 GMT
>> I am experiencing problems when trying to communicate with a TCP
>> socket based application that does not always append a <CR> at the
[quoted text clipped - 13 lines]
> but has been working "by accident". A subtle difference in timing may
> make the difference.

I'm surprised I've been "lucky so far".  I'm talking about a application
that services transactions from multiple clients under extreme load and it
has never missed a record yet. The records are passed using a normal socket
write operation so I know how the data is provided.

Still I won't argue with someone that knows better than me and by that I am
actually trying to be sincere not rude :)

>> How can I get java sockets to operate in a similar manner as unix
>> socket reads so that one record is obtained in each read operation,
[quoted text clipped - 21 lines]
>
> /gordon

Unfortunately I do not have control over the data being sent to me now and I
have now found from a ethereal analysis that sometimes the records have a
<CR> ending and other times they do not.

I'll have to try simply reading a determined number of bytes :(

Cheers,
Pep
Pep - 27 Oct 2005 09:34 GMT
>>> I am experiencing problems when trying to communicate with a TCP
>>> socket based application that does not always append a <CR> at the
[quoted text clipped - 56 lines]
> Cheers,
> Pep

Sorry I forgot to add that the length of the record is variable so that
without this random <CR> I'm pretty much screwed :(
Gordon Beaton - 27 Oct 2005 09:51 GMT
> Sorry I forgot to add that the length of the record is variable so
> that without this random <CR> I'm pretty much screwed :(

Probably, yes. I thought to add that exact sentiment in my previous
reply, but didn't want to come across as rude!

It sounds odd to me that the existence (or lack) of CR is "random".

Have I understood correctly that your C++ clients work as expected,
and you are now implementing a Java client to an existing C++ server?

Do the C++ clients not have any special logic to deal with record
boundaries? Can you not change the way the server sends messages?

As I mentioned earlier, a small timing change could make a difference.

Basically if there is a short delay between calls to write(), it is
often the case that they will be sent separately by TCP. As long as
the recipient reads() sufficiently quickly, he will receive them
separately as well. This can work ("by accident"), but you really
shouldn't rely on this behaviour.

If the sender sends short messages without delay in between, then TCP
may send them together. Similarly when the reader is slow, messages
will accumulate in his receive buffer, and calls to read() cannot
distinguish between them.

/gordon

Signature

[  do not email me copies of your followups  ]
g o r d o n + n e w s @  b a l d e r 1 3 . s e

Pep - 27 Oct 2005 11:01 GMT
>> Sorry I forgot to add that the length of the record is variable so
>> that without this random <CR> I'm pretty much screwed :(
[quoted text clipped - 3 lines]
>
> It sounds odd to me that the existence (or lack) of CR is "random".

Same here.  I was told in the spec that the records would be capped with a
<CR> but they are not always which is something I am arguing with the
designer of the server code.

> Have I understood correctly that your C++ clients work as expected,
> and you are now implementing a Java client to an existing C++ server?

NO.  The original application which I wrote consisted of a C++ server and
client both of which use unix sockets with no EOR delimiter and they work
fine.

Now I am having to replace this with a server that is provided and has been
written using windows based C++ and I have to write the client.  So I have
done this using java.

> Do the C++ clients not have any special logic to deal with record
> boundaries? Can you not change the way the server sends messages?
[quoted text clipped - 6 lines]
> separately as well. This can work ("by accident"), but you really
> shouldn't rely on this behaviour.

Accepted and thanks for that knowledge.

> If the sender sends short messages without delay in between, then TCP
> may send them together. Similarly when the reader is slow, messages
> will accumulate in his receive buffer, and calls to read() cannot
> distinguish between them.

Which appears to be my problem here.

> /gordon

Okay this is the code that I am trying to run

===============================================================================
try
{

       while ((running.get()) && (parentProxy.br != null) &&
(parentProxy.br.ready()))
       {

               try
               {
                       mResponse = "";

                       if (parentProxy.br != null)
                       {
                               mResponse = parentProxy.br.readLine(); // read the response from the m
server
                               logDebugToFile("TSreader::run processing [" + mResponse + "]");
                               parentProxy.processMResult(mResponse); // send the result back to the
client
                       }

               }
               catch(SocketTimeoutException e)
               {
                       // do nothing here
               }

       }

}
catch(Throwable e)
{
       logFatalToFile("TSreader::run Error (running) " + e.getMessage(), e);
       e.printStackTrace();
}
===============================================================================

and this is the output of the code

===============================================================================
27 Oct 2005 09:34:15 GMT: DEBUG          - {Thread-0} {Thread-4}
TSreader::run processing [CCOK Q458000:Y:a:XXXX48]
27 Oct 2005 09:34:16 GMT: DEBUG          - {Thread-0} {Thread-4}
TSreader::run processing []
27 Oct 2005 09:34:16 GMT: FATAL          - {Thread-0} {Thread-4}
TSreader::run Error (running) String index out of range: 6
java.lang.StringIndexOutOfBoundsException: String index out of range: 6
       at java.lang.String.charAt(Unknown Source)
       at TS.MP.processMResult(MP.java:486)
       at TS.TSreader.run(TSreader.java:81)
       at java.lang.Thread.run(Unknown Source)
27 Oct 2005 09:34:16 GMT: DEBUG          - {Thread-0} {Thread-4}
TSreader::run processing [CCOK E458000:Y:a:XXXX34]
===============================================================================

as can be seen, it can read the first record and the 3rd record but the
second record is coming back empty

Based on this input on the socket (obtained using ethereal)

===============================================================================
CCOK
Q458000:Y:a:XXXX48C\x9f`C\xd8@^@6^@^@^@6^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc
(^H^@E^@^@(V\xbf@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f  
\xbc#\x8d\xf5\x95\xd4\xd3\xd1\x83^G\xfdP^P\xe4
Y^F^@^@C\x9f`C\xdbC^@<^@^@^@<^@^@^@^@^K\xdb\x95\xc
(^@^N\x83\x9c\x91\xb2^H^@E^@^@)^O\xf5@^@}^F\x94\xee^]^H^P\x8f\xc0\xa8j\xac#\x8d\xbc\xd1\x83^G\xfd\xf5\x95\xd4\xd3P^X^^^[Z\x91^@^@^M^@^@^@^@^@C\x9f`C\x8en^@\x87^@^@^@\x87^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc
(^H^@E^@^@yV\xc6@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f    
\xbc#\x8d\xf5\x95\xd4\xd3\xd1\x83^G\xfeP^X\xe4YW^@^@C:MISTER-48/5     :WXYZ  :4111111111111111      :0605:    
20.00:JTWB801XXXX48 ^MC\x9f`C\xab\x86^@M^@^@^@M^@^@^@^@^K\xdb\x95\xc
(^@^N\x83\x9c\x91\xb2^H^@E^@^@?^P\xf5@^@}^F\x93\xd8^]^H^P\x8f\xc0\xa8j\xac#\x8d
\xbc\xd1\x83^G\xfe\xf5\x95\xd5$P^X^]\xca\x88g^@^@CCOKR458000:Y:a:XXXX10C\x9f`C8\xb7^K^@\x87^@^@^@\x87^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc
(^H^@E^@^@yV\xcc@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f
\xbc#\x8d\xf5\x95\xd5$\xd1\x83^H^UP^X\xe4
YW^@^@C:MISTER-18/5     :WXYZ  :4111111111111111      :0605:    
20.00:KTWB801XXXX18 ^MC\x9f`C\xba^K^@O^@^@^@O^@^@^@^@^K\xdb\x95\xc
(^@^N\x83\x9c\x91\xb2^H^@E^@^@A^R\xf5@^@}^F\x91\xd6^]^H^P\x8f\xc0\xa8j\xac#\x8d  
\xbc\xd1\x83^H^U\xf5\x95\xd5uP^X^]yE~^@^@^MCCOK
E458000:Y:a:XXXX34^MC\x9f`C^K'^M^@6^@^@^@6^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc
(^H^@E^@^@(V\xd1@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f    
\xbc#\x8d\xf5\x95\xd5u\xd1\x83^H.P^P\xe4Y^F^@^@C\x9f`C\xf3\xdd^M^@M^@^@^@M^@^@^@^@^K\xdb\x95\xc
(^@^N\x83\x9c\x91\xb2^H^@E^@^@?^T\xf5@^@}^F\x8f\xd8^]^H^P\x8f\xc0\xa8j\xac#\x8d  
\xbc\xd1\x83^H.\xf5\x95\xd5uP^X^]y\x82F^@^@
===============================================================================

and I am struggling to find out why this is happening :(

If you can see where I am going wrong then I would greatly appreciate your
advice.

The really annoying thing is that I have written a simulator based on the
servers spec of capping the records with a <CR> and my client can process
up to 100,000 record sin a 2 hour period. So this is really starting to
piss me off!

Cheers,
Pep.
Roedy Green - 27 Oct 2005 11:28 GMT
>Same here.  I was told in the spec that the records would be capped with a
><CR> but they are not always which is something I am arguing with the
>designer of the server code.

If you are using a BufferedOutputStream, you want to do a flush()
after every record or parts of it could stay stuck in the buffer
wrapping the socket until you write some more to push it out.
.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Pep - 27 Oct 2005 11:42 GMT
>>Same here.  I was told in the spec that the records would be capped with a
>><CR> but they are not always which is something I am arguing with the
[quoted text clipped - 4 lines]
> wrapping the socket until you write some more to push it out.
> .

I'm using a print writer

pw   = new PrintWriter(microgateSocket.getOutputStream(),true);

The really annoying thing about this is that it seems ot be related to the
fact that they are not placing a <CR> at the end of the records.  My
simulator puts a \n at the end of each record and like I said, my client
will then process over 100,000 record swithout dropping a single one.

Cheers,
Pep.
Roedy Green - 27 Oct 2005 12:04 GMT
>pw   = new PrintWriter(microgateSocket.getOutputStream(),true);
>
>The really annoying thing about this is that it seems ot be related to the
>fact that they are not placing a <CR> at the end of the records.

That is not the official duty of a PrintWriter.  It is supposed to put
a platform specific line separator there. If you want a cr
specifically you should do a write( '\r' );
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Pep - 27 Oct 2005 12:25 GMT
>>pw   = new PrintWriter(microgateSocket.getOutputStream(),true);
>>
[quoted text clipped - 4 lines]
> a platform specific line separator there. If you want a cr
> specifically you should do a write( '\r' );

Yeah but they are not using a print writer.  They have written their
application using either Visual C++ or Visual Basic so maybe god knows how
they are writing the data to the socket?

I use the print writer in my java client to send the transactions to their
server and I append a <CR> to the data.

I have just done another massive run against their server and they seem to
only be appending a <CR> to maybe around 2% of their transaction reply
records.  Which of course means as it is a variable length record with no
delimiter I cannot even handle the protocol myself using a data stream
reader.

Cheers,
Pep.
Pep - 27 Oct 2005 16:02 GMT
>>pw   = new PrintWriter(microgateSocket.getOutputStream(),true);
>>
[quoted text clipped - 4 lines]
> a platform specific line separator there. If you want a cr
> specifically you should do a write( '\r' );

I have now found out, with the use of ethereal at both ends of the socket,
that the windows application is definitely sending a 0x0D but it is being
transformed in to a 0xDC by the time it reaches my end of the socket.

Similarly my 0x0D0x0A byte sequence is being converted in to a 0x0d.

Cheers,
Pep.
Gordon Beaton - 27 Oct 2005 16:20 GMT
> I have now found out, with the use of ethereal at both ends of the
> socket, that the windows application is definitely sending a 0x0D
> but it is being transformed in to a 0xDC by the time it reaches my
> end of the socket.

I find it extremly hard to believe that the cable or a switch alone
would be making such selective changes to the data stream.

Are you absolutely certain that you aren't making parts of this
observation in the code itself, where some processing has already
taken place? Or that your data doesn't pass through a proxy of some
kind?

/gordon

Signature

[  do not email me copies of your followups  ]
g o r d o n + n e w s @  b a l d e r 1 3 . s e

Pep - 27 Oct 2005 20:42 GMT
>> I have now found out, with the use of ethereal at both ends of the
>> socket, that the windows application is definitely sending a 0x0D
[quoted text clipped - 10 lines]
>
> /gordon

At this point I am  not sure of anything other than ethereal shows a 0x0D on
the windows end of the socket and a 0xDC on the unix end of the socket.

Similarly that the 0x0D0x0A on the unix end of the socket is a 0x0D when it
reaches the windows end of the socket.

I make no assumptions as to what is causing the change but am relieved to
find out that it is not my client written in Java or the server written in
some windows based language.

Cheers,
Pep.
Steve Horsley - 28 Oct 2005 21:21 GMT
> At this point I am  not sure of anything other than ethereal shows a 0x0D on
> the windows end of the socket and a 0xDC on the unix end of the socket.
[quoted text clipped - 5 lines]
> find out that it is not my client written in Java or the server written in
> some windows based language.

Spooky. So what exactly connects the client and server?

As Gordon says, I imagine they must be talking via a proxy. I
would be inclined to compare the traces for IP address, MAC
address, IP sequence numbers, to prove there is some entity
playing piggy in the middle and corrupting the data stream. That
kind of change doesn't happen by accident - you have to re-write
checksums, and even change sequence numbering if you're dropping
bytes from the stream.

Steve
Pep - 31 Oct 2005 12:16 GMT
>> At this point I am  not sure of anything other than ethereal shows a 0x0D
>> on the windows end of the socket and a 0xDC on the unix end of the
[quoted text clipped - 18 lines]
>
> Steve

I agree, very spooky.

We are about to run the client and server on the same segment as each other
to see if the problem still exists.

Cheers,
Pep.
Pep - 31 Oct 2005 16:43 GMT
>> At this point I am  not sure of anything other than ethereal shows a 0x0D
>> on the windows end of the socket and a 0xDC on the unix end of the
[quoted text clipped - 18 lines]
>
> Steve

Having now run my client on the same network segment as the server, I have
successfully processed in excess of 5,000 records without dropping any. So
it looks like there is something on the network which is responsible for
the conversion or dropping of bytes.

Now to find out what it is :(

Cheers,
Pep.
Steve Horsley - 31 Oct 2005 21:09 GMT
> Having now run my client on the same network segment as the server, I have
> successfully processed in excess of 5,000 records without dropping any. So
> it looks like there is something on the network which is responsible for
> the conversion or dropping of bytes.
>
> Now to find out what it is :(

The thot plickens.

That might prove to be an interesting investigation.

One thing I can guarantee - when you find the kit responsible,
and find whoever is responsible for that kit, they will
categorically deny that their kit could possibly have anything to
do with your problem.

Steve
Pep - 01 Nov 2005 09:34 GMT
>> Having now run my client on the same network segment as the server, I
>> have successfully processed in excess of 5,000 records without dropping
[quoted text clipped - 13 lines]
>
> Steve

ROFL.

I already had one sysad state "absolutely impossible" when I described the
problem to him and it's not even his problem :)

Pep.
Roedy Green - 29 Oct 2005 07:58 GMT
>I have now found out, with the use of ethereal at both ends of the socket,
>that the windows application is definitely sending a 0x0D but it is being
>transformed in to a 0xDC by the time it reaches my end of the socket.

So Java nothing to do with it.

Write a class that reads one record  scanning it byte by byte

You might use http://mindprod.com/jgloss/readblocking.html
as a model.  Perhaps it should also convert it to char for you as well
after it has scanned the bytes.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Pep - 31 Oct 2005 12:15 GMT
>>I have now found out, with the use of ethereal at both ends of the socket,
>>that the windows application is definitely sending a 0x0D but it is being
>>transformed in to a 0xDC by the time it reaches my end of the socket.
>
> So Java nothing to do with it.

Thankfully, no.

> Write a class that reads one record  scanning it byte by byte
>
> You might use http://mindprod.com/jgloss/readblocking.html
> as a model.  Perhaps it should also convert it to char for you as well
> after it has scanned the bytes.

Just about to run my client on the same network segmet as the server to see
if we still have the same problem.  If not then we can work outwards from
there to see where it comes in.

Cheers,
Pep.
Pep - 31 Oct 2005 16:43 GMT
>>I have now found out, with the use of ethereal at both ends of the socket,
>>that the windows application is definitely sending a 0x0D but it is being
[quoted text clipped - 7 lines]
> as a model.  Perhaps it should also convert it to char for you as well
> after it has scanned the bytes.

Having now run my client on the same network segment as the server, I have
successfully processed in excess of 5,000 records without dropping any. So
it looks like there is something on the network which is responsible for
the conversion or dropping of bytes.

Now to find out what it is :(

Cheers,
Pep.
Pep - 27 Oct 2005 11:46 GMT
>>Same here.  I was told in the spec that the records would be capped with a
>><CR> but they are not always which is something I am arguing with the
[quoted text clipped - 4 lines]
> wrapping the socket until you write some more to push it out.
> .

Actually as I am looking at the ethereal output, I cannot see a <CR> at the
end of any of the records they are seeing back to me at all so I'm now
wondering how the readLine function is working at all?

Cheers,
Pep.
Missaka Wijekoon - 29 Oct 2005 06:01 GMT
> Actually as I am looking at the ethereal output, I cannot see a <CR> at the
> end of any of the records they are seeing back to me at all so I'm now
> wondering how the readLine function is working at all?

Per the Java API docs:

public String readLine() throws IOException
    Read a line of text. A line is considered to be terminated by any
one of a line feed ('\n'), a carriage return ('\r'), or a carriage
return followed immediately by a linefeed.

From some of the conversation, it feels as if there might be a filter
that is converting the stream like dos2unix, etc.  Is there a chance
that the socket on the server end is not a true socket, but perhaps a
telnet connections?  For example, the telnet protocol requires that 0xFF
be escaped.

> Cheers,
> Pep.
Pep - 31 Oct 2005 16:43 GMT
>> Actually as I am looking at the ethereal output, I cannot see a <CR> at
>> the end of any of the records they are seeing back to me at all so I'm
[quoted text clipped - 15 lines]
>> Cheers,
>> Pep.

Having now run my client on the same network segment as the server, I have
successfully processed in excess of 5,000 records without dropping any. So
it looks like there is something on the network which is responsible for
the conversion or dropping of bytes.

Now to find out what it is :(

Cheers,
Pep.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.