Java Forum / General / November 2006
How can I detect a carriage return using java.net
Angus - 21 Oct 2006 16:14 GMT Hello
I have code a bit like this:
BufferedReader bis = new BufferedReader(new InputStreamReader(tserverSocket.getInputStream()));
// Here I write some data - getting a response - some lines of text returned
boolean more = true; while(more) { String line = bis.readLine(); if (line == null) more = false; else System.out.println(line); }
But code never gets out of while loop.
I assumed when there was no more input it would return null, so more would be set to false and it would exit loop.
I want to test for a carriage return. (I know text is in ASCII format, and I know end of each line has a carriange return and line feed.
So I tried doing this:
if (line.indexOf("\n") != -1) { break; }
But it didn't seem to find \n . How should I test for a carriage return?
Angus
java@starflag.net - 21 Oct 2006 16:44 GMT Angus,
BufferedReader does return null when there is more data to read. However, when dealing with a socket (based on the "tserverSocket" variable name), its possible that if the connection is still active that the empty string might be what is being returned. The BufferedReader.readLine() method automatically strips off the carriage return so trying to match based on that will never be true.
For example, if the stream sends "one\ntwo\nthree\n" The first invocation of readLine() would return "one", etc.
I think I need a little more information about the tserverSocket to assist more on this.
kavaj
> Hello > [quoted text clipped - 33 lines] > > Angus EJP - 22 Oct 2006 08:51 GMT > BufferedReader does return null when there is more data to read. > However, when dealing with a socket (based on the "tserverSocket" > variable name), its possible that if the connection is still active > that the empty string might be what is being returned. This makes no sense. A null will be returned when the other end closes its socket. Anything else returned, including an empty string, is returned because the other end wrote it, and for no other reason.
Arne Vajhøj - 21 Oct 2006 16:44 GMT > boolean more = true; > while(more) [quoted text clipped - 10 lines] > I assumed when there was no more input it would return null, so more would > be set to false and it would exit loop. If you want to read one line then use:
String line = bis.readLine();
If you want to read multiple lines then your code should work even though:
String line; while((line = bis.readLine()) != null) { ... }
is the most common way of doing it.
Note that readLine first return null (EOF) when the socket is closed in the other end.
Arne
Gordon Beaton - 21 Oct 2006 16:50 GMT > // Here I write some data - getting a response - some lines of text > // returned [quoted text clipped - 10 lines] > > But code never gets out of while loop. That's a complicated way of writing a simple loop. This is the idiom:
String line; while ((line = bis.readLine()) != null) { System.out.println(line); }
> I assumed when there was no more input it would return null, so more > would be set to false and it would exit loop. It will, but "no more input" means "end of file". readLine() returns null and the loop will exit when you reach EOF, i.e. when the remote *closes* the connection. You won't reach EOF as long as the remote keeps the connection open.
> I want to test for a carriage return. (I know text is in ASCII > format, and I know end of each line has a carriange return and line > feed. readLine() has already tested for newlines and removed them. Each line it returns is exactly one whole line.
/gordon
 Signature [ don't email me support questions or followups ] g o r d o n + n e w s @ b a l d e r 1 3 . s e
Angus - 21 Oct 2006 17:30 GMT Ah OK.
So what I really want to know is how do I know when server has finished sending me data?
Eg my socket server sends out 1 to many lines of output. How do I know when server has finished sending data? Is it possible?
Or does my server need to somehow send some data which indicates sending has finished?
I have the source code for the server too so I can edit the the way the server outputs data if necessary.
Angus
> > // Here I write some data - getting a response - some lines of text > > // returned [quoted text clipped - 38 lines] > [ don't email me support questions or followups ] > g o r d o n + n e w s @ b a l d e r 1 3 . s e Gordon Beaton - 21 Oct 2006 17:50 GMT > So what I really want to know is how do I know when server has > finished sending me data? [quoted text clipped - 4 lines] > Or does my server need to somehow send some data which indicates > sending has finished? Yes it's possible, but the solution depends on what happens later.
If you don't need to keep the connection open, then the server can just close it and readLine() will return null, indicating EOF.
If you need to need continue communicating with the server, then the server should send information the client can use to determine where the end of the response is. One way is to precede the response with the number of lines to expect so the client can count them. Another is to send an extra, empty line ("") after the response (assuming that can't occur within the response itself), or to mark the final line of the response differently from the others in some way.
/gordon
 Signature [ don't email me support questions or followups ] g o r d o n + n e w s @ b a l d e r 1 3 . s e
Martin Gregorie - 21 Oct 2006 22:32 GMT > If you don't need to keep the connection open, then the server can > just close it and readLine() will return null, indicating EOF. With all due respect, I think that the connection should always be opened and closed by the client: if the server closes the connection you have no way of telling if the network broke, the server crashed or if it was an intentional "end of dialog" closure.
> the server should send information the client can use to determine where > the end of the response is. One way is to precede the response with > the number of lines to expect so the client can count them. Another is > to send an extra, empty line ("") after the response (assuming that > can't occur within the response itself), or to mark the final line of > the response differently from the others in some way. A better way is to precede each message with its length, which should be either a binary byte or (better) a fixed length character string, e.g.,
0012Message data
This way the receiver starts with a fixed length read to get the message data length, translates it into an int and then reads that many bytes to get the message. If either read gets the wrong number of bytes you know there's a problem and you can take corrective action.
If both client and server send this type of message both can use the same error checking and message decoding code with the addition that, if the server receives a zero length message when its expecting a message length it knows that the client has closed the connection.
I usually build messages this way, but assemble them from comma separated fields. If the client always sends messages consisting of a command and an associated value a message might look like:
0024,STORE,Text to be stored
While the reply from the server might look like
0004,OK,
or
0023,ERROR,The file is full
This format has the double advantage that its easy to decode and is human-readable when it appears in debugging messages, etc.
 Signature martin@ | Martin Gregorie gregorie. | Essex, UK org |
EJP - 22 Oct 2006 08:51 GMT > With all due respect, I think that the connection should always be > opened and closed by the client: if the server closes the connection you > have no way of telling if the network broke, the server crashed or if it > was an intentional "end of dialog" closure. Except that an orderly close returns a null via readLine(), or -1 via read(), or an EOFException via any other method, whereas a disorderly close throws an IOException, or a SocketException, or possibly just blocks forever.
Martin Gregorie - 22 Oct 2006 12:08 GMT >> With all due respect, I think that the connection should always be >> opened and closed by the client: if the server closes the connection [quoted text clipped - 5 lines] > close throws an IOException, or a SocketException, or possibly just > blocks forever. If the server crashes when the client is expecting a reply, the client will just get a null from readLine because, at the lowest level, the reader can't distinguish an intentional close from a connection that closes when the remote program crashes.
IIRC you'll only get the IOException if you try to write to a broken connection or try to read it after you've already seen the close.
 Signature martin@ | Martin Gregorie gregorie. | Essex, UK org |
EJP - 22 Oct 2006 13:34 GMT > If the server crashes when the client is expecting a reply, the client > will just get a null from readLine because, at the lowest level, the > reader can't distinguish an intentional close from a connection that > closes when the remote program crashes. At the lowest level the TCP stack will receive either nothing or an RST from a disorderly close; it will receive a FIN from an orderly close.
Martin Gregorie - 22 Oct 2006 23:33 GMT >> If the server crashes when the client is expecting a reply, the client >> will just get a null from readLine because, at the lowest level, the [quoted text clipped - 3 lines] > At the lowest level the TCP stack will receive either nothing or an RST > from a disorderly close; it will receive a FIN from an orderly close. Whether that difference is visible at application code level depends on the stack implementation. On my kit, like all Unices, the difference isn't visible and the behavior is not language dependent: C and Java do exactly the same.
 Signature martin@ | Martin Gregorie gregorie. | Essex, UK org |
EJP - 24 Oct 2006 00:50 GMT >> At the lowest level the TCP stack will receive either nothing or an >> RST from a disorderly close; it will receive a FIN from an orderly close. [quoted text clipped - 3 lines] > isn't visible and the behavior is not language dependent: C and Java do > exactly the same. There is no truth in this. At the 'C' level a read() or recv() or recvmsg() will return 0 meaning EOF on receiving a FIN, and -1 with an errno of ECONNRESET on receiving an RST. This is true both of Berkeley Sockets and WINSOCK except for the WSA on the front of the names. At the Java level a read() will return -1 on a clean EOF from a FIN, and throw an IOException on a reset. Java's readLine(), readObject(), readXXX() are all built on read(), and their behaviour on these conditions is similarly well-defined.
Martin Gregorie - 24 Oct 2006 19:32 GMT >>> At the lowest level the TCP stack will receive either nothing or an >>> RST from a disorderly close; it will receive a FIN from an orderly [quoted text clipped - 9 lines] > errno of ECONNRESET on receiving an RST. This is true both of Berkeley > Sockets and WINSOCK except for the WSA on the front of the names. Maybe so for Berkeley Sockets or WINSOCK, but that's not true for the version in the Linux Fedora Core distribution. I'm well aware that errors cause -1 to be returned, but the stack is not treating the disconnection as an error.
Killing the remote process causes recv() to return zero, exactly the same as if the remote process had closed the connection. This is not speculation or derived from reading manpages: I just re-checked it by running tests.
As I've seen this behavior in UNIX SVR4 systems as well as Linux its obviously not uncommon.
 Signature martin@ | Martin Gregorie gregorie. | Essex, UK org |
EJP - 25 Oct 2006 01:24 GMT > Maybe so for Berkeley Sockets or WINSOCK, but that's not true for the > version in the Linux Fedora Core distribution. I'm well aware that > errors cause -1 to be returned, but the stack is not treating the > disconnection as an error. Clearly a bug if true, and a bad one.
> Killing the remote process causes recv() to return zero, exactly the > same as if the remote process had closed the connection. This is not > speculation or derived from reading manpages: I just re-checked it by > running tests. kill -9? or kill with some signal that the process could have caught?
Martin Gregorie - 25 Oct 2006 11:03 GMT >> Maybe so for Berkeley Sockets or WINSOCK, but that's not true for the >> version in the Linux Fedora Core distribution. I'm well aware that [quoted text clipped - 9 lines] > > kill -9? or kill with some signal that the process could have caught? kill -9
 Signature martin@ | Martin Gregorie gregorie. | Essex, UK org |
EJP - 26 Oct 2006 00:52 GMT >> kill -9? or kill with some signal that the process could have caught? > > kill -9 Erk!
I'd still find it easier to believe that the OS has properly closed the socket for the killed process than that the reader never gets {-1,ECONNRESET}. I'd want to see an incoming RST segment that is ignored before I could really believe that such a basic error exists in any production OS.
Martin Gregorie - 26 Oct 2006 12:39 GMT >>> kill -9? or kill with some signal that the process could have caught? >> [quoted text clipped - 7 lines] > before I could really believe that such a basic error exists in any > production OS. Or, as I said earlier, it could be a deliberate implementation decision in the upper layers of the stack.
I haven't looked into Java in the same platform to the same detail but as there's a good chance it uses the same stack, I'm not surprised that I'm seeing closely similar behavior.
 Signature martin@ | Martin Gregorie gregorie. | Essex, UK org |
EJP - 27 Oct 2006 01:25 GMT > Or, as I said earlier, it could be a deliberate implementation decision > in the upper layers of the stack. It is possible. It is extremely unlikely. It wouldn't comply with any known standard for Unix or sockets. I don't know anything about Fedora but it is certainly true that the Berkeley stacks deliver ECONNRESET, contrary to previous assertions.
An OS with this bug wouldn't be able to make any practical use of TCP/IP or the Internet.
Martin Gregorie - 28 Oct 2006 19:27 GMT >> Or, as I said earlier, it could be a deliberate implementation >> decision in the upper layers of the stack. [quoted text clipped - 6 lines] > An OS with this bug wouldn't be able to make any practical use of TCP/IP > or the Internet. The stack used by DEC Unix on the alphaserver did the same - and that was a Mach based implementation.
Actually, its a pretty benign fault, if it is one. It certainly doesn't make socket connections unusable. Any time a recv() or read() returns a positive value you know you've got data. If you get <=0 the connection has been dropped and you need to clean up and recover. If you got an error you can report it, but this doesn't affect the logic unless it was EAGAIN or EINTR, which both indicate that the recv() returned without reading any valid data but the connection is still intact.
My copy of "UNIX Systems Programming for SVR4" implies that this behavior is the norm from its examples, which all use logic of the form:
while((n = recv()) >0) process the message if (n < 0) decode errno and output the error message stop.
and the Linux manpage says more or less the same. Neither explicitly mentions any error indicating that the connection has failed.
 Signature martin@ | Martin Gregorie gregorie. | Essex, UK org |
EJP - 30 Oct 2006 08:37 GMT > The stack used by DEC Unix on the alphaserver did the same - and that > was a Mach based implementation. At the receiving end or the kill -9 end? We haven't resolved that one yet.
> Actually, its a pretty benign fault, if it is one. It certainly doesn't > make socket connections unusable. It makes error detection impossible. It means that every transfer would appear to be complete when it isn't. I would describe that as unusable.
> My copy of "UNIX Systems Programming for SVR4" implies that this > behavior is the norm from its examples, which all use logic of the form: [quoted text clipped - 7 lines] > and the Linux manpage says more or less the same. Neither explicitly > mentions any error indicating that the connection has failed. apart from the n < 0 test?
I am finding all this utterly impossible to believe. Can you provide an ethereal dump showing the incoming RST? - the one allegedly being ignored by the 'implementation decision in the upper layers of the stack'?
EJP - 01 Nov 2006 09:18 GMT ... and the following program demonstrates clearly that I am stone cold motherless wrong.
The only way it prints the RST line is if the write line is enabled. It can't tell the difference between a reset and a FIN when reading.
=====================================
import java.io.IOException; import java.net.ServerSocket; import java.net.Socket;
public class SocketResetTest { /** Creates a new instance of SocketResetTest */ public static void main(String[] args) throws IOException { ServerSocket ss = new ServerSocket(0); Socket cs = new Socket("localhost", ss.getLocalPort()); Socket cc = ss.accept(); cc.getOutputStream().write("Hello".getBytes()); cc.setSoLinger(false, 0); cc.close(); ss.close(); try { // This write detects the RST. // Without it, the read detects the EOS. // cs.getOutputStream().write("H".getBytes()); int c; while ((c = cs.getInputStream().read()) > 0) System.out.print((char)c); System.out.println(""); System.out.println("Detected an EOS, seemed like a FIN"); } catch (IOException exc) { System.out.println("Detected an error, seemed like an RST"); } cs.close(); } }
Martin Gregorie - 01 Nov 2006 13:11 GMT > ... and the following program demonstrates clearly that I am stone cold > motherless wrong. [quoted text clipped - 40 lines] > > } An interesting test. Thanks for trying it and giving the results. What host platform and stack were you using?
I doubt we'll get an more confirmation of the way the DEC UNIX Alphaserver stack handles things: I haven't had access to such a system since 2001 and there probably aren't many left these days.
A pity: DEC UNIX had its problems but the Alphaserver hardware was startlingly fast: would you believe 18 developers using PCs and Hummingbird X-term as terminals on a 150 MHz uniprocessor box and no significant delays?
 Signature martin@ | Martin Gregorie gregorie. | Essex, UK org |
EJP - 02 Nov 2006 01:04 GMT > An interesting test. Thanks for trying it and giving the results. > What host platform and stack were you using? Windows XP Pro 2002 SP2, but having seen that (and having written in my own book (http://www.telekinesis.com.au/wipv3_6/FundamentalNetworkingInJava.A21) that you can only detect RSTs by trying a write), I'm prepared to believe that most or all TCPs do this.
Gordon Beaton - 22 Oct 2006 09:27 GMT > With all due respect, I think that the connection should always be > opened and closed by the client: if the server closes the connection > you have no way of telling if the network broke, the server crashed > or if it was an intentional "end of dialog" closure. As EJP already pointed out there are other ways of determining those things.
I wonder though why you want to favour the client. Isn't information about the connection state equally useful at *both* ends of the connection?
/gordon
 Signature [ don't email me support questions or followups ] g o r d o n + n e w s @ b a l d e r 1 3 . s e
Martin Gregorie - 22 Oct 2006 12:37 GMT > As EJP already pointed out there are other ways of determining those > things. > > I wonder though why you want to favour the client. Isn't information > about the connection state equally useful at *both* ends of the > connection? Yes, of course, but if you use the convention that only the client opens and closes connections then:
- the client can treat any disconnection or i/o problem as an error.
- the server can always just clean up and wait for another connection when it sees a disconnection. It also does the same if sending a response fails but should also log the error.
If the server is stateless that's all it ever needs to do.
A stateful server is more complex because a disconnect is only valid if its waiting for the start of a session. A disconnect or i/o error at any other time is always an error.
I've designed complex, high performance, multi-process systems without needing to use stateful message exchanges. These have used request/response pairs with the client issuing the request. The overhead of the positive response has never been a problem and it certainly makes error recovery and process synchronization a lot easier.
 Signature martin@ | Martin Gregorie gregorie. | Essex, UK org |
Red Orchid - 21 Oct 2006 18:55 GMT "Angus" <nospam@gmail.com> wrote or quoted in Message-ID: <ehdhvu$hhf$1$8302bc10@news.demon.co.uk>:
> Ah OK. > [quoted text clipped - 6 lines] > Or does my server need to somehow send some data which indicates sending has > finished? As I know as, 'null' do not necessarily indicate that server has finished sending data. 'null' may be returned when unexpected network error occurs (ex: unexpected connection close).
Therefore, communication protocol must have an ending indicator.
For example, With NNTP, a single period (".") on a line indicates the end of sending data.
With HTTP, an empty line("") indicates it. (Or "Content-Length").
With a server that sends only one line to a client always, "\r\n" will indicate it.
I think that you have to code a server that send out it.
EJP - 22 Oct 2006 08:51 GMT > As I know as, > 'null' do not necessarily indicate that server has finished > sending data. 'null' may be returned when unexpected > network error occurs (ex: unexpected connection close). This is 100% incorrect.
Red Orchid - 22 Oct 2006 13:00 GMT EJP <esmond.not.pitt@not.bigpond.com> wrote or quoted in Message-ID: <exF_g.51086$rP1.23887@news-server.bigpond.net.au>:
> > As I know as, > > 'null' do not necessarily indicate that server has finished > > sending data. 'null' may be returned when unexpected > > network error occurs (ex: unexpected connection close). > > This is 100% incorrect. Let's assume that you have an external modem for networking.
Execute the class 'Test'. If you turn off the modem in the middle of receiving data, this code returns 'null', not an exception.
The 'null' do not indicate that a server has finished sending data because the modem was turned off.
<code> public class Test {
public static void main(String[] args) throws Exception {
process(); } static void process() { String url = "http://Your Server/Your Data File"; BufferedReader br = null; try { HttpURLConnection uc;
URL u = new URL(url); uc = (HttpURLConnection) u.openConnection();
int code = uc.getResponseCode(); if (code != 200) { System.out.println("Err Code: " + code); return; } InputStreamReader ir; InputStream in; in = uc.getInputStream(); ir = new InputStreamReader(in); br = new BufferedReader(ir); while(true) { if (br.readLine() == null) { System.out.println("-< null >-"); break; } System.out.println("Receiving ..."); } } catch (Exception e) { e.printStackTrace(); } finally { if (br != null) { try { br.close(); } catch (Exception e) { } } } } }
</code>
EJP - 22 Oct 2006 13:34 GMT > If you turn off the modem in the middle of receiving data, > this code returns 'null', not an exception. Under the rules of TCP/IP you should get either nothing or an IOException 'connection reset by peer'.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|