Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / February 2007

Tip: Looking for answers? Try searching our database.

'\u000a' and '\u000d'

Thread view: 
dimakura - 19 Feb 2007 14:33 GMT
i found in the web-search why i can not use

////////////////////

char c = '\u000a'

////////////////////

but i can not find why i can not use

////////////////////

// char c = '\u000a'

////////////////////

is it because '\u000a' is equivivalent to \n and this type is comment
is a single-line?

thanks,
dimitri
Oliver Wong - 19 Feb 2007 14:52 GMT
>i found in the web-search why i can not use
>
[quoted text clipped - 14 lines]
> is it because '\u000a' is equivivalent to \n and this type is comment
> is a single-line?

   The process of converting unicode escape sequences to characters happens
somewhere between reading the source file, and then parsing source file for
compilation.

   So javac will read in the file, and thus get:

////////////////////
// char c = '\u000a'
////////////////////

   Then it will convert unicode escape sequences to their equivalent
characters and get:

////////////////////
// char c = '
'
////////////////////

   And then it will try to compile this code, and it'll fail with some sort
of error like "Not expecting apostrophe here".

   - Oliver
Gordon Beaton - 19 Feb 2007 15:02 GMT
> is it because '\u000a' is equivivalent to \n and this type is
> comment is a single-line?

Yes.

Read section 3.2 of the JLS, which describes translation of the input
to the compiler. The unicode escape sequences are translated into
their corresponding unicode characters, *then* the resulting sequence
of characters is tokenized.

So when you escape a line feed as you've done, you are essentially
writing this (illegal) code:

 char c = '
 '

i.e. the closing quote ends up on the following line.

Similarly, commenting the line results in this invalid sequence:

 // char c = '
 '

/gordon

Signature

[ don't email me support questions or followups ]
g o r d o n  +  n e w s  @  b a l d e r 1 3 . s e

Knute Johnson - 20 Feb 2007 23:02 GMT
> Read section 3.2 of the JLS, which describes translation of the input
> to the compiler. The unicode escape sequences are translated into
[quoted text clipped - 15 lines]
>
> /gordon

Gordon:

char c = \u0027\u002a\u0027\u003b

Do you know why they would process the unicode prior to determining if
it was part of a comment or literal first?  It does provide for some
great obfuscation.  I'm really glad it wasn't me that ran across this, I
could have spent days trying to figure this one out :-).

Signature

Knute Johnson
email s/nospam/knute/

Chris Uppal - 20 Feb 2007 23:55 GMT
> char c = \u0027\u002a\u0027\u003b
>
> Do you know why they would process the unicode prior to determining if
> it was part of a comment or literal first?

I presume the idea is to allow the use of Unicode characters in identifiers and
comments without making the source completely inaccessible to people using
non-Unicode editors.  Also to allow for the case where the source has to be
manipulated by non-Unicode programs (source code control, and so on).

   -- chris
Knute Johnson - 21 Feb 2007 00:16 GMT
>> char c = \u0027\u002a\u0027\u003b
>>
[quoted text clipped - 7 lines]
>
>     -- chris

I guess you have to make the rule one way or the other and this is the
way.  It does make for some really interesting traps though.

Signature

Knute Johnson
email s/nospam/knute/

Chris Uppal - 19 Feb 2007 15:03 GMT
> but i can not find why i can not use
> // char c = '\u000a'
> is it because '\u000a' is equivivalent to \n and this type is comment
> is a single-line?

Yes, exactly right.

   -- chris
Andreas Leitgeb - 19 Feb 2007 19:10 GMT
>> but i can not find why i can not use
>> // char c = '\u000a'
>> is it because '\u000a' is equivivalent to \n and this type is comment
>> is a single-line?
> Yes, exactly right.

And to test yourself, whether you've really understood,
predict what the compiler will say to that:

// char c = '\u000a//'

// :-)
dimakura - 20 Feb 2007 09:22 GMT
On Feb 19, 7:10 pm, Andreas Leitgeb <a...@gamma.logic.tuwien.ac.at>
wrote:
> >> but i can not find why i can not use
> >> // char c = '\u000a'
[quoted text clipped - 8 lines]
>
> // :-)

yes, i understand: new line begin with comment!
ok.

just to test myself:

it is not an error:

// \u000a

but error is

// \u000a something_else

where "something_else" is not spaces or something placed in correct
Java-style comment
Patricia Shanahan - 20 Feb 2007 09:39 GMT
> On Feb 19, 7:10 pm, Andreas Leitgeb <a...@gamma.logic.tuwien.ac.at>
> wrote:
[quoted text clipped - 25 lines]
> where "something_else" is not spaces or something placed in correct
> Java-style comment

It is not an error. It is two lines of code, and something_else is on
the second line, not part of the one line comment. In the following
valid program, ("Hello, world"); is neither spaces nor a Java-style comment.

public class HelloWorld{
  public static void main(String[] args){
   System.out.println // \u000a ("Hello, world");
  }
}

Patricia
Gordon Beaton - 20 Feb 2007 09:44 GMT
> but error is
>
> // \u000a something_else
>
> where "something_else" is not spaces or something placed in correct
> Java-style comment

Not just comments and whitespace. It's valid if something_else is
anything that can appear at the start of a line, including statements
or declarations, etc, in the context of the most recent non-comment
before this line, e.g.:

 public class
 // \u000a Foo {
 }

/gordon

Signature

[ don't email me support questions or followups ]
g o r d o n  +  n e w s  @  b a l d e r 1 3 . s e

dimakura - 21 Feb 2007 05:08 GMT
> > but error is
>
[quoted text clipped - 17 lines]
> [ don't email me support questions or followups ]
> g o r d o n  +  n e w s  @  b a l d e r 1 3 . s e

i agree, my formulation was not too precise.
thanks.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.