Java Forum / General / July 2005
Enum enlightenment
Roedy Green - 08 Jul 2005 14:45 GMT I wrote a simple enum-using class and decompiled it. Now all sorts of things about enum make sense.
to understand this paste this into documents and view them side by side in your IDE.
Here is the original code -- a enum to track the various flavours of Windows:
package com.mindprod.htmlmacros;
import java.util.EnumSet; import java.util.Set;
/** * enum of possible Windows OSes. May be used freely for any purpose but military. * @author Roedy Green copyright 2005 Canadian Mind Products */ public enum WindowsOS {
WIN95( "W95", "Windows 95"), WIN98( "W98", "Windows 98"), WINME( "Me", "Windows Me"), WINNT( "NT", "Windows NT" ), WIN2K( "W2K", "Windows 2000" ), WINXP( "XP", "Windows XP" ), WIN2K3("W2K3","Windows 2003");
private String shortName;
private String longName;
private static boolean DEBUGGING = true;
/** * Enum constant constructor that captures two extra facts about the enum. * @param short name for the os e.g. Me * @param long name of the OS e.g. "Windows XP" */ WindowsOS ( String shortName, String longName ) { this.shortName = shortName; this.longName = longName; }
/** * @return short name */ public String getShortName () { return this.shortName; }
/** * @return long name */ public String getLongName () { return this.longName; }
/** * Static method to construct a string mentioning multiple OSes, * by slashes. * @param choices, EnumSet of just the oses you want included * @return a String of the form "Windows 95/98/Me" */ public static String OSes( EnumSet<WindowsOS> choices ) { StringBuilder sb = new StringBuilder( 40 ); for ( WindowsOS o : choices ) { sb.append( '/' ); sb.append( o.shortName ); } if ( sb.length() == 0 ) { return ""; } else { // chop lead / and prepend "windows " return "Windows " + sb.toString().substring( 1 ); } }
/** * test harness * * @param args not used */ public static void main ( String[] args ) { if ( DEBUGGING ) { // You don't use a constructor to create EnumSet objects. EnumSet<WindowsOS> justThese = EnumSet.of( WIN2K, WINXP, WINME );
// prints "Windows Me/W2K/XP" // note they come out in proper order. System.out.println( WindowsOS.OSes ( justThese ) ); } } }
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is the decomiled code, showing what actually makes it into byte code.
package com.mindprod.htmlmacros;
import java.io.PrintStream; import java.util.EnumSet; import java.util.Iterator;
public final class WindowsOS extends Enum {
public static final WindowsOS[] values() { return (WindowsOS[])$VALUES.clone(); }
public static WindowsOS valueOf(String s) { return (WindowsOS)Enum.valueOf(com/mindprod/htmlmacros/WindowsOS, s); }
private WindowsOS(String s, int i, String s1, String s2) { super(s, i); shortName = s1; longName = s2; }
public String getShortName() { return shortName; }
public String getLongName() { return longName; }
public static String OSes(EnumSet enumset) { StringBuilder stringbuilder = new StringBuilder(40); WindowsOS windowsos; for(Iterator iterator = enumset.iterator(); iterator.hasNext(); stringbuilder.append(windowsos.shortName)) { windowsos = (WindowsOS)iterator.next(); stringbuilder.append('/'); }
if(stringbuilder.length() == 0) return ""; else return (new StringBuilder()).append("Windows ").append(stringbuilder.toString().substring(1)).toString(); }
public static void main(String args[]) { if(DEBUGGING) { EnumSet enumset = EnumSet.of(WIN2K, WINXP, WINME); System.out.println(OSes(enumset)); } }
public static final WindowsOS WIN95; public static final WindowsOS WIN98; public static final WindowsOS WINME; public static final WindowsOS WINNT; public static final WindowsOS WIN2K; public static final WindowsOS WINXP; public static final WindowsOS WIN2K3; private String shortName; private String longName; private static boolean DEBUGGING = true; private static final WindowsOS $VALUES[];
static { WIN95 = new WindowsOS("WIN95", 0, "W95", "Windows 95"); WIN98 = new WindowsOS("WIN98", 1, "W98", "Windows 98"); WINME = new WindowsOS("WINME", 2, "Me", "Windows Me"); WINNT = new WindowsOS("WINNT", 3, "NT", "Windows NT"); WIN2K = new WindowsOS("WIN2K", 4, "W2K", "Windows 2000"); WINXP = new WindowsOS("WINXP", 5, "XP", "Windows XP"); WIN2K3 = new WindowsOS("WIN2K3", 6, "W2K3", "Windows 2003"); $VALUES = (new WindowsOS[] { WIN95, WIN98, WINME, WINNT, WIN2K, WINXP, WIN2K3 }); } }
Note how java generates you some methods in the same class!
It composes you a values and a valueOf method that does not need a Class parameter.
it makes your constructor private.
I generates two extra secret fields to your constructor, the enum name and the ordinal. This mean the enum constants don't have to count themselves or register themselves. That is all done at compile time.
It creates static finals for each enum constant an the code to initialise them using your constructors.
It creates a constant array of enum objects, one of each flavour called $VALUES[] to use in the values method. IT can also be used by the name method to convert
In this case no enum constant had any of its own fields or methods.
Note the true enum class is hard coded in all over the place. This is no object-type erasure crap.
The $VALUE array could have been used by methods like first, last, count, ordinalToEnum, but I have not found any trace of such methods. You can't get at the $VALUES without patching byte code since that is not a legal java identifier. So I guess every time you wan that information you need to do a values() to clone the array just to find out how long it is, or to index it in a read only way to convert ordinal back to enum.
Note that the generic EnumSet handles all enums. There is no corresponding customised code generated for the EnumSet. It is not obvious from this code, but the bit masks used in EnumSet computations are not built into the enum constants. They are generated from the ordinal number as needed on the fly with shifting and masking. It is also not obvious from this code, but EnumSet.of figures out the class of the enums by looking up the class of the first parameter. There is NOT an EnumSet class generated for each Enum class.
 Signature Bush crime family lost/embezzled $3 trillion from Pentagon. Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video. http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm
Canadian Mind Products, Roedy Green. See http://mindprod.com/iraq.html photos of Bush's war crimes
Roedy Green - 08 Jul 2005 15:28 GMT Here is what happens when you give your enum constants their own private methods and variables:
enum .... WIN2K( "W2K", "Windows 2000" ) { private int p; int cost () { return 200; } } , WINXP( "XP", "Windows XP" ) { private int q; int cost () { return 300; } } , WIN2K3("W2K3","Windows 2003"); ...
this generates:
WINNT = new WindowsOS("WINNT", 3, "NT", "Windows NT"); WIN2K = new WindowsOS("WIN2K", 4, "W2K", "Windows 2000") {
int cost() { return 200; }
private int p;
}; WINXP = new WindowsOS("WINXP", 5, "XP", "Windows XP") {
int cost() { return 300; }
private int q;
}; WIN2K3 = new WindowsOS("WIN2K3", 6, "W2K3", "Windows 2003");
in other words, each of those little enum constants becomes its own little anyonyomous inner class.
 Signature Bush crime family lost/embezzled $3 trillion from Pentagon. Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video. http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm
Canadian Mind Products, Roedy Green. See http://mindprod.com/iraq.html photos of Bush's war crimes
Martijn Mulder - 08 Jul 2005 16:14 GMT > I wrote a simple enum-using class and decompiled it. <snip> Tell me Roedy, how do you decompile a .class file? javap gives me an overview of the methods in the class, not the code within the methods. The switches I tried (-c, -h, -l, -p, -s, -v) did not give me a 'machine formatted' version of my .java files.
Roedy Green - 09 Jul 2005 11:15 GMT >Tell me Roedy, how do you decompile a .class file? javap gives me an overview of >the methods in the class, not the code within the methods. The switches I tried >(-c, -h, -l, -p, -s, -v) did not give me a 'machine formatted' version of my >.java files. see http://mindprod.com/jgloss/decompiler.html and http://mindprod.com/jgloss/disassembler.html
 Signature Bush crime family lost/embezzled $3 trillion from Pentagon. Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video. http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm
Canadian Mind Products, Roedy Green. See http://mindprod.com/iraq.html photos of Bush's war crimes
Paul - 08 Jul 2005 19:17 GMT > The $VALUE array could have been used by methods like > first, last, count, ordinalToEnum, but I have not found any trace of [quoted text clipped - 3 lines] > to find out how long it is, or to index it in a read only way to > convert ordinal back to enum. In Java, you can use a dollar sign as part of a legal Java identifier. I think the $VALUES array isn't created until some second pass after the compiler has validated your actual .java file but before it translates it into the implementation behind the idiom.
Maybe "legal java identifier" isn't what you meant, but that the symbol is undefined.
public enum TestDollar { ONE, TWO;
private int $dollarvar;
public int get() { return $dollarvar; } public void set(int d) { $dollarvar = d; }
public void voidfunc() { TestDollar[] vals = $VALUES; // compiler says 'cannot find symbol' for $VALUES } }
--Paul
Roedy Green - 09 Jul 2005 09:46 GMT >Maybe "legal java identifier" isn't what you meant, but that the symbol >is undefined. I scanned my text books and the web and could not get a definitive answer on just what chars are allowed in identifiers: 1. in JVM byte code. 2. in java source.
I wanted not just to know what the current compiler lets you have, but what the language standard guarantees.
I suppose it can be tested by experiment. is eacute ok? Chinese characters? math symbols? the \u notation is pretty ugly. I'd need a unicode text editor to do the proper experiments.
my personal rule has been to use nothing but A-Z a-z 0-9 and _ but only the middle of constant names. A similar question is just how long can an Identifier be? Natural limits due to bit sizes for field lengths are 31, 255 and 32,767. I suppose that could be an implementation detail.
 Signature Bush crime family lost/embezzled $3 trillion from Pentagon. Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video. http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm
Canadian Mind Products, Roedy Green. See http://mindprod.com/iraq.html photos of Bush's war crimes
Tim Tyler - 09 Jul 2005 16:19 GMT Roedy Green <look-on@mindprod.com.invalid> wrote or quoted:
> A similar question is just how long can an Identifier be? Natural > limits due to bit sizes for field lengths are 31, 255 and 32,767. I > suppose that could be an implementation detail. ``The length of field and method names, field and method descriptors, and other constant string values is limited to 65535 characters by the 16-bit unsigned length item of the CONSTANT_Utf8_info structure (§4.4.7). Note that the limit is on the number of bytes in the encoding and not on the number of encoded characters. UTF-8 encodes some characters using two or three bytes. Thus, strings incorporating multibyte characters are further constrained.''
- http://java.sun.com/docs/books/vmspec/2nd-edition/html/ClassFile.doc.html#88659
 Signature __________ |im |yler http://timtyler.org/ tim@tt1lock.org Remove lock to reply.
Dale King - 16 Jul 2005 06:59 GMT > Roedy Green <look-on@mindprod.com.invalid> wrote or quoted: > [quoted text clipped - 11 lines] > > - http://java.sun.com/docs/books/vmspec/2nd-edition/html/ClassFile.doc.html#88659 Early on, 1.5 was supposed to include support for removing some of the class file size limitations (particularly only 64K for a method body), but somehow it didn't make the final cut.
It's still being worked on under JSR202.
 Signature Dale King
Raymond DeCampo - 09 Jul 2005 16:34 GMT >>Maybe "legal java identifier" isn't what you meant, but that the symbol >>is undefined. [quoted text clipped - 6 lines] > I wanted not just to know what the current compiler lets you have, but > what the language standard guarantees. Well, did you try reading it?
http://java.sun.com/docs/books/jls/second_edition/html/j.title.doc.html
> I suppose it can be tested by experiment. is eacute ok? Chinese > characters? math symbols? the \u notation is pretty ugly. I'd need a [quoted text clipped - 6 lines] > limits due to bit sizes for field lengths are 31, 255 and 32,767. I > suppose that could be an implementation detail. HTH, Ray
 Signature XML is the programmer's duct tape.
Paul Bilnoski - 09 Jul 2005 17:04 GMT > http://java.sun.com/docs/books/jls/second_edition/html/j.title.doc.html FYI, third edition is out with the appropriate changes for "Generics, annotations, asserts, autoboxing and unboxing, enum types, foreach loops, variable arity methods and static imports". http://java.sun.com/docs/books/jls/
--Paul
Roedy Green - 10 Jul 2005 05:27 GMT >Well, did you try reading it? > >http://java.sun.com/docs/books/jls/second_edition/html/j.title.doc.html The relevant section is:
http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html#40625
That's Patricia Shanahan's job. I detest reading such lawyerly documents that try their hardest to hide the plain meaning.
A straight forward reading of the standard would say you CAN put - in your identifier names, but I know you can't.
The example he gives of a Legal identifier violates the first java letter rule.
Perhaps a lawyer can make sense of what they are trying say. For mortals a list of acceptable and unacceptable identifier with reason says for than pages of BNF or explanation.
If the standard was literally true Java foolishly refused to reserve even the Unicode mathematical operators for future use.
 Signature Bush crime family lost/embezzled $3 trillion from Pentagon. Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video. http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm
Canadian Mind Products, Roedy Green. See http://mindprod.com/iraq.html photos of Bush's war crimes
Raymond DeCampo - 11 Jul 2005 02:26 GMT >>Well, did you try reading it? >> [quoted text clipped - 9 lines] > A straight forward reading of the standard would say you CAN put - in > your identifier names, but I know you can't. I don't see where you would get this from the standard.
> The example he gives of a Legal identifier violates the first java > letter rule. I don't know which example you mean. They all seem fine to me.
> Perhaps a lawyer can make sense of what they are trying say. For > mortals a list of acceptable and unacceptable identifier with reason > says for than pages of BNF or explanation. > > If the standard was literally true Java foolishly refused to reserve > even the Unicode mathematical operators for future use. I don't know where you are reading that into it.
Actually, after posting the link, I went in and read the above section on my own. I was pretty disappointed that the real "specification" for what characters may be included was punted on by saying it depends on the results of java.lang.Character.isJavaIdentifierStart() and java.lang.Character.isJavaIdentifierPart().
Delving into the documentation led me on a relatively uninteresting excursion into Unicode land.
Ray
 Signature XML is the programmer's duct tape.
Dale King - 16 Jul 2005 06:49 GMT >> Perhaps a lawyer can make sense of what they are trying say. For >> mortals a list of acceptable and unacceptable identifier with reason [quoted text clipped - 13 lines] > Delving into the documentation led me on a relatively uninteresting > excursion into Unicode land. I think the reason they don't give you the definitive list is that list is not necessarily static. As characters get added to Unicode they can get added to the list of acceptable letters for Java identifiers. They don't want to update the language spec. as Unicode support expands in Java.
How would they specify it anyway? It would take pages to list ll the characters.
The rules are pretty broad. Almost any thing that is a letter or digit in Unicode is acceptable.
The one area that Sun fails in this regard is the support for encodings to actually use this full Unicode set. They don't support the use of byte order marks at the start of a Java source file to indicate UTF-8, UTF-16BE, UTF-32, etc. Even Windows notepad supports that, but not Sun. All they give you is the -encoding option which is not good enough.
 Signature Dale King
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|