Java Forum / General / May 2006
If you could add anything you want
John Gagon - 12 May 2006 16:28 GMT If you could add anything you wanted to the java language, what would it be?
I'd predict some would say the non-imperative stuff ie: closures or the
LISP like abilities to work almost purely functionally or do macros. Some might get smart as say "everything dot NET (dot NYET) has".
There is one thing that would be a nice compiler check that would prevent some stupid coding organizationally and that might be some better information hiding and this is rather original and with all such things, they may not be the best ideas and so that's why I bounce it here:
Example:
(public*) package com.mycomp.mysoft.persistence ignores com.mycomp.mysoft.services, com.mycomp.mysoft.web;
private package com.mycomp.mysoft.business knows com.mycomp.mysoft.persistence, com.mycomp.mysoft.services;
* the default
This could have been a java swing example just as easily. Anyhow, pardon me since it's been ten years but IIRC, C++ had friend and other features that seemed to work like this but perhaps that is more a class or type-grained feature.
Perhaps the reverse could also be asked and that is, what would you remove from the java language if you could or what is the biggest mistake. (Originality being more interesting here IMHO). i.e.: I know the way that the way package and protected work in terms of visibility is often criticized.
Thank you (in advance) for any corrections to conceptual mishaps here I might have too.
John Gagon Struggling Software Engineer
VisionSet - 12 May 2006 16:29 GMT > If you could add anything you wanted to the java language, what would > it be? Extending the OO paradigm from just classes to package structures. They could be so much more than simple name spaces.
-- Mike W
John Gagon - 12 May 2006 17:08 GMT > > If you could add anything you wanted to the java language, what would > > it be? > > Extending the OO paradigm from just classes to package structures. They > could be so much more than simple name spaces. Yes, this would be very cool. I'm just thinking about the possibilities of this and eliminating duplication, abstracting with packages and doing more with various kinds of dependencies. Those dependencies could be externally defined for example or there could be some further constraint on all classes in those packages placed in there. It's boggling but I can see already it would be very useful.
Oliver Wong - 12 May 2006 16:48 GMT > If you could add anything you wanted to the java language, what would > it be? I wrote about this before: Syntactic sugar for "final from now on". I.e. instead of:
<code> String myFinalValue; { String temp; for (Foo f : bar) { if (f.someCondition); temp = f.toString(); break; } } if (temp == null) { myFinalValue = ""; } else { myFinalValue = temp; } }
new AnonymousClass() { public void method() { system.out.println(myFinalValue); } } </code>
something like
<code> String myFinalValue = ""; for (Foo f : bar) { if (f.someCondition); myFinalValue = f.toString(); break; } } finalize myFinalValue;
new AnonymousClass() { public void method() { system.out.println(myFinalValue); } } </code>
- Oliver
John Gagon - 12 May 2006 17:12 GMT > > If you could add anything you wanted to the java language, what would > > it be? [quoted text clipped - 46 lines] > > - Oliver I see, so you basically prevent further assignments with a compiler check and it would eliminate potential side effects. A "single" assignment declaration (but delayed) might be useful too like a finalize_after_assignment String myvar; (but it would be some single short keyword)
John
Hendrik Maryns - 15 May 2006 10:03 GMT Oliver Wong schreef:
>> If you could add anything you wanted to the java language, what would >> it be? > > I wrote about this before: Syntactic sugar for "final from now on". > I.e. instead of: Indeed, a situation where I have thought a few times: I’d like Eiffel’s once functions here.
H. - -- Hendrik Maryns
================== http://aouw.org Ask smart questions, get good answers: http://www.catb.org/~esr/faqs/smart-questions.html
Eric Sosman - 12 May 2006 18:08 GMT John Gagon wrote On 05/12/06 11:28,:
> If you could add anything you wanted to the java language, what would > it be? An antidote for Eternal September.
 Signature Eric.Sosman@sun.com
Roedy Green - 12 May 2006 19:43 GMT > An antidote for Eternal September. http://www.answers.com/topic/eternal-september
All time since September 1993. One of the seasonal rhythms of the Usenet used to be the annual September influx of clueless newbies who, lacking any sense of netiquette, made a general nuisance of themselves. This coincided with people starting college, getting their first internet accounts, and plunging in without bothering to learn what was acceptable. These relatively small drafts of newbies could be assimilated within a few months. But in September 1993, AOL users became able to post to Usenet, nearly overwhelming the old-timers' capacity to acculturate them; to those who nostalgically recall the period before, this triggered an inexorable decline in the quality of discussions on newsgroups. Syn. eternal September. See also AOL!.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
John Gagon - 13 May 2006 07:31 GMT > > An antidote for Eternal September. > [quoted text clipped - 11 lines] > period before, this triggered an inexorable decline in the quality of > discussions on newsgroups. Syn. eternal September. See also AOL!. An obscurely guised dig at the OP, namely myself then or perhaps just a statement of affairs in general...
John Gagon
alexandre_paterson@yahoo.fr - 12 May 2006 20:24 GMT > If you could add anything you wanted to the java language, what > would it be? The one and only true Design by Contract, as defined by its inventor.
This should imply also *removing* all those unnecessary gotos from the language (ie the over-abused and misused exceptions) and leave exceptions, well, for really exceptional conditions (in Eiffel that happens only when some part breaks a contract).
I've been following closely stuff like JML, Nice, etc. and more recently Contract4J.
But real DbC integrating nicely in my IDE and having that IDE reporting possible broken contract in real-time would be gorgeous (that can be seen on demos of Microsoft's Spec# "specsharp" research language and it is *really* impressive).
Link to an overview here (works fine under OpenOffice):
http://research.microsoft.com/~leino/papers/SpecSharp-MPI-SS.ppt
For the moment I'm stuck with IntelliJ IDEA's @NotNull Java 1.5 annotation... It's already a good addition to the language (some would say it's just a fix for a language defect ;)
http://www.jetbrains.com/idea/features/newfeatures.html#nullable
I can't wait being able to specify real contracts on my abstract data types!
Of course YMMV,
Alex
Kent Paul Dolan - 14 May 2006 09:04 GMT "alexandre_paterson" <alexandre_paterson@yahoo.fr> wrote:
> This should imply also *removing* all those > unnecessary gotos from the language (ie the > over-abused and misused exceptions) and leave > exceptions, well, for really exceptional > conditions (in Eiffel that happens only when some > part breaks a contract). Umm, that in itself would be one of my wishes: remove exception handling hell by allowing exceptions to propogate upward unhandled (untried, uncaught) and unremarked by redundant "throws ...", or unrelayed by rethrowing, (but warn at compile time that such is the case) and at the top level to default to an exception dump, stackdump and hard error exit.
This would cater for writing the base "added value" code _before_ doing the exception code, rather than having handling exceptions explicitly a requirement for compilation or testing. It is way maddening to have simple-in-concept code cluttered by try...catch sets, throws declarations, rethrowing, and all the messes they make out of variable scoping, before the main intellectual product code can be tested at all.
The second one would be to decouple class names from file names to reduce the proliferation of space wasting tiny files in a typical large application suite. If other languages can use symbol tables and linkage editors to cross-link compiled files, and have done so for decades, Java can probably do the trick as well.
It is even nastier to navigate a directory with a thousand individual class files in it, than to edit a single file with those thousand class definitions in it.
Probably file level encapsulation should allow up to a whole package in a single file, rather than merely a class, an interface, or an exception declaration, as now.
Third, allow separation of declaration and implementation, as Ada does. This isn't quite the same thing as declaring a Java intreface and then separately implementing it one or several times, because in Ada, the implementation of a declaration is unique.
The Ada style allows the whole software suite interface to be declared and compiled, cleaned up, and debugged, before a stick of implementation is written. This allows much more robust large system architecting.
FWIW
xanthian.
Kent Paul Dolan - 14 May 2006 09:11 GMT "alexandre_paterson" <alexandre_paterson@yahoo.fr> wrote:
> This should imply also *removing* all those > unnecessary gotos from the language (ie the > over-abused and misused exceptions) and leave > exceptions, well, for really exceptional > conditions (in Eiffel that happens only when some > part breaks a contract). Umm, that in itself would be one of my wishes: remove exception handling hell by allowing exceptions to propogate upward unhandled (untried, uncaught) and unremarked by redundant "throws ...", or unrelayed by rethrowing, (but warn at compile time that such is the case) and at the top level to default to an exception dump, stackdump and hard error exit.
This would cater for writing the base "added value" code _before_ doing the exception code, rather than having handling exceptions explicitly a requirement for compilation or testing. It is way maddening to have simple-in-concept code cluttered by try...catch sets, throws declarations, rethrowing, and all the messes they make out of variable scoping, before the main intellectual product code can be tested at all.
[I'd even suggest that a "try{}" _not_ define a scope of variable visibility, but instead variables declared within it have scope the next surrounding scope. An explicit limited scope can easily be created with an extra set of brackets "try{{}}", but the current scope rules mean that to be seen in the catch(){} or after the try/catch, the variables must be declared before the try, conflicting with the best practice that variables be declared where they are first used.]
The second one would be to decouple class names from file names to reduce the proliferation of space wasting tiny files in a typical large application suite. If other languages can use symbol tables and linkage editors to cross-link compiled files, and have done so for decades, Java can probably do the trick as well.
It is even nastier to navigate a directory with a thousand individual class files in it, than to edit a single file with those thousand class definitions in it.
Probably file level encapsulation should allow up to a whole package in a single file, rather than merely a class, an interface, or an exception declaration, as now.
Third, allow separation of declaration and implementation, as Ada does. This isn't quite the same thing as declaring a Java intreface and then separately implementing it one or several times, because in Ada, the implementation of a declaration is unique.
The Ada style allows the whole software suite interface to be declared and compiled, cleaned up, and debugged, before a stick of implementation is written. This allows much more robust large system architecting than the Java style does.
FWIW
xanthian.
Hendrik Maryns - 15 May 2006 10:09 GMT Kent Paul Dolan schreef:
> "alexandre_paterson" <alexandre_paterson@yahoo.fr> > wrote: [quoted text clipped - 33 lines] > try, conflicting with the best practice that variables > be declared where they are first used.] You obviously did nut understand what Alexandre was suggesting. Read up on error management in Eiffel. It really means removing the /whole/ exception system. But this of course will not happen.
> Third, allow separation of declaration and > implementation, as Ada does. This isn't quite the [quoted text clipped - 8 lines] > written. This allows much more robust large system > architecting than the Java style does. You don’t believe in the single-source principle?
H. - -- Hendrik Maryns
================== http://aouw.org Ask smart questions, get good answers: http://www.catb.org/~esr/faqs/smart-questions.html
Kent Paul Dolan - 16 May 2006 06:14 GMT > Kent Paul Dolan schreef:
>>> This should imply also *removing* all those >>> unnecessary gotos from the language (ie the >>> over-abused and misused exceptions) and leave >>> exceptions, well, for really exceptional >>> conditions (in Eiffel that happens only when >>> some part breaks a contract). Notice, for the current purposes, that the above quite specifically _does_ leave an exception system in place, just not the current one.
>> Umm, that in itself would be one of my wishes: >> remove exception handling hell by allowing [quoted text clipped - 4 lines] >> at the top level to default to an exception dump, >> stackdump and hard error exit.
>> This would cater for writing the base "added >> value" code _before_ doing the exception code, [quoted text clipped - 5 lines] >> make out of variable scoping, before the main >> intellectual product code can be tested at all.
>> [I'd even suggest that a "try{}" _not_ define a scope >> of variable visibility, but instead variables declared [quoted text clipped - 5 lines] >> try, conflicting with the best practice that variables >> be declared where they are first used.]
> You obviously did not understand what Alexandre > was suggesting. Umm, I read what he _wrote_, not what some prior agenda of mine wanted him to have written. Did you?
> Read up on error management in Eiffel. It really > means removing the /whole/ exception system. But > this of course will not happen. I'm not particularly interested in turning Java into Eiffel. Something about Eiffel, perhaps merely that it wasn't made freely available early enough, maybe its politics, perhaps its intellectual difficulty, has kept it from becoming popular, it remains more in the "minor languages" crowd, than in the "languages you need to know to find employment" crowd.
Java in contrast has taken the software industry by storm, despite being proprietary. Turning Java into Eiffel would presumably destroy the popularity of Java as well.
I don't have any particular problems with an exception system. I'll grant that it can be abused into a "goto" system, in Alexander's words, but, used with proper discipline, it provides well for a "what to do when you can't go on" system that lets you handle problems "paragraph by paragraph" rather than "phrase by phrase".
What gives me hives is an exception system that forces me to handle an exception twelve layers deep in the stack twelve times to get it to the top level of the application, in the case where I really do want just to drop dead if the case occurs, but someone else might want to cater for the exception somewhere half-way in-between. What gives me the grippe is an exception system that won't _let_ me use something that might, someday, "throw", without wrapping the invocation in tedious grunge that makes my code too ugly to comprehend. Those are the parts of Java's exceptions I would like to see fixed. I have zero problems with a compilation system that insists on warning about unhandled exceptions, so my management has some way to impose quality control on my code, I just have _huge_ problems with a compilation system that refuses to _compile_ code with unhandled propagation of exceptions _at all_.
>> Third, allow separation of declaration and >> implementation, as Ada does. This isn't quite the >> same thing as declaring a Java intreface and then >> separately implementing it one or several times, >> because in Ada, the implementation of a declaration >> is unique.
>> The Ada style allows the whole software suite >> interface to be declared and compiled, cleaned up, >> and debugged, before a stick of implementation is >> written. This allows much more robust large system >> architecting than the Java style does.
> You don't believe in the single-source principle? The "single source principle" I know says "don't replicate code, refactor to make it a callable routine instead". Ada doesn't violate that principle, it merely allows the declaration and the implementation of a method to reside in separate source files, a dandy idea, allowing one to create and review an interface uncluttered by its implementation details. In terms of commercial software, this allows that implementations need not even be delivered as source code, if one is of that "closed source" school, while catering that declarations _can be and are_ delivered as source code, much like C/C++ header files are.
[Nor am I a "language theorist", chained to some family of inviolable language design principles; I'm a practicing applications programmer, since 1961, and what I'm interested to find in a language's design isn't inflexibility for the sake of principle, but stuff that helps me do that job well and efficiently, a true fan of Larry Wall and Perl.]
xanthian.
Hendrik Maryns - 16 May 2006 09:14 GMT Kent Paul Dolan schreef:
>> Kent Paul Dolan schreef: > [quoted text clipped - 43 lines] > Umm, I read what he _wrote_, not what some prior > agenda of mine wanted him to have written. Did you? Hm, ok maybe you’re right. I indeed read his suggestion as abolishing all exceptions.
>> Read up on error management in Eiffel. It really >> means removing the /whole/ exception system. But [quoted text clipped - 21 lines] > you handle problems "paragraph by paragraph" rather > than "phrase by phrase". Agree with all that.
> What gives me hives is an exception system that > forces me to handle an exception twelve layers deep [quoted text clipped - 14 lines] > compilation system that refuses to _compile_ code > with unhandled propagation of exceptions _at all_. But not sure about this. I will comment no further, as I do feel that it makes code ugly often now, but I am not sure I’d find your system better.
>>> Third, allow separation of declaration and >>> implementation, as Ada does. This isn't quite the [quoted text clipped - 24 lines] > declarations _can be and are_ delivered as source > code, much like C/C++ header files are. The single source principle I know says: ‘put everything that concerns one unit/class/module into one file’. Under everything I understand documentation, interface, implementation, specification, contract, ... Then along with compilers the suitable tools should be delivered to extract the appropriate view. Which is what happens reasonably well with Javadoc, but I agree there might be a need for a tool to abstract the interface view of a class without having to define a Java interface for it. (Again, see Eiffel(Studio) for more different views.)
> [Nor am I a "language theorist", chained to some > family of inviolable language design principles; I'm [quoted text clipped - 3 lines] > but stuff that helps me do that job well and > efficiently, a true fan of Larry Wall and Perl.] Hm, we certainly differ in that :-)
H.
- -- Hendrik Maryns
================== http://aouw.org Ask smart questions, get good answers: http://www.catb.org/~esr/faqs/smart-questions.html
John Gagon - 16 May 2006 21:31 GMT > > If you could add anything you wanted to the java language, what > > would it be? [quoted text clipped - 30 lines] > > Alex I don't see where my message went so I'll summarize this time.. I do like Nice a lot. I will look at Spec Sharp. Looks good though. Well done presentation.
John Gagon
Ed - 13 May 2006 15:07 GMT > If you could add anything you wanted to the java language, what would > it be? An accessor level between package-private and public, so that a class could be visible not to the whole system, but just a group of packages: http://www.edmundkirwan.com/servlet/fractal/frac-page56.html
.ed
-- www.EdmundKirwan.com - Home of The Fractal Class Composition
John Gagon - 15 May 2006 03:28 GMT > > If you could add anything you wanted to the java language, what would > > it be? [quoted text clipped - 4 lines] > > .ed That seems like it would go hand in hand with the restricted packaging. Cool.
John Gagon
John Gagon - 15 May 2006 06:17 GMT > > If you could add anything you wanted to the java language, what would > > it be? > > An accessor level between package-private and public, so that a class > could be visible not to the whole system, but just a group of packages: > http://www.edmundkirwan.com/servlet/fractal/frac-page56.html You know, I just read through your website and downloaded the analysis tool. I find it's fairly helpful. I use a lot of metrics, LoD and PMD and CPD etc etc and this is another one I'll use as well. Other kinds of tools like execution coverage/dead code analysis/test coverage and profiling etc I tend to go crazy on this kind of stuff. I tend to really like objective statistics and code reviewing.
Does your tool, if used then guarantee certain metrics? (I could guess but I'd rather know if you intended to cover any or unintentionally resolved any)
BTW, I did notice a few missing words (I tend to do that) here and there in the article. I can provide correction if you like and I like the code examples. I was hoping to look at source for examples of how your facades and singletons looked but I noticed you have it obfuscated. I'm about ready to run it on some code now. Unfortunately, one of my modules I wrote not long ago got a 0.58 but one of my other older pieces had a 0.72. (I haven't done any other metrics on them yet). I'm somewhat perfectionist.
John Gagon
Ed Kirwan - 16 May 2006 10:05 GMT >>> If you could add anything you wanted to the java language, what would >>> it be? [quoted text clipped - 13 lines] > but I'd rather know if you intended to cover any or unintentionally > resolved any) Hi, John,
I'm not entirely sure what you mean by, "Guarantee certain metrics." Do you mean, "Are the tool's metrics guaranteed to be correct?" Well, we use them in our work, and I know of two other shops that use them; but would I sign an iron-clad, financially-punitive contract declaring that the tool is free from all bugs and so the metrics are guaranteed to be correct for all inputs?" Sadly, I would not. To date, however, there have been few major complaints.
I get the feeling, however, that this is not what you meant.
> BTW, I did notice a few missing words (I tend to do that) here and > there in the article. I can provide correction if you like I would be delighted to receive your corrections. Engineering has withered my English language skills to the point where they must cart around their own bottled oxygen with them, and still they wheeze and splutter at the slightest grammatical exertion; I would appreciate any comments you have. It's rare indeed that anyone volunteers so surprisingly important a service.
and I like
> the code examples. I was hoping to look at source for examples of how > your facades and singletons looked Excellent point. I should make the source examples available as downloads. Examples, in fact, are probably not enough; so as a gesture of appreciation for your offer above, I'll open-source a full application with a fractal index of a perfect 1.0. Give me a couple of weeks to cobble together a program description; I'll post notification here.
but I noticed you have it
> obfuscated. I'm about ready to run it on some code now. Unfortunately, > one of my modules I wrote not long ago got a 0.58 but one of my other > older pieces had a 0.72. (I haven't done any other metrics on them > yet). I'm somewhat perfectionist. I'm fortunately unafflicted by perfectionism: I hear it can be tiresome. :)
As with all metrics and as I'm sure you're aware, metrics should be viewed with healthy caution. I'm not sure how much value can be gleaned from pouring code that was designed without the fractal class composition in mind into the Fractality code analyser because there are many different methodologies that people use to maximise the OOness of their system.
A fractal index of 0.58 does indeed suggest that a module was not, "Programmed to an interface repository," and did not, "Eliminate descendant dependencies;" but if these concepts were not used in the construction of that module, then we're viewing the code from an angle not considered by the designer: it's then perhaps no surprise that it looks a little askew, but that doesn't imply that the code is poorly designed; it's just designed in a way unfamiliar to an unbending code analyser.
If, however, a module is designed from scratch with the fractal class composition in mind, and yet still scores badly in Fractality, then we can ask some drilling questions.
> John Gagon
 Signature www.EdmundKirwan.com - Home of The Fractal Class Composition.
Download Fractality, free Java code analyzer: www.EdmundKirwan.com/servlet/fractal/frac-page130.html
John Gagon - 16 May 2006 12:17 GMT > >>> If you could add anything you wanted to the java language, what would > >>> it be? [quoted text clipped - 18 lines] > I'm not entirely sure what you mean by, "Guarantee certain metrics." Do > you mean, "Are the tool's metrics guaranteed to be correct?" I believe yes. I notice you include a lot of the standard metrics that I use in the various analysis views. I'm assuming that once one achieves the Fractal Class Composition score of 1.0, that the metrics for instability, for example would be zero (afferent/efferent couplings) and cyclomatic complexity would all be at a certain optimum or value. I'm guessing some might come in as perfect while others are "fairly close" to an ideal value (like distance and abstractness). In any case, I wonder if some metric limits are reached by achieving a score of 1.0 perhaps as a function of number of packages and classes.
> Well, we > use them in our work, and I know of two other shops that use them; but > would I sign an iron-clad, financially-punitive contract declaring that > the tool is free from all bugs and so the metrics are guaranteed to be > correct for all inputs?" Sadly, I would not. To date, however, there > have been few major complaints. It's always hard to tell that one. Amount of money to risk seems to me proportionate to perceived stability, I would think it would depend on the amount in the contract.
> I get the feeling, however, that this is not what you meant. Do you still get the feeling? (as I lost what your pronoun 'this' (above) might refer to other than generally my question about guaranteeing of certain metrics)
> > BTW, I did notice a few missing words (I tend to do that) here and > > there in the article. I can provide correction if you like [quoted text clipped - 5 lines] > comments you have. It's rare indeed that anyone volunteers so > surprisingly important a service. I have sent them to you personally in a separate email.
> and I like > > the code examples. I was hoping to look at source for examples of how [quoted text clipped - 5 lines] > application with a fractal index of a perfect 1.0. Give me a couple of > weeks to cobble together a program description; I'll post notification here. Yes, that would be *very* useful. Very good idea there. I'll search for it periodically in the future. Feel free to CC my email if you would like me to look at it too. ;-)
> but I noticed you have it > > obfuscated. I'm about ready to run it on some code now. Unfortunately, [quoted text clipped - 3 lines] > > I'm fortunately unafflicted by perfectionism: I hear it can be tiresome. :) It sure can be. Well, a somewhat perfectionist, to be pedantic, is not quite as bad as an absolute perfectionist though is it?
> As with all metrics and as I'm sure you're aware, metrics should be > viewed with healthy caution. I'm not sure how much value can be gleaned > from pouring code that was designed without the fractal class > composition in mind into the Fractality code analyser because there are > many different methodologies that people use to maximise the OOness of > their system. I use principles of keeping packages, classes and method sizes in a certain range and I try to organize dependencies and in the past, I've used a more bandaid approach using an open source tool call depfind that searched the code for dependencies and spat out megabytes of xml or html. As a maintainer of a mature codebase, this was more crucial because stability was a primary goal at that point. Every code change needed impact analysis and I would use the dependency checker run in an ant script to find out the current dependencies and find the number of other classes affected by the change. This preventative approach reduces that need quite a bit. I used to work on this at HP before offshoring and reduction of workforce occured with the incoming CEO replacing Carly.
> A fractal index of 0.58 does indeed suggest that a module was not, > "Programmed to an interface repository," and did not, "Eliminate [quoted text clipped - 4 lines] > designed; it's just designed in a way unfamiliar to an unbending code > analyser. The code started out cleaner but then became more ratsnesty / spaghetti and even with just myself programming it, it grew out of control since I would work on it weekend to weekend since it was my own, on the side, skunkwork/moonlighting project.
> If, however, a module is designed from scratch with the fractal class > composition in mind, and yet still scores badly in Fractality, then we > can ask some drilling questions. Yes. I plan on refactoring to this standard if only for future maintenance. It will be a guiding principle for all others working on my free open version. I'm writing a tool which I will soon publish on Sourceforge and java.net and later, I will finish a commercial grade version with extra features. My tool is something more related to prototyping and quick model driven development similar to projects like trails/ruby on rails etc but with one other design goal in mind besides "do not repeat yourself". It's been a long journey but I've got about 60% completion right now. (I'm also working on a personal tracking tool like xplanner but supporting more calendar and recurring functions) John Gagon
John Gagon - 16 May 2006 12:36 GMT > > I'm not entirely sure what you mean by, "Guarantee certain metrics." Do > > you mean, "Are the tool's metrics guaranteed to be correct?" [quoted text clipped - 8 lines] > any case, I wonder if some metric limits are reached by achieving a > score of 1.0 perhaps as a function of number of packages and classes. (of course, metrics are more often type level metrics as Fractal Class Composition is more a package level metric of its own. maybe it's not so relevant per se but an additional item that is almost independant)
John Gagon
> John Gagon Ed Kirwan - 17 May 2006 14:15 GMT > I believe yes. I notice you include a lot of the standard metrics that > I use in the various analysis views. I'm assuming that once one [quoted text clipped - 5 lines] > any case, I wonder if some metric limits are reached by achieving a > score of 1.0 perhaps as a function of number of packages and classes. I had the same question myself, which is actually the reason for including our good friend Robert C Martin's metrics in the analysis tool. I was hoping that a system with a fractal index of 1.0 would show a very low Distance metric. It's difficult to compare the two metrics, of course, as the fractal index is system-wide, but the Distance metric is per-package (why doesn't Uncle Bob develop a system-wide variant?), but in those applications I've seen with a fractal index of 1.0, I've not seen any packages with a Distance metric of higher than 0.5.
Certainly, "Program to an interface repository, not an implementation repository," should align well with the Distance metric, but the correlation is still suspect; it's just too easy get an accidentally high Distance metric. (And I remember reading somewhere, sometime, that someone else had made a slight alteration to the Distance metric ... must check for that again.)
On the other hand, cyclomatic complexity is certainly well managed by, "Eliminate descendant dependencies;" and indeed the only place where cycles can occur is between two peer interface repositories, which are in themselves quite rare (one interface repository is usually sufficient to serve a package branch).
>>> BTW, I did notice a few missing words (I tend to do that) here and >>> there in the article. I can provide correction if you like >> I would be delighted to receive your corrections. > > I have sent them to you personally in a separate email. Received and thank you, sir!
 Signature www.EdmundKirwan.com - Home of The Fractal Class Composition.
Download Fractality, free Java code analyzer: www.EdmundKirwan.com/servlet/fractal/frac-page130.html
Kent Paul Dolan - 15 May 2006 09:20 GMT > If you could add anything you wanted to the java language, what would > it be? Without a doubt, automation of the present mess requiring programmers to divert GUI updating code to the Event Handling Thread. This is so awful a misfeature when left to the programmer as to make Java GUI programming nearly intolerable, by seeding insidious bugs into completely normal looking code where this need hasn't been recognized as applying.
xanthian.
Chris Uppal - 17 May 2006 10:09 GMT > If you could add anything you wanted to the java language, what would > it be? Hmm... where does one start ?
> I'd predict some would say the non-imperative stuff ie: closures or the > LISP like abilities to work almost purely functionally or do macros. Without trying to change Java into a better /kind/ of language, here are a few things which (IMO) stay within the spirit of Java but would have saved me time/effort in the past.
Java would have unsigned integers (but no automatic coercion between signed/unsigned of the same width).
>>> and >> would be swapped over. Right or left shifting by an impossible constant would provoke a compile-time warning.
char would be 32-bit.
String would be an abstract type with (the option of) different concrete subclasses.
Auto-boxing would provoke a compile-time warning.
The [] notation would be available whenever the object implements some interface, Indexable perhaps. java.util.List would inherit that interface.
Operator overloading would be permitted in some disciplined manner. Again, probably a small group of interfaces -- Field, MultiplicativeGroup, AdditiveGroup, perhaps. Classes would be required to implement the whole set of related operations, not just cherry-pick. Assignment operators like ++ and *= would be translated by the compiler into x = x + MyClass.unity(), rather than being available for roll-your-own overriding. The argument types and returned value of overloaded operators would be required to follow the pattern established by the existing operators (i.e. you can't redefine << to mean System.out.print()).
Objects would allow the clone() operation by default (which would probably be renamed to copy(), leaving a protected clone() which was a JVM-implemented shallow copy). The default implementation of copy() would call a protected postCopy() method. The default implementation of postCopy() would be empty. There would be a marker interface or annotation to forbid clone(). I.e. classes would opt-out of being copyable, not be forced to opt in.
Generics would vanish.
The definition of interfaces would be changed so that a method needed to satisfy the contract implied by the interface need be no more visible than the interface itself. (E.g. package-private methods would satisfy package-private interfaces.)
There would be a means of telling the compiler: "yes I know I'm calling a method that you don't know about, but /trust me/, it'll be there by the time this code is executed". Perhaps that would be allowed wherever there's an explicit handler for NoSuchMethodException. The same for fields.
There would be a method, java.lang.System.getPlatformVersion().
In Java, references to final fields initialised to a compile-time constant, are replaced by the constant itself. That's OK, but the generated classfiles would retain a per-method reference to the field so that dependencies can be tracked.
There would be some way to define compile-time constants on the javac command-line. (Since it's possible to do this anyway with only a little hacking, there seems no valid justification for not allowing it in a disciplined form.)
There would be a kind of Classloader which understood that you can put several JARfiles in one directory. The application Classloader would be of this type.
The people at Sun would be immersed upside-down in a huge vat of sex-crazed cane toads until they agreed to change their bloody awful layout conventions.
That list is by no means complete, but I've grown bored typing it in. Probably everyone's sick of reading by now too...
-- chris
Oliver Wong - 17 May 2006 17:14 GMT >> If you could add anything you wanted to the java language, what would >> it be? [snipped some good suggestions]
> char would be 32-bit. Conceptually, char should not have a width or size at all. Every char value should identify exactly one unicode character. The underlying implementation is free to use UTF-8, UTF-16, UTF-32 or any other encoding it likes to convert from char to bytes, but from the programmer's perspective, you can store any unicode character into a single char (i.e. none of this "surrogate pair" nonsense).
However, I'm not that familiar with the inner workings of the virtual machine, so I don't know what kind of havoc a "variable-length primitive" might cause.
[snipped some good suggestions]
> Generics would vanish. !!! I thought Java would be better with more generics, rather than less (or none at all).
> The definition of interfaces would be changed so that a method needed to > satisfy the contract implied by the interface need be no more visible than > the > interface itself. (E.g. package-private methods would satisfy > package-private > interfaces.) I don't understand this one.
> There would be a means of telling the compiler: "yes I know I'm calling a > method that you don't know about, but /trust me/, it'll be there by the > time > this code is executed". Perhaps that would be allowed wherever there's an > explicit handler for NoSuchMethodException. The same for fields. I've never seen a need for this. Can you elaborate?
[snipped some good suggestions]
- Oliver
Chris Uppal - 18 May 2006 12:46 GMT [me:]
> > char would be 32-bit. > > Conceptually, char should not have a width or size at all. I don't see the need for that level of abstraction. Unicode is limited to < 24 bits by the UTF-16 hack. The Unicode consortium states that code points will never be allocated outside the range representable in UTF-16. The equivalent ISO work has limited itself (as I understand it) to 31 bits, but since they and the Unicode people are committed to staying in lock-step, it's hard to see why that is more than an academic point.
So the only reason I can think of for /not/ choosing 32-bits is that you might suspect that now or in the future an implementation might want to restrict itself to 24 bits per char. I don't find that too plausible myself, but...
Of course, Strings might well use UTF-8 or UTF-16 encoded binary as their internal representation, or maybe UTF-32 for applications needing constant-time access. (The requirement I perceive for flexibility in this matter is why I would want to turn String into an abstract class). But that's /Strings/, I don't see a reason for /char/ to be anything other than an integer type with known width.
> > Generics would vanish. > > !!! I thought Java would be better with more generics, rather than less > (or none at all). Java might be better off with a proper implementation of generics (as opposed to the mess we've had dumped on us). I don't, myself, see that there's much to be gained by such a feature, and I don't really think that it's in the "spirit of Java" -- so (at least for this discussion) I'd just drop them.
You may be thinking of a more C++-like feature which provides compile-time metaprogramming. I'd certainly agree that a language which supports metaprogramming is much to be preferred over one that does not (unless, of course, the "metaprogramming" is something as gross as C++ templates). But that wouldn't fit with my self-imposed restriction:
> Without trying to change Java into a better /kind/ of language, [...]
> > The definition of interfaces would be changed so that a method needed to > > satisfy the contract implied by the interface need be no more visible [quoted text clipped - 4 lines] > > I don't understand this one. E.g. I have a package which uses internal interfaces to give order to the structure of the private code. I want one of the public classes in that package to implement one or more of those internal interfaces. I am forced to make the relevant methods public. I.e. I have to /publish/ them, and thus commit to keeping them unchanged in future developments. Bad. An interface is a /promise/, but you have to ask who the promise is made to. In this case I want to be able to use promises internally but am impeded because the language definition assumes that the only promises I will ever want to make are to client code.
> > There would be a means of telling the compiler: "yes I know I'm calling > > a method that you don't know about, but /trust me/, it'll be there by [quoted text clipped - 3 lines] > > I've never seen a need for this. Can you elaborate? Maybe a simple example would help:
... aMethod() { double start = now(); someLongishOperation(); double end = now(); System.out.printf("It took %f seconds%n", end-start) }
/** * return the time in seconds since an arbitrary (but fixed) start-time. * Resolution is dependent on the version of the Java platform */ private long now() { try { if (System.out.getPlatformVersion() >= 5) return System.out.nanoTime() / 1.0e9; } catch (NoSuchMethodException e) { // log it, or something } return System.out.getTimeMillis() * 1.0e3; }
That, or something like it, should compile on pre 1.5 platforms, but it won't because the compiler is too damned fond of early-binding. Note that it is the /compiler/ that's doing this, the equivalent bytecode would run just fine (legal /and/ safe) on any JVM.
-- chris
Oliver Wong - 18 May 2006 15:54 GMT > [me:] >> > char would be 32-bit. [quoted text clipped - 29 lines] > with > known width. I like the flexibility of adding new characters. If we define a size on char, then we either have a finite number of character we can define, or we have something like surrogate pairs (or triplets, or quadruplets, etc.) where you don't have a 1 to 1 correspondence between the "concept of a character", the that "char data type in Java".
Potential sources for new characters (in approximate order of probability):
* More domain-specific characters. E.g. musical notation for percussive instruments, symbols for obscure operators in math, physics, etc. * Integrating more popular, though "fictional" character sets, into Unicode e.g. Klingon. * Invention of a new language like Esperanto. * New discovery by archeologists of ancient writing systems. * Contact with alien civilization which use a different character set.
[snipped more explanations from Chris where I asked for them]
Ah, makes sense. Thanks.
- Oliver
Roedy Green - 18 May 2006 19:45 GMT > * More domain-specific characters. E.g. musical notation for percussive >instruments, symbols for obscure operators in math, physics, etc. One of the "character" sets I saw on a prototype IBM colour terminal was "geography". They had a character shaped like the tip of the boot of Italy. You could put these together with solid blobs to make up a map that took far fewer bits than a full bit map. Memory was very expensive back then and bandwidth was typically 9600 baud max with many terminals sharing the "high speed" line.
Other possible places for character expansion:
1. airport symbol language expanding to a full international language to be used on signs and emergency instructions.
2. ASL symbols for the deaf, showing gestures in symbolic form.
3. symbols for choreography.
4. weather symbols (might be in there already. I did not notice them).
5. ligatures and fancy forms needed for precise typesetting even if they are inserted by rule.
6. Symbols for the visually impaired. Alphabets and symbols easy to discriminate.
7. Symbols to record the oral-only niche languages rapidly disappearing.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Kent Paul Dolan - 19 May 2006 11:12 GMT > Other possible places for character expansion:
> 1. airport symbol language expanding to a full > international language to be used on signs and > emergency instructions. Makes sense.
> 2. ASL symbols for the deaf, showing gestures in symbolic form. This won't work in general. ASL symbols are done moving in space, sometimes changing handshapes while the hands move. Video recordings work better. There _was_, by the way, a very comprehensive, very arcane written notation for ASL created well over a decade ago, but it never caught on with either the ASL communitity, or the research community studying them.
> 3. symbols for choreography. Such already exist, (LAB-Annotation, for one), but IIUC, they are pretty rich symbol sets, subject to idiosyncratic extentions by each choreographer, and might not map to "alphabets" well.
> 4. weather symbols (might be in there already. I > did not notice them). That would be mostly doable, but weather symbols tend to be laid out on a two dimensional surface, not used in typesetting, so the utility of such an "alphabet" would be limited. Also, some weather symbols, like storm front "curves with triangular teeth", are extended graphical objects, with no base point from which to draw them with an alphabet.
> 5. ligatures and fancy forms needed for precise > typesetting even if they are inserted by rule. I've seen at least some of those in there, but others, that I expected to see, are instead done by overstriking, and could usefully have unique representations, which would usually be more accurate, instead. Notice that ligatures of the "ffl" type are really font choices, not alphabet choices, and so perhaps not suitable for Unicode codes, since "ffl" in one font might be a ligature, while in another it would not. Similarly, an ellipsis is a single character or three characters, depending on font support, so it is a kind of "ligature" too (and, is in Unicode already). I don't think a consistent treatment here is possible, which will give standards committees great sway to do mischief if they attempt the deed anyway.
> 6. Symbols for the visually impaired. Alphabets > and symbols easy to discriminate. Mostly this is just accomplished by use of large type, since the symbols have to be comprehensible to the population with which they interact. Also, notice that Unicode _doesn't_ include fonts or font styles, just alphabet generic glyph identifiers and ideograph generic glyph identifiers. Thus, some "visually impaired" equivalent of the optical character recognition fonts (which were for scanners of the day that were "visually impaired" compared to today's) wouldn't need codespace in the Unicode standard, they'd just be other fonts or font styles, with glyphs identified by Unicode "codes" for the generic glyphs of which they are instances.
> 7. Symbols to record the oral-only niche languages > rapidly disappearing. A nice idea, but they're going away far too fast for saving. The world has, IIRC, some 3,000 languages, very few of which have long term viability, and many of which are reduced to a handful of proficient speakers today. There simply aren't enough linguists and linguistically adept missionaries to save most of them.
And, in the rush to save what can be saved, use of the International Phonetic Alphabet (perhaps extended) as the base script would sure be a lot smarter than inventing a whole new alphabet per language.
FWIW
xanthian.
Roedy Green - 19 May 2006 20:48 GMT >. Also, >notice that Unicode _doesn't_ include fonts or font >styles, just alphabet generic glyph identifiers and >ideograph generic glyph identifiers That is the theory, but in practice you will find multiple symbols all looking suspiciously like an A.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Dale King - 19 May 2006 15:04 GMT >> * More domain-specific characters. E.g. musical notation for percussive >> instruments, symbols for obscure operators in math, physics, etc. If you see ones that are missing here you should point them out to the Unicode Consortium. They already have a fairly complete set with many obscure symbols.
> One of the "character" sets I saw on a prototype IBM colour terminal > was "geography". They had a character shaped like the tip of the boot > of Italy. You could put these together with solid blobs to make up a > map that took far fewer bits than a full bit map. Memory was very > expensive back then and bandwidth was typically 9600 baud max with > many terminals sharing the "high speed" line. That's not appropriate for Unicode. If you wanted something like that for a specific project there are always private use areas of Unicode that you can use for you own private use.
> Other possible places for character expansion: > > 1. airport symbol language expanding to a full international language > to be used on signs and emergency instructions. I can't imagine any symbols here appropriate for Unicode in general. Examples?
> 2. ASL symbols for the deaf, showing gestures in symbolic form. > > 3. symbols for choreography. Not appropriate as these are motions, not symbols, but if there are symbols that are commonly used they should be proposed.
> 4. weather symbols (might be in there already. I did not notice them). There are a few in this code page:
http://www.unicode.org/charts/PDF/U2600.pdf
> 5. ligatures and fancy forms needed for precise typesetting even if > they are inserted by rule. Many of these exist for common ones.
> 6. Symbols for the visually impaired. Alphabets and symbols easy to > discriminate. Definitely not appropriate for Unicode. This is a presentation/font issue.
> 7. Symbols to record the oral-only niche languages rapidly > disappearing. Isn't a symbolic alphabet for an oral-only language an oxymoron? ;-)
If the language doesn't currently have an alphabet and one is being assigned, it would make a lot more sense to use existing alphabets than creating brand new ones.
 Signature Dale King
Oliver Wong - 19 May 2006 15:46 GMT >>> * More domain-specific characters. E.g. musical notation for >>> percussive instruments, symbols for obscure operators in math, physics, [quoted text clipped - 3 lines] > Unicode Consortium. They already have a fairly complete set with many > obscure symbols. Well, I took a look at http://www.unicode.org/charts/PDF/U1D100.pdf ("Western Musical Symbols"), and they don't seem to have a notation for indicating that the drummer should ease off the hihat pedal for the next few notes, and then dampen the sound by applying pressure again. The notation looks something like:
<asciiArt> |-- O ----> -- (o) -| | | </asciiArt>
And is drawn above the staff of five lines where the notes are usually drawn. There are others missing as well (e.g. repeat the following section, but apply this ending the first time, and that ending the second time; repeat the previous four measures; the following 3 notes should be played in the time of 2 notes; apply the wah-wah pedal when playing guitars; let the strings of the guitar ring openly; etc.) I haven't suggested this to the consortium because:
(1) I didn't realize you could (but I've seen discovered http://www.unicode.org/pending/proposals.html) (2) I don't know the terminology or official names for these musical symbols, being only an amateur musician. I figure there must be someone else out there more qualified to make these submissions than me, but perhaps the intersection of the set of all musicians and the set of all people who care about Unicode is rather small.
>> Other possible places for character expansion: >> [quoted text clipped - 3 lines] > I can't imagine any symbols here appropriate for Unicode in general. > Examples? Well, they have some symbols which are, AFAIK, internationally recognized. In http://www.unicode.org/charts/PDF/U2600.pdf there's the recycling symbol, the biohazard symbol, and the poison symbol. Perhaps you could have internationally recognized road signs as well (yield, stop, left lane merge, etc.)
- Oliver
Oliver Wong - 19 May 2006 20:26 GMT >> 2. ASL symbols for the deaf, showing gestures in symbolic form. >> >> 3. symbols for choreography. > > Not appropriate as these are motions, not symbols, but if there are > symbols that are commonly used they should be proposed. Kent Paul Dolan brought up similar arguments.
As I was replying to a post by Bent C Dalager, it occured to me that Unicode does not concern itself with the representation of characters at all, so it is perfectly feasible that Unicode could support "animated" glyphs. My post is at http://groups.google.ca/group/comp.lang.java.programmer/msg/cd269f6cfc8392fc
There are actually quite a few "unprintable" characters in Unicode, so it wouldn't be a novelty to have characters that could not actually be displayed in a traditional text editor. In fact, the codecharts have an entire section called "Invisible Operators" (http://www.unicode.org/charts/PDF/U2000.pdf, though in actuallity, some of the characters defined there are indeed "visible").
\u2029, for example, "Paragraph Seperator" is invisible, and is merely to control the flow of text. It cannot, in itself, be displayed in any form.
So it doesn't seem unreasonable to have some unicode character "\ufoo" which represents a ASL gesture, which cannot be represented via some static glyph.
- Oliver
Kent Paul Dolan - 21 May 2006 11:50 GMT >>> 2. ASL symbols for the deaf, showing gestures in >>> symbolic form.
>>> 3. symbols for choreography.
>> Not appropriate as these are motions, not >> symbols, but if there are symbols that are >> commonly used they should be proposed.
> Kent Paul Dolan brought up similar arguments.
> As I was replying to a post by Bent C Dalager, it > occured to me that Unicode does not concern itself [quoted text clipped - 6 lines] > to have characters that could not actually be > displayed in a traditional text editor.
> So it doesn't seem unreasonable to have some > unicode character "\ufoo" which represents a ASL > gesture, which cannot be represented via some > static glyph. That might work for the various choreography annotations, though the "alphabets" would be rather huge unless done, as are many ligitures, as a set of overstrikes of simpler motions, body part by body part, all "printed" at the same location, or strung out as a "word of motion".
But ASL "words/gestures" have context dependent meanings, among other difficulties in capturing them in brief encodings.
Their meanings sometimes also depend profoundly, not just casually, on accompanying facial expressions.
Too, a typical ASL paragraph consists of putting actors/items at various places in space in front of the "speaker", then play-acting interactions among those locations and their contents.
In many senses, ASL is a much richer language than most spoken/written languages. I'm not an expert ASL speaker, but I get by in simple conversations, and I just don't see a way short of full video recording to convey an ASL conversation with the usual ASL conventions.
That recording would need to be stereo video recording, too, to give depth perception. ASL's "location in space meaning" is dependent on four dimensions. X, Y, Z, and speed of execution of the gesture are all modifiers to that gesture's meaning.
Even how broadly the gesture is made modifies its meaning, as may all of its starting point, its ending point. and its curved path through space.
Surely attempting to convey such a conversation in a written format (and to me, even though it is a system that includes invisible codes, Unicode is a system targeted at written languages) that would let the reader reconstruct the entire ASL gesture sequence or even understand its meaning from its ASL form, would be an incredibly painful exercise.
Moreover, since a written form of ASL _has_ been created, and failed to be adopted, I'm guessing that there would be no particular call for a "Unicode version of ASL" in any case.
ASLers who want to convey their ideas in written form, at least those literate enough to be capable of any kind of reading/writing, normally write them in English. [ASL is specifically _American_ sign language, and depends on the user's knowledge of American English to convey, by spelling them out, ideas for which no accepted sign currently exists.]
Note also that there are several formal sign languages besides ASL (i.e., Mexican sign language, British sign language) usually quite incompatible among themselves, so this problem would need solving many times, and chew up many chunks of Unicode code-space. I cannot conceive of any value in making the attempt to encode ASL in a code space of Unicode, which doesn't deny that someone may make the attempt anyway.
FWIW
xanthian.
Roedy Green - 21 May 2006 21:30 GMT >In many senses, ASL is a much richer language than >most spoken/written languages. In ASL, you have the analog ability to emphasise with the grandness of gesture and exaggeration of the facial expressions. You would not need to encode that in an ASL symbolic dictionary. Humans are quite capable of supplying that on their own.
If you look at an ASL dictionary it has stylised pictures with little arrows to indicate motion. I could imagine someone inventing a notation that could be read directly or used to generate those images or "Reboot" style 3D animations, much as Chinese ideograms can be created from combining radical symbols.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 20 May 2006 00:01 GMT >> 5. ligatures and fancy forms needed for precise typesetting even if >> they are inserted by rule. > >Many of these exist for common ones. I read somewhere they decide not to add any more ligatures. However, in typesetting you still need a code for them, so I suspect eventually they will be given Unicode slots.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Patricia Shanahan - 19 May 2006 15:25 GMT ...
> 7. Symbols to record the oral-only niche languages rapidly > disappearing. Why not use the International Phonetic Alphabet, which is already represented in Unicode?
Patricia
Roedy Green - 19 May 2006 20:54 GMT >Why not use the International Phonetic Alphabet, which is already >represented in Unicode? The original question was what might people do to Unicode to expand it, not what SHOULD they do.
There was a phonetic alphabet designed for native Canadian languages. It is very pretty, but it is as almost as bad as Hebrew for having very similar letters you would have to look carefully at to discriminate.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Chris Uppal - 19 May 2006 09:49 GMT > I like the flexibility of adding new characters. I presume you mean that you like the flexibility the ISO and Unicode consortium have to add new characters, rather than you would like to be free to define your own (if do mean the latter then there is always the private-use area to play in).
> If we define a size > on char, then we either have a finite number of character we can define, > or we have something like surrogate pairs (or triplets, or quadruplets, > etc.) where you don't have a 1 to 1 correspondence between the "concept > of a character", the that "char data type in Java". Me, I prefer to be able to manipulate characters as integers. Which requires (for sanity) knowing how wide the integer is. Unicode isn't going to add characters which don't fit into UTF-16, so there's a definite limit to how wide the integer needs to be. Even if they /did/ scrap UTF-16 (hardly likely when it would break Windows, .NET, /and/ Java ;-) there is still a unimaginably huge amount of space available in the 31-bits that ISO limits itself to. It would need several thousand "alphabets" the size of the unified HAN stuff to exhaust that (and where are those writing systems lurking ?).
> Potential sources for new characters (in approximate order of > probability): > > * More domain-specific characters. E.g. musical notation for > percussive instruments, symbols for obscure operators in math, physics, /Plenty/ of space is already available for that.
> etc. * Integrating more popular, though "fictional" character sets, > into Unicode e.g. Klingon. Ugh! Bloody sci-fi soap opera. (I /do/ like SF, I just don't like Star Treck -- in any of its manifestations). IMO, adding that kind of thing (say Tolkein's scripts) to Unicode would be a pathetic abuse of power.
And there's plenty of space anyway.
> * Invention of a new language like Esperanto. But would any sane new language use a writing system like Chinese ? And, if it did, why would anyone want to take it seriously enough to add it to Unicode. Let's say I design a language which, by definition, uses /all/ the Unicode glyphs, in pairs, to denote a fixed but large set of words. That /can't/ fit into any possible Unicode-like scheme since it has been deliberately designed to break any finite scheme. So why should the scheme be extended to support it ?
> * New discovery by archeologists of ancient writing systems. Certainly possible, and I would even call it probable. But why should that require more space than is already available ?
> * Contact with alien civilization which use a different character set. Since Unicode is designed around /human/ writing schemes, reflecting /human/ perceptual processes and /human/ cultural history(ies), I don't think it would be legitimate (and almost certainly impossible) to use Unicode to represent another species' communication systems. Far better to adopt /their/ version of Unicode for representing their communications.
And anyway, I sort of doubt whether there are any alien civilisations -- the universe is far to big.
-- chris
Oliver Wong - 19 May 2006 15:59 GMT >> I like the flexibility of adding new characters. > [quoted text clipped - 5 lines] > to > play in). Yes, I meant that I like the idea that the consortium can add letters as needed.
>> If we define a size >> on char, then we either have a finite number of character we can define, [quoted text clipped - 16 lines] > exhaust > that (and where are those writing systems lurking ?). I don't think it should make sense to manipulate characters as integers, just like it doesn't make sense to manipulate Strings which coincidentally have length 1 as integers.
If you're doing some sort of ASCII manipulation stuff, then you're not actually dealing with the characters themselves, but the byte-encoding of those characters in the ASCII encoding system, for example. So you'd take your characters, convert them to integers or bytes or whatever using an ASCII encoder, and then manipulate those integers, then convert them back to characters using an ASCII decoder, for example.
At any rate, I don't think we should impose an upper limit on the number of useful symbols or characters that we allow to define for ourselves. It reminds me of that "Nobody needs more than 640KB of RAM" (mis-)quote.
>> Potential sources for new characters (in approximate order of >> probability): [snip]
>> * Contact with alien civilization which use a different character >> set. [quoted text clipped - 8 lines] > version of > Unicode for representing their communications. Is this "humans-only" requirement actually documented anywhere? I mean, if we found out that, for example, spiders encoded some communicative information within the patterns of their webs and we managed to decode it, would it be "against policy" to add symbols from this spider-language to Unicode? Or would we say "well, now since we, as humans, have decoded it, it becomes a human writing scheme, and so is apt to be used in Unicode"?
I don't know what assumptions Unicode makes, but it seems to me that if it's possible to add characters to it to support alien languages, it certainly would be worthwhile to do so upon encountering those languages.
I guess Unicode assumes that there exists a definite ordering of the character streams (e.g. right to left, top to bottom). Or maybe it's not Unicode which makes that assumption, but rather our Strings which do so. If an alien civilization's natural language ressembled BeFunge, I'm not sure how well our concepts of strings could cope, though we could certainly add each symbol within that language to Unicode.
- Oliver
Bent C Dalager - 19 May 2006 17:24 GMT > Is this "humans-only" requirement actually documented anywhere? It is probably an emergent property of the system.
>I mean, >if we found out that, for example, spiders encoded some communicative >information within the patterns of their webs and we managed to decode it, >would it be "against policy" to add symbols from this spider-language to >Unicode? Or would we say "well, now since we, as humans, have decoded it, it >becomes a human writing scheme, and so is apt to be used in Unicode"? You would get into trouble if it turns out that the exact stickiness (however stickiness is measured) of the strands involved in the symbol are vital to the correct interpretation of the message.
How do you represent stickiness in Unicode?
Or perhaps pheromones add vital information to the picture.
> I don't know what assumptions Unicode makes, but it seems to me that if >it's possible to add characters to it to support alien languages, it >certainly would be worthwhile to do so upon encountering those languages. Our ability to do so presumably depends upon the aliens having the exact same concept as we do as to what a glyph is. The simplest variation (depending on 3D glyphs perhaps, or animated ones)* is likely to throw off Unicode.
* - For all I know, Unicode may support this, but you get the idea.
Cheers Bent D
 Signature Bent Dalager - bcd@pvv.org - http://www.pvv.org/~bcd powered by emacs
Oliver Wong - 19 May 2006 20:18 GMT >>if we found out that, for example, spiders encoded some communicative >>information within the patterns of their webs and we managed to decode it, [quoted text clipped - 10 lines] > > Or perhaps pheromones add vital information to the picture. I don't think Unicode says anything about the representation of such characters. There's a platonic ideal representing the concept of the first character in the lowercase alphabet, 'a'. Unicode assignes a number to that character (\u0061), but it doesn't say anything about what that character looks like, or how it should be drawn. There's another character, \u0430, which is visually indistinguishable from \u0061 in all fonts I've seen, and yet it's "obviously" a different Unicode character by virtue of having a different number.
So this spider-character-set would have different code points for each character. It's up to the font designers to worry about how to represent stickiness or pheromones in their fonts (if they chose to do so at all).
Note that unicode text isn't nescessarily displayed visually; it could be displayed via speech readers, or braille devices. It's almost a natural fit to represent stickness via any tactical output device like braille. Similar devices could be constructed to accurately represent pheromones olfactorally.
>> I don't know what assumptions Unicode makes, but it seems to me that >> if [quoted text clipped - 7 lines] > > * - For all I know, Unicode may support this, but you get the idea. Again, I believe unicode isn't interested in the actual representation of these characters.
- Oliver
Roedy Green - 20 May 2006 00:01 GMT >You would get into trouble if it turns out that the exact stickiness >(however stickiness is measured) of the strands involved in the symbol >are vital to the correct interpretation of the message. More likely is decoding something of squid language which entails rapid subtle colour changes. Presumably they compute something with their outsized brains.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Chris Uppal - 21 May 2006 12:24 GMT > I don't think it should make sense to manipulate characters as > integers, just like it doesn't make sense to manipulate Strings which > coincidentally have length 1 as integers. At one level I agree with you; there's something unnatural about conflating characters and integers. In fact Smalltalk works exactly how you suggest, and my own Unicode implementation for Smalltalk (under construction) works that way too, so I have a fair bit of experience using a system which separates the two concepts.
But that's only half the story. You also need to be able to do a significant subset of arithmentical operations on character values (indexing into arrays for instance), and such operations often turn up in places where constantly casting back-and-forth between integer code points and actual characters would be painful and/or inefficient. Java doesn't really support the idea of "hybrid" values -- half arithmetical, half not -- so, barring major changes to the language, I'd stick with the current scheme, but make "char" wider.
It's perhaps worth emphasising that, in Unicode, a character has very little meaning by itself -- it is, in general, not possible to do anything very useful with a character which isn't an element of a stream or string. Pretty-much the only things you can legitimately do with a char are compare it with another char or use it as a lookup index into Unicode character property tables. A character is /not/ like a short string -- it's a different class of entity entirely.
Tell you what. How about, since we're redefining Java anyway, we rename "char" to "codepoint" ? It would be more accurate...
> Is this "humans-only" requirement actually documented anywhere? Not that I know of, although it wouldn't surprise me to find the human-centric design principles discussed somewhere. Unicode includes rather a lot of thoughtful and interesting meta-discussion in it's documentation (if not in the standard itself).
The way that Unicode works is extremely practical and /not/ universal (see below). It introduces features only if they are used in some target orthography. Thus it has ligatures, since they are essential in many systems of writing. It also attempts to make round-tripping from other charsets, into Unicode, and back possible (no information lost), and so has a very limited number of Latin ligatures (and that's the /only/ reason it has Latin ligatures). No writing system uses colour to denote meaning (that I know of) and so Unicode doesn't touch colour. The result of this YAGNI-like focus on features that are actually needed, is that Unicode inevitably reflects the human processes which create written languages, and which determine their logical structure. One huge example is that human vision uses edge-detection heavily. As a result Unicode glyphs are /shapes/ -- shapes which can be rendered as black-on-white.
BTW, don't get mislead by the odd few Unicode code points which are assigned to non-visual purposes -- the BOM being a good example, or the directionality markers. There are damned few of those, and for the most part they only exist in order to allow round-tripping or the use of Unicode in a context where insufficient meta-information is available, and their use is disouraged in other contexts. Unicode is /about/ shapes.
It's worth considering how much Unicode /doesn't/ have which it might be expected to include if the focus weren't so limited. For instance it has no way of expressing /semantic/ qualifiers on text such as italics (or, more abstractly, emphasis). It has no means of rendering prosody beyond the limited expression implied by existing punctuation schemes[*]. Yet if the text-to-speech example could be taken as a core use for Unicode -- i.e. as a true alternative rendering of Unicode, on an equal footing with printing text on paper -- then such annotations would seem to be highly desirable, perhaps even necessary.
([*] Another aside: apparently English punctuation started out -- with the Greeks, naturally -- purely as a way of expressing prosody, but at around the time fully modern English emerged, the punctuation system had its own mini-revolution: new marks were invented, old marks were reinterpreted or discarded, and the role of punctuation shifted away from expressing prosody to expressing grammar and other semantic features of text.)
> I > mean, if we found out that, for example, spiders encoded some [quoted text clipped - 3 lines] > humans, have decoded it, it becomes a human writing scheme, and so is apt > to be used in Unicode"? I don't think it's a policy thing at all. If this situation were ever to arise, then I think one of two things would happen. Either we humans (not being able to "see" the patterns properly since we lack the necessary brain circuitry) would develop an independent glyph-system for representing the patterns (and whatever other features were needed). In that case the new glyph system might get added to Unicode if enough humans wanted to represent Spiderese texts in their discussions with other humans. Note that the spiders themselves would probably not be able to "see" our human glyphs any more than we could see theirs, so this system would be solely for human use. This is roughly what has happened for musical notation[**] Alternatively it might turn out that human/spider brains were similar enough that we could read their patterns directly (I have to say that I find this almost impossible to imagine), in that case it would come down to the practicalities. Does written Spiderese break down into a glyph system similar enough to the existing human ones for it to be expressed in the Unicode framework ? I find this even harder to imagine, but if it /did/ turn out that way then I see no reason for spider-glyphs not to be added to Unicode. To me (presupposing the existence of other intelligences at all) it seems much more likely that their communications wouldn't have a modality which was anywhere near close enough to human writing to fit into Unicode. Spiders, for instance, might be much more likely to use moving patterns of standing waves in their webs (vibrations /matter/ to spiders). Almost any species might naturally record meaning as structures in a very-high dimensional space -- smell is far more universal on Earth than vision.
([**] BTW, it seems to me that musical notation is in Unicode because people want to write /about/ music, not in order to /express/ music per se.)
Your (snipped) point about Unicode assuming sequence is well-taken. Some human written languages don't make much use of sequence. I can't remember which off-hand, but some of the old South American languages just bung a number of symbols/pictures together into a cartoon-like frame, and leave it to the reader to work out which express a meaning and which qualifies what. It's an interesting system since it allows a lot of freedom for the writer to be creative with the pictures and layout. I don't know how such systems would be mapped into Unicode. It'd be possible, I suppose, to write the symbols down in an arbitrary, or conventual, order, but I don't know if that would be any use for scholars, who might want to preserve the spatial layout. If not then they'd probably be better off using JPEGs instead of Unicode text.
I /think/ I may have worked out where we're seeing Unicode differently. There's a parallel with dictionaries, which come in two broad flavours. There are the dictionaries which attempt to /record/ what the (written or not) language is like at a given time and place (or over a range of such). The OED is the incomparable exemplar of this school of thought. And then there are the /prescriptive/ dictionaries -- ones which attempt to tell readers what the "correct" meaning and spelling of a word is. In the dictionary world the prescriptive idea has long gone out of fashion[***], and prescriptive dictionaries are only used for teaching purposes. So, if people start -- say -- confusing "convince" and "persuade", the dictionaries will simply reflect that in their next edition, whereas a school dictionary will attempt to dictate that the two words have separate meanings (with a small amount of overlap).
The parallel here is that I think you are seeing Unicode as non-prescriptive in that sense, whereas I see it as essentially prescriptive. It's purpose -- as I see it -- is not to /record/ the diversity of the worlds scripts, but to /standardise/ their computerised representation. The motive is purely practical, with no scholarly side to it at all. (Although considerable scholarship goes into creating it, and it is intended to be used /by/ scholars.) The purpose is only to allow people to share written texts across different computers -- and for that a prescriptive approach is necessary. A /standard/.
([***] Since about Samuel Johnson's time, although the idea does resurface from time to time -- I believe the original Webster's Dictionary was primarily prescriptive.)
-- chris
Oliver Wong - 23 May 2006 21:22 GMT [Snipped long, but very interesting response -- thanks Chris]
I found Chris' reply very interesting and informative, and it inspired me to actually go read The Unicode Standard document. I'll copy and paste interesting block quotes later on in this post, but for the extremely impatient, here's a bullet point summary.
* The Unicode Standard does set a limit on itself at 0x10FFFF (or just over a million) characters. I don't know why. * Unicode deals with abstract semantical concepts of a character, and not with the glyph, graphic, picture or whatever you want to call it, that is used to actually visually render that character legible. * They specifically say that they do not wish to cover "dance notations". * One interesting (to me anyway, and in the context of this discussion) character is U+2062. It's an character which is traditionally invisible (though I suppose fonts are free to supply a graphic for it) which represents the mathematical concept of multiplication. That is, in when you want to write the concept "A times B", you'd write the character 'A', the U+20 |
|