Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / May 2006

Tip: Looking for answers? Try searching our database.

If you could add anything you want

Thread view: 
John Gagon - 12 May 2006 16:28 GMT
If you could add anything you wanted to the java language, what would
it be?

I'd predict some would say the non-imperative stuff ie: closures or the

LISP like abilities to work almost purely functionally or do macros.
Some might get smart as say "everything dot NET (dot NYET) has".

There is one thing that would be a nice compiler check that would
prevent some
stupid coding organizationally and that might be some better
information hiding
and this is rather original and with all such things, they may not be
the best ideas and so that's why I bounce it here:

Example:

(public*) package com.mycomp.mysoft.persistence
    ignores
        com.mycomp.mysoft.services,
        com.mycomp.mysoft.web;

private package com.mycomp.mysoft.business
    knows
        com.mycomp.mysoft.persistence,
        com.mycomp.mysoft.services;

* the default

This could have been a java swing example just as easily. Anyhow,
pardon me since it's been ten years but IIRC, C++ had friend and other
features that seemed to work like this but perhaps that is more a class
or type-grained feature.

Perhaps the reverse could also be asked and that is, what would you
remove from the java language if you could or what is the biggest
mistake. (Originality being more interesting here IMHO). i.e.: I know
the way that the way package and protected work in terms of visibility
is often criticized.

Thank you (in advance) for any corrections to conceptual mishaps here I
might have too.

John Gagon
Struggling Software Engineer
VisionSet - 12 May 2006 16:29 GMT
> If you could add anything you wanted to the java language, what would
> it be?

Extending the OO paradigm from just classes to package structures.  They
could be so much more than simple name spaces.

--
Mike W
John Gagon - 12 May 2006 17:08 GMT
> > If you could add anything you wanted to the java language, what would
> > it be?
>
> Extending the OO paradigm from just classes to package structures.  They
> could be so much more than simple name spaces.

Yes, this would be very cool. I'm just thinking about the possibilities
of this and eliminating duplication, abstracting with packages and
doing more with various kinds of dependencies. Those dependencies could
be externally defined for example or there could be some further
constraint on all classes in those packages placed in there. It's
boggling but I can see already it would be very useful.
Oliver Wong - 12 May 2006 16:48 GMT
> If you could add anything you wanted to the java language, what would
> it be?

   I wrote about this before: Syntactic sugar for "final from now on". I.e.
instead of:

<code>
String myFinalValue;
{
 String temp;
 for (Foo f : bar) {
   if (f.someCondition);
     temp = f.toString();
     break;
   }
 }
 if (temp == null) {
   myFinalValue = "";
 } else {
   myFinalValue = temp;
 }
}

new AnonymousClass() {
 public void method() {
   system.out.println(myFinalValue);
 }
}
</code>

something like

<code>
String myFinalValue = "";
for (Foo f : bar) {
 if (f.someCondition);
   myFinalValue = f.toString();
   break;
 }
}
finalize myFinalValue;

new AnonymousClass() {
 public void method() {
   system.out.println(myFinalValue);
 }
}
</code>

   - Oliver
John Gagon - 12 May 2006 17:12 GMT
> > If you could add anything you wanted to the java language, what would
> > it be?
[quoted text clipped - 46 lines]
>
>     - Oliver

I see, so you basically prevent further assignments with a compiler
check and it would eliminate potential side effects. A "single"
assignment declaration (but delayed) might be useful too like a
finalize_after_assignment String myvar; (but it would be some single
short keyword)

John
Hendrik Maryns - 15 May 2006 10:03 GMT
Oliver Wong schreef:
>> If you could add anything you wanted to the java language, what would
>> it be?
>
>    I wrote about this before: Syntactic sugar for "final from now on".
> I.e. instead of:

Indeed, a situation where I have thought a few times: I’d like Eiffel’s
once functions here.

H.
- --
Hendrik Maryns

==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html
Eric Sosman - 12 May 2006 18:08 GMT
John Gagon wrote On 05/12/06 11:28,:
> If you could add anything you wanted to the java language, what would
> it be?

   An antidote for Eternal September.

Signature

Eric.Sosman@sun.com

Roedy Green - 12 May 2006 19:43 GMT
>    An antidote for Eternal September.

http://www.answers.com/topic/eternal-september

All time since September 1993. One of the seasonal rhythms of the
Usenet used to be the annual September influx of clueless newbies who,
lacking any sense of netiquette, made a general nuisance of
themselves. This coincided with people starting college, getting their
first internet accounts, and plunging in without bothering to learn
what was acceptable. These relatively small drafts of newbies could be
assimilated within a few months. But in September 1993, AOL users
became able to post to Usenet, nearly overwhelming the old-timers'
capacity to acculturate them; to those who nostalgically recall the
period before, this triggered an inexorable decline in the quality of
discussions on newsgroups. Syn. eternal September. See also AOL!.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

John Gagon - 13 May 2006 07:31 GMT
> >    An antidote for Eternal September.
>
[quoted text clipped - 11 lines]
> period before, this triggered an inexorable decline in the quality of
> discussions on newsgroups. Syn. eternal September. See also AOL!.

An obscurely guised dig at the OP, namely myself then or perhaps just a
statement of affairs in general...

John Gagon
alexandre_paterson@yahoo.fr - 12 May 2006 20:24 GMT
> If you could add anything you wanted to the java language, what
> would it be?

The one and only true Design by Contract, as defined by its inventor.

This should imply also *removing* all those unnecessary gotos from
the language (ie the over-abused and misused exceptions) and leave
exceptions, well, for really exceptional conditions (in Eiffel that
happens only when some part breaks a contract).

I've been following closely stuff like JML, Nice, etc. and more
recently Contract4J.

But real DbC integrating nicely in my IDE and having that IDE
reporting possible broken contract in real-time would be gorgeous
(that can be seen on demos of Microsoft's Spec# "specsharp"
research language and it is *really* impressive).

Link to an overview here (works fine under OpenOffice):

http://research.microsoft.com/~leino/papers/SpecSharp-MPI-SS.ppt

For the moment I'm stuck with IntelliJ IDEA's @NotNull Java 1.5
annotation...  It's already a good addition to the language (some
would say it's just a fix for a language defect ;)

http://www.jetbrains.com/idea/features/newfeatures.html#nullable

I can't wait being able to specify real contracts on my abstract
data types!

Of course YMMV,

 Alex
Kent Paul Dolan - 14 May 2006 09:04 GMT
"alexandre_paterson" <alexandre_paterson@yahoo.fr>
wrote:

> This should imply also *removing* all those
> unnecessary gotos from the language (ie the
> over-abused and misused exceptions) and leave
> exceptions, well, for really exceptional
> conditions (in Eiffel that happens only when some
> part breaks a contract).

Umm, that in itself would be one of my wishes:
remove exception handling hell by allowing
exceptions to propogate upward unhandled (untried,
uncaught) and unremarked by redundant "throws ...",
or unrelayed by rethrowing, (but warn at compile
time that such is the case) and at the top level to
default to an exception dump, stackdump and hard
error exit.

This would cater for writing the base "added value"
code _before_ doing the exception code, rather than
having handling exceptions explicitly a requirement
for compilation or testing. It is way maddening to
have simple-in-concept code cluttered by try...catch
sets, throws declarations, rethrowing, and all the
messes they make out of variable scoping, before the
main intellectual product code can be tested at all.

The second one would be to decouple class names from
file names to reduce the proliferation of space
wasting tiny files in a typical large application
suite. If other languages can use symbol tables and
linkage editors to cross-link compiled files, and
have done so for decades, Java can probably do the
trick as well.

It is even nastier to navigate a directory with a
thousand individual class files in it, than to edit
a single file with those thousand class definitions
in it.

Probably file level encapsulation should allow up to
a whole package in a single file, rather than merely
a class, an interface, or an exception declaration,
as now.

Third, allow separation of declaration and
implementation, as Ada does. This isn't quite the
same thing as declaring a Java intreface and then
separately implementing it one or several times,
because in Ada, the implementation of a declaration
is unique.

The Ada style allows the whole software suite
interface to be declared and compiled, cleaned up,
and debugged, before a stick of implementation is
written. This allows much more robust large system
architecting.

FWIW

xanthian.
Kent Paul Dolan - 14 May 2006 09:11 GMT
"alexandre_paterson" <alexandre_paterson@yahoo.fr>
wrote:

> This should imply also *removing* all those
> unnecessary gotos from the language (ie the
> over-abused and misused exceptions) and leave
> exceptions, well, for really exceptional
> conditions (in Eiffel that happens only when some
> part breaks a contract).

Umm, that in itself would be one of my wishes:
remove exception handling hell by allowing
exceptions to propogate upward unhandled (untried,
uncaught) and unremarked by redundant "throws ...",
or unrelayed by rethrowing, (but warn at compile
time that such is the case) and at the top level to
default to an exception dump, stackdump and hard
error exit.

This would cater for writing the base "added value"
code _before_ doing the exception code, rather than
having handling exceptions explicitly a requirement
for compilation or testing. It is way maddening to
have simple-in-concept code cluttered by try...catch
sets, throws declarations, rethrowing, and all the
messes they make out of variable scoping, before the
main intellectual product code can be tested at all.

[I'd even suggest that a "try{}" _not_ define a scope
of variable visibility, but instead variables declared
within it have scope the next surrounding scope. An
explicit limited scope can easily be created with an
extra set of brackets "try{{}}", but the current scope
rules mean that to be seen in the catch(){} or after the
try/catch, the variables must be declared before the
try, conflicting with the best practice that variables
be declared where they are first used.]

The second one would be to decouple class names from
file names to reduce the proliferation of space
wasting tiny files in a typical large application
suite. If other languages can use symbol tables and
linkage editors to cross-link compiled files, and
have done so for decades, Java can probably do the
trick as well.

It is even nastier to navigate a directory with a
thousand individual class files in it, than to edit
a single file with those thousand class definitions
in it.

Probably file level encapsulation should allow up to
a whole package in a single file, rather than merely
a class, an interface, or an exception declaration,
as now.

Third, allow separation of declaration and
implementation, as Ada does. This isn't quite the
same thing as declaring a Java intreface and then
separately implementing it one or several times,
because in Ada, the implementation of a declaration
is unique.

The Ada style allows the whole software suite
interface to be declared and compiled, cleaned up,
and debugged, before a stick of implementation is
written. This allows much more robust large system
architecting than the Java style does.

FWIW

xanthian.
Hendrik Maryns - 15 May 2006 10:09 GMT
Kent Paul Dolan schreef:
> "alexandre_paterson" <alexandre_paterson@yahoo.fr>
> wrote:
[quoted text clipped - 33 lines]
> try, conflicting with the best practice that variables
> be declared where they are first used.]

You obviously did nut understand what Alexandre was suggesting.  Read up
on error management in Eiffel.  It really means removing the /whole/
exception system.  But this of course will not happen.

> Third, allow separation of declaration and
> implementation, as Ada does. This isn't quite the
[quoted text clipped - 8 lines]
> written. This allows much more robust large system
> architecting than the Java style does.

You don’t believe in the single-source principle?

H.
- --
Hendrik Maryns

==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html
Kent Paul Dolan - 16 May 2006 06:14 GMT
> Kent Paul Dolan schreef:

>>> This should imply also *removing* all those
>>> unnecessary gotos from the language (ie the
>>> over-abused and misused exceptions) and leave
>>> exceptions, well, for really exceptional
>>> conditions (in Eiffel that happens only when
>>> some part breaks a contract).

Notice, for the current purposes, that the above
quite specifically _does_ leave an exception system
in place, just not the current one.

>> Umm, that in itself would be one of my wishes:
>> remove exception handling hell by allowing
[quoted text clipped - 4 lines]
>> at the top level to default to an exception dump,
>> stackdump and hard error exit.

>> This would cater for writing the base "added
>> value" code _before_ doing the exception code,
[quoted text clipped - 5 lines]
>> make out of variable scoping, before the main
>> intellectual product code can be tested at all.

>> [I'd even suggest that a "try{}" _not_ define a scope
>> of variable visibility, but instead variables declared
[quoted text clipped - 5 lines]
>> try, conflicting with the best practice that variables
>> be declared where they are first used.]

> You obviously did not understand what Alexandre
> was suggesting.

Umm, I read what he _wrote_, not what some prior
agenda of mine wanted him to have written. Did you?

> Read up on error management in Eiffel.  It really
> means removing the /whole/ exception system.  But
> this of course will not happen.

I'm not particularly interested in turning Java into
Eiffel. Something about Eiffel, perhaps merely that
it wasn't made freely available early enough, maybe
its politics, perhaps its intellectual difficulty,
has kept it from becoming popular, it remains more
in the "minor languages" crowd, than in the
"languages you need to know to find employment"
crowd.

Java in contrast has taken the software industry by
storm, despite being proprietary. Turning Java into
Eiffel would presumably destroy the popularity of
Java as well.

I don't have any particular problems with an
exception system. I'll grant that it can be abused
into a "goto" system, in Alexander's words, but,
used with proper discipline, it provides well for a
"what to do when you can't go on" system that lets
you handle problems "paragraph by paragraph" rather
than "phrase by phrase".

What gives me hives is an exception system that
forces me to handle an exception twelve layers deep
in the stack twelve times to get it to the top level
of the application, in the case where I really do
want just to drop dead if the case occurs, but
someone else might want to cater for the exception
somewhere half-way in-between. What gives me the
grippe is an exception system that won't _let_ me
use something that might, someday, "throw", without
wrapping the invocation in tedious grunge that makes
my code too ugly to comprehend. Those are the parts
of Java's exceptions I would like to see fixed. I
have zero problems with a compilation system that
insists on warning about unhandled exceptions, so my
management has some way to impose quality control on
my code, I just have _huge_ problems with a
compilation system that refuses to _compile_ code
with unhandled propagation of exceptions _at all_.

>> Third, allow separation of declaration and
>> implementation, as Ada does. This isn't quite the
>> same thing as declaring a Java intreface and then
>> separately implementing it one or several times,
>> because in Ada, the implementation of a declaration
>> is unique.

>> The Ada style allows the whole software suite
>> interface to be declared and compiled, cleaned up,
>> and debugged, before a stick of implementation is
>> written. This allows much more robust large system
>> architecting than the Java style does.

> You don't believe in the single-source principle?

The "single source principle" I know says "don't
replicate code, refactor to make it a callable
routine instead". Ada doesn't violate that
principle, it merely allows the declaration and the
implementation of a method to reside in separate
source files, a dandy idea, allowing one to create
and review an interface uncluttered by its
implementation details. In terms of commercial
software, this allows that implementations need not
even be delivered as source code, if one is of that
"closed source" school, while catering that
declarations _can be and are_ delivered as source
code, much like C/C++ header files are.

[Nor am I a "language theorist", chained to some
family of inviolable language design principles; I'm
a practicing applications programmer, since 1961,
and what I'm interested to find in a language's
design isn't inflexibility for the sake of principle,
but stuff that helps me do that job well and
efficiently, a true fan of Larry Wall and Perl.]

xanthian.
Hendrik Maryns - 16 May 2006 09:14 GMT
Kent Paul Dolan schreef:
>> Kent Paul Dolan schreef:
>
[quoted text clipped - 43 lines]
> Umm, I read what he _wrote_, not what some prior
> agenda of mine wanted him to have written. Did you?

Hm, ok maybe you’re right.  I indeed read his suggestion as abolishing
all exceptions.

>> Read up on error management in Eiffel.  It really
>> means removing the /whole/ exception system.  But
[quoted text clipped - 21 lines]
> you handle problems "paragraph by paragraph" rather
> than "phrase by phrase".

Agree with all that.

> What gives me hives is an exception system that
> forces me to handle an exception twelve layers deep
[quoted text clipped - 14 lines]
> compilation system that refuses to _compile_ code
> with unhandled propagation of exceptions _at all_.

But not sure about this.  I will comment no further, as I do feel that
it makes code ugly often now, but I am not sure I’d find your system better.

>>> Third, allow separation of declaration and
>>> implementation, as Ada does. This isn't quite the
[quoted text clipped - 24 lines]
> declarations _can be and are_ delivered as source
> code, much like C/C++ header files are.

The single source principle I know says: ‘put everything that concerns
one unit/class/module into one file’.  Under everything I understand
documentation, interface, implementation, specification, contract, ...
Then along with compilers the suitable tools should be delivered to
extract the appropriate view.  Which is what happens reasonably well
with Javadoc, but I agree there might be a need for a tool to abstract
the interface view of a class without having to define a Java interface
for it.  (Again, see Eiffel(Studio) for more different views.)

> [Nor am I a "language theorist", chained to some
> family of inviolable language design principles; I'm
[quoted text clipped - 3 lines]
> but stuff that helps me do that job well and
> efficiently, a true fan of Larry Wall and Perl.]

Hm, we certainly differ in that :-)

H.

- --
Hendrik Maryns

==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html
John Gagon - 16 May 2006 21:31 GMT
> > If you could add anything you wanted to the java language, what
> > would it be?
[quoted text clipped - 30 lines]
>
>   Alex

I don't see where my message went so I'll summarize this time.. I do
like Nice a lot. I will look at Spec Sharp. Looks good though. Well
done presentation.

John Gagon
Ed - 13 May 2006 15:07 GMT
> If you could add anything you wanted to the java language, what would
> it be?

An accessor level between package-private and public, so that a class
could be visible not to the whole system, but just a group of packages:
http://www.edmundkirwan.com/servlet/fractal/frac-page56.html

.ed

--
www.EdmundKirwan.com - Home of The Fractal Class Composition
John Gagon - 15 May 2006 03:28 GMT
> > If you could add anything you wanted to the java language, what would
> > it be?
[quoted text clipped - 4 lines]
>
> .ed

That seems like it would go hand in hand with the restricted packaging.
Cool.

John Gagon
John Gagon - 15 May 2006 06:17 GMT
> > If you could add anything you wanted to the java language, what would
> > it be?
>
> An accessor level between package-private and public, so that a class
> could be visible not to the whole system, but just a group of packages:
> http://www.edmundkirwan.com/servlet/fractal/frac-page56.html

You know, I just read through your website and downloaded the analysis
tool. I find it's fairly helpful. I use a lot of metrics, LoD and PMD
and CPD etc etc and this is another one I'll use as well. Other kinds
of tools like execution coverage/dead code analysis/test coverage and
profiling etc I tend to go crazy on this kind of stuff. I tend to
really like objective statistics and code reviewing.

Does your tool, if used then guarantee certain metrics? (I could guess
but I'd rather know if you intended to cover any or unintentionally
resolved any)

BTW, I did notice a few missing words (I tend to do that) here and
there in the article. I can provide correction if you like and I like
the code examples. I was hoping to look at source for examples of how
your facades and singletons looked but I noticed you have it
obfuscated. I'm about ready to run it on some code now. Unfortunately,
one of my modules I wrote not long ago got a 0.58 but one of my other
older pieces had a 0.72. (I haven't done any other metrics on them
yet). I'm somewhat perfectionist.

John Gagon
Ed Kirwan - 16 May 2006 10:05 GMT
>>> If you could add anything you wanted to the java language, what would
>>> it be?
[quoted text clipped - 13 lines]
> but I'd rather know if you intended to cover any or unintentionally
> resolved any)

Hi, John,

I'm not entirely sure what you mean by, "Guarantee certain metrics." Do
you mean, "Are the tool's metrics guaranteed to be correct?" Well, we
use them in our work, and I know of two other shops that use them; but
would I sign an iron-clad, financially-punitive contract declaring that
the tool is free from all bugs and so the metrics are guaranteed to be
correct for all inputs?" Sadly, I would not. To date, however, there
have been few major complaints.

I get the feeling, however, that this is not what you meant.

> BTW, I did notice a few missing words (I tend to do that) here and
> there in the article. I can provide correction if you like

I would be delighted to receive your corrections. Engineering has
withered my English language skills to the point where they must cart
around their own bottled oxygen with them, and still they wheeze and
splutter at the slightest grammatical exertion; I would appreciate any
comments you have. It's rare indeed that anyone volunteers so
surprisingly important a service.

 and I like
> the code examples. I was hoping to look at source for examples of how
> your facades and singletons looked

Excellent point. I should make the source examples available as
downloads. Examples, in fact, are probably not enough; so as a gesture
of appreciation for your offer above, I'll open-source a full
application with a fractal index of a perfect 1.0. Give me a couple of
weeks to cobble together a program description; I'll post notification here.

 but I noticed you have it
> obfuscated. I'm about ready to run it on some code now. Unfortunately,
> one of my modules I wrote not long ago got a 0.58 but one of my other
> older pieces had a 0.72. (I haven't done any other metrics on them
> yet). I'm somewhat perfectionist.

I'm fortunately unafflicted by perfectionism: I hear it can be tiresome. :)

As with all metrics and as I'm sure you're aware, metrics should be
viewed with healthy caution. I'm not sure how much value can be gleaned
from pouring code that was designed without the fractal class
composition in mind into the Fractality code analyser because there are
many different methodologies that people use to maximise the OOness of
their system.

A fractal index of 0.58 does indeed suggest that a module was not,
"Programmed to an interface repository," and did not, "Eliminate
descendant dependencies;" but if these concepts were not used in the
construction of that module, then we're viewing the code from an angle
not considered by the designer: it's then perhaps no surprise that it
looks a little askew, but that doesn't imply that the code is poorly
designed; it's just designed in a way unfamiliar to an unbending code
analyser.

If, however, a module is designed from scratch with the fractal class
composition in mind, and yet still scores badly in Fractality, then we
can ask some drilling questions.

> John Gagon

Signature

www.EdmundKirwan.com - Home of The Fractal Class Composition.

Download Fractality, free Java code analyzer:
www.EdmundKirwan.com/servlet/fractal/frac-page130.html

John Gagon - 16 May 2006 12:17 GMT
> >>> If you could add anything you wanted to the java language, what would
> >>> it be?
[quoted text clipped - 18 lines]
> I'm not entirely sure what you mean by, "Guarantee certain metrics." Do
> you mean, "Are the tool's metrics guaranteed to be correct?"

I believe yes. I notice you include a lot of the standard metrics that
I use in the various analysis views. I'm assuming that once one
achieves the Fractal Class Composition score of 1.0, that the metrics
for instability, for example would be zero (afferent/efferent
couplings) and cyclomatic complexity would all be at a certain optimum
or value. I'm guessing some might come in as perfect while others are
"fairly close" to an ideal value (like distance and abstractness). In
any case, I wonder if some metric limits are reached by achieving a
score of 1.0 perhaps as a function of number of packages and classes.

> Well, we
> use them in our work, and I know of two other shops that use them; but
> would I sign an iron-clad, financially-punitive contract declaring that
> the tool is free from all bugs and so the metrics are guaranteed to be
> correct for all inputs?" Sadly, I would not. To date, however, there
> have been few major complaints.

It's always hard to tell that one. Amount of money to risk seems to me
proportionate to perceived stability, I would think it would depend on
the amount in the contract.

> I get the feeling, however, that this is not what you meant.

Do you still get the feeling? (as I lost what your pronoun 'this'
(above) might refer to other than generally my question about
guaranteeing of certain metrics)

> > BTW, I did notice a few missing words (I tend to do that) here and
> > there in the article. I can provide correction if you like
[quoted text clipped - 5 lines]
> comments you have. It's rare indeed that anyone volunteers so
> surprisingly important a service.

I have sent them to you personally in a separate email.

>   and I like
> > the code examples. I was hoping to look at source for examples of how
[quoted text clipped - 5 lines]
> application with a fractal index of a perfect 1.0. Give me a couple of
> weeks to cobble together a program description; I'll post notification here.

Yes, that would be *very* useful. Very good idea there. I'll search for
it periodically in the future. Feel free to CC my email if you would
like me to look at it too. ;-)

>   but I noticed you have it
> > obfuscated. I'm about ready to run it on some code now. Unfortunately,
[quoted text clipped - 3 lines]
>
> I'm fortunately unafflicted by perfectionism: I hear it can be tiresome. :)

It sure can be. Well, a somewhat perfectionist, to be pedantic, is not
quite as bad as an absolute perfectionist though is it?

> As with all metrics and as I'm sure you're aware, metrics should be
> viewed with healthy caution. I'm not sure how much value can be gleaned
> from pouring code that was designed without the fractal class
> composition in mind into the Fractality code analyser because there are
> many different methodologies that people use to maximise the OOness of
> their system.

I use principles of keeping packages, classes and method sizes in a
certain range and I try to organize dependencies and in the past, I've
used a more bandaid approach using an open source tool call depfind
that searched the code for dependencies and spat out megabytes of xml
or html. As a maintainer of a mature codebase, this was more crucial
because stability was a primary goal at that point. Every code change
needed impact analysis and I would use the dependency checker run in an
ant script to find out the current dependencies and find the number of
other classes affected by the change. This preventative approach
reduces that need quite a bit. I used to work on this at HP before
offshoring and reduction of workforce occured with the incoming CEO
replacing Carly.

> A fractal index of 0.58 does indeed suggest that a module was not,
> "Programmed to an interface repository," and did not, "Eliminate
[quoted text clipped - 4 lines]
> designed; it's just designed in a way unfamiliar to an unbending code
> analyser.

The code started out cleaner but then became more ratsnesty / spaghetti
and even with just myself programming it, it grew out of control since
I would work on it weekend to weekend since it was my own, on the side,
skunkwork/moonlighting project.

> If, however, a module is designed from scratch with the fractal class
> composition in mind, and yet still scores badly in Fractality, then we
> can ask some drilling questions.

Yes. I plan on refactoring to this standard if only for future
maintenance. It will be a guiding principle for all others working on
my free open version. I'm writing a tool which I will soon publish on
Sourceforge and java.net and later, I will finish a commercial grade
version with extra features. My tool is something more related to
prototyping and quick model driven development similar to projects like
trails/ruby on rails etc but with one other design goal in mind besides
"do not repeat yourself". It's been a long journey but I've got about
60% completion right now. (I'm also working on a personal tracking tool
like xplanner but supporting more calendar and recurring functions)

John Gagon
John Gagon - 16 May 2006 12:36 GMT
> > I'm not entirely sure what you mean by, "Guarantee certain metrics." Do
> > you mean, "Are the tool's metrics guaranteed to be correct?"
[quoted text clipped - 8 lines]
> any case, I wonder if some metric limits are reached by achieving a
> score of 1.0 perhaps as a function of number of packages and classes.

(of course, metrics are more often type level metrics as Fractal Class
Composition is more a package level metric of its own. maybe it's not
so relevant per se but an additional item that is almost independant)

John Gagon

> John Gagon
Ed Kirwan - 17 May 2006 14:15 GMT
> I believe yes. I notice you include a lot of the standard metrics that
> I use in the various analysis views. I'm assuming that once one
[quoted text clipped - 5 lines]
> any case, I wonder if some metric limits are reached by achieving a
> score of 1.0 perhaps as a function of number of packages and classes.

I had the same question myself, which is actually the reason for
including our good friend Robert C Martin's metrics in the analysis
tool. I was hoping that a system with a fractal index of 1.0 would show
a very low Distance metric. It's difficult to compare the two metrics,
of course, as the fractal index is system-wide, but the Distance metric
is per-package (why doesn't Uncle Bob develop a system-wide variant?),
but in those applications I've seen with a fractal index of 1.0, I've
not seen any packages with a Distance metric of higher than 0.5.

Certainly, "Program to an interface repository, not an implementation
repository," should align well with the Distance metric, but the
correlation is still suspect; it's just too easy get an accidentally
high Distance metric. (And I remember reading somewhere, sometime, that
someone else had made a slight alteration to the Distance metric ...
must check for that again.)

On the other hand, cyclomatic complexity is certainly well managed by,
"Eliminate descendant dependencies;" and indeed the only place where
cycles can occur is between two peer interface repositories, which are
in themselves quite rare (one interface repository is usually sufficient
to serve a package branch).

>>> BTW, I did notice a few missing words (I tend to do that) here and
>>> there in the article. I can provide correction if you like
>> I would be delighted to receive your corrections.
>
> I have sent them to you personally in a separate email.

Received and thank you, sir!

Signature

www.EdmundKirwan.com - Home of The Fractal Class Composition.

Download Fractality, free Java code analyzer:
www.EdmundKirwan.com/servlet/fractal/frac-page130.html

Kent Paul Dolan - 15 May 2006 09:20 GMT
> If you could add anything you wanted to the java language, what would
> it be?

Without a doubt, automation of the present mess
requiring programmers to divert GUI updating code
to the Event Handling Thread. This is so awful a
misfeature when left to the programmer as to make
Java GUI programming nearly intolerable, by seeding
insidious bugs into completely normal looking code
where this need hasn't been recognized as applying.

xanthian.
Chris Uppal - 17 May 2006 10:09 GMT
> If you could add anything you wanted to the java language, what would
> it be?

Hmm... where does one start ?

> I'd predict some would say the non-imperative stuff ie: closures or the
> LISP like abilities to work almost purely functionally or do macros.

Without trying to change Java into a better /kind/ of language, here are a few
things which (IMO) stay within the spirit of Java but would have saved me
time/effort in the past.

Java would have unsigned integers (but no automatic coercion between
signed/unsigned of the same width).

>>> and >> would be swapped over.

Right or left shifting by an impossible constant would provoke a compile-time
warning.

char would be 32-bit.

String would be an abstract type with (the option of) different concrete
subclasses.

Auto-boxing would provoke a compile-time warning.

The [] notation would be available whenever the object implements some
interface, Indexable perhaps.  java.util.List would inherit that interface.

Operator overloading would be permitted in some disciplined manner.  Again,
probably a small group of interfaces -- Field, MultiplicativeGroup,
AdditiveGroup, perhaps.  Classes would be required to implement the whole set
of related operations, not just cherry-pick. Assignment operators like ++ and
*= would be translated by the compiler into x = x + MyClass.unity(), rather
than being available for roll-your-own overriding.  The argument types and
returned value of overloaded operators would be required to follow the pattern
established by the existing operators (i.e. you can't redefine << to mean
System.out.print()).

Objects would allow the clone() operation by default (which would probably be
renamed to copy(), leaving a protected clone() which was a JVM-implemented
shallow copy).  The default implementation of copy() would call a protected
postCopy() method.  The default implementation of postCopy() would be empty.
There would be a marker interface or annotation to forbid clone().  I.e.
classes would opt-out of being copyable, not be forced to opt in.

Generics would vanish.

The definition of interfaces would be changed so that a method needed to
satisfy the contract implied by the interface need be no more visible than the
interface itself.  (E.g. package-private methods would satisfy package-private
interfaces.)

There would be a means of telling the compiler: "yes I know I'm calling a
method that you don't know about, but /trust me/, it'll be there by the time
this code is executed".  Perhaps that would be allowed wherever there's an
explicit handler for NoSuchMethodException.  The same for fields.

There would be a method, java.lang.System.getPlatformVersion().

In Java, references to final fields initialised to a compile-time constant, are
replaced by the constant itself.  That's OK, but the generated classfiles would
retain a per-method reference to the field so that dependencies can be tracked.

There would be some way to define compile-time constants on the javac
command-line.  (Since it's possible to do this anyway with only a little
hacking, there seems no valid justification for not allowing it in a
disciplined form.)

There would be a kind of Classloader which understood that you can put several
JARfiles in one directory.  The application Classloader would be of this type.

The people at Sun would be immersed upside-down in a huge vat of sex-crazed
cane toads until they agreed to change their bloody awful layout conventions.

That list is by no means complete, but I've grown bored typing it in.
Probably everyone's sick of reading by now too...

   -- chris
Oliver Wong - 17 May 2006 17:14 GMT
>> If you could add anything you wanted to the java language, what would
>> it be?

[snipped some good suggestions]

> char would be 32-bit.

   Conceptually, char should not have a width or size at all. Every char
value should identify exactly one unicode character. The underlying
implementation is free to use UTF-8, UTF-16, UTF-32 or any other encoding it
likes to convert from char to bytes, but from the programmer's perspective,
you can store any unicode character into a single char (i.e. none of this
"surrogate pair" nonsense).

   However, I'm not that familiar with the inner workings of the virtual
machine, so I don't know what kind of havoc a "variable-length primitive"
might cause.

[snipped some good suggestions]

> Generics would vanish.

!!! I thought Java would be better with more generics, rather than less (or
none at all).

> The definition of interfaces would be changed so that a method needed to
> satisfy the contract implied by the interface need be no more visible than
> the
> interface itself.  (E.g. package-private methods would satisfy
> package-private
> interfaces.)

   I don't understand this one.

> There would be a means of telling the compiler: "yes I know I'm calling a
> method that you don't know about, but /trust me/, it'll be there by the
> time
> this code is executed".  Perhaps that would be allowed wherever there's an
> explicit handler for NoSuchMethodException.  The same for fields.

   I've never seen a need for this. Can you elaborate?

[snipped some good suggestions]

   - Oliver
Chris Uppal - 18 May 2006 12:46 GMT
[me:]
> > char would be 32-bit.
>
>     Conceptually, char should not have a width or size at all.

I don't see the need for that level of abstraction.  Unicode is limited to < 24
bits by the UTF-16 hack.  The Unicode consortium states that code points will
never be allocated outside the range representable in UTF-16.  The equivalent
ISO work has limited itself (as I understand it) to 31 bits, but since they and
the Unicode people are committed to staying in lock-step, it's hard to see why
that is more than an academic point.

So the only reason I can think of for /not/ choosing 32-bits is that you might
suspect that now or in the future an implementation might want to restrict
itself to 24 bits per char.  I don't find that too plausible myself, but...

Of course, Strings might well use UTF-8 or UTF-16 encoded binary as their
internal representation, or maybe UTF-32 for applications needing constant-time
access.  (The requirement I perceive for flexibility in this matter is why I
would want to turn String into an abstract class).  But that's /Strings/, I
don't see a reason for /char/ to be anything other than an integer type with
known width.

> > Generics would vanish.
>
> !!! I thought Java would be better with more generics, rather than less
> (or none at all).

Java might be better off with a proper implementation of generics (as opposed
to the mess we've had dumped on us).  I don't, myself, see that there's much
to be gained by such a feature, and I don't really think that it's in the
"spirit of Java" -- so (at least for this discussion) I'd just drop them.

You may be thinking of a more C++-like feature which provides compile-time
metaprogramming.  I'd certainly agree that a language which supports
metaprogramming is much to be preferred over one that does not (unless, of
course, the "metaprogramming" is something as gross as C++ templates).  But
that wouldn't fit with my self-imposed restriction:

> Without trying to change Java into a better /kind/ of language, [...]

> > The definition of interfaces would be changed so that a method needed to
> > satisfy the contract implied by the interface need be no more visible
[quoted text clipped - 4 lines]
>
>     I don't understand this one.

E.g. I have a package which uses internal interfaces to give order to the
structure of the private code.  I want one of the public classes in that
package to implement one or more of those internal interfaces.  I am forced to
make the relevant methods public.  I.e. I have to /publish/ them, and thus
commit to keeping them unchanged in future developments.  Bad.  An interface is
a /promise/, but you have to ask who the promise is made to.  In this case I
want to be able to use promises internally but am impeded because the language
definition assumes that the only promises I will ever want to make are to
client code.

> > There would be a means of telling the compiler: "yes I know I'm calling
> > a method that you don't know about, but /trust me/, it'll be there by
[quoted text clipped - 3 lines]
>
>     I've never seen a need for this. Can you elaborate?

Maybe a simple example would help:

   ...
   aMethod()
   {
       double start = now();
       someLongishOperation();
       double end = now();
       System.out.printf("It took %f seconds%n", end-start)
   }

   /**
    * return the time in seconds since an arbitrary (but fixed) start-time.
    * Resolution is dependent on the version of the Java platform
    */
   private long
   now()
   {
       try
       {
           if (System.out.getPlatformVersion() >= 5)
               return System.out.nanoTime() / 1.0e9;
       }
       catch (NoSuchMethodException e)
       {
           // log it, or something
       }
       return System.out.getTimeMillis() * 1.0e3;
   }

That, or something like it, should compile on pre 1.5 platforms, but it won't
because the compiler is too damned fond of early-binding.  Note that it is the
/compiler/ that's doing this, the equivalent bytecode would run just fine
(legal /and/ safe) on any JVM.

   -- chris
Oliver Wong - 18 May 2006 15:54 GMT
> [me:]
>> > char would be 32-bit.
[quoted text clipped - 29 lines]
> with
> known width.

   I like the flexibility of adding new characters. If we define a size on
char, then we either have a finite number of character we can define, or we
have something like surrogate pairs (or triplets, or quadruplets, etc.)
where you don't have a 1 to 1 correspondence between the "concept of a
character", the that "char data type in Java".

   Potential sources for new characters (in approximate order of
probability):

   * More domain-specific characters. E.g. musical notation for percussive
instruments, symbols for obscure operators in math, physics, etc.
   * Integrating more popular, though "fictional" character sets, into
Unicode e.g. Klingon.
   * Invention of a new language like Esperanto.
   * New discovery by archeologists of ancient writing systems.
   * Contact with alien civilization which use a different character set.

[snipped more explanations from Chris where I asked for them]

   Ah, makes sense. Thanks.

   - Oliver
Roedy Green - 18 May 2006 19:45 GMT
>    * More domain-specific characters. E.g. musical notation for percussive
>instruments, symbols for obscure operators in math, physics, etc.

One of the "character" sets I saw on a prototype IBM colour terminal
was "geography".  They had a character shaped like the tip of the boot
of Italy.   You could put these together with solid blobs to make up a
map that took far fewer bits than a full bit map.  Memory was very
expensive back then and bandwidth was typically 9600 baud max with
many terminals sharing the "high speed" line.

Other possible places for character expansion:

1. airport symbol language expanding to a full international language
to be used on signs and emergency instructions.

2. ASL symbols for the deaf, showing gestures in symbolic form.

3. symbols for choreography.

4. weather symbols (might be in there already. I did not notice them).

5. ligatures and fancy forms needed for precise typesetting even if
they are inserted by rule.

6. Symbols for the visually impaired. Alphabets and symbols easy to
discriminate.

7. Symbols to record the oral-only niche languages rapidly
disappearing.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Kent Paul Dolan - 19 May 2006 11:12 GMT
> Other possible places for character expansion:

> 1. airport symbol language expanding to a full
> international language to be used on signs and
> emergency instructions.

Makes sense.

> 2. ASL symbols for the deaf, showing gestures in symbolic form.

This won't work in general. ASL symbols are done
moving in space, sometimes changing handshapes while
the hands move. Video recordings work better. There
_was_, by the way, a very comprehensive, very arcane
written notation for ASL created well over a decade
ago, but it never caught on with either the ASL
communitity, or the research community studying
them.

> 3. symbols for choreography.

Such already exist, (LAB-Annotation, for one), but
IIUC, they are pretty rich symbol sets, subject to
idiosyncratic extentions by each choreographer, and
might not map to "alphabets" well.

> 4. weather symbols (might be in there already. I
> did not notice them).

That would be mostly doable, but weather symbols
tend to be laid out on a two dimensional surface,
not used in typesetting, so the utility of such an
"alphabet" would be limited. Also, some weather
symbols, like storm front "curves with triangular
teeth", are extended graphical objects, with no base
point from which to draw them with an alphabet.

> 5. ligatures and fancy forms needed for precise
> typesetting even if they are inserted by rule.

I've seen at least some of those in there, but
others, that I expected to see, are instead done by
overstriking, and could usefully have unique
representations, which would usually be more
accurate, instead. Notice that ligatures of the
"ffl" type are really font choices, not alphabet
choices, and so perhaps not suitable for Unicode
codes, since "ffl" in one font might be a ligature,
while in another it would not. Similarly, an
ellipsis is a single character or three characters,
depending on font support, so it is a kind of
"ligature" too (and, is in Unicode already). I don't
think a consistent treatment here is possible, which
will give standards committees great sway to do
mischief if they attempt the deed anyway.

> 6. Symbols for the visually impaired. Alphabets
> and symbols easy to discriminate.

Mostly this is just accomplished by use of large
type, since the symbols have to be comprehensible to
the population with which they interact. Also,
notice that Unicode _doesn't_ include fonts or font
styles, just alphabet generic glyph identifiers and
ideograph generic glyph identifiers. Thus, some
"visually impaired" equivalent of the optical
character recognition fonts (which were for scanners
of the day that were "visually impaired" compared to
today's) wouldn't need codespace in the Unicode
standard, they'd just be other fonts or font styles,
with glyphs identified by Unicode "codes" for the
generic glyphs of which they are instances.

> 7. Symbols to record the oral-only niche languages
> rapidly disappearing.

A nice idea, but they're going away far too fast for
saving. The world has, IIRC, some 3,000 languages,
very few of which have long term viability, and many
of which are reduced to a handful of proficient
speakers today. There simply aren't enough linguists
and linguistically adept missionaries to save most
of them.

And, in the rush to save what can be saved, use of
the International Phonetic Alphabet (perhaps
extended) as the base script would sure be a lot
smarter than inventing a whole new alphabet per
language.

FWIW

xanthian.
Roedy Green - 19 May 2006 20:48 GMT
>. Also,
>notice that Unicode _doesn't_ include fonts or font
>styles, just alphabet generic glyph identifiers and
>ideograph generic glyph identifiers

That is the theory, but in practice you will find multiple symbols all
looking suspiciously like an A.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Dale King - 19 May 2006 15:04 GMT
>>    * More domain-specific characters. E.g. musical notation for percussive
>> instruments, symbols for obscure operators in math, physics, etc.

If you see ones that are missing here you should point them out to the
Unicode Consortium. They already have a fairly complete set with many
obscure symbols.

> One of the "character" sets I saw on a prototype IBM colour terminal
> was "geography".  They had a character shaped like the tip of the boot
> of Italy.   You could put these together with solid blobs to make up a
> map that took far fewer bits than a full bit map.  Memory was very
> expensive back then and bandwidth was typically 9600 baud max with
> many terminals sharing the "high speed" line.

That's not appropriate for Unicode. If you wanted something like that
for a specific project there are always private use areas of Unicode
that you can use for you own private use.

> Other possible places for character expansion:
>
> 1. airport symbol language expanding to a full international language
> to be used on signs and emergency instructions.

I can't imagine any symbols here appropriate for Unicode in general.
Examples?

> 2. ASL symbols for the deaf, showing gestures in symbolic form.
>
> 3. symbols for choreography.

Not appropriate as these are motions, not symbols, but if there are
symbols that are commonly used they should be proposed.

> 4. weather symbols (might be in there already. I did not notice them).

There are a few in this code page:

http://www.unicode.org/charts/PDF/U2600.pdf

> 5. ligatures and fancy forms needed for precise typesetting even if
> they are inserted by rule.

Many of these exist for common ones.

> 6. Symbols for the visually impaired. Alphabets and symbols easy to
> discriminate.

Definitely not appropriate for Unicode. This is a presentation/font issue.

> 7. Symbols to record the oral-only niche languages rapidly
> disappearing.

Isn't a symbolic alphabet for an oral-only language an oxymoron? ;-)

If the language doesn't currently have an alphabet and one is being
assigned, it would make a lot more sense to use existing alphabets than
creating brand new ones.

Signature

 Dale King

Oliver Wong - 19 May 2006 15:46 GMT
>>>    * More domain-specific characters. E.g. musical notation for
>>> percussive instruments, symbols for obscure operators in math, physics,
[quoted text clipped - 3 lines]
> Unicode Consortium. They already have a fairly complete set with many
> obscure symbols.

   Well, I took a look at http://www.unicode.org/charts/PDF/U1D100.pdf 
("Western Musical Symbols"), and they don't seem to have a notation for
indicating that the drummer should ease off the hihat pedal for the next few
notes, and then dampen the sound by applying pressure again. The notation
looks something like:

<asciiArt>
|-- O ---->                   -- (o) -|
|                                     |
</asciiArt>

   And is drawn above the staff of five lines where the notes are usually
drawn. There are others missing as well (e.g. repeat the following section,
but apply this ending the first time, and that ending the second time;
repeat the previous four measures; the following 3 notes should be played in
the time of 2 notes; apply the wah-wah pedal when playing guitars; let the
strings of the guitar ring openly; etc.) I haven't suggested this to the
consortium because:

   (1) I didn't realize you could (but I've seen discovered
http://www.unicode.org/pending/proposals.html)
   (2) I don't know the terminology or official names for these musical
symbols, being only an amateur musician. I figure there must be someone else
out there more qualified to make these submissions than me, but perhaps the
intersection of the set of all musicians and the set of all people who care
about Unicode is rather small.

>> Other possible places for character expansion:
>>
[quoted text clipped - 3 lines]
> I can't imagine any symbols here appropriate for Unicode in general.
> Examples?

   Well, they have some symbols which are, AFAIK, internationally
recognized. In http://www.unicode.org/charts/PDF/U2600.pdf there's the
recycling symbol, the biohazard symbol, and the poison symbol. Perhaps you
could have internationally recognized road signs as well (yield, stop, left
lane merge, etc.)

   - Oliver
Oliver Wong - 19 May 2006 20:26 GMT
>> 2. ASL symbols for the deaf, showing gestures in symbolic form.
>>
>> 3. symbols for choreography.
>
> Not appropriate as these are motions, not symbols, but if there are
> symbols that are commonly used they should be proposed.

   Kent Paul Dolan brought up similar arguments.

   As I was replying to a post by Bent C Dalager, it occured to me that
Unicode does not concern itself with the representation of characters at
all, so it is perfectly feasible that Unicode could support "animated"
glyphs. My post is at
http://groups.google.ca/group/comp.lang.java.programmer/msg/cd269f6cfc8392fc

   There are actually quite a few "unprintable" characters in Unicode, so
it wouldn't be a novelty to have characters that could not actually be
displayed in a traditional text editor. In fact, the codecharts have an
entire section called "Invisible Operators"
(http://www.unicode.org/charts/PDF/U2000.pdf, though in actuallity, some of
the characters defined there are indeed "visible").

   \u2029, for example, "Paragraph Seperator" is invisible, and is merely
to control the flow of text. It cannot, in itself, be displayed in any form.

   So it doesn't seem unreasonable to have some unicode character "\ufoo"
which represents a ASL gesture, which cannot be represented via some static
glyph.

   - Oliver
Kent Paul Dolan - 21 May 2006 11:50 GMT
>>> 2. ASL symbols for the deaf, showing gestures in
>>> symbolic form.

>>> 3. symbols for choreography.

>> Not appropriate as these are motions, not
>> symbols, but if there are symbols that are
>> commonly used they should be proposed.

> Kent Paul Dolan brought up similar arguments.

> As I was replying to a post by Bent C Dalager, it
> occured to me that Unicode does not concern itself
[quoted text clipped - 6 lines]
> to have characters that could not actually be
> displayed in a traditional text editor.

> So it doesn't seem unreasonable to have some
> unicode character "\ufoo" which represents a ASL
> gesture, which cannot be represented via some
> static glyph.

That might work for the various choreography
annotations, though the "alphabets" would be rather
huge unless done, as are many ligitures, as a set of
overstrikes of simpler motions, body part by body
part, all "printed" at the same location, or strung
out as a "word of motion".

But ASL "words/gestures" have context dependent
meanings, among other difficulties in capturing them
in brief encodings.

Their meanings sometimes also depend profoundly, not
just casually, on accompanying facial expressions.

Too, a typical ASL paragraph consists of putting
actors/items at various places in space in front
of the "speaker", then play-acting interactions
among those locations and their contents.

In many senses, ASL is a much richer language than
most spoken/written languages. I'm not an expert ASL
speaker, but I get by in simple conversations, and I
just don't see a way short of full video recording
to convey an ASL conversation with the usual ASL
conventions.

That recording would need to be stereo video
recording, too, to give depth perception. ASL's
"location in space meaning" is dependent on four
dimensions. X, Y, Z, and speed of execution of the
gesture are all modifiers to that gesture's meaning.

Even how broadly the gesture is made modifies its
meaning, as may all of its starting point, its
ending point. and its curved path through space.

Surely attempting to convey such a conversation in a
written format (and to me, even though it is a
system that includes invisible codes, Unicode is a
system targeted at written languages) that would let
the reader reconstruct the entire ASL gesture
sequence or even understand its meaning from its ASL
form, would be an incredibly painful exercise.

Moreover, since a written form of ASL _has_ been
created, and failed to be adopted, I'm guessing that
there would be no particular call for a "Unicode
version of ASL" in any case.

ASLers who want to convey their ideas in written
form, at least those literate enough to be capable
of any kind of reading/writing, normally write them
in English. [ASL is specifically _American_ sign
language, and depends on the user's knowledge of
American English to convey, by spelling them out,
ideas for which no accepted sign currently exists.]

Note also that there are several formal sign
languages besides ASL (i.e., Mexican sign language,
British sign language) usually quite incompatible
among themselves, so this problem would need solving
many times, and chew up many chunks of Unicode
code-space. I cannot conceive of any value in making
the attempt to encode ASL in a code space of
Unicode, which doesn't deny that someone may make
the attempt anyway.

FWIW

xanthian.
Roedy Green - 21 May 2006 21:30 GMT
>In many senses, ASL is a much richer language than
>most spoken/written languages.

In ASL, you have the analog ability to emphasise with the grandness of
gesture and exaggeration of the facial expressions.  You would not
need to encode that in an ASL symbolic dictionary.  Humans are quite
capable of supplying that on their own.

If you look at an ASL dictionary it has stylised pictures with little
arrows to indicate motion. I could imagine someone inventing a
notation that could be read directly or used to generate those images
or "Reboot" style 3D animations, much as Chinese ideograms can be
created from combining radical symbols.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Roedy Green - 20 May 2006 00:01 GMT
>> 5. ligatures and fancy forms needed for precise typesetting even if
>> they are inserted by rule.
>
>Many of these exist for common ones.

I read somewhere they decide not to add any more ligatures.  However,
in typesetting you still need a code for them, so I suspect eventually
they will be given Unicode slots.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Patricia Shanahan - 19 May 2006 15:25 GMT
...
> 7. Symbols to record the oral-only niche languages rapidly
> disappearing.

Why not use the International Phonetic Alphabet, which is already
represented in Unicode?

Patricia
Roedy Green - 19 May 2006 20:54 GMT
>Why not use the International Phonetic Alphabet, which is already
>represented in Unicode?

The original question was what might people do to  Unicode to expand
it, not what SHOULD they do.

There was a phonetic alphabet designed for native Canadian languages.
It is very pretty, but it is as almost as bad as Hebrew for having
very similar letters you would have to look carefully at to
discriminate.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Chris Uppal - 19 May 2006 09:49 GMT
>     I like the flexibility of adding new characters.

I presume you mean that you like the flexibility the ISO and Unicode consortium
have to add new characters, rather than you would like to be free to define
your own (if do mean the latter then there is always the private-use area to
play in).

> If we define a size
> on char, then we either have a finite number of character we can define,
> or we have something like surrogate pairs (or triplets, or quadruplets,
> etc.) where you don't have a 1 to 1 correspondence between the "concept
> of a character", the that "char data type in Java".

Me, I prefer to be able to manipulate characters as integers.  Which requires
(for sanity) knowing how wide the integer is.  Unicode isn't going to add
characters which don't fit into UTF-16, so there's a definite limit to how wide
the integer needs to be.  Even if they /did/ scrap UTF-16 (hardly likely when
it would break Windows, .NET, /and/ Java ;-) there is still a unimaginably huge
amount of space available in the 31-bits that ISO limits itself to.  It would
need several thousand "alphabets" the size of the unified HAN stuff to exhaust
that (and where are those writing systems lurking ?).

>     Potential sources for new characters (in approximate order of
> probability):
>
>     * More domain-specific characters. E.g. musical notation for
> percussive instruments, symbols for obscure operators in math, physics,

/Plenty/ of space is already available for that.

>     etc. * Integrating more popular, though "fictional" character sets,
> into  Unicode e.g. Klingon.

Ugh!  Bloody sci-fi soap opera.  (I /do/ like SF, I just don't like Star
Treck -- in any of its manifestations).  IMO, adding that kind of thing (say
Tolkein's scripts) to Unicode would be a pathetic abuse of power.

And there's plenty of space anyway.

>     * Invention of a new language like Esperanto.

But would any sane new language use a writing system like Chinese ?  And, if it
did, why would anyone want to take it seriously enough to add it to Unicode.
Let's say I design a language which, by definition, uses /all/ the Unicode
glyphs, in pairs, to denote a fixed but large set of words.  That /can't/ fit
into any possible Unicode-like scheme since it has been deliberately designed
to break any finite scheme.  So why should the scheme be extended to support
it ?

>     * New discovery by archeologists of ancient writing systems.

Certainly possible, and I would even call it probable.  But why should that
require more space than is already available ?

>     * Contact with alien civilization which use a different character set.

Since Unicode is designed around /human/ writing schemes, reflecting /human/
perceptual processes and /human/ cultural history(ies), I don't think it would
be legitimate (and almost certainly impossible) to use Unicode to represent
another species' communication systems.  Far better to adopt /their/ version of
Unicode for representing their communications.

And anyway, I sort of doubt whether there are any alien civilisations -- the
universe is far to big.

   -- chris
Oliver Wong - 19 May 2006 15:59 GMT
>>     I like the flexibility of adding new characters.
>
[quoted text clipped - 5 lines]
> to
> play in).

   Yes, I meant that I like the idea that the consortium can add letters as
needed.

>> If we define a size
>> on char, then we either have a finite number of character we can define,
[quoted text clipped - 16 lines]
> exhaust
> that (and where are those writing systems lurking ?).

   I don't think it should make sense to manipulate characters as integers,
just like it doesn't make sense to manipulate Strings which coincidentally
have length 1 as integers.

   If you're doing some sort of ASCII manipulation stuff, then you're not
actually dealing with the characters themselves, but the byte-encoding of
those characters in the ASCII encoding system, for example. So you'd take
your characters, convert them to integers or bytes or whatever using an
ASCII encoder, and then manipulate those integers, then convert them back to
characters using an ASCII decoder, for example.

   At any rate, I don't think we should impose an upper limit on the number
of useful symbols or characters that we allow to define for ourselves. It
reminds me of that "Nobody needs more than 640KB of RAM" (mis-)quote.

>>     Potential sources for new characters (in approximate order of
>> probability):
[snip]
>>     * Contact with alien civilization which use a different character
>> set.
[quoted text clipped - 8 lines]
> version of
> Unicode for representing their communications.

   Is this "humans-only" requirement actually documented anywhere? I mean,
if we found out that, for example, spiders encoded some communicative
information within the patterns of their webs and we managed to decode it,
would it be "against policy" to add symbols from this spider-language to
Unicode? Or would we say "well, now since we, as humans, have decoded it, it
becomes a human writing scheme, and so is apt to be used in Unicode"?

   I don't know what assumptions Unicode makes, but it seems to me that if
it's possible to add characters to it to support alien languages, it
certainly would be worthwhile to do so upon encountering those languages.

   I guess Unicode assumes that there exists a definite ordering of the
character streams (e.g. right to left, top to bottom). Or maybe it's not
Unicode which makes that assumption, but rather our Strings which do so. If
an alien civilization's natural language ressembled BeFunge, I'm not sure
how well our concepts of strings could cope, though we could certainly add
each symbol within that language to Unicode.

   - Oliver
Bent C Dalager - 19 May 2006 17:24 GMT
>    Is this "humans-only" requirement actually documented anywhere?

It is probably an emergent property of the system.

>I mean,
>if we found out that, for example, spiders encoded some communicative
>information within the patterns of their webs and we managed to decode it,
>would it be "against policy" to add symbols from this spider-language to
>Unicode? Or would we say "well, now since we, as humans, have decoded it, it
>becomes a human writing scheme, and so is apt to be used in Unicode"?

You would get into trouble if it turns out that the exact stickiness
(however stickiness is measured) of the strands involved in the symbol
are vital to the correct interpretation of the message.

How do you represent stickiness in Unicode?

Or perhaps pheromones add vital information to the picture.

>    I don't know what assumptions Unicode makes, but it seems to me that if
>it's possible to add characters to it to support alien languages, it
>certainly would be worthwhile to do so upon encountering those languages.

Our ability to do so presumably depends upon the aliens having the
exact same concept as we do as to what a glyph is. The simplest
variation (depending on 3D glyphs perhaps, or animated ones)* is
likely to throw off Unicode.

* - For all I know, Unicode may support this, but you get the idea.

Cheers
    Bent D
Signature

Bent Dalager - bcd@pvv.org - http://www.pvv.org/~bcd
                                   powered by emacs

Oliver Wong - 19 May 2006 20:18 GMT
>>if we found out that, for example, spiders encoded some communicative
>>information within the patterns of their webs and we managed to decode it,
[quoted text clipped - 10 lines]
>
> Or perhaps pheromones add vital information to the picture.

   I don't think Unicode says anything about the representation of such
characters. There's a platonic ideal representing the concept of the first
character in the lowercase alphabet, 'a'. Unicode assignes a number to that
character (\u0061), but it doesn't say anything about what that character
looks like, or how it should be drawn. There's another character, \u0430,
which is visually indistinguishable from \u0061 in all fonts I've seen, and
yet it's "obviously" a different Unicode character by virtue of having a
different number.

   So this spider-character-set would have different code points for each
character. It's up to the font designers to worry about how to represent
stickiness or pheromones in their fonts (if they chose to do so at all).

   Note that unicode text isn't nescessarily displayed visually; it could
be displayed via speech readers, or braille devices. It's almost a natural
fit to represent stickness via any tactical output device like braille.
Similar devices could be constructed to accurately represent pheromones
olfactorally.

>>    I don't know what assumptions Unicode makes, but it seems to me that
>> if
[quoted text clipped - 7 lines]
>
> * - For all I know, Unicode may support this, but you get the idea.

   Again, I believe unicode isn't interested in the actual representation
of these characters.

   - Oliver
Roedy Green - 20 May 2006 00:01 GMT
>You would get into trouble if it turns out that the exact stickiness
>(however stickiness is measured) of the strands involved in the symbol
>are vital to the correct interpretation of the message.

More likely is decoding something of squid language which entails
rapid subtle colour changes.  Presumably they compute something with
their outsized brains.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Chris Uppal - 21 May 2006 12:24 GMT
>     I don't think it should make sense to manipulate characters as
> integers, just like it doesn't make sense to manipulate Strings which
> coincidentally have length 1 as integers.

At one level I agree with you; there's something unnatural about conflating
characters and integers.  In fact Smalltalk works exactly how you suggest, and
my own Unicode implementation for Smalltalk (under construction) works that way
too, so I have a fair bit of experience using a system which separates the two
concepts.

But that's only half the story.  You also need to be able to do a significant
subset of arithmentical operations on character values (indexing into arrays
for instance), and such operations often turn up in places where constantly
casting back-and-forth between integer code points and actual characters would
be painful and/or inefficient.  Java doesn't really support the idea of
"hybrid" values -- half arithmetical, half not -- so, barring major changes to
the language, I'd stick with the current scheme, but make "char" wider.

It's perhaps worth emphasising that, in Unicode, a character has very little
meaning by itself -- it is, in general, not possible to do anything very useful
with a character which isn't an element of a stream or string.  Pretty-much the
only things you can legitimately do with a char are compare it with another
char or use it as a lookup index into Unicode character property tables.   A
character is /not/ like a short string -- it's a different class of entity
entirely.

Tell you what.  How about, since we're redefining Java anyway, we rename "char"
to "codepoint" ?  It would be more accurate...

>     Is this "humans-only" requirement actually documented anywhere?

Not that I know of, although it wouldn't surprise me to find the human-centric
design principles discussed somewhere.  Unicode includes rather a lot of
thoughtful and interesting meta-discussion in it's documentation (if not in the
standard itself).

The way that Unicode works is extremely practical and /not/ universal (see
below).  It introduces features only if they are used in some target
orthography.  Thus it has ligatures, since they are essential in many systems
of writing.  It also attempts to make round-tripping from other charsets, into
Unicode, and back possible (no information lost), and so has a very limited
number of Latin ligatures (and that's the /only/ reason it has Latin
ligatures).  No writing system uses colour to denote meaning (that I know of)
and so Unicode doesn't touch colour.  The result of this YAGNI-like focus on
features that are actually needed, is that Unicode inevitably reflects the
human processes which create written languages, and which determine their
logical structure.  One huge example is that human vision uses edge-detection
heavily.  As a result Unicode glyphs are /shapes/ -- shapes which can be
rendered as black-on-white.

BTW, don't get mislead by the odd few Unicode code points which are assigned to
non-visual purposes -- the BOM being a good example, or the directionality
markers.  There are damned few of those, and for the most part they only exist
in order to allow round-tripping or the use of Unicode in a context where
insufficient meta-information is available, and their use is disouraged in
other contexts.  Unicode is /about/ shapes.

It's worth considering how much Unicode /doesn't/ have which it might be
expected to include if the focus weren't so limited.  For instance it has no
way of expressing /semantic/ qualifiers on text such as italics (or, more
abstractly, emphasis).  It has no means of rendering prosody beyond the limited
expression implied by existing punctuation schemes[*].  Yet if the
text-to-speech example could be taken as a core use for Unicode -- i.e. as a
true alternative rendering of Unicode, on an equal footing with printing text
on paper -- then such annotations would seem to be highly desirable, perhaps
even necessary.

([*] Another aside: apparently English punctuation started out -- with the
Greeks, naturally -- purely as a way of expressing prosody, but at around the
time fully modern English emerged, the punctuation system had its own
mini-revolution: new marks were invented, old marks were reinterpreted or
discarded, and the role of punctuation shifted away from expressing prosody to
expressing grammar and other semantic features of text.)

> I
> mean, if we found out that, for example, spiders encoded some
[quoted text clipped - 3 lines]
> humans, have decoded it, it becomes a human writing scheme, and so is apt
> to be used in Unicode"?

I don't think it's a policy thing at all.  If this situation were ever to
arise, then I think one of two things would happen.  Either we humans (not
being able to "see" the patterns properly since we lack the necessary brain
circuitry) would develop an independent glyph-system for representing the
patterns (and whatever other features were needed). In that case the new glyph
system might get added to Unicode if enough humans wanted to represent
Spiderese texts in their discussions with other humans.  Note that the spiders
themselves would probably not be able to "see" our human glyphs any more than
we could see theirs, so this system would be solely for human use.  This is
roughly what has happened for musical notation[**]  Alternatively it might turn
out that human/spider brains were similar enough that we could read their
patterns directly (I have to say that I find this almost impossible to
imagine), in that case it would come down to the practicalities.  Does written
Spiderese break down into a glyph system similar enough to the existing human
ones for it to be expressed in the Unicode framework ?  I find this even harder
to imagine, but if it /did/ turn out that way then I see no reason for
spider-glyphs not to be added to Unicode.  To me (presupposing the existence of
other intelligences at all) it seems much more likely that their communications
wouldn't have a modality which was anywhere near close enough to human writing
to fit into Unicode.  Spiders, for instance, might be much more likely to use
moving patterns of standing waves in their webs (vibrations /matter/ to
spiders).  Almost any species might naturally record meaning as structures in a
very-high dimensional space -- smell is far more universal on Earth than
vision.

([**] BTW, it seems to me that musical notation is in Unicode because people
want to write /about/ music, not in order to /express/ music per se.)

Your (snipped) point about Unicode assuming sequence is well-taken.  Some human
written languages don't make much use of sequence.  I can't remember which
off-hand, but some of the old South American languages just bung a number of
symbols/pictures together into a cartoon-like frame, and leave it to the reader
to work out which express a meaning and which qualifies what.  It's an
interesting system since it allows a lot of freedom for the writer to be
creative with the pictures and layout.  I don't know how such systems would be
mapped into Unicode.  It'd be possible, I suppose, to write the symbols down in
an arbitrary, or conventual, order, but I don't know if that would be any use
for scholars, who might want to preserve the spatial layout.  If not then
they'd probably be better off using JPEGs instead of Unicode text.

I /think/ I may have worked out where we're seeing Unicode differently.
There's a parallel with dictionaries, which come in two broad flavours.  There
are the dictionaries which attempt to /record/ what the (written or not)
language is like at a given time and place (or over a range of such).  The OED
is the incomparable exemplar of this school of thought.  And then there are the
/prescriptive/ dictionaries -- ones which attempt to tell readers what the
"correct" meaning and spelling of a word is.   In the dictionary world the
prescriptive idea has long gone out of fashion[***], and prescriptive
dictionaries are only used for teaching purposes.  So, if people start --
say -- confusing "convince" and "persuade", the dictionaries will simply
reflect that in their next edition, whereas a school dictionary will attempt to
dictate that the two words have separate meanings (with a small amount of
overlap).

The parallel here is that I think you are seeing Unicode as non-prescriptive in
that sense, whereas I see it as essentially prescriptive.  It's purpose -- as I
see it --  is not to /record/ the diversity of the worlds scripts, but to
/standardise/ their computerised representation.  The motive is purely
practical, with no scholarly side to it at all. (Although considerable
scholarship goes into creating it, and it is intended to be used /by/
scholars.)   The purpose is only to allow people to share written texts across
different computers -- and for that a prescriptive approach is necessary.  A
/standard/.

([***] Since about Samuel Johnson's time, although the idea does resurface from
time to time -- I believe the original Webster's Dictionary was primarily
prescriptive.)

   -- chris
Oliver Wong - 23 May 2006 21:22 GMT
[Snipped long, but very interesting response -- thanks Chris]

   I found Chris' reply very interesting and informative, and it inspired
me to actually go read The Unicode Standard document. I'll copy and paste
interesting block quotes later on in this post, but for the extremely
impatient, here's a bullet point summary.

   * The Unicode Standard does set a limit on itself at 0x10FFFF (or just
over a million) characters. I don't know why.
   * Unicode deals with abstract semantical concepts of a character, and
not with the glyph, graphic, picture or whatever you want to call it, that
is used to actually visually render that character legible.
   * They specifically say that they do not wish to cover "dance
notations".
   * One interesting (to me anyway, and in the context of this discussion)
character is U+2062. It's an character which is traditionally invisible
(though I suppose fonts are free to supply a graphic for it) which
represents the mathematical concept of multiplication. That is, in when you
want to write the concept "A times B", you'd write the character 'A', the
U+20