Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / November 2006

Tip: Looking for answers? Try searching our database.

Java equivalent of C++ Spirit or Boost Graph Library ?

Thread view: 
Fab - 27 Oct 2006 21:18 GMT
Hi !

 Because of very interesting feature of Java that lacks to C++ :
memory handled by the language, I wonder if it would be eaily possible
to switch from C++ to Java ( In order to make the writing of an
interpreter for a functional language easier. )

 I see some STL features like vectors are usable in Java.

But about other C++ libraries : do we often find something equivalent
in Java ?
For example :
With Boost Spirit, it's possible to define a language grammar using
Backus Naur Form : very cooler than lex/yacc/bison.
And Boost Graph Library provide maths graph structure, than can be
visited, studied, printed...

Please, does java provide something like these ?

 Thanks for advance

Fabrice
Chris Uppal - 28 Oct 2006 11:27 GMT
>   Because of very interesting feature of Java that lacks to C++ :
> memory handled by the language, I wonder if it would be eaily possible
> to switch from C++ to Java ( In order to make the writing of an
> interpreter for a functional language easier. )

It certainly possible, but you will be learning a new language -- don't assume
that familiarity with C++ will help you learn Java.  (It will help in some way,
but it'll be a hindrance in other, and perhaps more important, ways.)

In particular, Java does not have templates, nor anything which even vaguely
resembles them.  Hence there is not, and cannot be, an equivalent to the
template-based games which underlie the approach taken in the Boost
library(ies).

Of course, Java does have libraries -- it's just that they are based on
different principles.

> With Boost Spirit, it's possible to define a language grammar using
> Backus Naur Form : very cooler than lex/yacc/bison.

Look for ANTLR.

> And Boost Graph Library provide maths graph structure, than can be
> visited, studied, printed...

I don't know of any really advanced graph handling libraries in Java, myself.
(There are some pretty sophisticated graph libraries in C++, but I don't know
whether the Boost library is one of them).  "Jung" and "JGraphT" are probably
worth a look (I haven't used either of them myself).  If the (staggering) price
of "yFiles" is any indication of its quality and completeness then it must be
excellent ;-)   No doubt you'll be able to find other packages...

The usual technique for finding stuff is to go to your favourite search engine.
For instance, type:
   "graph library" java
into Google (it might help to add:
   -chart
to the search terms, though that will eliminate some valid hits too[*]).

As a general point: you'll find that the libraries (standardised, open source,
or commercial) which are available tend to reflect what people are doing with
Java, so there is rather more support for tasks like drawing things or serving
up web pages, than for mathematical operations.

   -- chris

([*] Interesting to note that Google's advert-selection algorithm responds to
me indicating a lack of interest in charts by serving up lots of charting
packages...)
Christopher Benson-Manica - 28 Oct 2006 13:46 GMT
> In particular, Java does not have templates, nor anything which even vaguely
> resembles them.

Surely the 1.5 generics can be said to at least "vaguely resemble" C++
templates?  There are fundamental differences between how the features
are implmented in each language, but they enable similar design
decisions to be made.

> Hence there is not, and cannot be, an equivalent to the
> template-based games which underlie the approach taken in the Boost
> library(ies).

I haven't looked at the Boost code, but I could definitely believe
that the code rampantly abuses templates, as C++ allows.

> Of course, Java does have libraries -- it's just that they are based on
> different principles.

It's worth noting (for OP) that Java's standard libraries are more
full-featured in many ways than anything a C++ library could offer -
for example, a Java author finds a wealth of thread support at his
fingertips, while a C++ author must turn to the operating system (or a
third-party package intended for that operating system) for help.

Signature

C. Benson Manica           | I *should* know what I'm talking about - if I
cbmanica(at)gmail.com      | don't, I need to know.  Flames welcome.

Patrick May - 28 Oct 2006 17:09 GMT
> > In particular, Java does not have templates, nor anything which
> > even vaguely resembles them.
>
> Surely the 1.5 generics can be said to at least "vaguely resemble"
> C++ templates?

    Only syntactically.

> There are fundamental differences between how the features are
> implmented in each language, but they enable similar design
> decisions to be made.

    Actually, they don't.  Type erasure means that a generic class in
Java represents a single type, regardless of how it is parameterized.
This makes it impossible to use generics to implement techniques such
as James Coplien's Curiously Recurring Template idiom.  Despite their
name, Java's generics do not support Generic Programming.

    Generics in Java are only useful for eliminating the need to cast
in certain situations.  They're too complex for the minimal value they
provide.

Regards,

Patrick

------------------------------------------------------------------------
S P Engineering, Inc.  | Large scale, mission-critical, distributed OO
                      | systems design and implementation.
         pjm@spe.com  | (C++, Java, Common Lisp, Jini, middleware, SOA)
Christopher Benson-Manica - 28 Oct 2006 18:08 GMT
> > Surely the 1.5 generics can be said to at least "vaguely resemble"
> > C++ templates?

>      Only syntactically.

That sounds pretty vague :-)

> > There are fundamental differences between how the features are
> > implmented in each language, but they enable similar design
> > decisions to be made.

>      Actually, they don't.  Type erasure means that a generic class in
> Java represents a single type, regardless of how it is parameterized.

Well, perhaps I should have selected a more suitably vague definition
of "similar" for my comment.  I came to Java from C++ and I admit to
being disappointed to learn that many things that could be done with
C++ templates were not possible in Java.

>      Generics in Java are only useful for eliminating the need to cast
> in certain situations.  They're too complex for the minimal value they
> provide.

I disagree; even taken as mere syntactic sugar, the clarity
improvement that said sugar provides is significant.

Signature

C. Benson Manica           | I *should* know what I'm talking about - if I
cbmanica(at)gmail.com      | don't, I need to know.  Flames welcome.

Patrick May - 28 Oct 2006 21:49 GMT
> >      Generics in Java are only useful for eliminating the need to
> > cast in certain situations.  They're too complex for the minimal
> > value they provide.
>
> I disagree; even taken as mere syntactic sugar, the clarity
> improvement that said sugar provides is significant.

    Aye, but they could have been so much more.

Regards,

Patrick

------------------------------------------------------------------------
S P Engineering, Inc.  | Large scale, mission-critical, distributed OO
                      | systems design and implementation.
         pjm@spe.com  | (C++, Java, Common Lisp, Jini, middleware, SOA)
Fab - 29 Oct 2006 00:33 GMT
Very thanks for the help and for the interesting answers !

Chris warns me about the difficulty of learning Java. Good thing :
must think about this. At first glance Java seemed to have very similar
syntax than C++. The main differences I was able to notice was the
simple inheritance and the non overloaded operators. I now realize it
is more deeply different.

 About Spirit substitute, I see antlr is available as a package for my
debian. I discover it can work for Java and C++ too.

 About BGL replacement :
Here is what it can do :
http://www.boost.org/libs/graph/doc/
a lot indeed but its pretty painful interface gives me headeache.
(So I wrap it for an easier use :
http://fabrice.marchant.free.fr/INFO/wrappers/DiGraphPtr.h
)
Looking at Jgraph. (There is a free version too). It seems indeed to
give "... more support for tasks like drawing things ... than for
mathematical operations."

Christopher wrote :
"...I could definitely believe that the code rampantly abuses
templates, ..."
Yes, your absolutely right. When you launch a compile including these
big templates (not really a lib then) on an old PII 350 MHz, you can go
and fetch a beer... But the executable is fast.

 ( More, maybe you would consider this as an abuse : the Boost MPL,
template metaprogramming give you the possibility to compute absolutely
all you want by the compiler ! Submit it a chess position and it 'll
solve mate at compile time. )

( I snip experts discussion about generic programming in Java. )

Jeffrey wrote :
"C++ provides excellent memory management via its standard library.
Post in comp.lang.c++.moderated."
I did post in comp.lang.c++. Moderated forum would be better for this
question ?

"IMO, automatic garbage collection would be a really bad reason to
choose Java over C++, especially since I prefer C++-style resource
management."
 The interpreter I try to write is for a new Lisp-like language.
Mc Carthy created Lisp and Garbage Collection too ( I think ).
 I do need GC too cos the project is a functional language that causes
cyclic references and it would be very hard to explicitely unallocate
the forms that are generated at each evaluation.

 Here is an excerpt from the agressive book, The UNIX-Haters :

"C++ users, alas, are forced to pick up their garbage manually. Many
have been brainwashed into thinking that somehow this is more efficient
than  using something written by experts especially for the platform
they use.
These same people probably prefer to create disk files by asking for
platter, track, and sector numbers instead of by name.
...a study comparing performance of programmer-optimized memory
management techniques in C versus using a standard garbage collector.
C programmers get significantly worse performance by rolling their own.
OK, suppose you are one of those enlightened C++ programmers who wants
a garbage collector portable and reusable..."

 I agree with you in the way it would be very cleaner to be able to
accurately free all we have created. Relying on a GC seems at best
blurred.
However do not know how to avoid this.

"...If your interpreted language can reasonably be parsed with regular
expressions and parse trees, you might not even need a separate library
for it."

Joel de Guzman (Spirit) wrote : "True, there are tools such as
regular-expression libraries (such as boost regex) or scanners (such as
boost tokenizer), but these tools do not scale well when we need to
write more elaborate parsers. Attempting to write even a
moderately-complex parser using these tools leads to code that is hard
to understand and maintain. "

I wrote the parse part of my interpreter project using only the
Standard Template Library and the code is complicated, ugly...

Then I discovered Spirit, I was amazed about the clarity it provides.
Moreover, you program in BNF-like C++ directly ! No extra translation
step required.

 Thank you again Jeffrey for your interesting comparison between the
languages.

 Please, is there somewhere a classified list of the numerous Java
libraries (with a concise description) that we could browse by category
?

 Regards

  Fabrice
Chris Uppal - 29 Oct 2006 11:39 GMT
> Looking at Jgraph. (There is a free version too). It seems indeed to
> give "... more support for tasks like drawing things ... than for
> mathematical operations."

Small point: not "JGraph", but "JGraphT" -- a different package with a
different purpose (and licensing).

>   Please, is there somewhere a classified list of the numerous Java
> libraries (with a concise description) that we could browse by category
> ?

Not that I know of.

The Sun-supplied libraries are already extensive enough that they could do with
a better indexing system than is available.  Large external libraries (such as
the stuff produced under the Apache Jakarta umbrella) are similarly in need of
better indexing.  And there's masses of stuff out there which doesn't fall into
either class.

Sometimes you can find helpful lists on the Web by starting with the names of a
few related libraries, and then searching for pages that mention all of them.
I've just found
   http://java-source.net/
using that technique -- it may help you find out more of what's available.

   -- chris
Jeffrey Schwab - 30 Oct 2006 22:29 GMT
>  Very thanks for the help and for the interesting answers !
>
[quoted text clipped - 37 lines]
> I did post in comp.lang.c++. Moderated forum would be better for this
> question ?

Possibly.  The posters are generally far more familiar with the
language, and more accomplished in their fields.  Many of them have
strong affections for smart pointers, though, which are likely to be
their suggestion to you re. memory management; let me disagree a priori.  :)

Is this your post in comp.lang.c++?

http://groups-beta.google.com/group/comp.lang.c++/browse_thread/thread/e5356a732
3ae95bb/ec868a8be396b6df?hl=en#ec868a8be396b6df


  "I need to use a GC in order to program an
  interpreter for a functional language. Memory
  handling would be otherwise difficult too much."

Do you understand RAII and deterministic destruction?  Once you do, you
will probably be far less nervous about using C++.  Of course you  if
you go with Java, this group is certainly a great resource, and the
level of discourse in comp.lang.java.programmer seems (believe it or
not) to be generally higher than comp.lang.c++. :)

> "IMO, automatic garbage collection would be a really bad reason to
> choose Java over C++, especially since I prefer C++-style resource
> management."
>   The interpreter I try to write is for a new Lisp-like language.
> Mc Carthy created Lisp and Garbage Collection too ( I think ).

So sayeth Wikipedia.

>   I do need GC too cos the project is a functional language that causes
> cyclic references and it would be very hard to explicitely unallocate
> the forms that are generated at each evaluation.

No, that point does not follow from your requirements.  At any time,
each dynamically created object A should be owned by exactly one object
B.  You can have all the circular references you like, as long as you
avoid circular ownership.  Circular references have gained notoriety
because they don't get properly collected by reference-counted garbage
collectors, e.g. Perl's.  They're not a problem in either a
well-designed C++ program or a typical, mark & sweep Java implementation.

>   Here is an excerpt from the agressive book, The UNIX-Haters :
>
[quoted text clipped - 4 lines]
> These same people probably prefer to create disk files by asking for
> platter, track, and sector numbers instead of by name.

Well, f.ck the UNIX-Haters, too.  C++ does not force you to pick up your
garbage "manually."  It does, however, force you to think (for programs
beyond a certain size) about object ownership and other relationships.
It supports you in this endeavor with probably the best static type
system of any language in popular use.  You can use (almost) whatever
design techniques you find most natural:  procedural, object-oriented,
or even functional; low-level or abstract.  (On the down-side, some
techniques, e.g. aspect-oriented programming, do require a silly amount
of typing.)

>  ...a study comparing performance of programmer-optimized memory
> management techniques in C versus using a standard garbage collector.
> C programmers get significantly worse performance by rolling their own.

Right on.  You're unlikely to roll a GC that's better than the HotSpot
VM's.  Remember, though, that C is a very different language from C++;
C++ gives you vastly better ways to manage memory and other resources.

> OK, suppose you are one of those enlightened C++ programmers who wants
> a garbage collector portable and reusable..."
[quoted text clipped - 3 lines]
> blurred.
> However do not know how to avoid this.

Post some details (and preferably a little code) in clc++m and watch the
magic fly. :)

> "...If your interpreted language can reasonably be parsed with regular
> expressions and parse trees, you might not even need a separate library
[quoted text clipped - 4 lines]
> boost tokenizer), but these tools do not scale well when we need to
> write more elaborate parsers.

Do you need an elaborate parser?  How does your grammar look?  Can you
post some sample code from your interpreted language?

> Attempting to write even a
> moderately-complex parser using these tools leads to code that is hard
> to understand and maintain. "

That has not been my experience.  The only interpreted languages I've
written that actually got used regularly by other people have been very
domain-specific, but the parsers were at least moderately complex, and
the code was straight-forward to maintain.  My languages of preference
for parsers have been C++ (because it lets me use one language for the
whole application) and Perl.  I don't doubt that one *can* write hideous
code using regular expressions and tokenizers, but saying so is like
pointing out that only one end of a hammer is any good for whacking nails.

> I wrote the parse part of my interpreter project using only the
> Standard Template Library and the code is complicated, ugly...

Don't blame the STL!  The STL on its own is not meant to be a complete
solution to every problem, any more than Java Collections are.

> Then I discovered Spirit, I was amazed about the clarity it provides.
> Moreover, you program in BNF-like C++ directly ! No extra translation
> step required.

Cool.  I have not used Spirit.  On the other hand, I don't usually
specify languages in BNF. :)

>   Thank you again Jeffrey for your interesting comparison between the
> languages.
>
>   Please, is there somewhere a classified list of the numerous Java
> libraries (with a concise description) that we could browse by category
> ?

No, there's not CPAN-like central repository.  This group seems to be as
good a place as any to ask about specific libraries, though.  You might
start here to get some ideas:

http://directory.google.com/Top/Science/Math/Combinatorics/Software/Graph_Drawing/
Tor Iver Wilhelmsen - 28 Oct 2006 22:56 GMT
> Surely the 1.5 generics can be said to at least "vaguely resemble" C++
> templates?

Not formally - only if Sun foolishly had chosen to implement "real"
templating by synthesizing classes when you used generics, e.g. if use
of

ArrayList<String>

was turned into a class called

ArrayList_Generic$String

or some such nonsense.

> There are fundamental differences between how the features
> are implmented in each language, but they enable similar design
> decisions to be made.

Yes, but the Java solution prevents foot-shooting.
Chris Uppal - 29 Oct 2006 11:04 GMT
[me:]
> > In particular, Java does not have templates, nor anything which even
> > vaguely resembles them.
>
> Surely the 1.5 generics can be said to at least "vaguely resemble" C++
> templates?

Not really, at least not as I see it.  The two language features are completely
different (apart from superficial syntax); the "similarity" is only that there
is an overlap in what they are used /for/.

C++ templates are a way of doing real macros (or "template metaprogramming|" as
they like to call it).  A strange, limited, and clunky feature -- or rather, an
elegant and enormously powerful idea given a strange, limited, and clunky
expression in C++.  One of the many potential applications for that feature is
in creating type-safe collections.

Java's generics are a way of not having to write explicit casts.  A strange,
limited, and clunky feature -- or rather, a moderately useful idea given a
strange, limited, and clunky expression in Java.  One of the few potential
applications for that feature is in creating type-safe collections.

You may not agree (and especially, you may not agree with my value judgements),
but I hope you see what I mean.

Note that the OP's questions were about template applications outside the
overlap with Java's generics.

   -- chris
kevin  cline - 30 Oct 2006 23:11 GMT
> > In particular, Java does not have templates, nor anything which even vaguely
> > resembles them.
>
> Surely the 1.5 generics can be said to at least "vaguely resemble" C++
> templates?

The only thing Java generics have incommon with C++ templates is their
shared use of angle brackets.  C++ templates generate code.   Java
templates are merely syntactic sugar to allow more precise compile-time
type checking.
Jeffrey Schwab - 28 Oct 2006 19:36 GMT
>   Because of very interesting feature of Java that lacks to C++ :
> memory handled by the language, I wonder if it would be eaily possible
> to switch from C++ to Java ( In order to make the writing of an
> interpreter for a functional language easier. )

C++ provides excellent memory management via its standard library.  Post
in comp.lang.c++.moderated.  IMO, automatic garbage collection would be
a really bad reason to choose Java over C++, especially since I prefer
C++-style resource management.

>   I see some STL features like vectors are usable in Java.
>
[quoted text clipped - 5 lines]
> And Boost Graph Library provide maths graph structure, than can be
> visited, studied, printed...

Myriad libraries are available for Java.  The presence of specific
libraries would be an excellent reason for you to choose one language
over the other.  Graph visualization especially seems to have more and
better open source options in Java.

Java does not have an equivalent of Boost, but the standard library is
enormous on its own.  If your interpreted language can reasonably be
parsed with regular expressions and parse trees, you might not even need
a separate library for it.

C++ is tremendous.  I am always amazed at what can be achieved
efficiently, both in terms of runtime resources and programmer effort,
with only the core language and standard library.  It scales down as
well as it scales up, and its object model is both powerful and
graceful.  The conventional wisdom is that Java programs require fewer
lines of code than equivalent C++ programs, but in my limited
experience, the programs being compared are really not equivalent.

The Java language enables a new programmer to start writing useful
programs far more quickly than does C++.  The Java platform is enormous,
and is probably the best reason to choose Java over C++ for a specific
application.  Though the core language is IMHO inferior to C++ its
relative simplicity and native support for multithreading are compelling
advantages.  You will probably have a much easier time going from C++ to
Java than you would going the other way.
Chris Uppal - 29 Oct 2006 11:41 GMT
> C++ provides excellent memory management via its standard library.

I'm not sure that I'd call it "excellent", but however you judge it, it is not
at all suitable for the OP's purposes.  He will need proper garbage collection,
so it's a choice between using a system-supplied one, or writing your own /in/
C++.

But a word of warning: GC algorithms/implementation tend (naturally) to be
tuned to the allocation patterns of the languages for which they are intended.
The pattern of references between objects in Java may be very different from
those naturally arising in a functional language implementation -- and it might
also depend on the style of the implementation.  (I remember a paper comparing
the Boehm conservative collector for C/C++ in an implementation of Lisp and a
combinator-machine implementation of some FP language -- it didn't work well at
all for the latter application, probably because the reference networks were
much bigger and more tangled).

In this case, I'd be a bit wary in case the JVM's collector assumed that chains
of object references were short enough to be followed by explicitly recursive
function calls.  If so, then there /may/ be a risk that it would run out of
stack space trying to GC an object-network that arose from some styles of FP
implementation.  It's easy enough to test -- just create a long chain of
objects (longer than would naturally arise in the FP language implementation)
and see if the collector breaks ;-)

   class Cell { Cell next; }

   Cell head = new Cell();
   Cell tail = head;
   for (long i = 0; i < LONGEST_PLAUSIBLE_CHAIN; i++)
   {
       tail.next = new Cell();
       tail = tail.next;
   }

   head = tail = null;

   // now wait until the GC has had a chance to crash...

Please note that I'm not saying that the GC in any particular JVM /would/ be
unsuitable for an FP language implementation, only that there /might/ be an
issue (depending on many factors, including the implementation strategy of the
FP language itself).

   -- chris
Jeffrey Schwab - 30 Oct 2006 21:41 GMT
>> C++ provides excellent memory management via its standard library.
>
> I'm not sure that I'd call it "excellent", but however you judge it, it is not
> at all suitable for the OP's purposes.  He will need proper garbage collection,
> so it's a choice between using a system-supplied one, or writing your own /in/
> C++.

I really don't understand why you would even think something like that.
 I disagree entirely with your premise.  What makes you think that
garbage collection is indispensable to the authoring of an interpreted
language?  The most popular dynamic languages are written in C, so
clearly, GC is not strictly necessary.  C++, in fact, has a fantastic -
I do call it excellent - model of resource management.  It's not
"improper" garbage collection, or a partial GC implementation, or
anything like it.

Don't get me wrong; I'm not trying to advocate either C++ or Java over
the other.  The two languages have different approaches, and I'm not
sure why you think one of those approaches is the only right answer.  I
definitely feel that GC is an awful reason to choose Java over C++;
memory is only one kind of resource, and I prefer deterministic
destruction to GC and finally-blocks.  The functionality of the Java
platform, the cleanliness of the syntax, and the portability of the
compiled code are all much better reasons to prefer Java.
Chris Uppal - 31 Oct 2006 15:02 GMT
[me:]
> > I'm not sure that I'd call it "excellent", but however you judge it, it
> > is not at all suitable for the OP's purposes.  He will need proper
[quoted text clipped - 3 lines]
> I really don't understand why you would even think something like that.
>   I disagree entirely with your premise.

To be honest I cannot understand how /you/ could possible think that.  I'm not
trying to be offensive, and I don't want to come across that way, but I can't
think of any milder way to express my bafflement.

The OP stated that he was interested in creating a Lisp-like interpreted
language.  There are /very/ few interpreted languages which are not
garbage-collected (the only one /I/ can think is/was early PostScript); and he
also stated that he was interested in Java because it came with GC built-in.
So I'm assuming that he wants his language to feature garbage collection too.
I'll go further than that: he'd have to be insane to consider creating a
general-purpose interpreted language which /didn't/ have garbage collection.
(Special-purpose languages may not need or benefit from it -- I have written
ones which do and -- rather more often -- ones which don't.)

So, how can he implement GC for his language in C++ ?  He /can't/ just expose
C++'s memory management (or lack of it) at the level of his language -- by
hypothesis.  There is no way to implement what is intrinsically a global
operation ("is this memory used anywhere ?") using only local reasoning.  But
C++'s resource management is entirely local.  So he'd have to build a garbage
collector /in/ C++.  It might be that some of the C++ idioms would help in
implementing that -- for instance one might represent "pointers into the heap"
on the C++ stack as some sort of smart pointer which automatically maintained
the necessary housekeeping information so that the global GC could find out
which stack slots held references into the heap.  (Actually, I think that would
be a very bad implementation strategy, but only for performance reasons -- it
would work, but it'd be slow.)

Another option would be to use the Boehm Conservative Collector in the C++
program.  I've used that once myself (for a small language implemented in C).
That has potential problems, and I would not recommend it unless, for some
reason, you /have/ to use C or C++ as the implementation language (in which
case it becomes a reasonable option to evaluate).  Again, though, that
collector supplants the C++ resource management, replacing it with a global
operation.

The /real/ solutions (the ones that don't attempt to cut corners) would be
either to implement the target language in another language which does feature
proper memory management, or to create a /real/ GC implementation in C++.  That
is perfectly possible (note that the GC wouldn't be managing C++'s memory, only
the memory of the target language).  The problem is that it is difficult and
messy to create a decent implementation (for instance it affects /all/ the
other code).  Which is one reason why so many of the "classic" interpreted
languages (Perl, Python, etc) feature half-arsed GC implementations, and indeed
the early JVMs from Sun featured laughably inadequate attempts at GC too.

Incidentally, you said (in a different post in this thread) something like (not
an exact quote) "in any well-designed C++ program, each dynamically allocated
object will have exactly one owner".  I agree, but it's important to realise
that "well designed" here means only "feasibly implementable in C++" -- it is a
/restriction/ imposed by the language, not a feature of well-designed programs
in general.  The effect of C++'s lack of automatic /global/ resource management
is to reduce the design space to only those designs in which a clear notion of
ownership can be identified.  Not all designs have that property, and indeed, I
would say that not many /do/ have that property, and that for a designer to
have to consider /whether/ that property can be engineered into some design is
just a waste of that designer's time and effort.

It's probably obvious by now that I consider GC a necessity for "real"
general-purpose languages, and hence that I consider C++ and C only suitable
for special cases and very small programs.  Writing a very high-performance
language interpreter (such as a modern JVM) can certainly be considered one
such special-case  -- and C++ wouldn't at all be a bad choice /if/ the
implementation were to be so sophisticated that it needs to hit the metal hard.
For a less extreme implementation it seems to me that one would be better off
treating the problem as just another programming task (one where things like
abstraction, clarity, and an unrestricted design-space, are more important than
access to bare metal) and write the implementation in a real programming
language -- Lisp, Smalltalk, Haskell, or even Java...

But I've been wandering off-topic a bit.  The real point is that, since C++
doesn't have global memory management, using it to implement a language which
/does/ have that is tricky at best.

   -- chris
Jeffrey Schwab - 31 Oct 2006 15:24 GMT
> [me:]
>>> I'm not sure that I'd call it "excellent", but however you judge it, it
[quoted text clipped - 7 lines]
> trying to be offensive, and I don't want to come across that way, but I can't
> think of any milder way to express my bafflement.

Ditto completely.

> The OP stated that he was interested in creating a Lisp-like interpreted
> language.  There are /very/ few interpreted languages which are not
> garbage-collected (the only one /I/ can think is/was early PostScript);

The languages are garbage-collected, but the implementations are not.
Even Java's implementation is written in C++.  Nobody's saying his
functional language shouldn't have GC.

> and he
> also stated that he was interested in Java because it came with GC built-in.
> So I'm assuming that he wants his language to feature garbage collection too.

Right, because he thinks he needs it.  My point was that he might not,
and that he probably shouldn't choose his language based on the presence
of absence of GC.

> I'll go further than that: he'd have to be insane to consider creating a
> general-purpose interpreted language which /didn't/ have garbage collection.

I guess that depends on the definition of "general-purpose," but you're
certainly correct that the popular dynamic languages have GC.

> (Special-purpose languages may not need or benefit from it -- I have written
> ones which do and -- rather more often -- ones which don't.)
>
> So, how can he implement GC for his language in C++ ?

Using the same techniques as Ruby, Python, Java, Perl...  Do you really
foresee Groovy or BeanShell replacing LAMP?

> He /can't/ just expose
> C++'s memory management (or lack of it) at the level of his language -- by
> hypothesis.

C++ does not lack memory management.  Please stop implying otherwise.
It's getting irritating.

> There is no way to implement what is intrinsically a global
> operation ("is this memory used anywhere ?") using only local reasoning.  But
> C++'s resource management is entirely local.

No, it's as local as it needs to be.  It doesn't force you to use a big,
global collector, although the standard containers do use allocators
that share the same memory pool.

> So he'd have to build a garbage
> collector /in/ C++.  It might be that some of the C++ idioms would help in
[quoted text clipped - 4 lines]
> be a very bad implementation strategy, but only for performance reasons -- it
> would work, but it'd be slow.)

He wouldn't have to write any such thing, because the standard library
includes smart pointer implementations, and various others are freely
available.  Rather than pan the performance on theoretical grounds, you
might want to get some numbers from the real world.

FWIW, I agree that smart pointers are not the way to go.  I'd rather
have either Java-style GC, or use smart Factories.

> Another option would be to use the Boehm Conservative Collector in the C++
> program.  I've used that once myself (for a small language implemented in C).
[quoted text clipped - 3 lines]
> collector supplants the C++ resource management, replacing it with a global
> operation.

C is not C++.  Really.

All allocators of a given type have to use the same storage pool in
order to be compatible with the STL.  They typically use static pools,
so they actually are global.  The STL containers put memory back into
the pools as soon as they're through with it, which may be either faster
or slower than waiting for them to be marked & swept, depending on the
situation.

> The /real/ solutions (the ones that don't attempt to cut corners) would be
> either to implement the target language in another language which does feature
[quoted text clipped - 5 lines]
> languages (Perl, Python, etc) feature half-arsed GC implementations, and indeed
> the early JVMs from Sun featured laughably inadequate attempts at GC too.

Have you seriously had a problem with GC in Python?  They're not
"half-arsed" implementations.

> Incidentally, you said (in a different post in this thread) something like (not
> an exact quote) "in any well-designed C++ program, each dynamically allocated
> object will have exactly one owner".  I agree, but it's important to realise
> that "well designed" here means only "feasibly implementable in C++" -- it is a
> /restriction/ imposed by the language, not a feature of well-designed programs
> in general.

We disagree again.  You've effectively said that a certain type of GC is
critical, but that only Java (and I assume you acknowledge C#) provides
it.  I say that deterministic destruction is a reasonable alternative.
The fact that C++ has better support for it than other languages does
not mean the language is more restrictive.

> The effect of C++'s lack of automatic /global/ resource management
> is to reduce the design space to only those designs in which a clear notion of
> ownership can be identified.  Not all designs have that property, and indeed, I
> would say that not many /do/ have that property, and that for a designer to
> have to consider /whether/ that property can be engineered into some design is
> just a waste of that designer's time and effort.

I think we're viewing the world from very different angles.  Determining
the hierarchy ownership has proven one of the most valuable techniques
at my disposal.

> It's probably obvious by now that I consider GC a necessity for "real"
> general-purpose languages, and hence that I consider C++ and C only suitable
> for special cases and very small programs.

Why do you keep mentioning C?

It sounds like only Java has a GC you approve, ergo you consider only
Java a "real" general-purpose language.  I suppose you might make an
allowance for C#.

> Writing a very high-performance
> language interpreter (such as a modern JVM) can certainly be considered one
[quoted text clipped - 5 lines]
> access to bare metal) and write the implementation in a real programming
> language -- Lisp, Smalltalk, Haskell, or even Java...

None of which are (except Smalltalk) are canonically implemented in
languages with native GC.

> But I've been wandering off-topic a bit.  The real point is that, since C++
> doesn't have global memory management, using it to implement a language which
> /does/ have that is tricky at best.

You're wrong. :)

>     -- chris
Chris Smith - 15 Nov 2006 07:44 GMT
Pardon for jumping in...

> The languages are garbage-collected, but the implementations are not.
> Even Java's implementation is written in C++.  Nobody's saying his
> functional language shouldn't have GC.

Right.  Machine code is not garbage collected, and there exist garbage
collected interpreted languages, ergo it is possible to implement a
garbage collected language interpreter in a non-collected language.  
That doesn't mean it is easy.  In fact, I will go ahead and assert,
without explicit evidence, that it is among the most difficult things to
do well, throughout all of software development.  I feel comfortable
asserting this because I've done it, and I've done most other things,
and I can compare my experience in the two.

There are ways to make it easier, of course.  You can give up on certain
performance implementation techniques, for example by sticking with a
single kind of collector, and avoiding generational implementation.  You
can give up on correct implementation and document the deficiencies, as
happens with both conservative collectors and practically all reference
counting implementations.  But then you have to deal with the
deficiencies, and it's still not exactly easy unless you go the ref-
counting route.

> Right, because he thinks he needs it.  My point was that he might not,
> and that he probably shouldn't choose his language based on the presence
> of absence of GC.

I'd definitely think that if one can save perhaps six months of hard
development work, and get a first-class garbage collector in the
bargain, by choosing a certain implementation language, then that's a
pretty decent reason for the decision.  Perhaps there are other factors
that would outweigh that advantage, but they'd have to be pretty big
factors.

> C++ does not lack memory management.  Please stop implying otherwise.
> It's getting irritating.

C++ certainly lacks automatic memory management, resulting in
programmers doing a lot of the work that would be considered memory
management in a garbage collected language.  You can be as picky as you
like about the words; but there's something being said there whether you
like the wording or not.  Feel free to propose new wording, and I'm sure
people will be happy to consider it.

> > There is no way to implement what is intrinsically a global
> > operation ("is this memory used anywhere ?") using only local reasoning.  But
> > C++'s resource management is entirely local.
>
> No, it's as local as it needs to be.

If we're talking about memory management at the C++ level, the language
specification makes it well-nigh impossible to provide a real garbage-
collected implementation of C++.  The culprit is that there's too much
specified behavior involving memory layout and pointers.  If more of
that kind of thing resulted in undefined behavior, then it may be
possible to provide a garbage-collected implementation; but that's not
the case.

Of course, none of that applies to the target interpreted language; but
you certainly can't push memory management back onto the C++ manual heap
in the same way you can in Java or any other garbage collected language.  
A reasonable-performance garbage collector would require that the
interpreter allocate large blocks of memory for itself, probably using a
system call like sbrk rather than any C++ language concept, and then
provide its own memory management with that.  Memory that's used by the
interpreted language could not be allocated from the standard C++ heap,
whereas that could easily be done in Java.

> Rather than pan the performance on theoretical grounds, you
> might want to get some numbers from the real world.

If you're referring to the statement that reference counting would yield
poor performance, that is well-documented enough throughout the gc
literature that there's no reason to ask anyone to test it yet again
every time they want to make the claim.

> Have you seriously had a problem with GC in Python?  They're not
> "half-arsed" implementations.

I haven't used Python enough to know whether the problems are serious or
not.  From reading documentation, though, it appears Python has a
correct garbage collector with poorer than needed performance.  That may
be fine for many applications, and not for others.  It also appears to
have taken them a while to finish (they had a broken garbage collector
until version 2.0).  It doesn't appear to be buggy.

That being said, though, Python is a pretty big project with a lot of
developers.  Things that are feasible for Python are not necessarily
feasible in other contexts.  (Actually, I'm surprised that Python
doesn't have a better collector by now -- one that combines different GC
techniques according to what's best for the specific generation -- but I
suppose there may be enough overhead elsewhere that gc overhead doesn't
matter so much.

> > Incidentally, you said (in a different post in this thread) something like (not
> > an exact quote) "in any well-designed C++ program, each dynamically allocated
[quoted text clipped - 6 lines]
> critical, but that only Java (and I assume you acknowledge C#) provides
> it.  I say that deterministic destruction is a reasonable alternative.

While being careful not to speak for Chris Uppal, I'd suggest it's very
likely that when he talks about a real garbage collector, he means one
that:

  (a) doesn't sacrifice performance unnecessarily
  (b) is correct (not conservative, collects all garbages)

Garbage collectors that provide deterministic destruction are okay for
very limited applications, but they have performance disadvantages that
prevent their being used for most applications.  In most cases where
deterministic destruction is needed, the application should not be
implemented with a garbage collected heap at all.

> The fact that C++ has better support for it than other languages does
> not mean the language is more restrictive.

If you mean that a non-garbage collected language such as C++ is
sufficient for general purpose scripting, then you are somewhat outside
of the mainstream, and near-universal practice disagrees with you.

> I think we're viewing the world from very different angles.  Determining
> the hierarchy ownership has proven one of the most valuable techniques
> at my disposal.

It is provable (see work by Hans Boehm) that there are problems for
which solutions satisfying the need for an explicit free operation
require asymptotically greater running time than the equivalent
implementation in a deferred garbage collector.  It is also provable
(see Andrew Appel) that the same is not true in the other direction.  
This is not a matter of perspective.  If you find assigning ownership
helpful to your design, you may continue to do it in any language; but
not needing to do so when it makes efficient solutions impossible is
objectively beneficial.

> It sounds like only Java has a GC you approve, ergo you consider only
> Java a "real" general-purpose language.  I suppose you might make an
> allowance for C#.

Huh?  You are reading something into Chris's responses that isn't
actually there.

Signature

Chris Smith

Chris Uppal - 15 Nov 2006 12:35 GMT
> Pardon for jumping in...

Of course ;-)

> It is provable (see work by Hans Boehm) that there are problems for
> which solutions satisfying the need for an explicit free operation
> require asymptotically greater running time than the equivalent
> implementation in a deferred garbage collector.  It is also provable
> (see Andrew Appel) that the same is not true in the other direction.

Do you have references for that ?  (I'm not disputing you -- indeed, something
like it seems intuitively obvious -- but I'd like to see the original papers.)

> > It sounds like only Java has a GC you approve, ergo you consider only
> > Java a "real" general-purpose language.  I suppose you might make an
> > allowance for C#.
>
> Huh?  You are reading something into Chris's responses that isn't
> actually there.

<nods> I assumed that Jeffrey had forgotten that [I knew that?] there are
several (as I would call it) "real" languages, designed /and/ implemented for
programming in-the-large.  Java, at best[*], is only one of them.

   -- chris

[*] I may as well repeat that I don't think Java is particularly
well-designed -- for programming at any scale -- but my reservations have
nothing to do with the quality of implementation of Sun's recent JVMs.
Chris Smith - 15 Nov 2006 18:21 GMT
> > It is provable (see work by Hans Boehm) that there are problems for
> > which solutions satisfying the need for an explicit free operation
[quoted text clipped - 4 lines]
> Do you have references for that ?  (I'm not disputing you -- indeed, something
> like it seems intuitively obvious -- but I'd like to see the original papers.)

Regarding the first, I have nothing more specific that what I originally
wrote, which is that Hans Boehm is responsible for it.  I looked, but
was unable to find a reference for it.  The paper gave a specific
problem -- IIRC something to do with string processing -- that can't be
done in better than O(n^2) time if you need to free the memory
explicitly, but which uses O(n) time as part of a larger application
with a copying garbage collector.  Essentially, tracking when a piece of
memory needs to be freed is quadratic, but the core algorithm is linear.

Andrew Appel is ultimately responsible for the second, though it's been
refined over time.  The original argument is at
http://citeseer.ist.psu.edu/appel87garbage.html.  I think it's fair to
say that the consensus is that there is no generalized justification for
the "faster" statement of the original paper, which depends on the
details of the machine architecture and isn't an asymptotic statement at
all; but "asymptotically at least as fast" definitely does follow.

Signature

Chris Smith

Chris Uppal - 16 Nov 2006 16:09 GMT
> Regarding the first, I have nothing more specific that what I originally
> wrote, which is that Hans Boehm is responsible for it.  I looked, but
> was unable to find a reference for it.

OK, thanks for looking.  I'll keep an eye open in case I happen across it
anywhere.

> Andrew Appel is ultimately responsible for the second, though it's been
> refined over time.  The original argument is at
> http://citeseer.ist.psu.edu/appel87garbage.html.

Got it, thanks.  (The paper seems vaguely familiar -- I must have seen it
before somewhere, sometime...)

BTW, I liked the reference to the "Massive Memory Machine" -- all 128 MBytes of
it ;-)  How times change !

   -- chris
Simon Brooke - 16 Nov 2006 20:06 GMT
> Pardon for jumping in...

Pardon for following up to you; I don't have the parent article on my
server.

>> The languages are garbage-collected, but the implementations are not.
>> Even Java's implementation is written in C++.  Nobody's saying his
>> functional language shouldn't have GC.

This is bollocks[tm]. There are plenty of machines which run LISP down on
the metal. LISP is both interpreted and compiled, and whether interpreted
or compiled is garbage collected (indeed on most LISP systems interpreted
and compiled functions live in the same garbage collected space).

I, too, have written garbage collectors, and all the garbage collectors
I've written so far have been written in the language that they collected
for.

Yes, garbage collectors are tricky. I still have a considerable soft spot
for reference counters, simply because they don't stop the world and don't
move things about; but unless you partition your space into spaces for
different object sizes you get fragmentation, which is a downside. And
some things do get locked in, so you will ultimately get silting. What all
this means is that it works better for things like LISP (where most
objects are the same size, and only a few types of object are variable in
size) than for Java (where most types of object differ in size from one
another).

>> Right, because he thinks he needs it.  My point was that he might not,
>> and that he probably shouldn't choose his language based on the presence
>> of absence of GC.

In my opinion any language with manual memory management is obsolete,
industrial archaeology, only fit for showing students how not to do
things. More serious software errors come out of mistakes in manual memory
management than any other single cause. These mistakes are unnecessary;
they are a consequence of using obsolete technology.

>> We disagree again.  You've effectively said that a certain type of GC is
>> critical, but that only Java (and I assume you acknowledge C#) provides
[quoted text clipped - 12 lines]
> deterministic destruction is needed, the application should not be
> implemented with a garbage collected heap at all.

Signature

simon@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/

                               ... a mild, inoffensive sadist...



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.