Java Forum / General / July 2005
Jon's many-language ray tracer
alex goldman - 24 Jun 2005 15:43 GMT Regarding Jon's ray tracer, I'll play the devil's advocate:
When you only have spheres, algebraic types are nice and dandy, but if you had a plethora of different objects to render, some of them special cases of others, perhaps using OOP and organizing them into a class hierarchy would be better?
Vincenzo Ciancia - 24 Jun 2005 17:53 GMT > When you only have spheres, algebraic types are nice and dandy, but if you > had a plethora of different objects to render, some of them special cases > of others, perhaps using OOP and organizing them into a class hierarchy > would be better? Maybe, but when in need of binary methods you will have to make one or more messes - and end up using the visitor pattern which I don't find particularly pleasant. What's the functional equivalent to the visitor pattern, and to the class hierarchy which is so often used as an example in OOP textbooks?
I only had to do such a thing once, and used ocaml polymorphic variants - and I don't remember exactly what my solution looked like, nor if I liked my solution or not, but I think it was the visitor pattern implemented as a match over polymorphic variants (representing the "extensible" ADT of shapes), which would be IMHO the clever brother of the usual thing :)
What should be the functional way to implement a hierarchy like
shape circle <: shape box <: shape
and the function "intersect" which is specialized in the case (box,box) and (circle,circle), but maybe also on (circle,box)?
What should be the functional way to allow extensibility and successive refinement of the intersect function?
Maybe there are elegant solutions only using ADTs, but maybe one needs functors, or polymorphic variants, or type classes, existential types etc. Is there anyone willing to propose solutions?
bye and thanks
Vincenzo
 Signature Please note that I do not read the e-mail address used in the from field but I read vincenzo_ml at yahoo dot it Attenzione: non leggo l'indirizzo di posta usato nel campo from, ma leggo vincenzo_ml at yahoo dot it
alex goldman - 24 Jun 2005 18:58 GMT >> When you only have spheres, algebraic types are nice and dandy, but if >> you had a plethora of different objects to render, some of them special [quoted text clipped - 32 lines] > > Vincenzo There are a few problems that I see in your post:
1. We are not discussing functional vs imperative, but algebraic datatypes vs inheritance
2. In the raytracer, "intersect" works on (ray, shape) pairs not (shape, shape), so the visitor pattern is irrelevant
Regardless
3. Some OOP systems don't need the visitor pattern, because they have http://en.wikipedia.org/wiki/Multiple_dispatch
And also
4. Abruptly changing "Subject:", while the subject remains the same, is not very helpful.
5. The ray tracer was actively discussed lately in both clf and cljp - Jon started the discussion there himself. I'd like to hear what both communities have to say about this. Repeating the same arguments separatly in different groups is a waste of time. Don't trim the groups list.
Vincenzo Ciancia - 24 Jun 2005 19:59 GMT First of all, I am sorry for my previous post which is off topic on cljp, of course I didn't notice the crosspost when replying (I will try to configure my newsreader a little better in order to notice crossposts in the future).
> There are a few problems that I see in your post: > > 1. We are not discussing functional vs imperative, but algebraic datatypes > vs inheritance In fact I never used the word "imperative" in my post.
I am replying to your topic, which seemed to me: "object-oriented approaches vs. ADT approaches to problems that arise when writing graphics programs". I generalized a bit but used a rather imprecise word: "functional", I meant "ADT plus newer but still ADT-related techniques, like polymorphic variants, modules, functors, existential ADTs and so on".
> 2. In the raytracer, "intersect" works on (ray, shape) pairs not (shape, > shape), so the visitor pattern is irrelevant > > Regardless I tried to interpret your intentions and talk about object-oriented hierarchies and their equivalent in functional languages. I was wrong it seems, as you only want to talk about Jon raytracer - should I be sorry for having replied? If so, you should be sorry for having asked in the first place <G>.
> 3. Some OOP systems don't need the visitor pattern, because they have > http://en.wikipedia.org/wiki/Multiple_dispatch I know, of course! I unintentionally stated that you _need_ the visitor pattern in OOP. But I guess I can state that using C#, C++, Java, Eiffel, the O of ocaml and other more or less "mainstream" OO languages, one will surely think about using the visitor pattern when in need of binary methods.
> And also > > 4. Abruptly changing "Subject:", while the subject remains the same, is > not very helpful. Seems like you felt the change of subject as a critique of the one you used...
I changed the subject on purpose, because I was introducing a rather interesting (at least, to me) subject which I thought be strictly related to your post (hence the reply) but not strictly related with the original subject (Jon's raytracer), so I changed the subject hoping that this would contribute to the signal-to-noise ratio of a possible search on usenet for the topic.
I see that the subject has much more meaning for readers of cljp than for readers of clf due to previous discussions, but pardon me I was not aware of those, so I changed the subject as I am used to do when I think it's worth - and BTW I think discussing on usenet about what the subject of a certain post should be is even less helpful than changing a subject :)
> 5. The ray tracer was actively discussed lately in both clf and cljp - Jon > started the discussion there himself. I'd like to hear what both > communities have to say about this. Repeating the same arguments > separatly in different groups is a waste of time. Don't trim the groups > list. I actually didn't, so what has this to do with the "few problems" in my post? And also, I still think that my post was perfectly in topic with your on CLF, I don't see why you're upset with that.
Again - sorry for this other off topic on both NGs - I think this post will clarify the previous one to everybody.
V.
 Signature Please note that I do not read the e-mail address used in the from field but I read vincenzo_ml at yahoo dot it Attenzione: non leggo l'indirizzo di posta usato nel campo from, ma leggo vincenzo_ml at yahoo dot it
Joachim Durchholz - 25 Jun 2005 09:04 GMT > 2. In the raytracer, "intersect" works on (ray, shape) pairs not (shape, > shape), so the visitor pattern is irrelevant Agreed.
> Regardless > > 3. Some OOP systems don't need the visitor pattern, because they have > http://en.wikipedia.org/wiki/Multiple_dispatch Multiple dispatch just shifts the problems to a different area.
I.e. if programmer A add a new ray type and programmer B adds a new shape type, they can't do that independently. It's just the same mess as with the visitor pattern.
Regards, Jo
Jesper Nordenberg - 25 Jun 2005 19:48 GMT > > 2. In the raytracer, "intersect" works on (ray, shape) pairs not (shape, > > shape), so the visitor pattern is irrelevant [quoted text clipped - 10 lines] > I.e. if programmer A add a new ray type and programmer B adds a new > shape type, they can't do that independently. What do you mean? You can force the visitor implementor to implement a default method that can be called with types of shapes.
> It's just the same mess as > with the visitor pattern. No, multiple dispatch has a number of advantages over the visitor pattern. See this page for more info: http://nice.sourceforge.net/visitor.html
/Jesper Nordenberg
Joachim Durchholz - 25 Jun 2005 20:47 GMT >>> 3. Some OOP systems don't need the visitor pattern, because they >>> have http://en.wikipedia.org/wiki/Multiple_dispatch [quoted text clipped - 6 lines] > What do you mean? You can force the visitor implementor to implement a > default method that can be called with types of shapes. That's one of the ways people try to address the problem.
The base problem is: you have N ray types and M shape types. With any kind of multiple dispatch (be it the Visitor pattern or built right into the language), that's a matrix of NxM functions (of course, in practice, that matrix is filled with relatively few values).
Now if both programmers add a new type, you have an (N+1)x(M+1) matrix. And it's the [N+1][M+1] corner case that gives us all the trouble: neither the programmer who's responsible for the N axis nor the programmer who's responsible for the M axis will feel responsible for that case - in fact neither knows that this new case exists! And if the software has an unknown case, there will be nobody who reviews whether it's handled correctly. Even worse: adding a new type is an important design change, so it's quite likely that this new combined case is nontrivially different from anything that exists.
Forcing a default just covers the problem up by making the system compile. What's actually needed is the exact opposite: a compiler message that informs whoever uses the two extensions together that there's a new, unimplemented case in his system.
Needless to say that this plays havoc with composability: adding two independent extensions to the system may break it. In other words, multiple dispatch breaks modularity (the ability to compose independent libraries without getting adverse interactions). For the Javaites among the readers: It also prevents dynamic class loading. Well, one could defer the error message until the not-provided-for type combination actually turns up and throw an exception. You'd get multiple dispatch at the price of slightly destabilising the software... which is actually feasible with the right overall architecture; see the whitepapers on system stability in Erlang.
>> It's just the same mess as with the visitor pattern. > > No, multiple dispatch has a number of advantages over the visitor > pattern. See this page for more info: > http://nice.sourceforge.net/visitor.html OK, I agree MD solves several problems of Visitor. It doesn't solve the non-modularity problem though. And I don't think that this is solvable; whatever structure you impose, it will lead to unexpected results (read: bug opportunities) in some cases.
The best way to handle the problem is to decompose the operation on NxM types into two: one Nx1, and one 1xM, where the composition will always do the right thing, without either Visitor or MD. Unfortunately, it seems that this strategy doesn't always work (but I'm not sure that this is a fundamental problem - it might be lack of inventiveness that makes us fail to see how to decompose NxM operations).
Regards, Jo
Philippa Cowderoy - 26 Jun 2005 14:25 GMT > The best way to handle the problem is to decompose the operation on NxM types > into two: one Nx1, and one 1xM, where the composition will always do the > right thing, without either Visitor or MD. Unfortunately, it seems that this > strategy doesn't always work (but I'm not sure that this is a fundamental > problem - it might be lack of inventiveness that makes us fail to see how to > decompose NxM operations). Perhaps, but it's not going to go to Nx1 and 1xM - sooner or later that amounts to assuming the problem's bilinear. Other structures no doubt exist - I'm having some fun exploring extensible parsing for a wiki, though that has the advantage of an easy fallback case ("it's not markup, so it must be plain text").
 Signature flippa@flippac.org
The task of the academic is not to scale great intellectual mountains, but to flatten them.
Joachim Durchholz - 26 Jun 2005 23:29 GMT >> The best way to handle the problem is to decompose the operation on >> NxM types into two: one Nx1, and one 1xM, where the composition will [quoted text clipped - 5 lines] > Perhaps, but it's not going to go to Nx1 and 1xM - sooner or later that > amounts to assuming the problem's bilinear. Hmm... for some definition of "bilinear", yes ;-)
> Other structures no doubt exist - I'm having some fun exploring > extensible parsing for a wiki, though that has the advantage of an > easy fallback case ("it's not markup, so it must be plain text"). Which has its own problems - I'm involved in just such a wiki, too :-)
Re "other structures": I think one can often (maybe always) restructure the problem, just as one can restructure the nonlinear Tree data structure into linear "nodes" and "links". The transformation usually is nonobvious (otherwise the software would have been structured differently), and that's where inventiveness comes in.
One example is e.g. particle physics simulation. For N particle types, one could write N*(N-1)/2 functions for defining the interaction between them. Or one could write N functions that define the interaction of a particle with the various fields :-)
Regards, Jo
Jesper Nordenberg - 26 Jun 2005 23:25 GMT > Forcing a default just covers the problem up by making the system > compile. What's actually needed is the exact opposite: a compiler > message that informs whoever uses the two extensions together that > there's a new, unimplemented case in his system. Forcing a default handler is the only solution I can think of in a language that supports dynamic class loading. You then have the option of throwing an exception in the default handler if you can't/won't handle the sub type.
> Needless to say that this plays havoc with composability: adding two > independent extensions to the system may break it. In other words, [quoted text clipped - 6 lines] > destabilising the software... which is actually feasible with the right > overall architecture; see the whitepapers on system stability in Erlang. In some cases having a default handler and special handling of a few sub types is what the developer wants. In other cases you want special handling of all sub types, but as you note this is impossible to check at compile time on platforms that support dynamic class loading (unless, like in the visitor pattern, the programmer is forced to provide a list of supported sub types for multiple dispatch). Compiler support for multiple dispatch could actually give you the option of checking that all sub types available at compile time are handled in separate methods, which would give some protection against the bugs you mention.
/Jesper Nordenberg
Joachim Durchholz - 27 Jun 2005 17:10 GMT >>Forcing a default just covers the problem up by making the system >>compile. What's actually needed is the exact opposite: a compiler [quoted text clipped - 5 lines] > of throwing an exception in the default handler if you can't/won't > handle the sub type. The problem is: who't that "you"? Programmer A, programmer B, or the system integrator C who wants to combine the extensions from A and B?
Neither A nor B know there's an interaction. C doesn't know about the internals and cannot fix any problems that might arise (in fact he'll probably never become aware of the problem - that's the worst of all scenarios: subtle bugs that go unnoticed until somebody recalculates the results by hand...)
> In some cases having a default handler and special handling of a few > sub types is what the developer wants. It's OK if the semantics is always the same but the implementation can be optimised for common combinations. The worst that will happen is that the (N+1)(M+1) case will take more time than needed - but since we're in an opimisation scenario here, integrator C is most likely the person doing the performance tests and looking for bottlenecks.
> In other cases you want special > handling of all sub types, but as you note this is impossible to check [quoted text clipped - 5 lines] > separate methods, which would give some protection against the bugs > you mention. Well, yes... but imagine a newbie programmer loading umpteen modules and extensions (still in the phase of trying everything out) and getting lots of errors that are related to modules he never even heard about!
Same can happen with experienced programmers: loading umpteen modules and extensions because they are all needed, but first thing that happens is that he'll have to write all those missing (N+1)(M+1) functions. Yet another chore before you can actually begin with productive work *sigh*....
My personal favorite is to disallow multiple dispatch except within a module boundary, where "module" is defined as "has one person or team who's responsible for it" in addition to the usual modularisation criteria. That way, programmers A and B are the same person and know that they have a new (N+1)(M+1) case. (For example, this is how built-in arithmetic is typically done: the author(s) of compiler and run-time system know about all the admissible combinations, and if a new type is introduced, they will be able to see all combinations that now need to be handled. It's not a library context, but the issues are exactly those we've been discussing - and if the arithmetic system is ever going to be a programmer-providable module, this multiple dispatch problem needs to be solved.)
Regards, Jo
Marcin 'Qrczak' Kowalczyk - 27 Jun 2005 18:41 GMT > My personal favorite is to disallow multiple dispatch except within a > module boundary, where "module" is defined as "has one person or team > who's responsible for it" in addition to the usual modularisation > criteria. It's not sufficient for equality: you really want to define your own equality for your own types, even in they come in different modules.
My personal favorite is no constraints, living with the possibility of writing incomplete systems where certain combinations of types will fail at runtime. This is because I know no system of constraints which wouldn't throw out too many sensible programs.
My second favorite is Haskell classes with common extensions (multiparameter classes, functional dependencies etc.). Availability of operations is thus checked statically. The system is limited by static typing, e.g. you can't use Haskell classes for different kinds of nodes in an abstract syntax tree - you must use some other mechanism for that. Algebraic types fit quite well, except that they are not extensible.
 Signature __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/
Joachim Durchholz - 27 Jun 2005 19:02 GMT >>My personal favorite is to disallow multiple dispatch except within a >>module boundary, where "module" is defined as "has one person or team [quoted text clipped - 3 lines] > It's not sufficient for equality: you really want to define your own > equality for your own types, even in they come in different modules. Equality it too complicated anyway. And it isn't even decidable if functions can be constructed at run-time (as is common in functional languages).
Besides, as soon as the data structures get large, the concept of equality becomes less and less clear.
Actually it isn't even clear for strings. If you have ASCII, multi-byte, and UTF strings, should they be "equal" if they denote the same character sequence, or should they be "equal" if they are byte-for-byte equal? The answer is: it depends. Which means: the programmer should select the appropriate equivalence relationship. BTW this isn't limited to representational variation: sometimes the appropriate equivalence relationship is case-insensitive comparison, and nobody finds it strange that it's syntactically different from standard string equality! Next is equivalence unter mutability. Say, a data structure has some expensive-to-compute field that's present if it ever was queried, absent otherwise (FPLers will recognise this as a "memoised" value). Should two data objects with a different status in this field be considered equal or not equal? Answer (again): it depends. When (say) testing a marshalling algorithm or doing other internal work, I want it included; when looking at the data structure from the outside, I want it excluded. (This is just a variant of representational equivalence.)
So my answer is: you don't need equality, you need equivalence. Which is just another binary operator with a boolean result, and doesn't introduce any problems that weren't present before.
Regards, Jo
Marcin 'Qrczak' Kowalczyk - 27 Jun 2005 22:12 GMT >> It's not sufficient for equality: you really want to define your own >> equality for your own types, even in they come in different modules. > > Equality it too complicated anyway. I encountered only two conceptual difficulties with equality:
- If many important types in a language are mutable (e.g. strings, lists), the relation which should play the role of equality of these values according to general principles (object identity) is usually less often wanted than comparing elements.
- When comparing numbers, it's often useful to make models of the same mathematical value expressed with different types compare as equal, and to have special features of floating point equality (-0.0 is equal to 0.0 even though they can be distinguished, NaN is not equal to itself). This is incompatible with using equality for the most distinguishing equivalence relation.
Of course various languages yield additional problems with associating implementation of equality with types, but these are weaknesses of particular languages, not conceptual problems.
> And it isn't even decidable if functions can be constructed at > run-time (as is common in functional languages). They should not be comparable for equality.
> Besides, as soon as the data structures get large, the concept of > equality becomes less and less clear. But for those types for which it is clear, the language should provide a mechanism for this.
When I make a dictionary indexed by pairs of strings, I don't want to be forced to manually specify how they should be compared.
> Actually it isn't even clear for strings. If you have ASCII, > multi-byte, and UTF strings, should they be "equal" if they denote > the same character sequence, or should they be "equal" if they are > byte-for-byte equal? They should be equal if they cannot be distinguished by other means, not equal if they can.
Applying a lossy conversion between strings for determining equality is asking for trouble. It's important which strings use which format.
Anyway, I see place for at most two string types in a general purpose language: sequences of Unicode code points, and sequences of bytes.
> BTW this isn't limited to representational variation: sometimes the > appropriate equivalence relationship is case-insensitive comparison, > and nobody finds it strange that it's syntactically different from > standard string equality! Of course, but this is not equality, just some equivalence relation. Structures like hash tables should not be restricted to using equality on keys, as sometimes I want the keys to be compared e.g. case insensitively. But equality should be the default because it's the most common case and shouldn't be surprising.
My favorite way of providing the relation for key comparison is not specifying a binary relation plus a hashing function, but specifying a function which transforms the keys into things which should be then compared using the default equality. Such function e.g. folds string case, or extracts a tuple of fields from a record (if the equality for this record isn't already defined to compare these fields).
This is not as general as an arbitrary binary relation in theory, but covers all cases I encountered, and it's easier to use. No need for providing a hashing function compatible with the equivalence. It's also faster because the hash table stores transformed keys instead of transforming them on the fly during searching.
> Next is equivalence unter mutability. Say, a data structure has some > expensive-to-compute field that's present if it ever was queried, > absent otherwise (FPLers will recognise this as a "memoised" value). > Should two data objects with a different status in this field be > considered equal or not equal? Does code outside the implementation of that structure need to distinguish between this field being already computed and not, or this is an internal detail and the value is always computed when asked for? In the first case the presence of the field should be taken into account, in the second case it should not.
 Signature __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/
Torben Ægidius Mogensen - 28 Jun 2005 09:52 GMT > My second favorite is Haskell classes with common extensions > (multiparameter classes, functional dependencies etc.). Availability > of operations is thus checked statically. The system is limited > by static typing, e.g. you can't use Haskell classes for different > kinds of nodes in an abstract syntax tree - you must use some other > mechanism for that. While type classes might not be suitable for individual nodes in a syntax tree, they works well for syntactic categories -- expressions, statements, declarations, etc. For example, you can make a "pretty print" type class and instantiate it for each category, so you don't need to invent names for each instance, and you can use a generic (overloaded) function for such as error messages that return a message string and a pretty-print of the offending sub-tree.
Torben
Jon Harrop - 24 Jun 2005 23:57 GMT > Maybe, but when in need of binary methods you will have to make one or > more messes - and end up using the visitor pattern which I don't find > particularly pleasant. What's the functional equivalent to the visitor > pattern, and to the class hierarchy which is so often used as an example > in OOP textbooks? When learning a new paradigm, you should not try to translate old solutions but, rather, you should try to reimplement old problems. So don't ask what the "visitor pattern" looks like in a functional implementation. Find a representative example which has been solved using the visitor pattern. Forget about the visitor pattern. Then try to solve the same problem in ML.
> What should be the functional way to implement a hierarchy like > > shape > circle <: shape > box <: shape type shape = Circle of vec * float | Box of vec * vec
> and the function "intersect" which is specialized in the case (box,box) > and (circle,circle), but maybe also on (circle,box)? Simply:
let intersect s1 s2 = match s1, s2 with Circle (c1, r1), Circle (c2, r2) -> ... | Circle (c, r), Box (l, u) | Box (l, u), Circle (c, r) -> ... | Circle (c1, r1), Circle (c2, r2) -> ...
> What should be the functional way to allow extensibility and successive > refinement of the intersect function? There are two different ways to extend this problem:
1. You could add a new shape.
In ML, you do this:
type shape = Circle of vec * float | Box of vec * vec | Polygon of vec list
and the compiler will then tell you exactly where you have pattern matches which don't handle Polygon.
In C++ or Java, you add a new derived class called Polygon:
class Polygon : public Shape { ... }
2. You could add a new function which acts upon shapes.
In ML, you do this:
let bound = function Circle of (c, r) -> ... | Box of (l, u) -> ...
In C++ or Java, you add a new member function to shape:
class Shape { ... virtual Bound bound() const = 0; }
and implementations in every derived class:
class Circle : public Shape { ... virtual Bound bound() const { ... } }
In general, a program has far more functions than types. So the ML approach is better because code updates are localised when adding new functions (which is more common) than when adding new types (less common). The ML also has many other advantages. For example, you have to cut and paste the types a lot in C++ and Java but ML infers all of the types.
Finally, MLs pattern matches are so much more expressive than derived classes in C++ and Java (e.g. you can pattern match over strings) that the compiler has a much better chance of optimising them. So ML programs are typically much faster too.
 Signature Dr Jon D. Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com
Philippa Cowderoy - 25 Jun 2005 11:47 GMT >> Maybe, but when in need of binary methods you will have to make one or >> more messes - and end up using the visitor pattern which I don't find [quoted text clipped - 5 lines] > but, rather, you should try to reimplement old problems. So don't ask what > the "visitor pattern" looks like in a functional implementation. Why not? It's just a plain recursive function :-)
 Signature flippa@flippac.org
'In Ankh-Morpork even the sh.t have a street to itself... Truly this is a land of opportunity.' - Detritus, Men at Arms
Lasse Reichstein Nielsen - 25 Jun 2005 12:11 GMT > When learning a new paradigm, you should not try to translate old solutions > but, rather, you should try to reimplement old problems. So don't ask what > the "visitor pattern" looks like in a functional implementation. Find a > representative example which has been solved using the visitor pattern. > Forget about the visitor pattern. Then try to solve the same problem in ML. That should be simple. The visitor pattern is just a hack for languages that doesn't have first class functions - a tuple of functions implemented as an object. :)
/L
 Signature Lasse Reichstein Nielsen - lrn@hotpop.com DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html> 'Faith without judgement merely degrades the spirit divine.'
Marcin 'Qrczak' Kowalczyk - 25 Jun 2005 13:09 GMT [Delayed but repeated Followup-To; I don't read comp.lang.java.programmer]
>> When learning a new paradigm, you should not try to translate old solutions >> but, rather, you should try to reimplement old problems. So don't ask what [quoted text clipped - 5 lines] > languages that doesn't have first class functions - a tuple of > functions implemented as an object. :) Not quite, a tuple of functions is still a translation of a solution rather than the problem.
It's a hack for extensible classes, i.e. for extending the set of functions whose implementation is dispatched at runtime on the type of one of arguments.
It can be used to emulate algebraic types for example (subclasses are used for constructors, dispatch is used instead of pattern matching). This can be quite inconvenient: 1. only one layer of constructors is dispatched at a time, 2. you can't have a default case - you have to specify what to do for all constructors separately, 3. and the body of each match must be moved to a separate method, with all local variables passed explicitly to it.
In addition 4. it requires a fixed signature of the function, so passing additional arguments and returning results is tricky (requires packing multiple arguments in "tuples", which Java lacks anyway, and casting to overcome static type system). 5. In Java it makes static exception specifications harmful: a single exception specification must be used for all functions dispatched that way. 6. It removes the other axis of extensibility, the one which was easy in plain OOP: adding new types to dispatch on.
Generic functions (like in CLOS and Dylan) solve the problem of adding dispatched functions without most of these drawbacks (only 1 and 3 remain), and as a bonus they provide dispatching on multiple arguments at once.
 Signature __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/
chris - 24 Jun 2005 18:07 GMT > Regarding Jon's ray tracer, I'll play the devil's advocate: > > When you only have spheres, algebraic types are nice and dandy, but if you > had a plethora of different objects to render, some of them special cases > of others, perhaps using OOP and organizing them into a class hierarchy > would be better? Not a ray tracer, but I'm working on a simple scene graph in OCaml for OpenGL. To represent visual objects I use polymorphic variants. Algebraic types (Variants) in OCaml cannot be extended easily, however polymorphic variants do the job nicely.
The only problem is look up is O(n) in the number of variants if you rely on a global rendering "dispatch" function. I assumed that was too much of a hit, so each node carries the render function it needs with it. This isn't a big space hit as it may sound, because it ends up just as aliases to the same functions.
If I knew how to hash polymorphic variants generically, a table driven approach may work and be easily extensible (just add new render functions to the table) but am leaving that until later.
Objects are used for the rendering context and serve a different purpose there.
Chris
Jon Harrop - 24 Jun 2005 23:41 GMT > Regarding Jon's ray tracer, I'll play the devil's advocate: Wow, I'm famous. ;-)
> When you only have spheres, algebraic types are nice and dandy, but if you > had a plethora of different objects to render, some of them special cases > of others, perhaps using OOP and organizing them into a class hierarchy > would be better? Definitely not in my experience. Time for a shameless plug: The latest commercial product from my company, Presenta, is a slideshow presentation program designed to display technical content including animated slideshow points, typeset mathematics and 2D and 3D graphics:
http://www.ffconsultancy.com/products/presenta
Presenta is now written entirely in OCaml. Two years ago, it was written entirely in C++. The OCaml implementation makes very little use of objects (there is a single object type, for a "scene", which was done only to circumvent having mutually recursive modules).
The internals are very complicated (much more complicated than anything I did during my PhD in computational physics, for example) and the OCaml implementation relies heavily upon variant types and pattern matching to get its jobs done. For example, variant types are used to represent:
1. A whole document. 2. Typeset mathematics. 3. Graphics. 4. The scene graph used for rendering. 5. Lines and Bezier curves. 6. Multiresolution paths. ... and many other entities.
In the case of 2D vector graphics, the program uses a particularly sophisticated approach to adaptively tesselate geometry in order to render only what is necessary for the current frame of animation whilst maximally reusing the results of previous computations and also exploiting hardware acceleration via OpenGL. This is not easy. Indeed, nobody else has managed to do this AFAIK.
Having translated this from an entirely OO C++ implementation I can definitely say that OO is very poorly suited to this application. OCaml's variants and pattern matching are not only vastly more succinct, they are also much clearer, much more robust and even much faster than the equivalent C++. Pattern matching leads to most code updates being localised where as OO leads to scattered updates which are not statically checked to the same extent by the compiler (C++ or Java).
Finally, the advantages offered by OCaml don't just lead to 1/2 or 1/3 as much code, as my ray tracer might lead you to believe. The code density in the current implementation is more than 5x that of the old C++. For larger projects, OCaml code is even smaller in proportion. I believe this is due to the extra dimension of factoring made possible by higher-order functions.
 Signature Dr Jon D. Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com
alex goldman - 25 Jun 2005 00:12 GMT > (there is a single object type, for a "scene", which was done only to > circumvent having mutually recursive modules). Aha!
In an objectless code, your Scene would be a big union of most or all renderable stuff. If it's an object instead, then everything that's renderable should be its subtype, no?
> Having translated this from an entirely OO C++ implementation I can [...]
> The code density in > the current implementation is more than 5x that of the old C++. For larger > projects, OCaml code is even smaller in proportion. My own experience was that simply rewriting an application significantly reduces its size, even when I rewrite it in the same language.
Jon Harrop - 25 Jun 2005 00:36 GMT >> (there is a single object type, for a "scene", which was done only to >> circumvent having mutually recursive modules). [quoted text clipped - 4 lines] > renderable stuff. If it's an object instead, then everything that's > renderable should be its subtype, no? No, sorry. :-)
I used an object to weaken the type system. Nothing to do with OO and/or subtyping.
>> The code density in >> the current implementation is more than 5x that of the old C++. For >> larger projects, OCaml code is even smaller in proportion. > > My own experience was that simply rewriting an application significantly > reduces its size, even when I rewrite it in the same language. I have ported several applications from OCaml to C++ and the code expanded by the same amount, despite being rewritten.
 Signature Dr Jon D. Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com
alex goldman - 25 Jun 2005 00:59 GMT > No, sorry. :-) > > I used an object to weaken the type system. Nothing to do with OO and/or > subtyping. This is telling us something (about the type system), but not enough to have a meaningful discussion. Suppose you want to render
Spheres, ellipsoids, cones, planes, parallelepipeds, prisms, etc.
There are obviously complex RELATIONS among these, and you might want to re-use the code due to these relations instead of just cutting and pasting.
What would your scene _object_ look like?
type aux = Sphere of sphere | Ell of ell | Cone of cone | Plane of ...
class scene = ?
Jon Harrop - 25 Jun 2005 01:44 GMT >> No, sorry. :-) >> [quoted text clipped - 3 lines] > This is telling us something (about the type system), but not enough to > have a meaningful discussion. The OCaml type system is deliberately restrictive. Out of 10,000 lines of code in my latest project, this restrictiveness is advantageous (by eliminating many errors) in all but one place. In this place, I have use an object to weaken the type system just enough to let me do what I want.
> Suppose you want to render > [quoted text clipped - 3 lines] > re-use the code due to these relations instead of just cutting and > pasting. Absolutely. Then you factor your code into many functions. Code reuse is then achieved by calling one function from many places. For example, if you're after brevity, you might replace all sphere code with calls to ellipse.
Your example is be best written in OCaml without using any OO at all.
> What would your scene _object_ look like? > > type aux = Sphere of sphere | Ell of ell | Cone of cone | Plane of ... > > class scene = ? My scene _object_ looks nothing like that. It looks like this:
class scene : ?fills:Smoke.Fill.t Store.t -> ?styles:Smoke.Style.geometry Store.t -> ?geometries:Smoke.ContourGeometry.t Store.t -> basic_node -> object ('a) method add_fill : Smoke.Fill.t -> Smoke.Fill.t Store.key * 'a method add_geometry : Smoke.ContourGeometry.t -> Smoke.ContourGeometry.t Store.key * 'a method add_style : Smoke.Style.geometry -> Smoke.Style.geometry Store.key * 'a method append : basic_node -> int * 'a method get_bound : Smoke.Bound.t method get_bound_of : basic_node -> Smoke.Bound.t method bound_of : iterator -> Smoke.Bound.t method get_fill : Smoke.Fill.t Store.key -> Smoke.Fill.t method get_fills : Smoke.Fill.t Store.t method get_geometry : Smoke.ContourGeometry.t Store.key -> Smoke.ContourGeometry.t method get_geometries : Smoke.ContourGeometry.t Store.t method get_root : basic_node method get_style : Smoke.Style.geometry Store.key -> Smoke.Style.geometry method get_styles : Smoke.Style.geometry Store.t method push_back : basic_node -> 'a method push_front : basic_node -> 'a method render : Smoke.RenderData.t -> unit method replace_fill : Smoke.Fill.t Store.key -> Smoke.Fill.t -> 'a method replace_geometry : Smoke.ContourGeometry.t Store.key -> Smoke.ContourGeometry.t -> 'a method replace_root : basic_node -> 'a method replace_style : Smoke.Style.geometry Store.key -> Smoke.Style.geometry -> 'a end
It has nothing to do with shapes, geometries and so forth. It is only to do with the way I chose to structure the whole library. Specifically, it pulls everything together in a way that can then be used by all of the parts of my library independently, without having to know about each other. Thus, it evades a big mutual dependency.
You want to talk about the variant type which implements a scene. I have chosen to use a variant type (much better than a class hierarchy):
type ('leaf, 'loner, 'group) generic_node = GenericLeaf of 'leaf | GenericLoner of 'loner * ('leaf, 'loner, 'group) generic_node | GenericGroup of 'group * ('leaf, 'loner, 'group) generic_node list
Note that it is both generic and extensible. The functions which act over this type are also both polymorphic and extensible. I have several levels of progressively more specialised scenes and functions, allowing the user to choose how much functionality they want to implement themselves.
This functionality can actually be achieved (albeit unsafely) in C++ and Java but the code required is so convoluted that nobody would ever think to write it. In OCaml, the best approach is often the most natural and mose concise.
 Signature Dr Jon D Harrop, Flying Frog Consultancy http://www.ffconsultancy.com
alex goldman - 25 Jun 2005 02:28 GMT >>> No, sorry. :-) >>> [quoted text clipped - 8 lines] > eliminating many errors) in all but one place. In this place, I have use > an object to weaken the type system just enough to let me do what I want. okay, but what if you were using just ML (you actually mentioned earlier you were thinking about porting your book and code to SML). Suppose your code was in SML from the beginning, nearly 10,000 lines of it. At one point, you realize that the type system will not allow you to do what you want - your type checker God tells you to get lost. This is the kind of situation I've been in with ML (the memories are too painful, and I'm supressing them, so don't ask, I don't really want to talk about that ;-)
>> Suppose you want to render >> [quoted text clipped - 5 lines] > > My scene _object_ looks nothing like that. It looks like this: [...] I guess I asked the wrong question. What I really wanted to know was what its relation to spheres, ellipsoids, cones, planes, parallelepipeds, prisms, etc. looks like.
INRIA folks have a paper about Objects vs not Objects (Modules?). IIRC, the basic idea is that objects are good when you have few function, but many data types. What I'm saying is that ray tracing seems just like this type of application (unless you limit yourself to spheres). There is a couple of functions (intersect and ?), but very many data types.
So when you use an OO design, the code differences between *Objective* Caml on the one hand, and Java or C++ on the other hand, will be rather superficial.
Torben Ægidius Mogensen - 27 Jun 2005 16:56 GMT > Regarding Jon's ray tracer, I'll play the devil's advocate: > > When you only have spheres, algebraic types are nice and dandy, but if you > had a plethora of different objects to render, some of them special cases > of others, perhaps using OOP and organizing them into a class hierarchy > would be better? If you have N kinds of things ("objects") and M different things ("methods") you want to do with them, OO allows you to have the M different "methods" close to each other for each "object", but the N different "objects" will be spaced out. Using an ADT (ML/Haskell datatype) allows you to keep the N "objects" together, but separates the M "methods". You can view it as an NxM matrix that you lay out either in row-major or column-major format.
Now, adding one "object" in OO allows the new code to be separated from the old, but adding a new "method" requires global redefinition. Using an ADT allows new "methods" to be added with separated code, whereas adding a new kind of "object" requires global rewrites. In both cases, the type system can help you figure out where you need to add code for the difficult case. So it really boils down to whether you need to add new "objects" more often than new "methods".
As for OO hierarchies, this really means that some "objects" may have more "methods" than others. OO draws the hierarchy from the methods perspective: Objects are classified by the methods they define. You can also pick a object-centric view, where methods are classified by which subset of objects they work on. Here, ADT's work fine: You can have nested sum-types where a "method" is classified by which subtree of the sum it works on. So, again, it depends on which axis you are most likely to modify/extend.
As for which fits best in a ray-tracer, this depends on whether the set of scene objects is more likely to be extended/subdivided than the set of operations. I see no clear winner here, you can define it either way: You can have a large number of different objects (polygons, spheres, cylinders, etc.) and a single operation: Find intersection with ray, or you can have a single type of objects (a Bezier-patch) and a large number of operations: Find intersection, find normal, find refraction index, find reflection index, find bounding box, etc. OO may win in the first case, but it will certainly lose in the second.
Even in compiler writing, where you are more likely to extend the language on which you work instead of the number of things you do with syntax, I personally find ADT's more to my taste than object hierarchies: I like my type-checker to be separate from my parser and code-generator, and things like optimising transformations (that has to look at more than one node in the tree at a time) is a lot easier using pattern-matching than classes.
Torben
Andreas Rossberg - 27 Jun 2005 17:46 GMT > Even in compiler writing, where you are more likely to extend the > language on which you work instead of the number of things you do with > syntax, I personally find ADT's more to my taste than object [ADT = algebraic datatype? Usually it stands for abstract datatype.]
> hierarchies: I like my type-checker to be separate from my parser and > code-generator, and things like optimising transformations (that has > to look at more than one node in the tree at a time) is a lot easier > using pattern-matching than classes. Indeed. In fact, I think your initial conjecture isn't even true: for non-toy compilers it happens much more often that you introduce new optimisations, new backends, whatever, i.e. new functions that operate on (parts of) an AST, than that you change the language. Moreover, the sheer number of auxiliary functions on ASTs that you need in a realistic compiler already makes the OO approach unsuitable. Having details of optimisation phases or multiple backends be scattered through a global class hierarchy is totally the wrong kind of modularisation.
And I second pattern matching - you are likely to need nested patterns for transformations, and those cannot be simulated by method dispatch without introducing a maintenance nightmare.
Cheers,
- Andreas
 Signature Andreas Rossberg, rossberg@ps.uni-sb.de
Let's get rid of those possible thingies! -- TB
Chris F Clark - 27 Jun 2005 23:35 GMT In the (N+1)*(M+1) discussion, Andrea wrote:
> Indeed. In fact, I think your initial conjecture isn't even true: for > non-toy compilers it happens much more often that you introduce new [quoted text clipped - 5 lines] > through a global class hierarchy is totally the wrong kind of > modularisation. Have you ever tried writing a non-trivial compiler with the OO approach? I have and maintain one on a daily basis. Now, I won't dispute many of your claims, except that the OO approach is unsuitable. You don't get too many auxillary functions scattered over too classes unless your design is poor. Both the number of classes and the number of functions in the compiler I maintain are too large for me to track in my head (or even to estimate, other than to say that both are well into the hundred(s)). However, I doubt that many of the functions have specializations over more than a handful of classes. The simple fact is that most functions if designed right are not specialized except in terms of some very generic attributes, and those attributes group into very regular hierarchies. Of course, the hierarchies are not necessarily easily disentangled trees down to the leaf level, so it is often useful to organize them as "properties" (where the properties use inheritance hierarchies).
In addition, I would like to suggest that even in an evolving compiler for a stable language, one is likely to add new AST types. That is a very convenient way of adding a new specialization. If some transformation yields a new result, it is useful to add one (or more) AST types to represent the result of that transformation, even if initially it may appear to be semanticly equivalent to some other AST type. Adding the new type allows one to be more precise in the semantics, as the transfromed node is often a special case of the semanticly equivalent AST. For example, if you run a phase that sort commutative operators so that constants only appear as the right argument, it can be useful to add sorted_add AST nodes to indicate the transformed outputs. Again, sorted may also be carried as a property rather than a subtype, especially if it may be attached to many distinct AST types. In either case, you can then specialize transformations that only are safe (or are more efficient, etc.) applied to sorted commuative operators.
I think OO inheritance gets criticized based on too many naive uses. Too many people assume that it is a way of dividing the world into distinct heirarchies, which doesn't match the shape of the world at all. It is much better, for defining things where "this works like that except for here and there". When used that way, one doesn't get functions scattered over great masses of code. You get nice locality.
The point is when adding either N+1 or M+1, you usually have only a few special cases to deal with, and you are generally adding both N+1 and M+1 at the same time, where the extra case(s) in the other dimension is to capture some distinction your previous design didn't distinguish, but you are really only adding a few cases to cover both N+1 and M+1, 1 case for the primary body that applies almost universally and some cases for the new distinguished elements, depnding on how many categories of them there are.
Of course, this method only works if you are distnguishing things in terms of characteristics (e.g. round, regular, parallel sides) rather than final types (circle, ellipse, triangle, square, star).
-Chris
Marcin 'Qrczak' Kowalczyk - 28 Jun 2005 01:32 GMT > Have you ever tried writing a non-trivial compiler with the OO > approach? I have and maintain one on a daily basis. Now, I won't > dispute many of your claims, except that the OO approach is > unsuitable. You don't get too many auxillary functions scattered over > too classes unless your design is poor. I've written a compiler using generic functions (http://sourceforge.net/cvs/?group_id=110425). This means that dispatched functions are defined separately from types they dispatch on, and their specializations are defined separately from generic functions and from the types (I tend to write them near the introductions of generic functions though).
There are several phases in the compiler. Each phase examines the result of the previous phase and prepares data for the next phase. Between most pairs of phases the module is represented using a family of types for different kinds of tree nodes. Each time it's a different family of types. Families of types usually form a two-level hiearchy, for example "lifted" code has abstract supertypes for expressions, patterns, definitions and meanings, and each of them is further split into concrete types.
Below are the counts of generic functions dispatched on nodes of each kind of intermediate code, for each module (I didn't count other generic functions, e.g. dispatched on the types of literals).
A traditional OO approach would force to put all functions dispatched on the same type together. For example pretty printing is quite independent from other processing and it's used only for debugging except the final phase, thus I defined it in separate modules, so it doesn't get in the way when I'm looking at more important code.
(An OO approach would perhaps use fewer function names, because for clarity I use different generic functions for domain subsets which are never mixed and dispatched dynamically. For example I have separate functions LiftExpr, Lift1Pat, Lift2Pat, Lift1Def, Lift2Def. An OO approach would probably spell them just Lift, Lift1 and Lift2, especially as they would be put in separate namespaces. So the real function counts in an OO approach might be smaller, but the amount of forced demodularization would remain the same.)
Source code: - SourcePrinter: 1 (pretty printing) - Expander: 17 (transforming source code to abstract code)
Abstract code: - Abstract: 2 (type definitions, auxiliary functions) - AbstractPrinter: 6 (pretty printing) - Interfaces: 1 (generating interface files) - Expander: 5 (some minor functions) - Lifter: 17 (transforming abstract code to lifted code)
Lifted code: - Lifted: 5 (type definitions, auxiliary functions) - LiftedPrinter: 4 (pretty printing) - Lifter: 3 (some minor functions) - Planner: 16 (transforming lifted code to sequential code)
Sequential code: - Sequential: 4 (type definitions, auxiliary functions) - SequentialPrinter: 4 (pretty printing) - Planner: 2 (some minor functions) - CCoder: 12 (transforming sequential code to C code)
C code: - CCode: 1 (type definitions, auxiliary functions) - CCodePrinter: 8 (pretty printing) - CCoder: 2 (some minor functions)
With a traditional OO approach the same design would be doable, at the cost of forcing to put together type definitions of a phase, auxiliary functions working on these types, pretty printing, and the main work of transforming these values to the next representation. I prefer to have freedom to arrange modules according to my taste rather than being forced to put together functions just because they are dispatched on the same family of types.
Currently the representations are independent from one another, and a module which implements a transformation phase depends only on the representation it examines and the representation it produces. The OO approach would significantly increase the depth of the dependency graph: each family of type definitions would depend on the following phases (because it includes code which produces the next phase). Technically this might not mean anything, it only hurts my taste.
In this language modules are used as units of name visibility. For example each module which implements a transformation phase exports only one name, a function transforming the whole module being compiled. If visibility boundaries were tied to classes, I would have to either make more names public, or introduce lots of "friendships" (or whatever the language provides for extending visibility). I prefer to be able to design visibility domains independently from grouping functions for the purpose of dispatching.
 Signature __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/
Marcin 'Qrczak' Kowalczyk - 28 Jun 2005 02:19 GMT > With a traditional OO approach the same design would be doable, at the > cost of forcing to put together type definitions of a phase, auxiliary [quoted text clipped - 3 lines] > being forced to put together functions just because they are dispatched > on the same family of types. Two more points.
Besides generic functions which would correspond to dispatched methods in OOP, there is other code. For example CCoder module (which transforms sequential code to C code) includes many constants and auxiliary functions for building parts of C code trees (something like smart constructors, compositions of constructors etc.).
Note that I've even given this module the name which corresponds to the kind of code it produces, not the kind of code it examines. This transformation is more tied to the representation of its target than to its source.
But with an OO approach these constants and functions operating exclusively on C code would naturally go to the module which defines types of C code, together with the pretty printer for the C code, but far from the implementation of transformation of sequential code into C code which actually uses them.
Well, I could write them in the module which uses them, i.e. the module which includes the mentioned transformation (all of them would be static constants or static functions), the module with sequential code. This is consistent with my current design, but I'm afraid contrary to typical OO practice which puts operations manipulating values of some type together with the type definition, no matter whether it builds them or examines them.
Another point. At some time I plan to add an internal interpreter (for macros) of the language which is currently being compiled. This means adding code which transforms some representation (probably lifted code) into a representation which is suitable for execution.
In the current design this is possible by adding new modules, without modifying any existing code. Of course they might trigger some tweaks to existing code for convenience or efficiency, but in principle none are needed.
In a design where functions dispatched on a type are defined together with the type, I would have to add methods to existing classes. The module which defines the lifted code would deal not only with transforming it into sequential code for the compiler, but also with making executable code for the interpreter from it.
I definitely prefer to have them written separately. The new module will have little in common with existing code, it just dispatches on the same family of types.
I admit that many of these issues are a matter of taste rather than technical difficulties. It's hard to objectively examine how well code is modularized and how focused are changes when it evolves.
 Signature __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/
Chris F Clark - 28 Jun 2005 05:09 GMT > I've written a compiler using generic functions > (http://sourceforge.net/cvs/?group_id=110425). I think we have a different definition of what OO means, since much of your description would match what I consider an OO approach, aside for your persistent instance that certain things are defined separately. And, I don't consider that to be relevant as to whether something is OO. It's the dispatch to distinct functions which makes something OO, because that's what allows the programming by differences. (That is in my book, if one is doing dynamic dispatch (especially on type of a function operand), one is doing OO. I realize that is not everyone's definition, and for other people it is the strong association of functions with type, which seems to be what your are critiquing.)
> Currently the representations are independent from one another, and > a module which implements a transformation phase depends only on the > representation it examines and the representation it produces. The OO > approach would significantly increase the depth of the dependency > graph: each family of type definitions would depend on the following > phases (because it includes code which produces the next phase). There is more likely to be some additional coupling in an OO approach. However, not nearly as severe as you propose. It's more severe if you use a "pure" OO approach where every function must be part of a type.
> I prefer to be able to design visibility domains independently from > grouping functions for the purpose of dispatching. That's not an unreasonable desire.
> This is consistent with my current design, but I'm afraid contrary > to typical OO practice which puts operations manipulating values of > some type together with the type definition, no matter whether it > builds them or examines them. Good OO practice puts the functions together which belong together. The one caveat is that functions which are dispatched upon a type must be associated with that type. However, good OO design would limit that function to the part which is actually specialized.
> In a design where functions dispatched on a type are defined together > with the type, I would have to add methods to existing classes. Yes, it is true that the code which you wish to specialize on a type must be defined "with" the type. Although in some OO languages, you can "define" the type "piecemeal" and thus make this point moot. Even in languages which require the type to be defined as one unit, generally the code body for the function need not be in the type declaration, so that one can still achieve the desired separation. The type merely serves as a point where one can go to look and see a listing of all the functions which specialize on it. The balance between whether that is a burden or a blessing depends in part on perspective.
-Chris
Jens Axel Søgaard - 28 Jun 2005 10:06 GMT >>I've written a compiler using generic functions >>(http://sourceforge.net/cvs/?group_id=110425). [quoted text clipped - 9 lines] > definition, and for other people it is the strong association of > functions with type, which seems to be what your are critiquing.) The obligatory reference to Rees's list of OO features or properties:
<http://store.yahoo.com/paulgraham/reesoo.html>
 Signature Jens Axel Søgaard
Marcin 'Qrczak' Kowalczyk - 28 Jun 2005 11:40 GMT > The obligatory reference to Rees's list of OO features or > properties: > > <http://store.yahoo.com/paulgraham/reesoo.html> For my language the only strong "no" is 9. Others are either "yes" (1, 3, 4, 5, 7), or "can be made true but the common design makes it false" (2), or "technically true but probably not in the way it was intended" (6), or "hard to tell whether a feature qualifies as this" (8). The concepts are too vague for me to be able to clearly state whether they are supported.
I understood the Andreas's critique as criticizing 9.
 Signature __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/
Andreas Rossberg - 28 Jun 2005 12:55 GMT >>The obligatory reference to Rees's list of OO features or >>properties: >> >> <http://store.yahoo.com/paulgraham/reesoo.html> Nice list (although I don't really see the difference between encapsulation and protection).
> I understood the Andreas's critique as criticizing 9. Right. IMO the defining characteristic of "object-oriented" programming is structuring programs around autarkic "objects" that contact each other in some way, but which all decide "by themselves" how they react to that, in a black-box sort of manner. That's what the term suggests, and it is the only characteristicum from the above list which you do /not/ find in other paradigms one way or the other. I sometimes call it "internalisation of operations", wrt a value. Technically, this is usually modeled by the "sum-of-product-of-function" idea described in the list.
I'm aware that some people are using the term OO in a wider sense, but I don't like it. First - as Rees notes correctly - it practically renders the term meaningless. Second, I believe it usually stems from either the desire to comply with the marketing hype ("see, <my favorite language> is OO too"), or the attempt to argue for the superiority of OO ("see, all modern languages are OO"), or from the lack of background regarding many of the ideas now captured as "inherently" OO ("OO is great because it gives us modularity"). I find neither an acceptable reason.
In particular, I disagree with considering "dynamic dispatch" or "late binding" OO. Or encapsulation. Or polymorphism. All are ingredients for realising the above idea, but not defining characteristics. All of them are found in any decent language. Often in much more flexible ways than in typical OO languages.
 Signature Andreas Rossberg, rossberg@ps.uni-sb.de
Let's get rid of those possible thingies! -- TB
Ketil Malde - 28 Jun 2005 13:31 GMT > Right. IMO the defining characteristic of "object-oriented" > programming is structuring programs around autarkic "objects" that > contact each other in some way, but which all decide "by themselves" > how they react to that, in a black-box sort of manner. That's what the > term suggests, and it is the only characteristicum from the above list > which you do /not/ find in other paradigms one way or the other. FWIW, I've also heard this classified as "object based", while "object oriented" is taken to include inheritance as well. Seems like a sensible distinction to me.
-k
 Signature If I haven't seen further, it is by standing in the footprints of giants
Andreas Rossberg - 28 Jun 2005 13:46 GMT > FWIW, I've also heard this classified as "object based", while "object > oriented" is taken to include inheritance as well. Seems like a > sensible distinction to me. Yes, "object-based" was used more frequently in the early nineties(?) when prototype-based languages were more en vogue, but I haven't heard it for quite a while. I rather recall the distinction being between absence and presence of classes. Of course, the latter usually includes inheritance (I wouldn't try to classify anything via inheritance alone, because of the even wider confusion about what inheritance is to start with).
 Signature Andreas Rossberg, rossberg@ps.uni-sb.de
Let's get rid of those possible thingies! -- TB
Chris F Clark - 28 Jun 2005 18:16 GMT Andreas wrote:
> I'm aware that some people are using the term OO in a wider sense, but > I don't like it. First - as Rees notes correctly - it practically > renders the term meaningless. I think the point made on the page is valid:
> Perhaps part of the confusion - and you say this in a different way > in your little memo - is that the C/C++ folks see OO as a liberation > from a world that has nothing resembling a first-class functions, > while Lisp folks see OO as a prison since it limits their use of > functions/objects to the style of (9.). I come from the liberation school, which is my view of what is OO is wider. Learning OO allowed me whole new ways of modelling things. It did not render what I had previously learned, e.g. structured programming invalid. The definition of OO, I learned, was 5 points: encapsulation, polymorphism, inheritance (and if I recall correctly abstraction, and dynamic dispatch). Encapsulation and abstraction are simply "modular programming" and not new to OO. That means for me, OO was about polymorphism, inheritance, and dynamic dispatch, as those are the things which "we did not know" prior to OO.
So, yes, I would consider pattern matching a form of dynamic dispatch and thus something OO, albeit a newer and more interesting form than OO itself (prior to FP languages) provided.
In fact, I don't consider HOF to be an FP innovation (at least not a modern one), since HOF have been known and desirable for as long as I can remember. In fact, one of the frustrations with Pascal (a pre-OO language) was in trying to write HOFs--the argument passing powers were more a hinderance than a help, which actually helped make C popular, since it could do HOFs better.
The distinguishing characteristic of modern FP is type inference, a better way of writing generic functions. This makes it possible to write HOFs in a type-safe way. That is an innovation. That distinguishes it from OO. No amount of inheritance based polymorphism is going to give one the same freedom as type inference.
Modern FP also inherits from vernerable FP (e.g. Lisp) the idea of applicative programming, i.e. expressions without side-effects. It also inehrits the idea of returning "large values" (e.g. lists) rather than only things which fit in a word. However, I would argue that both of those have been seen as desirable (with the side-effect-free part perhaps questionably so) for a long time, even by the non-FP programmers in the Pascal era.
So, if you want to argue that type inference enaables solutions that can't be expressed by inheritance, get on your soap-box and we will listen, especially when you have evidence. Similarly, if you can claim advantages from immutability, let us know. We OO people want to learn and grow. (Personally, I am most interested in monads, to see if they are really a boon. Generally, you see GC adopted by OO and tail-call-optimization will follow suit.) However, if you want to criticize us for some slavish adherence to some rule, be careful, we don't consider ourselves slaves. When I consider myself to be a convert on FP (and I'm tettering as I find certain aspects of mutable values really convenient), that doesn't mean I will stop considering myself an OO programmer (nor even a structured programmer). Knowledge grows by acretion. You don't need to forget what you've learned in order to learn more (Well, sometimes you do for a short while, but not universally and not eternally).
And, yes there are purists, people who are slavish devoted to solving everything one-way. However, much of that purity is motivated by practical considerations. For example, in a langue where everything is an object, what one is really asking for is for everything to be first-class. Something I think most FPers will appreciate. It's a real pain when "integers" and "objects" have different rules and one-or-the-other is not first-class.
So, please add closures and pattern matching to the next language I use. If you give me bignums, I won't complain either. Just don't take away inheritance if you can avoid it. It really does solve certain problems in an elegant way that does minimize code. And the penalty for using it isn't very high if one finds the right hierarchy.
-Chris
Joachim Durchholz - 28 Jun 2005 21:00 GMT > So, yes, I would consider pattern matching a form of dynamic dispatch > and thus something OO, albeit a newer and more interesting form than > OO itself (prior to FP languages) provided. Ironically, FPLs predate OO by a decade IIRC. (Lisp: sometime in the 50ies, Simula in the 60ies - I'm too lazy to check the exact dates, but I'm pretty sure somebody knows them by heart *g*)
> In fact, I don't consider HOF to be an FP innovation (at least not a > modern one), since HOF have been known and desirable for as long as I > can remember. In fact, one of the frustrations with Pascal (a pre-OO > language) was in trying to write HOFs--the argument passing powers > were more a hinderance than a help, which actually helped make C > popular, since it could do HOFs better. The essential point is that you can construct functions at run-time.
This is not possible in Pascal or C, and marginally possible in C++.
> The distinguishing characteristic of modern FP is type inference, a > better way of writing generic functions. This makes it possible to > write HOFs in a type-safe way. HOFs could always be type-safe. The point of having type inference is that it makes writing type-safe functions easier, because the type signatures of a HOF can be quite mind-boggling. E.g. in a Pascal-style syntax, the fold function would look like this: function fold (function fun (A): B; initial: A; l: list of A): B Now fold has a very simple signature; imagine a third-order function that takes a function taking functions as parameters...
No, type inference doesn't make HOFs possible, but it does a large step towards making them practical. (Parametric polymorphism is the other important feature.)
> Modern FP also inherits from vernerable FP (e.g. Lisp) the idea of > applicative programming, i.e. expressions without side-effects. Ironically, applicativity was one of the first things that Lisp lost - with the introduction of RPLACA and friends. Though the applicative style never got completely lost. Both styles seem to have coexisted through the years.
> So, if you want to argue that type inference enaables solutions that > can't be expressed by inheritance, get on your soap-box and we will > listen, especially when you have evidence. Soapboxed FPLers typically talk about parametric polymorphism (not type inference) being the FP equivalent to ad-hoc polymorphism.
Actually most will readily acknowledge that OO-style polymorphism is slightly more powerful, but they will also say that parametric polymorphism avoids a whole slew of problems that are typical for working with OO class hierarchies: dispatching asymmetries (rare but requires premature design decisions if it occurs), dissemination of logic across a class hierarchy (a common problem), no good way to modularise a class from its subclasses (roughly: private is too restrictive, protected/public exposes too many implementation details).
I have not enough experience with FPLs to say for sure that the argument holds up in practice, but I have done enough OO class hierarchy design to find the arguments plausible.
> Similarly, if you can claim advantages from immutability, let us know. Oh, that's really a FAQ, but anyway, here goes:
All problems that can be associated to pointer aliasing go away. Instantly and without a trace.
There is no need to copy data structures. Since data will never change when you're not watching, you can "copy" it by creating a new pointer to it. In practice, FPLs with no mutable data structures don't even have the concept of a "reference" - all assignment is by reference, since that's a safe thing to do under immutability, so there's no need to have by-value copying.
The absence of aliasing problems allow compilers to do a lot of aggressive high-level optimisations. Every function can be recast as a macro expansion and vice versa (try that with MIN() in C...), with no worries about breaking things. (The worries for an FPL compiler writer are the heuristics under which a function should be inlined...)
> (Personally, I am most interested in monads, to see if they are > really a boon. The topic is vastly overrated. They are just a specific Design Pattern.
Since this particular pattern is extremely abstract, many people waste far too much time trying to understand the concept. Unfortunately, nobody with enough knowledge and talent (and interest) has written a good generally understandable explanation. (The best approximation that I have come up with is "a monad is what's a pipe in Unix, only type-safe", but (a) I'm not sure that this is fully correct and (b) I'm pretty sure that this still isn't the full picture.)
> Generally, you see GC adopted by OO and > tail-call-optimization will follow suit.) However, if you want to [quoted text clipped - 3 lines] > values really convenient), that doesn't mean I will stop considering > myself an OO programmer (nor even a structured programmer). I had a somewhat sobering experience, and I don't consider myself an OO programmer anymore.
The experience was this: I was part of the basic library standardisation efforts for Eiffel. We tried to capture all relevant aspects of the STRING class' in the form of preconditions and postconditions. I learned two lessons there: 1) Doing assertions over containers without higher-order functions, by direct recursion, is a nightmare and not really worth the effort. (We had several cases where a function's body was easier to read than the assertion. The problem would have gone away if we had been able to directly write things like "forall characters in Current, this-and-that holds".) 2) One of the dreams in the Eiffel community were that, given a set of "precise enough" postconditions, the Eiffel compiler would be able to generate the routine's body. It came as a shock to me to see that such a postcondition would already be the function's body in an FPL - so why wait until Eiffel acquires an "implementation inferencer" (which would be *very* difficult to write since Eiffel doesn't have a formal semantics and other problems that would make is very difficult to have anything reliable for that). 3) Mutability makes it entirely impossible to nail down some essential aspects of the workings of such a class. For example, if you write an Eiffel loop like loop ... "foo".append(something) ... end you'll find that the STRING object will be reused, and grow with every iteration. This isn't what one would expect, but it's perfectly in line with mutable string semantics: "foo" is an instruction to the compiler to create a (mutable) STRING object, and the language designer didn't write it into the specs whether literal strings have to be recreated on every loop iteration. (He even refused to add such a notice when asked about the problem, because having to create the string object on every iteration would be "inefficient".)
Well, of course I still can program in OO languages. It's just that I don't see much value in inheritance anymore (but I gladly accept the encapsulation that OO classes bring).
> And, yes there are purists, people who are slavish devoted to solving > everything one-way. However, much of that purity is motivated by [quoted text clipped - 3 lines] > real pain when "integers" and "objects" have different rules and > one-or-the-other is not first-class. Indeed. Though I'm not sure how that translates to FPL usage - I haven't noticed anything "not quite functional" about FPLs.
> So, please add closures and pattern matching to the next language I > use. If you give me bignums, I won't complain either. Just don't > take away inheritance if you can avoid it. Inheritance makes type inference largely impossible.
If you have a call that says "foo.frobnicate", you don't know which of the many classes in your system contains the "frobnicate" function that's meant. Well, if all "frobnicates" are in descendants of the same base class, you can indeed do type inference - but what if they are in different class hierarchies? From reading the program, neither the programmer nor the compiler could infer what semantics is actually meant.
> It really does solve certain problems in an elegant way that does > minimize code. And the penalty for using it isn't very high if one > finds the right hierarchy. The same can be said for parametric polymorphism :-)
The real question is: it parametric polymorphism enough? And if it isn't and you have to circumvent the type system: are its other advantages worth the restrictions?
However, I have one indirect evidence that it's indeed enough: All OO languages that I know have some way to access RTTI, and to circumvent the type system to work around the expressivity limitations of the type system. Type casts (checked in better languages) are quite commonplace and routinely applied by application programmers. No such mechanism is available in a typical FPL. Well, some FPLs do have it, but the docs lace such functions with such heavy warnings that application programmers almost never use them. And it's very rare that a programmer calls for such a thing here in comp.lang.functional. To me, this indicates that parametric polymorphism is both powerful enough and leads to less typing problems.
Regards, Jo
Andreas Rossberg - 29 Jun 2005 13:44 GMT > No, type inference doesn't make HOFs possible, but it does a large step > towards making them practical. (Parametric polymorphism is the other > important feature.) Indeed.
> Actually most will readily acknowledge that OO-style polymorphism is > slightly more powerful No, it's not. The two are incomparable in expressiveness. But as you say, parametric polymorphism is simpler and less problematic. And more extensible.
> 3) Mutability makes it entirely impossible to nail down some essential > aspects of the workings of such a class. For example, if you write an [quoted text clipped - 12 lines] > about the problem, because having to create the string object on every > iteration would be "inefficient".) Incidentally, OCaml with its, um, debatable feature of mutable strings has exactly the same problem. :(
> Inheritance makes type inference largely impossible. No, inheritance is not a problem. Subtyping is. And it does not strictly make type inference impossible, only much less practical. You usually have to give up completeness, i.e. resort to local type inference.
Cheers,
- Andreas
 Signature Andreas Rossberg, rossberg@ps.uni-sb.de
Let's get rid of those possible thingies! -- TB
Joachim Durchholz - 29 Jun 2005 16:37 GMT >> Inheritance makes type inference largely impossible. > > No, inheritance is not a problem. Subtyping is. Agreed. Been typing faster than thinking... (actually I meant to say "interface inheritance", which is a kind of subtyping)
> And it does not strictly make type inference impossible, only much > less practical. You usually have to give up completeness, i.e. resort > to local type inference. Is there a short text that outlines the nature of these problems? I find that my mental model of what subtyping does to type inference is seriously lacking.
Regards, Jo
Andreas Rossberg - 29 Jun 2005 16:48 GMT > Is there a short text that outlines the nature of these problems? I find > that my mental model of what subtyping does to type inference is > seriously lacking. Nothing introductory I'm aware of at the moment. But the basic problem is simple: without subtyping, type inference just amounts to deriving and solving a system of equations over types. With subtyping, this becomes a system of inequations, which makes it explode combinatorially, because you cannot perform much simplification anymore.
Cheers,
- Andreas
 Signature Andreas Rossberg, rossberg@ps.uni-sb.de
Let's get rid of those possible thingies! -- TB
Tomasz Zielonka - 06 Jul 2005 00:12 GMT >> (Personally, I am most interested in monads, to see if they are >> really a boon. [I reordered Joachim's sentences a bit]
> Since this particular pattern is extremely abstract, many people waste > far too much time trying to understand the concept. Unfortunately, > nobody with enough knowledge and talent (and interest) has written a > good generally understandable explanation. I think the ones on http://haskell.org/bookshelf/#monads are quite good. The problem is that the concept seems to be really difficult to grasp. This depends on person - I bet there are people who (would) have no problem with it (I am not one of them).
The important thing is that monads are really easy to use (use existing ones, create your own ones) once you understand what's going on, and they are not only a solution for IO in Haskell - they are really useful as a programming tool.
> (The best approximation that I have come up with is "a monad is what's a > pipe in Unix, only type-safe", If pipes is all you want, using lazy lists and function composition will be sufficient (and convenient).
I think you are missing a big part of the picture. Simple function composition is only one of "computation combination strategies" possible with monads.
> but (a) I'm not sure that this is fully correct and Not even correct for any concrete monad. Unix pipes are severly constrained as a programming construct. They are only good for simple processing of a single stream of data. You cannot name and duplicate intermediate values (well, you could try to save files or fork streams, but that wouldn't be what you mean by "unix pipes", and still wouldn't suffice). What about recursion, looping, higher order functions, closures and various side-effects supported by different monads. If state is too easy, think about continuations, back-tracking, exceptions and the ability to mix these effects using monad transformers? Do I want the state changes to be rolled back when exception happens? Yes? No? No problem, just compose monad transformers in a correct order.
> (b) I'm pretty sure that this still isn't the full picture.) You are correct.
> The topic is vastly overrated. Given that your understanding of monads is incomplete, I would take your statement with a grain of salt.
> They are just a specific Design Pattern. One of the best.
Best regards Tomasz
< |
|