Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / April 2007

Tip: Looking for answers? Try searching our database.

Problem applying generics to my code.  Is there a better solution?

Thread view: 
Lucas - 05 Apr 2007 00:09 GMT
All,

I am working on a statistical inference library and have run into a
limiting case of using generics for type safety.  I hope someone is
able to suggest a more elegant solution to my problem that I currently
have.

I have a small class of static methods for computing closed-form
posterior distributions.  The skeleton of the classes look like this

// This class defines a distribution over a type T.  Usually Integer
or Double.
public interface Distribution<T> { ... }

// Here are some probability distributions
public class NormalDistribution implements Distribution<Double>
{ ... }
public class GammaDistribution implements Distribution<Double> { ... }
public class PoissonDistribution implements Distribution<Integer>
{ ... }

// and here are a bunch of methods to compute closed form solutions to
the posteriors
public class Posteriors {
  public static NormalDistribution
normalNormalPosterior( List<Double> data, NormalDistribution
likelihood, NormalDistribution prior ) { ... }
  public static GammaDistribution
poissonGammaPosterior( List<Integer> data, PoissonDistribution
likelihood, GammaDistribution prior ) { ... }
}

Now, I would like to create a generic dispatch method that I can pass
any data, likelihood and prior into and it will dispatch to the
appropriate method if it exists.  The prototype looks like this

public static <T, P> Distribution<P> posterior( List<T> data,
Distribution<T> likelihood, Distribution<P> prior)

So, for type safety, I want to require that the data is compatible
with the likelihood distribution and that the posterior distribution
has the same domain as the prior -- the definition of a conjugate
prior.  The problem comes in the implementation.  Currently, my method
looks like this

public static <T, P> Distribution<P> posterior( List<T> data,
Distribution<T> likelihood, Distribution<P> prior)
{
  if ( likelihood instanceof NormalDistribution && prior instanceof
NormalDistribution )
     return (Distribution<P>) normalNormalPosterior((List<Double>)
data, (NormalDistribution) likelihood, (NormalDistribution) prior );

  if ( likelihood instanceof PoissonDistribution && prior instanceof
GammaDistribution )
     return (Distribution<P>) normalNormalPosterior((List<Integer>)
data, (PoissonDistribution) likelihood, (GammaDistribution) prior );

  throw new UnsupportedConjugatePairException();
}

This works and, logically, I don't think I can run into any strange
run-time exceptions since, for example

  1. "likelihood instanceof NormalDistribution" implies that List<T>
must be a List<Double>
  2. "prior instanceof NormalDistribution" ensures that the posterior
will match the prior

The real question: Is there a way to write my posterior() method
without all the ugly casting and compiler warnings?
Daniel Pitts - 05 Apr 2007 01:25 GMT
> All,
>
[quoted text clipped - 69 lines]
> The real question: Is there a way to write my posterior() method
> without all the ugly casting and compiler warnings?

How about this:

public static <T, P> Distribution<P> posterior(List<Double> data,
NormalDistribution<T> likelihood, NormalDistribution<P> prior)
{
     return normalNormalPosterior(data, likelihood, prior );

}
public static <T, P> Distribution<P> posterior(List<Integer> data,
PoissonDistribution<T> likelihood, GammaDistribution<P> prior)
{
     return normalNormalPosterior(data, likelihood, prior);
}

Why have one method that does two things?
You might even want to make PoissonDistribution implement a method
posterior which takes a List<Integer> data, GammaDistribution<P>
prior), and likewise on the NormalDistribution.
Lucas - 05 Apr 2007 01:56 GMT
> How about this:
>
[quoted text clipped - 13 lines]
>
> Why have one method that does two things?

Oops.  That was a typo on my part.  That code path should be

public static <T, P> Distribution<P> posterior(List<Integer> data,
PoissonDistribution<T> likelihood, GammaDistribution<P> prior)
{
     return poissonGammaPosterior(data, likelihood, prior);
}

> You might even want to make PoissonDistribution implement a method
> posterior which takes a List<Integer> data, GammaDistribution<P>
> prior), and likewise on the NormalDistribution.

My motivation for the original choice was that, as a user of the
package, I want to manipulate generic Distribution<T> objects and let
the library decide if there is an efficient way of giving a
posterior.  There are always numerical methods to fall back on.  Also,
there can be multiple conjugate priors for the same likelihood.

Here's an example of how I want to interface to these methods

import static my.package.Conjugate.*;

public static void foo()
{
  Distribution<Double> distn1 = new NormalDistribution( mu, sigma );
  Distribution<Double> distn2 = new NormalDistribution( 0,  tau   );

  Distribution<Double> post = posterior( distn1, distn2 );
}

So, the underlying problem I'm trying to solve is to dispatch based on
the most specific type of the object.  Per your suggestion, I could
get around this by adding a posterior() method that would be
implemented along the lines of

public class NormalDistribution implements Distribution<Double>
{
  public Distribution<Double> posterior( List<Double> data,
Distribution<Double> prior )
  {
     if ( prior instanceof NormalDistibution ) return
normalNormalPosterior( data, this, prior );
     if ( prior instanceof GammaDistribution ) return
normalGammaPosterior( data, this, prior );
  }
}

or should I just create separate methods for each type of supported
prior distribution and force the caller to call the posterior() method
with an concrete type and not the generic Distribution<> type?

I'm going to ramble on a bit here and point out that the library have
a ConditionalDistribution class which would probably be the most
appropriate vehicle for your suggestion since a NormalDistribution
class itself does not have any unbound parameters.  To replicate the
last example

public class NormalDistributionWithUnknownMean extends
NormalDistribution implements ConditionalDistribution<Double, Double>
{
  public NormalDistribution conjugate( List<Double> data,
NormalDistribution prior )
  {
     return normalNormalPosterior( data, this, prior );
  }
}

public class NormalDistributionWithUnknownVariance extends
NormalDistribution implements ConditionalDistribution<Double, Double>
{
  public NormalDistribution conjugate( List<Double> data,
GammaDistribution prior )
  {
     return normalGammaPosterior( data, this, prior );
  }
}

Thank you for the discussion.
Daniel Pitts - 05 Apr 2007 06:15 GMT
> > How about this:
> >
[quoted text clipped - 60 lines]
>    }
> }

Anytime that you find yourself using a chain of "instanceof" checks on
classes which you control, its time to look into using polymorphism
instead...  You're particular problem may be best solved through a
Visitor pattern.
Here's an example, I'm removing the generics, as they're not necessary
for my point.

abstract class Distribution {
   Distribution posterior(List list, Distribution prior) {
        prior.visit(this);
   }
   protected Distribution posteriorList list, Normal prior) {
       throw new IllegalArgumentException("Not valid");
   }
   protected Distribution posterior(List list, Normal prior) {
       throw new IllegalArgumentException("Not valid");
   }
   public abstract Distribution visit(List list, Distribution d);
}
class Normal {
   protected Distribution posterior(List list, Normal prior) {
       return normalNormaPosterior(list, this, prior);
   }
   protected Distribution posterior(List list, Gamma prior) {
       return normalGammaPosterior(list, this, prior);
   }
   public Distribution visit(List list, Distribution d) {
      return d.posterior(list, this);
   }
}

class Gamma {
   protected Distribution posterior(List list, Normal prior) {
       return gammaNormaPosterior(list, this, prior);
   }
   protected Distribution posterior(List list, Gamma prior) {
       return gammaGammaPosterior(list, this, prior);
   }
   public Distribution visit(List list, Distribution d) {
      return d.posterior(list, this);
   }
}

> or should I just create separate methods for each type of supported
> prior distribution and force the caller to call the posterior() method
> with an concrete type and not the generic Distribution<> type?

That might not be a bad idea, but a "visitor pattern" might be more
appropriate... Alternatively, you might look for algorithms that are
distribution type agnostic.  I would probably go for the visitor
approach, google it :-)

> I'm going to ramble on a bit here and point out that the library have
> a ConditionalDistribution class which would probably be the most
[quoted text clipped - 23 lines]
>
> Thank you for the discussion.
I actually don't know much about the problem domain your solving. I'm
guessing statistics? (The one math I haven't studied much) :-)

Anyway, I hope I've been able to help.
Lucas - 05 Apr 2007 08:04 GMT
> Anytime that you find yourself using a chain of "instanceof" checks on
> classes which you control, its time to look into using polymorphism
> instead...  You're particular problem may be best solved through a
> Visitor pattern.

Yet another reason for me to get my own copy of the GoF.

> Here's an example, I'm removing the generics, as they're not necessary
> for my point.

I'll comment on your suggestion below, but I wanted to re-emphasize
the point that I really would like to use generics for the their type-
safety in this case.  Recall the function prototype I mentioned in the
OP

public static <T, P> Distribution<P> conjugate( List<T> data,
Distribution<T> likelihood, Distribution<P> prior )

Fro, the perspective of enforcing mathematical correctness, it is
important to constrain the return type to be the same as the prior
parameters and force the data to be compatible with the likelihood.
Now, on to your (slightly edited) suggestion....

> abstract class Distribution {
>     Distribution posterior(List list, Distribution prior) {
[quoted text clipped - 7 lines]
>     }
>     public abstract Distribution visit(List list, Distribution d);}

Ok, so we set up a base class (abstract in this example) where we
consider every possible distribution as a potential prior.  It will be
up to the implementing classes to override the conjugate cases
supported by their particular form.

> class Normal {
>     protected Distribution posterior(List list, Normal prior) {
[quoted text clipped - 7 lines]
>     }
> }

Now, when we invoke .posterior(data, prior) for a distribution x, it
will visit the prior, passing itself along as a parameters.  Since we
invoke .visit() on the prior object, it will call the correct method,
i.e. normal or gamma or whatever.  Now, the prior calls back to the
likelihood object, passing itself.  This will give the type
specificity to actually call one of the polymorphic posterior()
methods.  Very nice!!  This is a double-dispatch pattern, correct?

> I actually don't know much about the problem domain your solving. I'm
> guessing statistics? (The one math I haven't studied much) :-)

Correct.  The particular bit of statistics I'm implementing here is
described at

http://en.wikipedia.org/wiki/Conjugate_prior

> Anyway, I hope I've been able to help.

Very much!  I'll give your suggestion a whirl and see how it plays
out.

Many thanks.
Chris Uppal - 05 Apr 2007 09:50 GMT
> Anytime that you find yourself using a chain of "instanceof" checks on
> classes which you control, its time to look into using polymorphism
> instead...  You're particular problem may be best solved through a
> Visitor pattern.

Small point, but "Visitor pattern" is an application of the broadly useful
technique of double-dispatch (which can be called a pattern too, if you like,
though it isn't often seen that way).  In this case, I think, your suggestion
would be better seen as a direct application of double-dispatch, rather than an
application of Visitor -- and a better name could be found for the visit()
method (exposing what it /does/ rather than what feature of "Visitor" it
reflects).

A suggestion for the OP.  Design and implement this stuff without using
generics at all, then go back and put in the necessary generic declarations.
Double dispatch -- or whatever -- is part of the /semantics/ of your system,
whereas generics are just a way of telling the typechecker what rules correct
applications of those semantics will follow.  So unless generics make the
design/code easier to articulate in the first place (which I doubt in this
particular case), trying to keep the typechecker happy at the same time as
trying to think through the actual object interactions will be
counter-productive.

   -- chris
Daniel Pitts - 05 Apr 2007 22:55 GMT
On Apr 5, 1:50 am, "Chris Uppal" <chris.up...@metagnostic.REMOVE-
THIS.org> wrote:
> > Anytime that you find yourself using a chain of "instanceof" checks on
> > classes which you control, its time to look into using polymorphism
[quoted text clipped - 20 lines]
>
>     -- chris

Thanks Chris, I hadn't heard of the double dispatch pattern before,
and it is what I've demonstrated.

On Apr 5, 12:04 am, "Lucas" <lscha...@gmail.com> wrote:
> public static <T, P> Distribution<P> conjugate( List<T> data,
> Distribution<T> likelihood, Distribution<P> prior )
>
> Fro, the perspective of enforcing mathematical correctness, it is
> important to constrain the return type to be the same as the prior
> parameters and force the data to be compatible with the likelihood.
If I'm understanding you correctly

public static <T,  P extends Distribution<?>> P conjugate(List<T>
data, Distribution<T> likelihood, P prior);

Now, conjugate sounds like a verb.  Are you conjugating a likelihood
or a prior?  Does it make sense to make the conjugate method into the
likelihood or prior object instead?

Which one of these three makes more sense?
"Likelihood, conjugate using data and prior":
likelihood.conjugate(data, prior)
or
"Prior, conjugate using data and likelihood":
prior.conjugate(data, likelihood)
or (where Data isn't just a List<T>)
"Data, conjugate using likelihood and prior):
data.conjugate(likelihood, prior)?

If you find that the last one makes the most sense, you might decide
to "wrap" a List<T> into a Data<T> object which implements the method
conjugate.  You might want to wrap the List<T> into a Data class
anyway, depending on other uses.

Hope this helps even more :-)
Daniel
Lucas - 06 Apr 2007 01:47 GMT
> If I'm understanding you correctly
>
[quoted text clipped - 14 lines]
> "Data, conjugate using likelihood and prior):
> data.conjugate(likelihood, prior)?

In this context "conjugate" is a property of the likelihood and the
prior distribution.  It doesn't make a lot of sense to treat it as a
verb.

  "A prior is CONJUGATE TO a likelihood iff the posterior
distribution has the same form as the prior"

A more natural purpose of a "conjugate" method would be a
conjugateTo() method to test for the property, e.g.

  if ( prior.conjugateTo( likelihood ))
    // Use fast, close-form solution
    return prior.posterior( likelihood, data )
  else
    // Use slow, numerical simulation
    return new Posterior( likelihood, data, prior );
Lucas - 06 Apr 2007 02:49 GMT
Daniel,

Thank to you suggestion, I have a functional prototype that I've
included in this post.  One question: In order for each class to
dispatch properly, *every* class has to implement the visit() method.
If there are subclasses that inherit from a parent class, the generic
posterior() class in the abstract Distribution class is invoked which
leads to an infinite loop.

It's just syntactic sugar, but is there a way to get around this
problem?  The only solution I could come up with is that the abstract
class would have to implement a posterior() method for every class
that extends Distribution, which is not feasible since new
Distribution sub-classes will be created.

Anyway, thanks for the insight.  Couldn't have gotten to this point
without eveyone's help.

P.S> The semantics of the Distribution.posterior() method is that
"this" is the prior distribution and we are attempting to combine it
with a likelihood.  This has a nice side effect that every posterior()
method in a class returns a Distribution<T> object rather than the
other way around of treating "this" as the likelihood and have to take
and return a Distribution<?> object.

/**
* Experiment with a conjugate class hierarchy
*/

import java.util.*;

public class Conjugacy
{
 static abstract class Distribution<T> {
   public <S> Distribution<T> posterior( List<S> data,
Distribution<S> likelihood ) {
     return likelihood.dispatch( data, this );
   }

   public Distribution<T> posterior( List<Double> data,
NormalUnknownMean likelihood ) {
     System.out.println( "FAIL" );
     return null;
   }

   public Distribution<T> posterior( List<Double> data,
NormalUnknownVariance likelihood ) {
     System.out.println( "FAIL" );
     return null;
   }

   public Distribution<T> posterior( List<Integer> data,
PoissonUnknownRate likelihood ) {
     System.out.println( "FAIL" );
     return null;
   }

   public abstract <S> Distribution<S> dispatch( List<T> data,
Distribution<S> prior );
 }

 /**
  * Create a normal, gamma and poisson distributions
  */
 static class Normal extends Distribution<Double> {
   public Distribution<Double> posterior( List<Double> data,
NormalUnknownMean likelihood ) {
     System.out.println( "mu ~ Normal ~ Normal-Normal" );
     return new Normal();
   }

   public <S> Distribution<S> dispatch( List<Double> data,
Distribution<S> prior ) {
     System.out.println( "FAIL" );
     return null;
   }
 }

 static class NormalUnknownMean extends Normal {
    public <S> Distribution<S> dispatch( List<Double> data,
Distribution<S> prior ) {
      return prior.posterior( data, this );
   }
 }

 static class NormalUnknownVariance extends Normal {
    public <S> Distribution<S> dispatch( List<Double> data,
Distribution<S> prior ) {
      return prior.posterior( data, this );
   }
 }

 // Gamma distribution
 static class Gamma extends Distribution<Double> {
   public Distribution<Double> posterior( List<Double> data,
NormalUnknownVariance likelihood ) {
     System.out.println( "sigma ~ Gamma ~ Normal-Gamma" );
     return new Gamma();
   }

   public Distribution<Double> posterior( List<Integer> data,
PoissonUnknownRate likelihood ) {
     System.out.println( "lambda ~ Gamma ~ Poisson-Gamma" );
     return new Gamma();
   }

   public <S> Distribution<S> dispatch( List<Double> data,
Distribution<S> prior ) {
     System.out.println( "FAIL" );
     return null;
   }
 }

 // Poisson distribution
 static class Poisson extends Distribution<Integer> {
   public <S> Distribution<S> dispatch( List<Integer> data,
Distribution<S> prior ) {
     System.out.println( "FAIL" );
     return null;
   }
 }

 static class PoissonUnknownRate extends Poisson {
    public <S> Distribution<S> dispatch( List<Integer> data,
Distribution<S> prior ) {
      return prior.posterior( data, this );
   }
 }

 public static void main( String[] args )
 {
   Distribution<Double>  normal      = new Normal();
   Distribution<Double>  gamma       = new Gamma();
   Distribution<Integer> poisson     = new Poisson();
   Distribution<Double>  normalMu    = new NormalUnknownMean();
   Distribution<Double>  normalSig   = new NormalUnknownVariance();
   Distribution<Integer> poissonRate = new PoissonUnknownRate();

   List<Double>  data1 = new ArrayList<Double>();
   List<Integer> data2 = new ArrayList<Integer>();

   System.out.println( "Using a Normal as a prior distribution" );
   normal.posterior( data1, normal );      // Fail
   normal.posterior( data1, gamma );       // Fail
   normal.posterior( data2, poisson );     // Fail
   normal.posterior( data1, normalMu );    // Pass
   normal.posterior( data1, normalSig );   // Fail
   normal.posterior( data2, poissonRate ); // Fail

   System.out.println();
   System.out.println( "Using a Gamma as a prior distribution" );
   gamma.posterior( data1, normal );      // Fail
   gamma.posterior( data1, gamma );       // Fail
   gamma.posterior( data2, poisson );     // Fail
   gamma.posterior( data1, normalMu );    // Fail
   gamma.posterior( data1, normalSig );   // Pass
   gamma.posterior( data2, poissonRate ); // Pass
 }
}
Daniel Pitts - 06 Apr 2007 16:50 GMT
> Daniel,
>
[quoted text clipped - 13 lines]
> Anyway, thanks for the insight.  Couldn't have gotten to this point
> without eveyone's help.

I think the solution to this, or at least a way to avoid bugs, is to
call the generic method by a different name altogether, rather than
use the same name.  That way, if someone doesn't implement something
correctly, they'll get a compiler error instead of a run-time bug.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.