Java Forum / General / July 2007
A case for a global (i.e., public static) variable?
tam@milkyway.gsfc.nasa.gov - 26 Jul 2007 06:59 GMT Many people seem to think that global variables are generally a bad thing, and I'd be curious as to how those who take that view would address a problem that I've addressed using a global. Are there viable alternatives or is this one place where the use is justified?
I have a program where there are many parameters that the user can specify to control the program. Currently these parameters are collected into a public static Settings object (similar but not quite the same as a Java Properties). Whenever any element of the code needs to know what a parameter is (or if it was specified at all) they examine this global object.
For example: The program resamples input images into an output image. So we create the output Image object and as we're setting that up we create a Sampler object that encapsulates the type of sampling we want to do. The user starts the code with arguments that can include sampler=xxx where xxx is a short string that indicates the resampler to be used. The resampler objects are created using a factory that takes the xxx strings and returns an appropriate object. The factory and the resampler objects normally know nothing of the Settings object, just the one string that was used to specify the sampler.
However a few resamplers have specific settings that control their behavior (e.g., what to do when there is a NaN pixel). With a global Settings objects that's fine. These classes can find the object and extract whatever special parameters they need. The great majority of Samplers that have no special Settings remain blissfully ignorant.
Without a global object, I see two alternatives. First I could pass an instance of a Settings object around to essentially every object I create. Almost all the pieces of the program are extensible and even if something doesn't need to worry about Settings now, some extension of it might need to in the future. So this drastically increases the coupling of the system. Instead of being invisibly present everywhere as a global object, it's visibly present as an argument to all the factories and constructors. E.g., previously the sampler factory knew nothing of the Settings, but in this approach it needs to get the instance and pass it to any samplers it creates.
Alternatively I could require that the string that is passed into the sampler factory is potentially a complex structure. E.g., sampler=xxx/OnNaN=skip/OnInf=skip rather than sampler=xxx OnNaN=skip OnInf=skip As far as the user entry goes there's probably not much difference here, but now whenever I extract a setting I need to do a lot more processing to make sure that I find some 'primary key', and modifier keys and modifier values and .... Of course we'd use a common class to do all the parsing, but now all of the code is coupled using this common parser utility.
Is there some alternative which enables the decoupling that the global variable approach gives? With this approach the coupling is optional -- only if a class needs to see the Settings does it need to know about the class. With the other approaches it seems to me that virtually all classes will need to see either the Settings or parser class.
Regards, Tom McGlynn
Twisted - 26 Jul 2007 09:54 GMT (Summary: "Settings" object either global or passed around like a red- headed stepchild)
Congratulations -- you've just rediscovered the Singleton and the Context Object patterns, of which in your case yes Singleton is to be preferred. But you can avoid having a global variable as follows:
public class Settings { private static Settings instance = null;
public final int field1; public final int field2; // etc.
public static void initialize (int field1value, boolean field2value, etc.) { synchronized (Settings.class) { if (instance != null) throw new IllegalStateException(); instance = new Settings(field1value, field2value, etc.); } }
private Settings (int field1value, boolean field2value, etc.) { field1 = field1value; field2 = field2value; // etc. }
public Settings getInstance () { return instance; } }
Now your Settings class is a proper Singleton, and the only public fields are final, so there's no real global variables.
Your parameter-reading code would figure out field1value, field2value, etc. and after all the parameters were read, invoke Settings.initialize() with these local variables of main() as arguments. Elsewhere in your code you would use Settings.getInstance().field1 for instance, or cache a local reference to the object returned by Settings.getInstance() and access its fields in places where this is more efficient than constantly calling getInstance.
The only global variable (in the way of static nonfinals) is "instance", which is private to Settings. It clearly is only ever set once, so it isn't genuinely a variable, just a constant that needs initializing once. Hence "no real global variables". Thread-safety is partial here, as initialize is synchronized on the Settings.class Class object, and throws IllegalStateException if initialize is ever called more than once (including concurrently), but getInstance returns null before initialization, returns a fully-initialized Settings object after, but may return a partially-initialized Settings object until then because the JVM might actually allocate the Settings object, assign the reference to "instance", and THEN run the constructor instead of running the constructor before assigning the reference. Synchronizing the getInstance method only helps in multithreaded apps where threads that might access the instance are running before initialization is definitely done; if your code is single-threaded, or single-threaded until initialize() has returned, you're safe. Synchronizing the getInstance method would make it expensive and you may call it frequently, though calling it here and there and caching the return value would relieve some of the burden. Note that there's also the risk, in multithreaded code where you could see a partially-constructed Settings, of seeing a null Settings too. To be properly thread-safe for that scenario, it would probably be necessary to make getInstance look like this:
public Settings getInstance () { synchronized (Settings.class) { if (instance != null) return instance; Settings.class.wait(); return instance; } }
and add "Settings.class.notifyAll();" just before the end of the synchronized block in initialize(). This would cause threads calling getInstance early to block until the singleton was initialized instead of receiving a useless and probably-crash-provoking null for their efforts.
Probably that is overkill for your needs though and it suffices to launch no threads from the main thread and create no visible UI from the main thread (which would bring the AWT/Swing event dispatch thread into play) until the initialize() call has returned.
(There is also "double-checked locking" which is generally not thread- safe however clever it seems. There's also this option:
public static void initialize (int field1value, boolean field2value, etc.) { synchronized (Settings.class) { if (instance != null) throw new IllegalStateException(); } Settings temp = new Settings(field1value, field2value, etc.); synchronized (Settings.class) { if (instance != null) throw new IllegalStateException(); instance = temp; } }
Assuming reference assignment is atomic, callers to the unsynchronized getInstance implementation far above will always see either null or a fully-constructed Settings. The price is sometimes creating an extra Settings object and then throwing an exception instead of just throwing an exception if multiple threads try to initialize the singleton, which is unlikely for your implementation. A self- initializing singleton is sometimes seen, where the singleton constructor can find all the information it needs to initialize itself, e.g. the AWT default toolkit singleton. This can use that pattern if reference assignment is atomic:
public static Whatever getInstance () { if (instance != null) return instance; synchronized (something) { Whatever tempInstance; if (instance != null) return instance; synchronized (somethingElse) { tempInstance = new Whatever(); } synchronized (anotherSomething) { if (instance != null) return instance; instance = tempInstance; } return instance; } }
This only ever assigns instance once, and only ever returns instance, and instance is null until after the object it will refer to is fully constructed, so this is threadsafe as long as "instance = tempInstance;" is atomic, i.e. instance == null tests true right up until instance == tempInstance instead.) This requires "instance" be declared "volatile" and a recent JVM be used; I don't know if it holds even then. It's not dissimilar from double-checked locking, and is really an elaboration of same. The inner synchronized blocks ensure that "tempInstance = new Whatever()" and "instance = tempInstance" cannot be reordered to run concurrently -- without the first block moving some of "tempInstance = new Whatever()" (such as the constructor call) into the second and after "instance = tempInstance" could occur, and without the second "instance = tempInstance" could be moved in between assigning the tempInstance reference and calling the constructor. However, with both blocks it can't do either, since it can't move stuff out of one synchronized block or cause the scopes of the separate synchronizations to overlap (which could cause deadlock when it has one lock and tries to acquire the second) either.
So far as I can tell if "volatile" works as advertised the above is thread-safe, though klunky, and fast once instance != null, with only the "volatile" penalty.)
Walter Milner - 26 Jul 2007 09:58 GMT On 26 Jul, 06:59, t...@milkyway.gsfc.nasa.gov wrote:
> Many people seem to think that global variables are generally a bad > thing, and I'd be curious as to how those who take that view would [quoted text clipped - 59 lines] > Regards, > Tom McGlynn AFAICS - there are 2 reasons why ppl think global variables are a 'bad thing' 1) Java is a pure OO language based on the idea that everything is an object or static class. Using public static fields and public static methods enables you to write code in the style (ie using the ideas of) C or COBOL - but that is much more easily done in C or COBOL. 2) But STILL why shouldn't I? I think because of the advantages which accrue from encapsulation ie (one of) the reasons why OOP was developed. Having global variables around raises the possibility that code will inadvertently assign 'invalid' (in some sense) values to them.
So how about this. Your Settings class is either a singleton or a static class. All data fields are private. You use public accessor methods and mutator methods which validate proposed value changes.
Isn't this standard?
Daniel Pitts - 26 Jul 2007 15:43 GMT > On 26 Jul, 06:59, t...@milkyway.gsfc.nasa.gov wrote: > [quoted text clipped - 79 lines] > > Isn't this standard? I personally prefer inversion of control (dependency injection) over static data for most cases.
You can have a Singleton object, which is singleton only because the convention is for the framework to create it once, and pass it to every object that needs to know about it. This makes classes much more flexible, because they don't have to know where to get the actual object of a singleton from. the Spring Framework is a good framework for IoC practices, and definitely worth looking at.
tam@milkyway.gsfc.nasa.gov - 27 Jul 2007 15:57 GMT > > On 26 Jul, 06:59, t...@milkyway.gsfc.nasa.gov wrote: ...
> > > Is there some alternative which enables the decoupling that the global > > > variable approach gives? With this approach the coupling is optional [quoted text clipped - 33 lines] > object of a singleton from. the Spring Framework is a good framework > for IoC practices, and definitely worth looking at. My first attempt to respond got eaten up by the newsreader so this is a little shorter than it might otherwise be...
Thanks to all who responded. Twisted's comments on how to address initialization in the multi-threaded environment will be useful as I port code to a more complex environment. I may have been a bit misleading in using the word variable in the title. In fact I'm using a dedicated Settings class with private static HashMap that stores the settings and public static methods that access it. I think this is one approach that Walter alluded to.
Daniel's response addresses the area the concerned me most: how to decouple classes. I looked at some of the literature on IoC and dependency injection. The basic theme there seems to be that you defer instantiation of objects to some kind of factory which is able to find out what kind of object you want at run time. In fact that's what my code does. The problem is that it it needs to do this in lots of different places, and the question is how does it find the base information which is used to make that choice in all of these different locations.
%he creation and initialization of objects occurs in many different places in the code but there is a single source for the information used in that initialization. How do I get this information to all the places where it's needed without explicitly coupling my entire program to some kind of Settings object?
Regards, Tom McGlynn
Twisted - 28 Jul 2007 03:19 GMT On Jul 27, 10:57 am, t...@milkyway.gsfc.nasa.gov wrote:
> Thanks to all who responded. Twisted's comments on how to address > initialization in the multi-threaded environment will be useful > as I port code to a more complex environment. Thank you and you're welcome. :)
> Daniel's response addresses the area the concerned me most: how > to decouple classes. I looked at some of the literature [quoted text clipped - 5 lines] > is how does it find the base information which is used to make > that choice in all of these different locations. You might want to add another layer of IOC in this case.
One workable possibility is to construct factories for the classes that will need access to the Settings. The factory builds instances of the class and passes its factory method arguments AND THE SETTINGS OBJECT to that class's constructor. Now the problem is reduced to the factory classes needing access to the Settings object. Most likely you can instantiate these directly in main(), each with a settings subset appropriate for the class it manufactures.
Then the only remaining complaint is if the factories need to be passed around a lot.
OTOH, use of Singleton-esque methods is entirely appropriate for a global crosscutting concern such as an app-wide set of configuration or options parameters, or an app-wide logging facility (arguably *the* textbook example), or similarly.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|