Java Forum / General / September 2007
dependency-detection in java - Take 2
Andreas Leitgeb - 31 Aug 2007 08:20 GMT Andreas Leitgeb <avl@logic.at> wrote:
> Say, I've got two classes A and B, one of which (A) > contains "static final" fields (SFFs), the other (B) > references these fields. ... Obviously, I utterly failed to describe my problem before.
I've got a project tree full of classes (who hasn't?), some of these contain SFFs, some use SFFs.
Now, someone else of the team checks in com.mycompany.fubar.SomeClass, and everyone else in the project has it show up as new in his working space, and rebuild it. (The rebuild might even be central and automatic). It's a waste of ressources to recompile the whole tree each time a single .java is checked in, so we (of course!) use ant.
And now, think this SomeClass that has been checked in might contain new values for existing SFFs, which are used elsewhere.
Now what? The choices seem to be these: * Always recompile everything. * force each developer to find all dependent files, and also check in those with just a whitespace-change, so they will be picked up for compilation. Leaves the job of finding dependencies to each developer... * force the developer to set some repository-flag when he changes a constant, and have the build-process read that flag and trigger a full-compile... still error-prone, since the developer might forget to set the flag. new features (would require changes to javac,ant or new tools) * have javac create dependency-information for each file it compiles: a list of all referenced classes. Then ant could consult this list, and pass dependent files automatically to the list, even if these weren't changed themselves.
Please don't waste your own time with saying, that your SFFs never change. You just lie to yourself, and your build-process is unreliable. (or perhaps your project is small enough that a full-compile doesn't hurt, or you "just know", when to do a full compile, and when not. This isn't only about SFFs, but also about classes changing their interface incompatibly (could be a typo of the developer, but without compiling the dependents, it may go unnoticed for a while!)
Michael Jung - 01 Sep 2007 10:42 GMT > I've got a project tree full of classes (who hasn't?), > some of these contain SFFs, some use SFFs. [quoted text clipped - 6 lines] > And now, think this SomeClass that has been checked in might > contain new values for existing SFFs, which are used elsewhere.
> Please don't waste your own time with saying, that your SFFs > never change. You just lie to yourself, and your build-process [quoted text clipped - 5 lines] > but without compiling the dependents, it may go unnoticed for > a while!) SFFs should only be used locally, say within a package, so when you rebuild your SFF, it is not a big hassle to rebuild the package. The problem arises when you have that constant spread throughout a bigger project. It is, in effect, a "uncontracted" interface or a global variable, defying all encapsulation approaches and should be avoided.
Sometimes it can't be avoided fully. These IDL-files with generated constants are a good example, other external sources, you may not change are others. You can try and create fascades, i.e. hide the generated code, but provide our own interface to it, forcing all users of the IDL and it's constant to use a getter instead.
Even if you get ant or some other semantic analyser to solve that problem for you, you may still be stuck with the runtime problem, when someone in a distributed environment compiled against an old constant.
Michael
Andreas Leitgeb - 01 Sep 2007 20:13 GMT >> This isn't only about SFFs, but also about classes changing >> their interface incompatibly (could be a typo of the developer, >> but without compiling the dependents, it may go unnoticed for >> a while!)
> SFFs should only be used locally, I'm dreaming of reliable, non-full rebuilds. They shouldn't depend on developers following guidelines (which all have their "accepted exceptions", anyway), and shouldn't even depend on developers contributing correct code (but detect all errors, even those caused only in dependent classes.)
I'm viewing this problem from a CM point of view (although I'm rather developer than CM). Any change that gets into the repository might have been checked in by a monkey, as well as by a senior expert. The build process shouldn't care. It should in the end say: "yes, the project has been built", or "it could not be built due to these errors, and furthermore the build process should do this with minimal use of processor-ressources (on whatever machine it is applied, be it developer's workstation, or a dedicated compile-server).
I'm aware, that such a build-tool-chain is either not existing now, or at least not known to anyone participating in this thread.
I'd like to discuss how this could be done. First, what is principially possible to do - where are the theoretic limits? Would dependency-management be necessarily more expensive than the unconditional full compile?
> Even if you get ant or some other semantic analyser to solve that problem for > you, you may still be stuck with the runtime problem, when someone in a > distributed environment compiled against an old constant. The goal of this discussion is a build-process (but not the full one!), which yields the same result regardless which of the java files were most recently changed. So, in the end (almost) every developer would use that build-process (just like almost everyone already uses ant now), and given that they've checked out the same version, the'd get the same jar-file (except for files' meta-information like timestamp)
Mike Schilling - 01 Sep 2007 21:16 GMT >>> This isn't only about SFFs, but also about classes changing >>> their interface incompatibly (could be a typo of the developer, [quoted text clipped - 25 lines] > Would dependency-management be necessarily more expensive than > the unconditional full compile? I've thought about this a bit, though not to the point of creating a design, much less building prototypes. It seems to me that this approach is worth investigating:
1. The interface of each class C in the system needs to be captured and stored persistently, where "interface" means method signatures, field definitions, and constant values. Superclass name too, to cover the changes that can occur if what C inherits changes.
2. Dependencies also need to be captured and stored persistently. This will be information of the form:
. Class D depends on (some feature of) the interface of class C
3. Whenever a class is compiled successfully:
A. Its new interface is constructed. B. Changes to the previous interface are computed, and all classes dependent upon something that changed arerked invalid (except for other classes compiled by the same invocation of javac, of course -- they're up to date and will be marked valid). C. Its new dependencies are calculated. D. Its stored interface and dependencies are updated. E. It marked valid
4. Whenever a group of Java files is to be rebuilt, e.g. by Ant's <javac> task, any classes marked invalid in step 3 are recompiled, in addition to classes that are not up-to-date with their source files.
Notes: "Class" is used a bit ambiguously above, sometimes to mean a .class file, sometimes to mean all of the .class files built from a single source file. Clearly if A$Inner depends upon B, it's A that's marked invalid when B changes.
The obvious outstanding problems with the above are:
A. How granular should the dependency information be? If it's simply "A depends on B", there will be a lot of unnecessary recompilations. If it's "A depends separately on the following 20 method calls it makes to B", there will be a vast amount of dependency information stored, updated, and checked. I have no intuition for where the sweet spot lies.
B. How to generate the dependency information. I presume it can be calculated from .class file analysis, but I haven't verified this in detail, nor do I know how expensive that would be. It would be awfully nice if javac would generate it for us, but it doesn't.
C. How to represent the dependency "C calls a method on an instance of D that's inherited from E". I think all of this information is required: if the method definition changes, it will change in E, but we also need to mark C invalid if D is reparented.
Inheritance adds some wrinkles. If Sub overrides a method it inherits from Super, that doesn't really change its interface. Classes which previously called Sub.meth() don't have to be recompiled. On the other hand, if Sub defines a field that hides a field defined in Super, classes that accessed Super.field should be marked invalid.
Overloads add some more wrinkles. If a new overload is added, methods that called the existing overloads and might now call the new one need to be recompiled. For practical purposes, it should be fine to recompiled callers to any of the previous overloads, even if they're wholly disjoint.
When a class is reparented, it probably makes more sense to mark all of its dependents invalid, rather than to try to calculate exactly what changed. Note that changing the interfaces an abstract class implements is a kind of reparenting, since it can change the set of methods that the class defines.
I'm sure there are many more of these which further analysis would reveal. One more note: this is an ideal open source project, since it could be greatly useful to the development community and there is no money to be made by solving it.
Andreas Leitgeb - 04 Sep 2007 12:59 GMT > I've thought about this a bit, though not to the point of creating a design, > much less building prototypes. It seems to me that this approach is worth [quoted text clipped - 4 lines] > definitions, and constant values. Superclass name too, to cover the changes > that can occur if what C inherits changes. Yes, that's a good start.
> 2. Dependencies also need to be captured and stored persistently. This will > be information of the form: > . Class D depends on (some feature of) the interface of class C While I originally thought into that direction, I now have the impression that this is really too difficult. Afterall there are a couple of java's features that would need to be taken care for. (inheritence, compiletime-resolving of fields, static methods and the choice among overloaded methods)
As long as none of the recently changed classes changes its interface in an incompatible way(*), the current normal incremental build suffices.
If any class (well, except for private nested or anonymous ones) changes its interface such that it is possible for dependent classes to notice the difference, then a full-rebuild is worth it.
My original point of avoiding full-builds is still fulfilled, because the full rebuild wouldn't be necessary *everytime*! In the outset, the problem was, that one either needs to *always* do full-rebuilds, or risk inconsistencies. Now, it's about a reliable indicator, that still shouldn't fire too often.
> A. How granular should the dependency information be? I'd be already happy, if any .class-file's change can be reliably characterized as incompatible(*) or not. It wouldn't be a question of which java-file changed, but rather: Was any .class file changed incompatibly(*) during a normal incremental build?
> B. How to generate the dependency information. I presume it can be > calculated from .class file analysis No, not possible (unless javac was modified), because usage of static finals is not explicitly "mentioned" in the .class file.
> Inheritance adds some wrinkles. If Sub overrides a method it inherits from > Super, that doesn't really change its interface. Classes which previously > called Sub.meth() don't have to be recompiled. Unless it's a static method :-/ ... where dependent classes might continue to call Super's version, even if they refer to Sub.meth(). The same goes with fields and overloaded (even non-static) methods.
(*): Chapter 13 of the JLS-3.0 (Java Language Specification 3rd Edition) mentions (among other allowed changes): * Adding new fields, methods, or constructors to an existing class or interface. as a binary compatible change, but it isn't always compatible in our sense. Our rules for compatibility are stricter, in that they demand not only linkability, but also "equivalence in behaviour whether or not a dependent class is recompiled as well".
> I'm sure there are many more of these which further analysis would reveal. I also fear so ... but once I leave out the problem to list all dependents, I think that what remains should be possible and still useful.
> One more note: this is an ideal open source project, since it could be > greatly useful to the development community and there is no money to be made > by solving it. I wouldn't say so. Making builds reliable without resorting to always doing full rebuilds might safe some costs. However, I'm a fan of not only using open-source, but also contributing to it ...
Mike Schilling - 04 Sep 2007 20:16 GMT >> B. How to generate the dependency information. I presume it can be >> calculated from .class file analysis > > No, not possible (unless javac was modified), because usage of static > finals is not explicitly "mentioned" in the .class file. I didn't quite believe this, so I created an example, got out my handy class file analyzer and found that you're completely correct. Even compiling with debugging information, there is no information put in the class file about which constants were referenced, or even which classes constants were referenced from.
> Making builds reliable without resorting to always > doing full rebuilds might safe some costs. As someone who was in the software tools business for years, I feel confident in predicting that you wouldn't find enough people that would pay for it to recover your development costs.
Andreas Leitgeb - 05 Sep 2007 22:41 GMT > I didn't quite believe this, so I created an example, got out my handy class > file analyzer ... I'm curious as to which tool you use for this task. The way you said that ("got out my ...") seems to me to indicate that you've also made your own...
I have written one myself (in Tcl, not in Java, so I do the parsing completely myself), because I didn't like javap hiding away private fields and methods. Unfortunately the user-interface of my script is still somewhat cryptic (not yet good enough for prime time).
PS: this is not meant as a general question about such tools. Meanwhile I know some already, but back then, when I wrote my own, all I knew then was javap.
Mike Schilling - 05 Sep 2007 22:52 GMT >> I didn't quite believe this, so I created an example, got out my handy >> class [quoted text clipped - 3 lines] > ("got out my ...") seems to me to indicate that you've also made your > own... I did. It's rudimentary so far, just parses out and prints all of the bits including the constant pool entries. When I have time I'll flesh it out to a tool that helps determine class dependencies, so that I can safely re-organize a large existing code base.
Michael Jung - 02 Sep 2007 17:39 GMT > >> This isn't only about SFFs, but also about classes changing > >> their interface incompatibly (could be a typo of the developer, > >> but without compiling the dependents, it may go unnoticed for > >> a while!)
> > Even if you get ant or some other semantic analyser to solve that problem > > for you, you may still be stuck with the runtime problem, when someone in a > > distributed environment compiled against an old constant.
> The goal of this discussion is a build-process (but not the full one!), > which yields the same result regardless which of the java files were most > recently changed. So, in the end (almost) every developer would use > that build-process (just like almost everyone already uses ant now), and > given that they've checked out the same version, the'd get the same > jar-file (except for files' meta-information like timestamp) Is this what you want: every time a file A changes, all dependant files (B) should be recompiled automatically? (Because, as each developper compiles his stuff B, javac will automatically determine that the A class file is older than the A java file and recompile it, you can't mean that.)
What would the end result of such an automatic build be in case it yields lots of errors? (Because the dependant javas don't compile anymore; incompatible changes needed to be introduced.)
Michael
Andreas Leitgeb - 03 Sep 2007 08:33 GMT > Is this what you want: every time a file A changes, all dependant files (B) > should be recompiled automatically? For some definition of "automatically", yes :-) I do *not* expect javac to handle reverse-dependencies (B) as it does forward-dependencies. That would be a bad thing.
What I want instead is some team-work of ant and javac. Example 1: ant passes all files (of the codebase) to javac (plus some new option) and javac will first find all the changed ones, and then all the type-"B" ones among the others and compile those as well, but not those unrelated to all regenerated ones. Example 2: javac emits extra information, from which ant could determine the type-"B" candidates, and add them to the list of files passed to javac.
As of now, ant only passes the changed files to javac, so javac doesn't even have a chance of seeing any reverse-dependent ones, and it surely shouldn't go searching for these the way it does for forward-dependencies.
> What would the end result of such an automatic build be in case it > yields lots of errors? (Because the dependant javas don't compile > anymore; incompatible changes needed to be introduced.) Practical example: A user checks in a new version of an Interface-class, in which he added a new method. At next build, all classes within my codebase (a concept known to ant) that implement that interface, and which don't already implement that new method, are supposed to throw compile-errors then.
Michael Jung - 03 Sep 2007 18:55 GMT > > Is this what you want: every time a file A changes, all dependant files (B) > > should be recompiled automatically? [quoted text clipped - 6 lines] > and then all the type-"B" ones among the others and compile those > as well, but not those unrelated to all regenerated ones. [...]
Sticking to this: This would require ant/javac to walk through all of the codebase and then through all of the imports.
Say a developer recompiles his most volatile java file ever so often. Each time some remote change might possibly affect him, the whole engine starts? I don't know, but it was one of the things that really turned me off in the default eclipse settings, that every "save" would get the thing petrified, because we had a rather large source tree, 90% of which I don't care about. But some indirect dependency forced me to wait until the thing was through with checking all of it.
In other words, when the code base gets big enough to warrant such an import-analysis over "make all", the advantages of not doing it at all also increase dramatically. YMMV.
Michael
Andreas Leitgeb - 04 Sep 2007 10:06 GMT >> > Is this what you want: every time a file A changes, all dependant files (B) >> > should be recompiled automatically? [quoted text clipped - 9 lines] > Sticking to this: This would require ant/javac to walk through all of the > codebase and then through all of the imports. "Walk through all the codebase" ... this sounds quite expensive, but actually this happens with every incremental build already: ant checks every .java-file in the codebase, whether it's newer than its .class file.
> Say a developer recompiles his most volatile java file ever so often. Each > time some remote change might possibly affect him, the whole engine starts? The advantage of javac doing the reverse-dependencies itself could be, that it could trigger those recompilations only if not just the depended source has changed, but it even also changed it's interface. That would mean: a central java-class adding a new method/field or changing only implementation could even skip the reverse-dependency- handling. If, however, some class changed its interface (like changing a static final, adding abstract methods, or removing non-abstract ones), then anything else than following reverse-dependencies leaves an inconsistent state among your .class-files.
Perhaps following reverse-dependencies isn't the only solution. Having a way to detect binary-incompatible interface-changes in any of the recompiled classes (during the normal incremental build, like ant already supports) would let me know when to start a full-compile. This might even be enough. Actually, now that I think of that, this could even be done without any enhancements on javac.
> I don't know, but it was one of the things that really turned me off in the > default eclipse settings, that every "save" would get the thing petrified, I never made any claims, that a build should be auto-started on file-change. It's quite comfortable for the guy with the big machine (to whom a background compile is hardly noticable). The other guys rather turn off auto-build-on-file-save (as well as auto-build-on-key-typed :-)
> In other words, when the code base gets big enough to warrant such an > import-analysis over "make all", the advantages of not doing it at all > also increase dramatically. YMMV. That's what I feared since the start of this discussion. Is dependency-analysis necessarily more (or almost as) expensive than a full compile?
Michael Jung - 04 Sep 2007 18:09 GMT > >> > Is this what you want: every time a file A changes, all dependant files (B) > >> > should be recompiled automatically? [quoted text clipped - 13 lines] > every .java-file in the codebase, whether it's newer than its .class > file. No, it only checks the file that is scheduled for compiling. That's "just" a tree out of your graph.
> > Say a developer recompiles his most volatile java file ever so often. Each > > time some remote change might possibly affect him, the whole engine starts? [quoted text clipped - 8 lines] > then anything else than following reverse-dependencies leaves an > inconsistent state among your .class-files. How would you do that? A would be the target of javac, right? Then see "walk through the whole codebase".
> Perhaps following reverse-dependencies isn't the only solution. > Having a way to detect binary-incompatible interface-changes > in any of the recompiled classes (during the normal incremental > build, like ant already supports) would let me know when to start > a full-compile. This might even be enough. That is still the same thing. You must walk the whole codebase.
> Actually, now that I think of that, this could even be done > without any enhancements on javac. I'm pretty sure it must be done without javac, because javac works only "downward". Complete "upward" walking is only possibly when everything is available, generally at runtime.
> I never made any claims, that a build should be auto-started on > file-change. It's quite comfortable for the guy with the big machine > (to whom a background compile is hardly noticable). The other guys rather > turn off auto-build-on-file-save (as well as auto-build-on-key-typed :-) The guy with the big machine can spare a few moments for a rebuild of a subproject with a change in constants. Even IDL-constants shouldn't be spread about the whole project. You should know who far they carry.
> > In other words, when the code base gets big enough to warrant such an > > import-analysis over "make all", the advantages of not doing it at all > > also increase dramatically. YMMV. > That's what I feared since the start of this discussion. Is > dependency-analysis necessarily more (or almost as) expensive > than a full compile? I'm pretty sure it's of the same order with regards to code base size in general. What the factor between them is, will depend on your needs.
There is a way to circumvent that by keeping the reverse-dependencies in some database, which you update as B's are checked in and which you query when A's are checked in.
Michael
Andreas Leitgeb - 14 Sep 2007 15:30 GMT >> Is dependency-analysis necessarily more (or almost as) >> expensive than a full compile? [quoted text clipped - 3 lines] > database, which you update as B's are checked in and which you query when A's > are checked in. The current approach is a bit more reluctant. I'd just like to determine "relevant" changes between each new .class file and it's previous version (stored in some database or file)
My current focus would be finding out what changes in a class are relevant, and which are not. Based on the existence of relevant changes, a full-compile of the project would be triggered.
What is "relevant"? The definition would be: any change in a class A, for which a class B exists in the project's codebase, which compiled with old and new version of A.java in place would result in different generated .class files. (not only B.class!) This definition is not yet cast in stone, since even changes in the classes generated from such B.java could also be subject to qualification as effective or non-effective.
But how do we practically determine "relevant"ness? I think we can only carefully *approach* it for now.
As a very first step, any change in the class' interface could be considered relevant, which would save full-compiles when only method-implementations or private members were changed. (Note that private nested classes are classes on their own, so adding one, or even removing one is irrelevant, and for changing one, the same rules apply on that private class itself)
For the next step, adding a new non-static method (whose name wasn't yet used in that class, not even with a different signature), or certain types of private methods or data could be considered harmless.
Also, as long as no variadic methods exist yet, even overloaded method names with a previously nonexisting arity should be harmless. (There is no way for the compiler to pick these instead of te existing ones in the course of recompiling any dependent class.)
Any other interface changes, that are provably irrelevant to recompiling depending classes?
Michael Jung - 14 Sep 2007 18:04 GMT > >> Is dependency-analysis necessarily more (or almost as) > >> expensive than a full compile? [quoted text clipped - 6 lines] > determine "relevant" changes between each new .class file and it's > previous version (stored in some database or file) I'd try to keep it simple. If you try to restrict dependency too much, you might need a compile to determine dependency and that would gain you nothing. Go through imports. Also beware of successive dependency, e.g. classes B that inherit from classes that inherit from A.
Michael
Andreas Leitgeb - 14 Sep 2007 19:58 GMT >> The current approach is a bit more reluctant. I'd just like to >> determine "relevant" changes between each new .class file and it's >> previous version (stored in some database or file)
> I'd try to keep it simple. That's indeed my intention.
> If you try to restrict dependency too much, you > might need a compile to determine dependency and that would gain you nothing. The point is: I've given up the task of finding reverse-dependecies altogether, since I now think it's really impossible to get right.
I just scan the tree for class-files which are newer than they were at last scan, and for each new class file I compare its current interface with the stored old one, and if only one differs, I raise a flag that suggests/triggers a full compile.
E.g.: I see a class with a static method which it didn't have before, then I know, that in principle there *could* exist some other class, which *might* be affected in some way, which is enough to say: "Hey developer! better recompile all!"
The bonus is, that if none of these relevant changes happened, the developer *knows* that the previous incremental build was safe.
PS: there are still more caveats to my approach, e.g. if a build runs into errors, some class-files might be already new, whereas others are still old, and if this mixed state gets fed into the interface-database, it turns into garbage.
Andreas Leitgeb - 02 Sep 2007 22:47 GMT > I'd like to discuss how this could be done. First, what is > principially possible to do - where are the theoretic limits? Unfortunately, I've to accept, that my ultimate goal is unreachable, which would have been to have a working incremental build, that would work with just anything checked in, even if it's in some subtle way invalid java-code (in which case it would detect and report failure).
This won't work. There are situations, where in it's generality nothing than a full compile can detect the error: this is, because any two java-files (A.java & B.java) could contain a second (non-public) class of a common name (C). Only compiling this A.java and B.java in the same javac-run will detect this problem.
Also, as Mike pointed out, removing .java files is another thing, which an incremental build cannot "correctly" deal with - it can't remove orphaned .class-files.
I don't give up, yet. Even if I won't find *all* possible errors incrementally, I can still find most typical errors that way. Even interface-changes that do not cause compile errors (which also includes changes of constants) could still be made to be properly propagated to all dependent classes.
If javac were to add an attribute to the .class-file (containing "source" and perhaps even the value) for each inlined foreign constant, then third party tools (like e.g. ant) would have the necessary data available to do dependency management. It would still be not at all trivial for the third party tool, but quite likely trivial on javac's side.
Another approach could be to have javac write out .depend-files as used in C/C++-world - quite unlikely to ever happen.
javac could also take a new option that tells it not to unconditionally compile, but check compile-necessity for each given file. Advantage: it actually only checks forward-dependencies. Disadvantage: the check could be almost as expensive as a compile (don't know, though). This would mean, that javac would take over (and do better) some part of ant.
Roedy Green - 01 Sep 2007 11:03 GMT >Now what? >The choices seem to be these: I think a solution might work like this:
You have a hardware "compile server". Its job is to when probed to fetch the latest source and recompile it. You might use a tool like the Replicator to distribute the latest successful compilation class files to everyone .That saves clients the work of doing a huge build on perhaps a machine too tiny to compile efficiently.
See http://mindprod.com/webstart/replicator.html
If you change a static final, the safest route is a clean recompile of everything. It would be the duty of the person checking in such code to warn the compile server, perhaps with a comment in the embedded checkin log.
A little java program would run on the compile server all the time, that spawn ant tasks, and fields requests to recompile.
One of its main functions in to maintain a coherent set of class files to distribute to anyone who requests tem from the last successful compile in case some idiot checks in code that blows the build out the water.
 Signature Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Andreas Leitgeb - 01 Sep 2007 20:57 GMT > I think a solution might work like this: > You have a hardware "compile server". > That saves clients the work of doing a huge build > on perhaps a machine too tiny to compile efficiently. I think it doesn't matter if one wastes cycles for full build on each developer's PC or on a central server. I want to discuss saved effort of a new (yet to be developed) build-process, that respects reverse- dependencies to make an incremental build as reliably "correct" as a full rebuild.
Please see my reply to Michael Jung. It seems to take a me few iterations to get my point clearer. (not only to others, but also to myself)
Mike Schilling - 01 Sep 2007 22:44 GMT >> I think a solution might work like this: >> You have a hardware "compile server". [quoted text clipped - 7 lines] > dependencies to make an incremental build as reliably > "correct" as a full rebuild. One more point: a full rebuild will remove class files generated from source files that have been deleted. Very few build systems do this for incremental builds, even though it's not particularly difficult.
Andreas Leitgeb - 02 Sep 2007 19:03 GMT >> I want to discuss saved effort of a new (yet to >> be developed) build-process, that respects reverse- >> dependencies to make an incremental build as reliably >> "correct" as a full rebuild.
> One more point: a full rebuild will remove class files generated from source > files that have been deleted. Very few build systems do this for > incremental builds, even though it's not particularly difficult. Thanks for pointing that out! It's indeed a principial limitation of incremental builds. Even if we do not delete a whole .java-file, we might have re-arranged the code such that some helper-class (with e.g. $0 appended) is no longer generated, or we might have removed non-public additional classes.
I'll have to think through it further... (maybe what I wanted is actually impossible :-( )
Mike Schilling - 02 Sep 2007 19:35 GMT >>> I want to discuss saved effort of a new (yet to >>> be developed) build-process, that respects reverse- [quoted text clipped - 11 lines] > e.g. $0 appended) is no longer generated, or we might have removed > non-public additional classes. The thing is, removing these files is easy, because it's simple to correlate class files with the source file they came from; it's just that most build system don't bother. It's as simple as this:
Before the compilation step, find the .java file corresponding to each .class file. If
A. There isn't one, or B. It's in the set of files to be recompiled
then remove the .class file.
> I'll have to think through it further... > (maybe what I wanted is actually impossible :-( ) Mark Thornton - 02 Sep 2007 20:09 GMT > The thing is, removing these files is easy, because it's simple to correlate > class files with the source file they came from; it's just that most build [quoted text clipped - 7 lines] > > then remove the .class file. This task is actually a bit harder than you suggest. It is legal (if bad practice) to include top level non public classes (i.e. package level) in .java files which do not match the name of the class.
Mark Thornton
Mike Schilling - 02 Sep 2007 20:31 GMT >> The thing is, removing these files is easy, because it's simple to >> correlate class files with the source file they came from; it's just [quoted text clipped - 11 lines] > bad practice) to include top level non public classes (i.e. package > level) in .java files which do not match the name of the class. You're right; I was discounting that because I've never heard of anyone doing it. OK, let's add a flag to the build system to turn off the behavior described above, which 97% of its users will never have to worry about setting.
Andreas Leitgeb - 03 Sep 2007 08:10 GMT >> The thing is, removing these files is easy, because it's simple to correlate >> class files with the source file they came from; > > This task is actually a bit harder than you suggest. It is legal (if bad > practice) to include top level non public classes (i.e. package level) > in .java files which do not match the name of the class. Actually, the difficulty depends on whether the compiler correctly includes the "SourceFile"-attribute into the .class-file. (I think most, if not all, do, but I've seen class-files without it - perhaps the work of an obfuscator.)
It's still not exactly trivial, especially, if due to a developer's error, two .java-files contain the same package-level class.
I dislike these additional classes. Their visibility should be limited to the main class of that source (and their name autogenerated like MyClass$0), or they should be treated like nested classes. Not that I really expected this to change anytime soon...
Mark Thornton - 03 Sep 2007 19:28 GMT >>> The thing is, removing these files is easy, because it's simple to correlate >>> class files with the source file they came from; [quoted text clipped - 4 lines] > Actually, the difficulty depends on whether the compiler correctly > includes the "SourceFile"-attribute into the .class-file. (I think The javac option -g:none should emit class files without this attribute. There are also tools which will strip such attributes from class files after compilation.
> I dislike these additional classes. So do I, but as they exist a dependency tool ought to detect them even if only to emit a large raspberry! :-)
Mark Thornton
Mike Schilling - 03 Sep 2007 19:45 GMT >>>> The thing is, removing these files is easy, because it's simple to >>>> correlate class files with the source file they came from; [quoted text clipped - 12 lines] > So do I, but as they exist a dependency tool ought to detect them even > if only to emit a large raspberry! :-) "I cannot work under these conditions!"
Roedy Green - 04 Sep 2007 04:29 GMT >I think it doesn't matter if one wastes cycles for full >build on each developer's PC or on a central server. >I want to discuss saved effort of a new (yet to >be developed) build-process, that respects reverse- >dependencies to make an incremental build as reliably >"correct" as a full rebuild. how about a rule like this.
Your central compile server is notified on checkin. It does a checkout. It looks for the string "public static final". If it sees it is redoes a clean compile, in not, incremental.
This is different from doing the compiles at the client machines since it is triggered as soon as possible. Client compiles would not be triggered till much later, when someone does a checkout.
The server can serve the most recent coherent source/object, just the deltas. This is almost instantaneous so the extra time to clean compile is almost irrelevant. When you compile on a client machine, you have to wait. Further the time to compile is longer on the client since the client machines are not as powerful.
 Signature Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Andreas Leitgeb - 04 Sep 2007 10:34 GMT > On 01 Sep 2007 19:57:51 GMT, Andreas Leitgeb <avl@logic.at> wrote, >>I think it doesn't matter if one wastes cycles for full >>build on each developer's PC or on a central server. > > ... It looks for the string "public static final". It's not only the static final's that need to be taken care for. They were special only insofar, as they triggered this thread in the first place, and, in that they aren't "documented" in the using classes .class-file, unlike all other dependecies.
> If it sees it it redoes a clean compile, if not, incremental. The "conditional" clean compile seems to be the best one can get. However, the "condition" needs yet some fleshing out. Probably I'll start a new thread for that.
> This is different from doing the compiles at the client machines since > it is triggered as soon as possible. Anyway, the concept of central compilation and class-file distribution is entirely orthogonal to the question of which class-files are actually regenerated during a particular build-run on a particular machine.
I understand your point to be: the time saved by central full-compilation is still more than by any intelligent-incremental build on the developer machine... Our developer machines are actually almost dumb terminals (with some X11-server/emulation) to the central sun worksation, and most of the compilation happens there *before* checkin, so we avoid checking in broken sources.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|