Geek in a Suit

Friday, January 2, 2009

Why I hate the java.util collection library.

<rant>

It's very simple. While it was a vast improvement over Vector and Dictionary of their day, the Collections library as of Java 1.2 did what all new Java APIs from Sun seem to do... require lots of code to use simply and allow the user to do invalid things.

I won't spend a lot of time on the former, since it could be the topic of a whole other blog post, and I should preface all of this by saying that Java is my most proficient language. So this is not an anti-Java bigot speaking... just a frustrated user who wishes Sun and the community wouldn't keep heaping bad APIs on top of bad APIs.

Sorry... back to the point.

Immutability - it's all backwards... what's with that?

The big issue I have with the Collections API are about allowing the user to do wrong things. This amounts to Sun having inverted Mutability vs. Immutability in the class heirarchy. Immutability, in any language that wants to guarantee semantics, ease concurrency and resource contention, and otherwise clean things up should have immutability as a default. Something set shouldn't be volatile or mutable unless specified as such, and the semantics should enforce this. But with the collections API, we have immutability as an optional characteristic of Collections. To use a simplistic example, you can do this:

Collection c = new ArrayList();
c.add(foo);
c.add(bar);

Collection has an add() method. This means that, by definition, Collection is mutable. "But wait!" you cry, you can obtain an immutable version of this collection by calling Collections.immutableCollection(c);. Sure. At which point you have something that conforms to the contract of Collection, but which may throw exceptions if part of that contract is relied upon. In other words, you should not have access to methods that allow you to break the contract. To allow this is to have written a bad contract with ambiguity. Now, to properly guard against the possibility of stray immutable collections, you may be forced to check immutability before executing the contract (the add() method) or you may have to guard against the exceptions with try-catch logic and exception handling. You can see some of this in the concurrent library's implementations of concurrent collections. It's not bad, but could be simpler with an immutable collection interface.

Additionally, consider how hard it is to create anonymous one-off implementations of Collection. You have to implement not only size, contains(), iterator(), but all the storage logic. If you're wrapping an existing data structure and merely wanted a read-only view on the structure, you are forced to implement all that extra API in your code purely to satisfy the optional contract provided in the Collection definition.

The point here is that an immutable Collection is a sub-set of the functionality of a MutableCollection. That should be obvious, but apparently wasn't to Sun. Consider had Sun used the model used in NeXTSTEP (and now Apple's Cocoa) APIs. Collection would have been defined as (simplified):

public interface Collection<T> extends Iterable<T> {
    public Iterator<T> getIterator();
    public int getSize();
    public boolean isEmpty();
    public boolean contains(T object);
    public boolean containsAll(Collection<T> object);
    public T[] toArray(T[]);
}

and

public interface MutableCollection<T> extends Collection<T> {
    public boolean add(T object);
    public boolean addAll(Collection<T> objects);
    public boolean remove(T object);
    public boolean removeAll(Collection<T> objects);
    public boolean retainAll(Collection<T> objects);
    public void clear();
}

This would, ultimately, mean that a Collection instance, typed as a Collection would not have any mutable methods available to invoke, let alone that would need guarding against stray immutable invocations. There would then be two strategy for guaranteeing immutability. One... cast the stupid thing as Collection, and onothing that has access to the cast can get at the MutableCollections methods (except by explicit reflection). Alternately, add a "getImmutableCopy()" method to the MutableCollection interface that creates a shallow copy that is NOT an implementor of MutableCollection... merely of Collection. Then you have a safe "snapshot" of the mutable object that can be freely passed around without worry that something else will modify it.

Ok, why is this such a big deal? It's about having code that means what it says. If I have to guess about the run-time state, or more concrete type of an instance to know if it's OK to invoke one of its methods or not, then I'm working with implicit contract, and that's murky territory. Java, by making an ImmutableCollection a special implementation of Collection, has inverted the hierarchy of contract, and exposed methods that are not truly available for all implementors. Optional interfaces are fine, but you don't expose the optional interface above where it's true. It's a basic piece of encapsulation that the Java folks just seemed to forget.

Now, this is a decade too late, this little rant. Truth is, I made it when I worked at Sun, but was quite the junior contracting peon, and had no real voice. Now, I have a blog, and am free to whine and be annoyed in public. :) But I hope to make a more general point here about contract. Optional contracts (APIs) need to be handled very carefully, and in a way such that an unfamiliar programmer can understand what you meant from how the contract reads. Look at your interfaces from the perspective that it should not offer what it (potentially) cannot satisfy. Polymorphism doesn't require "kitchen-sinkism" in an API. Just careful, thought-out layers.

The issues of testability also arise here, insofar as a class that has to implement optional APIs must, therefore, have more tests to satisfy what are, in essence, boilerplate code. If I made a quick-and-dirty implementation of Collection as a wrapper around an already existing, immutable data structure, then I have to implement all of those methods and test them, when in fact, half of them will throw an exception. This means (to me) that they probably shouldn't even exist. Code with a lot of boilerplate code (lots of extra getters and setters, or over-wrought interfaces) tend to be hard to test, and one of my big annoyances in life (these days) is testing boilerplate code. Ick.

</rant>

Labels: , , , ,

Tuesday, November 4, 2008

code coverage != test driven development

I was vaguely annoyed to see this blog article featured in JavaLobby's recent mailout. Not because Kevin Pang doesn't make some good points about the limits of code coverage, but because his title is needlessly controversial. And, because JavaLobby is engaging in some agile-baiting by publishing it without some editorial restraint.

In asking the question, "Is code coverage all that useful," he asserts at the beginning of his article that Test Driven Development (TDD) proponents "often tend to push code coverage as a useful metric for gauging how well tested an application is." This statement is true, but the remainder of the blog post takes apart code coverage as a valid "one true metric," a claim that TDD proponents don't make, except in Kevin's interpretation.

He further asserts that "100% code coverage has long been the ultimate goal of testing fanatics." This isn't true. High code coverage is a desired attribute of a well tested system, but the goal is to have a fully and sufficiently tested system. Code coverage is indicative, but not proof, of a well-tested system. How do I mean that? Any system whose authors have taken the time to sufficiently test it such that it gets > 95% code coverage is likely (in my experience) thinking through how to test their system in order to fully express its happy paths, edge cases, etc. However, the code coverage here is a symptom, not a cause, of a well-tested system. And the metric can be gamed. Actually, when imposed as a management quality criterion, it usually is gamed. Good metrics should confirm a result obtained by other means, or provide leading indicators. Few numeric measurements are subtle enough to really drive system development.

Having said that, I have used code-coverage in this way, but in context, as I'll mention later in this post.

Kevin provides example code similar to the following:

String foo(boolean condition) {
   if (condition)
       return "true";
   else
       return "false";
}

... and talks about how if the unit tests are only testing the true path, then this is only working on 50% coverage. Good so far. But then he goes on to express that "code coverage only tells us what was executed by our unit tests, not what executed correctly." He is carefully telling us that a unit test executing a line doesn't guarantee that the line is working as intended. Um... that's obvious. And if the tests didn't pass correctly, then the line should not be considered covered. It seems there are some unclear assumptions on how testing needs to work, so let me get some assertions out of the way...

  1. Code coverage is only meaningful in the context of well-written tests. It doesn't save you from crappy tests.
  2. Code coverage should only be measured on a line/branch if the covering tests are passing.
  3. Code coverage suggests insufficiency, but doesn't guarantee sufficiency.
  4. Test-driven code will likely have the symptom of nearly perfect coverage.
  5. Test-driven code will be sufficiently tested, because the author wrote all the tests that form, in full, the requirements/spec of that code.
  6. Perfectly covered code will not necessarily be sufficiently tested.

What I'm driving at is that Kevin is arguing against something entirely different than that which TDD proponents argue. He's arguing against a common misunderstanding of how TDD works. On point 1 he and I are in agreement. Many of his commentators mention #3 (and he states it in various ways himself). His description of what code coverage doesn't give you is absurd when you take #2 into account (we assume that a line of covered code is only covered if the covering test is passing). But most importantly - "TDD proponents" would, in my experience, find this whole line of explanation rather irrelevant, as it is an argument against code-coverage as a single metric for code quality, and they would attempt to achieve code quality through thoroughness of testing by driving the development through tests. TDD is a design methodology, not a testing methodology. You just get tests as side-effect artifacts of the approach. Useful in their own right? Sure, but it's only sort of the point. It isn't just writing the tests-first.

In other words - TDD implies high or perfect coverage. But the inverse is not necessarily true.

How do you achieve thoroughness by driving your development with tests? You imagine the functionality you need next (your next increment of useful change), and you write or modify your tests to "require" the new piece of functionality. They you write it, then you go green. Code coverage doesn't enter into it, because you should have near perfect coverage at all times by implication, because every new piece of functionality you develop is preceded by tests which test its main paths and error states, upper and lower bounds, etc. Code coverage in this model is a great way to notice that you screwed up and missed something, but nothing else.

So, is code-coverage useful? Heck yeah! I've used coverage to discover lots of waste in my system. I've removed whole sets of APIs that were "just in case I need them" APIs, because they become rote (lots of accessors/mutators that are not called in normal operations). Is code coverage the only way I would find them? No. If I'm dealing with a system that wasn't driven with tests, or was poorly tested in general, I may use coverage as a quick health meter, but probably not. Going from zero to 90% on legacy code is likely to be less valuable than just re-writing whole subsystems using TDD... and often more expensive.

Regardless, while Kevin is formally asking "is code coverage useful?" he's really asking (rhetorically) is it reasonable to worship code coverage as the primary metric. But if no one's asserting the positive, why is he questioning it? He may be dealing with a lot of people with misunderstandings of how TDD works. He could be dealing with metrics bigots. He could be dealing with management-imposed-metrics initiatives which often fail. It might be a pet peeve or he's annoyed with TDD and this is a great way to do some agile-baiting of his own. I don't know him, so I can't say. His comments seem reasonable, so I assume no ill intent. But the answer to his rhetorical question is "yes, but in context." Not surprising, since most rhetorically asked questions are answerable in this fashion. Hopefully it's a bit clearer where it's useful (and where/how) it's not.

Labels: , , , , ,