Monday 17 November 2008

Saying you're agile, doesn't make you agile.

So a lot of posts have come up in response to James Shore's article on Agile's decline and fall, and the implications of using "scrum alone, and misapplied." While I agree with James about the bulk of his content, there's a much simpler issue at work, and it's an issue that's been hanging around since the earliest days. It's about branding.

Agile is a brand

I can call myself anything. I can call my self "Christian" (that's my name). I can call myself Canadian. I can call myself a consultant. These are descriptive, verifiable, and helpful designations in understanding what I'm about. I can also call myself "agile." This, depending on how the label is understood, can be helpful or not. Do I mean that I'm physically agile? Do I mean that I'm flexible in my thinking? Do I mean that I observe a particular method of software development? Does that mean that I'm iterating? The term is, at that level, meant to be evocative, and therefore is a very easy to apply, but slightly meaningless brand.

Scrum is a stronger brand, because it has some licensing, and there's a "by the book" approach. Extreme Programming is also a stronger brand. You can measure the organization's application of practices and determine if they're really doing XP or Scrum (to a larger extent). In fact Agile (as a brand) has always suffered from this weakness, in that it can be applied so generically as to be unhelpful as a description. Because of this, Agile has failed a lot in its history, because anyone can do a little piece of agile, call themselves "Agile", and fail. "Agile" is then a nice scapegoat for the failure. Waterfall is in the similar unenviable position, but it was received wisdom in the software industry for so long no one really questioned that it was one of the culprits until alternatives were seriously proposed.

So is "Agile" failing?

At one point I worked with a client that implemented Scrum. Only they didn't. They let teams determine their own processes in many ways, except that they all iterated. But the infrastructure of Scrum™ wasn't there. But they did not deliver working software every iteration. They did not have a clearly defined product owner. They had ambiguity about what "the team" was, in fact. No scrum-master-alike role. Little criteria for acceptance was provided to the teams. Few teams were allowed to prioritize defects to the top above new functionality. Basically I'm listing off a set of Scrum assumptions/directives that were not followed. Yet this organization called what it was doing "Agile." This company did not deliver software for several years, and is now not "doing agile" as an organization. (Some of the teams are actually moving forward with stronger agile processes, but the outer shell of the organization is moving towards stronger "contract-oriented" delivery in large phases.)

So did Agile fail? Did they fail Agile? Most Scrum proponents would probably suggest (and we did) that they weren't even doing agile in the first place. In fact, they were (on some teams) starting to do some of the engineering practices. The teams were learning, but the organization wasn't. They held retrospectives and didn't apply the learning. They weren't fearless in their willingness to change and adapt to what they were discovering about their process. So in effect, they started to do Scrum-but, ended up doing 1/4 XP with iteration, and management did not accept what was obvious to the rank and file. I do not consider this an "agile failure" or a "failure of agile" because, frankly, Agile wasn't even given a fair shake.

I contend, based on my experience there, that had they implemented Scrum "by the book", it may well not have "saved" them. Their engineers, however, are extremely competent (when permitted) and some of their best immediately began to implement agile-supportive engineering practices. And as consultants we helped them with infrastructure issues like moving towards a more coherent infrastructure around continuous integration, separation of types of testing, etc. I think they could have succeeded, because the ingredients were there. Scrum was exposing problems, insofar as they were using its elements. But without applying the learning, Scrum is as useless as any other learning system. As Mishkin Berteig opines frequently in his courses, "Scrum is very hard."

Incidentally, In case you know my clients, don't be too hard on these guys. Management tried very hard, was undermined, interfered, and meant well. For the most part, agile implementations I've observed have been partial implementations largely doomed to produce less value less quickly than was otherwise possible.

Can "Agile" fail if you are using something else.

What James talks about is not "agile" or "scrum" failing, but a wolf in sheep's clothing. I'm not copping out. As I said, most of the implementations I've seen have been half-baked partial agile implementations, which, in retrospective analysis done for my clients, have failed in the obvious faults left where missing practices were ignored or seen as too risky or costly. These methods are systems which work together for reasons. Lean analysis can help, but it still is a learning-system closely aligned to scrum in mental-state (if not in brand), and it will merely uncover things that you have to fix.

If I call myself a Buddhist, but become a materialist freak... did Buddhism fail, or am I truly worthy of the label? If I label myself a Christian, but then fail to love my neighbour, is Christianity failing, or am I failing Christianity? If I call myself a Muslim, then violate the strictures of the Qur'an (by, say, killing innocents), is Islam violent, or am I doing an injustice to the name of the religion? I mean no disservice to any religion when I call them a "brand", but like any other label, you can think of them that way. If a brand is easy to apply, then it's easier to distort and pervert from its original intent.

If I take a bottle of Coca-cola, replace the contents with prune-juice and Perrier, then put it in a blind taste test with Pepsi, which subsequently wins... did Pepsi beat Coca-cola in a blind taste test? No. I subverted the brand, by applying it to something else entirely.

If I've made my point clear, the problem should be obvious. Agile is a weak brand, which can be misapplied. Therefore, in a sense, given that Agile was only a loose community of like-minded methods and system practitioners, maybe it's OK that Agile declines, as a brand. The problem is that such a decline will discourage people from applying the management processes and engineering practices that can encourage organizational learning and product quality.

If that happens, then the sky has fallen, because we weren't making high-quality software before Agile, and I would despair if we just gave up.

Not a fad for me

I'll keep teaching and coaching "agile" teams, encouraging them to implement the full range of agile practices (above and below the engineering line). Not because I'm a believer, but because I find these useful practices which, when applied in concert, powerfully enable teams to deliver high-quality software. And because I enjoy working this way. And because I deliver faster when I do them. That's the bottom line. Agile might be a fad, but I don't really care. I wasn't using these techniques just because they had become popular. Many of them I had used before I had ever heard of the terms Scrum, XP, or Agile. It's a bit like Object-Orientation, which I've been using since the NeXTSTEP days in the early 90's. At that point, it wasn't clear that O-O had "won" over structured programming, and it was not really a "fad" until the advent of Java. That didn't stop me from using a helpful paradigm that improved my quality, integrity, and productivity. And it shouldn't stop anyone else. CIO.com will announce the new fad soon. Use it if its useful. Use Agile methods if you find them useful. Was O-O hard when I first was trying to do it? Sure! There are books on its pitfalls and how to screw it up. (And in fact I find little of it out there, despite the prevalence of O-O supportive languages.) But I still advocate for its use, and use it myself.

The quality, integrity, and delivery of your software is in your hands. Do what you gotta do.

Tuesday 4 November 2008

code coverage != test driven development

I was vaguely annoyed to see this blog article featured in JavaLobby's recent mailout. Not because Kevin Pang doesn't make some good points about the limits of code coverage, but because his title is needlessly controversial. And, because JavaLobby is engaging in some agile-baiting by publishing it without some editorial restraint.

In asking the question, "Is code coverage all that useful," he asserts at the beginning of his article that Test Driven Development (TDD) proponents "often tend to push code coverage as a useful metric for gauging how well tested an application is." This statement is true, but the remainder of the blog post takes apart code coverage as a valid "one true metric," a claim that TDD proponents don't make, except in Kevin's interpretation.

He further asserts that "100% code coverage has long been the ultimate goal of testing fanatics." This isn't true. High code coverage is a desired attribute of a well tested system, but the goal is to have a fully and sufficiently tested system. Code coverage is indicative, but not proof, of a well-tested system. How do I mean that? Any system whose authors have taken the time to sufficiently test it such that it gets > 95% code coverage is likely (in my experience) thinking through how to test their system in order to fully express its happy paths, edge cases, etc. However, the code coverage here is a symptom, not a cause, of a well-tested system. And the metric can be gamed. Actually, when imposed as a management quality criterion, it usually is gamed. Good metrics should confirm a result obtained by other means, or provide leading indicators. Few numeric measurements are subtle enough to really drive system development.

Having said that, I have used code-coverage in this way, but in context, as I'll mention later in this post.

Kevin provides example code similar to the following:

String foo(boolean condition) {
   if (condition)
       return "true";
   else
       return "false";
}

... and talks about how if the unit tests are only testing the true path, then this is only working on 50% coverage. Good so far. But then he goes on to express that "code coverage only tells us what was executed by our unit tests, not what executed correctly." He is carefully telling us that a unit test executing a line doesn't guarantee that the line is working as intended. Um... that's obvious. And if the tests didn't pass correctly, then the line should not be considered covered. It seems there are some unclear assumptions on how testing needs to work, so let me get some assertions out of the way...

  1. Code coverage is only meaningful in the context of well-written tests. It doesn't save you from crappy tests.
  2. Code coverage should only be measured on a line/branch if the covering tests are passing.
  3. Code coverage suggests insufficiency, but doesn't guarantee sufficiency.
  4. Test-driven code will likely have the symptom of nearly perfect coverage.
  5. Test-driven code will be sufficiently tested, because the author wrote all the tests that form, in full, the requirements/spec of that code.
  6. Perfectly covered code will not necessarily be sufficiently tested.

What I'm driving at is that Kevin is arguing against something entirely different than that which TDD proponents argue. He's arguing against a common misunderstanding of how TDD works. On point 1 he and I are in agreement. Many of his commentators mention #3 (and he states it in various ways himself). His description of what code coverage doesn't give you is absurd when you take #2 into account (we assume that a line of covered code is only covered if the covering test is passing). But most importantly - "TDD proponents" would, in my experience, find this whole line of explanation rather irrelevant, as it is an argument against code-coverage as a single metric for code quality, and they would attempt to achieve code quality through thoroughness of testing by driving the development through tests. TDD is a design methodology, not a testing methodology. You just get tests as side-effect artifacts of the approach. Useful in their own right? Sure, but it's only sort of the point. It isn't just writing the tests-first.

In other words - TDD implies high or perfect coverage. But the inverse is not necessarily true.

How do you achieve thoroughness by driving your development with tests? You imagine the functionality you need next (your next increment of useful change), and you write or modify your tests to "require" the new piece of functionality. They you write it, then you go green. Code coverage doesn't enter into it, because you should have near perfect coverage at all times by implication, because every new piece of functionality you develop is preceded by tests which test its main paths and error states, upper and lower bounds, etc. Code coverage in this model is a great way to notice that you screwed up and missed something, but nothing else.

So, is code-coverage useful? Heck yeah! I've used coverage to discover lots of waste in my system. I've removed whole sets of APIs that were "just in case I need them" APIs, because they become rote (lots of accessors/mutators that are not called in normal operations). Is code coverage the only way I would find them? No. If I'm dealing with a system that wasn't driven with tests, or was poorly tested in general, I may use coverage as a quick health meter, but probably not. Going from zero to 90% on legacy code is likely to be less valuable than just re-writing whole subsystems using TDD... and often more expensive.

Regardless, while Kevin is formally asking "is code coverage useful?" he's really asking (rhetorically) is it reasonable to worship code coverage as the primary metric. But if no one's asserting the positive, why is he questioning it? He may be dealing with a lot of people with misunderstandings of how TDD works. He could be dealing with metrics bigots. He could be dealing with management-imposed-metrics initiatives which often fail. It might be a pet peeve or he's annoyed with TDD and this is a great way to do some agile-baiting of his own. I don't know him, so I can't say. His comments seem reasonable, so I assume no ill intent. But the answer to his rhetorical question is "yes, but in context." Not surprising, since most rhetorically asked questions are answerable in this fashion. Hopefully it's a bit clearer where it's useful (and where/how) it's not.