Saturday, June 4, 2011

Procedural Objects

Consider one of the classic problems of object-oriented languages. Or to be precise, class-oriented languages, like Java, C++, Objective-C, C#, Smalltalk and Eiffel. Namely, you've got a use case. A use case is an algorithm: it says, with these objects we shall do certain things. You might have two or five or ten different classes, and dozens or hundreds of actual object instances, participating in this particular use case.

What object, or objects, describe the use case?

Before we continue, let's address some terminology problems. People make reference to business objects, domain objects, entities, and value objects, among other types. Unfortunately these terms mean different things to different people. This is an observable fact, it won't go away, and so we may as well assume that we cannot use these terms and still maintain clarity of discussion.

So I'll simply refer to objects (or classes, or instances of classes, or prototypes).

Object-oriented languages - class-oriented languages specifically - early on arrived at a picture of a system where use cases were split up amongst the methods belonging to classes. This is still a primary hallmark of object-oriented design. Furthermore, it is assumed that if the partitioning is done correctly that overall system behaviour will somehow emerge from all the object interactions.

This thinking becomes problematic quickly: a use case may involve dozens of methods in dozens of classes. This is difficult to reason about, both for original implementation and also for maintenance.

Point being, there is a lot of logic in an application that is procedural. It describes use cases, or algorithms. In the sense of classic objects - objects that have state, and whose behaviour is meant to operate only on that state - there is a great deal of logic that doesn't belong to classic objects.

A simple use case that illustrates this problem is a conventional document approval workflow. There is a Document object - its state might include a name (title), contents (a reference), security information, versioning, and its lifecycle status. There are also Actors - the initial author(s), one or more reviewers, one or more approvers, and recipients. We'll keep it simple and assume that the document is not a record. We'll also assume that external processes decide which people belong to which groups, for any given document.

A flowchart for a typical document approval workflow is moderately dense; you could diagram it on a standard letter-sized piece of paper and be able to read it at arm's length, but the diagram would certainly be busy.

So what methods in what classes are responsible for all this logic? Document? Absolutely not - the workflow will vary enormously by organization and timeframe and project, to mention just a few factors. Actors? No, because actors are external to the system by definition. Let's say that there were actor "stub" objects - the actions that these stubs would take are extremely simple. For example, if an actor stub existed for an Approver, there are aspects of a single approval task, like enforcing non-repudiation, or dealing with an unmet deadline, that are not its job. But neither are they responsibilities of the Document instance.

So in a typical C++ or Java or C# application - for that matter Python or Ruby - we end up doing a number of undesirable things:

  1. we shoehorn a bunch of state and behaviour into Document that doesn't belong there;
  2. we jam a bunch of state and behaviour into other classes in the application; any class that has the slightest relationship with document approval becomes a possible candidate for housing some of the logic;
  3. we create one or more procedure objects to handle the use case, but hold our noses while doing it.
The first two activities happen in 99 percent of OO apps. There are also lots of ways of (dubiously) justifying all this. One design smell is when people start talking about rich domain classes; this usually means that they are justifying, often unconsciously, the placement of logic in the wrong spots. In other words, they are breaking up use case logic and don't know where to put it.

There is actually nothing wrong with the first part of #3. It's the second part that is the problem - the fact that we have been conditioned to think that it is bad to have great chunks of procedural, imperative code in our OO program. To add insult to injury, methods that are too "imperative looking" are often derisively and incorrectly referred to as God methods, or the objects that contain them as God objects. The problem in fact is having dozens of small methods in inappropriate classes, accompanied by excessive coupling.

OO design patterns largely skirt this issue. They serve a different purpose. If anything, logic which is badly fragmented due to mistaken beliefs about OO design simply makes it more difficult to properly identify opportunities for application of patterns.

Some readers probably know where I am going with this: Data, Context and Interaction, or DCI. To use Domain Driven Design (DDD) language, the Data in DCI consists of Entities: objects that have state, and behaviour pertaining to that state, but no use-case behaviour.

A Context is an object with one or more methods, all of which relate to a use case (or several related use cases). The Context knows about Roles, and its methods both bind objects to roles and enact the use case logic.

Finally, the Interactions are what the Roles do. As dictated by the Context, data objects assume Roles for varying periods to execute use case logic.

From a design perspective, with DCI, it now becomes much easier to reason about what the code is doing for a given use case. The business logic is obviously still not in one spot, lexically, but all the code that comprises a use case is much easier to locate.

In subsequent posts we will examine implementations of a simple document approval workflow, using DCI, in C++, Scala and F#.

No comments:

Post a Comment