Stories about Software



Quick Information/Overview

Pattern Type Structural
Applicable Language/Framework Agnostic OOP
Pattern Source Gang of Four
Difficulty Easy

Up Front Definitions

  1. Component: This is the abstract object that represents any and all members of the pattern
  2. Composite: Derives from component and provides the definition for the class that contains other components
  3. Leaf: A concrete node that encapsulates data/behavior and is not a container for
    other components.
  4. Tree: For our context here, tree describes the structure of all of the components.

The Problem

Let’s say that you get a call in to write a little console application that allows users to make directories, create empty files, and delete the same. The idea is that you can create the whole structure in memory and then dump it to disk in one fell swoop. Marketing thinks this will be a hot seller because it will allow users to avoid the annoyance of disk I/O overhead when planning out a folder structure (you might want to dust off your resume, by the way).

Simple enough, you say. For the first project spring, you’re not going to worry about sub-directories – you’re just going to create something that will handle files and some parent directory.

So, you create the following class (you probably create it with actual input validation, but I’m trying to stay focused on the design pattern):

Client code is as follows:

That’s all well and good, so you ship and start work on sprint number 2, which includes the story that users want to be able to create one level of sub-directories. You do a bit of refactoring, deciding to make root a constructor parameter instead of a settable property, and then get to work on this new story.

You wind up with the following:

and client code:

Yikes. That’s starting to smell. You’ve had to add overloads for adding file and deleting file, add methods for add/delete directory, and append logic to print and create. Basically, you’ve had to either touch or overload every method in the class. Generally, that’s a surefire sign that you’re doin’ it wrong. But, no time for that now because here comes the third sprint. This time, the business wants two levels of nesting. So, you get started and you see just how ugly things are going to get. I won’t provide all of your code here so that the blog can keep a PG rating, but here’s the first awful thing that you had to do:

You also had to modify or overload every method yet again, bringing the method total to 12 and the complexity of each method to a larger figure. You’re pretty sure you can’t keep this up for an arbitrary number of sprints, so you send out your now-dusted off resume and foist this stinker on the hapless person replacing you.

So, What to Do?

What went wrong here is relatively easy to trace. Let’s backtrack to the start of the second sprint when we needed to support sub-directories. As we’ve seen in some previous posts in this series, the first foray at implementing the second sprint gives off the code smell, primitive obsession. This is a code smell wherein bunches of primitive types (string, int, etc) are used in ad-hoc fashion to operate as some kind of ad-hoc class.

In this case, the smell is coming from the series of lists and dictionaries centering around strings to represent file and directory names. As the sprints go on, it’s time to recognize that there is a need for at least one class here, so let’s create it and call it “SpeculativeDirectory” (so as not to confuse it with the C# Directory class).

And, the DirectoryStructure class is now:

The main program that invokes this class is unchanged. So, now, notice the Structure of SpeculativeDirectory. It contains a collection of strings, representing files, and a collection of SpeculativeDirectory, representing sub-directories. For things like PrintStructure() and CreateOnDisk(), notice that we’re now taking advantage of recursion.

This is extremely important because what we’ve done here is future proofed for sprint 3 much better than before. It’s still going to be ugly and involve more and more overloads, but at least it won’t require defining increasingly nested (and insane) dictionaries in the DirectoryStructure class.

Speaking of DirectoryStructure, does this class serve a purpose anymore? Notice that the answer is “no, not really”. It basically defines a root directory and wraps its operations. So, let’s get rid of that before we do anything else.

To do that, we can just change the client code to the following and delete DirectoryStructure:

Now, we’re directly using the directory object and we’ve removed a class in addition to cleaning things up. The API still isn’t perfect, but we’re gaining some ground. So, let’s turn our attention now to cleaning up SpeculativeDirectory. Notice that we have a bunch of method pairs: GetDirectory/GetFile, Add(Directory)/Add(string), Delete(Directory)/Delete(string). This kind of duplication is a code smell — we’re begging for polymorphism here.

Notice that we are performing operations routinely on SpeculativeDirectory and performing the same operations on the string representing a file. It is worth noting that if we had a structure where file and directory inherited from a common base or implemented a common interface, we could perform operations on them just once. And, as it turns out, this is the crux of the command pattern.

Let’s see how that looks. First, we’ll define a SpeculativeFile object:

This is pretty simple and straightforward. The file class knows how to print itself and how to create itself on disk, and it knows that it has a name. Now our task is to have a common inheritance model for file and directory. We’ll go with an abstract base class since they are going to have common implementations and file won’t have an implementation, per se, for add and delete. Here is the common base:

A few things to note here. Fist of all, our recursive Print() and CerateOnDisk() methods are divided into two methods each, one public, one private. This is continue to allow for recursive calls but without awkwardly forcing the user to pass in zero or empty or whatever for path/depth. Notice also that common concerns for the two different types of nodes (file and directory) are now here, some stubbed as do-nothing virtuals and others implemented. The reason for this is conformance to the pattern — while files and directories share some overlap, some operations are clearly not meaningful for both (particularly adding/deleting and anything else regarding children). So, you do tend to wind up with the leaves (SpeculativeFile) ignoring inherited functionality, this is generally a small price to pay for avoiding duplication and the ability to recurse to n levels.

With this base class, we have pulled a good bit of functionality out of the file class, which is now this:

Pretty simple. With this new base class, here is new SpeculativeDirectory class:

Wow. A lot more focused and easy to reason about, huh? And, finally, here is the new API:

Even the API has improved since our start. We’re no longer creating this unnatural “structure” object. Now, we just create root directory and add things to it with simple API calls in kind of a fluent interface.

Now, bear in mind that this is not as robust as it could be, but that’s what you’ll do in sprint 3, since your sprint 2 implemented sub-directories for N depths of recursion and not just one. 🙂

A More Official Explanation

According to dofactory, the Composite Pattern’s purpose is to:

Compose objects into tree structures to represent part-whole hierarchies. Composite lets clients treat individual objects and compositions of objects uniformly.

What we’ve accomplished in rescuing the speculative directory creation app is to allow the main program to perform operations on nodes in a directory tree without caring whether they are files or directories (with the exception of actually creating them). This is most evident in the printing and writing to disk. Whether we created a single file or an entire directory hierarchy, we would treat it the same for creating on disk and for printing.

The elegant concept here is that we can build arbitrarily large structures with arbitrary conceptual tree structures and do things to them uniformly. This is important because it allows the encapsulation of tree node behaviors within the objects themselves. There is no master object like the original DirectoryStructure that has to walk the tree, deciding how to treat each element. Any given node in the tree knows how to treat itself and its sub-elements.

Other Quick Examples

Other places where one might use the composite pattern include:

  1. GUI composition where GUI widgets can be actual widgets or widget containers (Swing java, WPF XAML, etc).
  2. Complex Chain of Responsibility structures where some nodes handle events and others simply figure out who to hand them over to
  3. A menu structure where nodes can either be actions or sub-menus.

A Good Fit – When to Use

The pattern is useful when (1) you want to represent an object hierarchy and (2) there is some context in which you want to be able to ignore differences between the objects and treat them uniformly as a client. This is true in our example here in the context of printing the structure and writing it to disk. The client doesn’t care whether something is a file or a directory – he just wants to be able to iterate over the whole tree performing some operation.

Generally speaking, this is good to use anytime you find yourself looking at a conceptual tree-like structure and iterating over the whole thing in the context of a control flow statement (if or case). In our example, this was achieved indirectly by different method overloads, but the result in the end would have been the same if we had been looking at a single method with a bit of logic saying “if node is a file, do x, otherwise, do y”.

Square Peg, Round Hole – When Not to Use

There are some subtle considerations here. You don’t want to use this if all of your nodes can be the same type of object, such as the case of some simple tree structure. If you were, for example, creating a sorted tree for fast lookup, where each node had some children and a payload, this would not be an appropriate use of composite.

Another time you wouldn’t want to use this is if a tree structure is not appropriate for representing what you want to do. If our example had not had any concept of recursion and were only for representing a single directory, this would not have been appropriate.

So What? Why is this Better?

The code cleanup here speaks for itself. We were able to eliminate a bunch of method overloads (or conditional branching if we had gone that way), making the code more maintainable. In addition, it allows elimination of a structure that rots as it grows — right out of the box, the composite pattern with its tree structure allows handling of arbitrarily deep and wide tree structures. And, finally, it allows clients to walk the tree structure without concerning themselves with what kind of nodes its processing and how to navigate to the next nodes.