Composite
Quick Information/Overview
Pattern Type | Structural |
Applicable Language/Framework | Agnostic OOP |
Pattern Source | Gang of Four |
Difficulty | Easy |
Up Front Definitions
- Component: This is the abstract object that represents any and all members of the pattern
- Composite: Derives from component and provides the definition for the class that contains other components
- Leaf: A concrete node that encapsulates data/behavior and is not a container for
other components. - Tree: For our context here, tree describes the structure of all of the components.
The Problem
Let’s say that you get a call in to write a little console application that allows users to make directories, create empty files, and delete the same. The idea is that you can create the whole structure in memory and then dump it to disk in one fell swoop. Marketing thinks this will be a hot seller because it will allow users to avoid the annoyance of disk I/O overhead when planning out a folder structure (you might want to dust off your resume, by the way).
Simple enough, you say. For the first project spring, you’re not going to worry about sub-directories – you’re just going to create something that will handle files and some parent directory.
So, you create the following class (you probably create it with actual input validation, but I’m trying to stay focused on the design pattern):
public class DirectoryStructure
{
private const string DefaultDirectory = ".";
private readonly List _filenames = new List();
private string _directory = DefaultDirectory;
public string RootDirectory { get { return _directory; } set { _directory = string.IsNullOrEmpty(value) ? DefaultDirectory : value; } }
public void AddFile(string filename)
{
_filenames.Add(filename);
_filenames.Sort();
}
public void DeleteFile(string filename)
{
if (_filenames.Contains(filename))
{
_filenames.Remove(filename);
}
}
public void PrintStructure()
{
Console.WriteLine("Starting in " + RootDirectory);
_filenames.ForEach(filename => Console.WriteLine(filename));
}
public void CreateOnDisk()
{
if (!Directory.Exists(RootDirectory))
{
Directory.CreateDirectory(RootDirectory);
}
_filenames.ForEach(filename => File.Create(filename));
}
}
Client code is as follows:
static void Main(string[] args)
{
var myStructure = new DirectoryStructure();
myStructure.AddFile("file1.txt");
myStructure.AddFile("file2.txt");
myStructure.AddFile("file3.txt");
myStructure.DeleteFile("file2.txt");
myStructure.PrintStructure();
Console.Read();
myStructure.CreateOnDisk();
}
That’s all well and good, so you ship and start work on sprint number 2, which includes the story that users want to be able to create one level of sub-directories. You do a bit of refactoring, deciding to make root a constructor parameter instead of a settable property, and then get to work on this new story.
You wind up with the following:
public class DirectoryStructure
{
private const string DefaultDirectory = ".";
private readonly List _filenames = new List();
private readonly List _directories = new List();
private readonly Dictionary> _subDirectoryFilenames = new Dictionary>();
private readonly string _root;
public DirectoryStructure(string root = null)
{
_root = string.IsNullOrEmpty(root) ? DefaultDirectory : root;
}
public void AddFile(string filename)
{
_filenames.Add(filename);
_filenames.Sort();
}
public void AddFile(string subDirectory, string filename)
{
if (!_directories.Contains(subDirectory))
{
AddDirectory(subDirectory);
}
_subDirectoryFilenames[subDirectory].Add(filename);
}
public void AddDirectory(string directoryName)
{
_directories.Add(directoryName);
_subDirectoryFilenames[directoryName] = new List();
}
public void DeleteDirectory(string directoryName)
{
if (_directories.Contains(directoryName))
{
_directories.Remove(directoryName);
_subDirectoryFilenames[directoryName] = null;
}
}
public void DeleteFile(string filename)
{
if (_filenames.Contains(filename))
{
_filenames.Remove(filename);
}
}
public void DeleteFile(string directoryName, string filename)
{
if (_directories.Contains(directoryName) && _subDirectoryFilenames[directoryName].Contains(filename))
{
_subDirectoryFilenames[directoryName].Remove(filename);
}
}
public void PrintStructure()
{
Console.WriteLine("Starting in " + _root);
foreach (var myDir in _directories)
{
Console.WriteLine(myDir);
_subDirectoryFilenames[myDir].ForEach(filename => Console.WriteLine("\t" + filename));
}
_filenames.ForEach(filename => Console.WriteLine(filename));
}
public void CreateOnDisk()
{
if (!Directory.Exists(_root))
{
Directory.CreateDirectory(_root);
}
foreach (var myDir in _directories)
{
Directory.CreateDirectory(Path.Combine(_root, myDir));
_subDirectoryFilenames[myDir].ForEach(filename => File.Create(Path.Combine(myDir, filename)));
}
_filenames.ForEach(filename => File.Create(filename));
}
}
and client code:
static void Main(string[] args)
{
var myStructure = new DirectoryStructure();
myStructure.AddFile("file1.txt");
myStructure.AddDirectory("firstdir");
myStructure.AddFile("firstdir", "hoowa");
myStructure.AddDirectory("seconddir");
myStructure.AddFile("seconddir", "hoowa");
myStructure.DeleteDirectory("seconddir");
myStructure.AddFile("file2.txt");
myStructure.AddFile("file3.txt");
myStructure.DeleteFile("file2.txt");
myStructure.PrintStructure();
Console.Read();
myStructure.CreateOnDisk();
}
Yikes. That’s starting to smell. You’ve had to add overloads for adding file and deleting file, add methods for add/delete directory, and append logic to print and create. Basically, you’ve had to either touch or overload every method in the class. Generally, that’s a surefire sign that you’re doin’ it wrong. But, no time for that now because here comes the third sprint. This time, the business wants two levels of nesting. So, you get started and you see just how ugly things are going to get. I won’t provide all of your code here so that the blog can keep a PG rating, but here’s the first awful thing that you had to do:
private readonly List _filenames = new List();
private readonly List _directories = new List();
private readonly Dictionary> _subDirectoryFilenames = new Dictionary>();
private readonly Dictionary> _subDirectoryNames;
private readonly Dictionary>> _subSubDirectoryFilenames = new Dictionary>>();
You also had to modify or overload every method yet again, bringing the method total to 12 and the complexity of each method to a larger figure. You’re pretty sure you can’t keep this up for an arbitrary number of sprints, so you send out your now-dusted off resume and foist this stinker on the hapless person replacing you.
So, What to Do?
What went wrong here is relatively easy to trace. Let’s backtrack to the start of the second sprint when we needed to support sub-directories. As we’ve seen in some previous posts in this series, the first foray at implementing the second sprint gives off the code smell, primitive obsession. This is a code smell wherein bunches of primitive types (string, int, etc) are used in ad-hoc fashion to operate as some kind of ad-hoc class.
In this case, the smell is coming from the series of lists and dictionaries centering around strings to represent file and directory names. As the sprints go on, it’s time to recognize that there is a need for at least one class here, so let’s create it and call it “SpeculativeDirectory” (so as not to confuse it with the C# Directory class).
public class SpeculativeDirectory
{
private const string DefaultDirectory = ".";
private readonly HashSet _subDirectories = new HashSet();
private readonly HashSet _files = new HashSet();
private readonly string _name = string.Empty;
public string Name { get { return _name; } }
public SpeculativeDirectory(string name)
{
_name = string.IsNullOrEmpty(name) ? DefaultDirectory : name;
}
public SpeculativeDirectory GetDirectory(string directoryName)
{
return _subDirectories.FirstOrDefault(dir => dir.Name == directoryName);
}
public string GetFile(string filename)
{
return _files.FirstOrDefault(file => file == filename);
}
public void Add(string file)
{
if(!string.IsNullOrEmpty(file))
_files.Add(file);
}
public void Add(SpeculativeDirectory directory)
{
if (directory != null && !string.IsNullOrEmpty(directory.Name))
{
_subDirectories.Add(directory);
}
}
public void Delete(string file)
{
_files.Remove(file);
}
public void Delete(SpeculativeDirectory directory)
{
_subDirectories.Remove(directory);
}
public void PrintStructure(int depth)
{
string myTabs = new string(Enumerable.Repeat('\t', depth).ToArray());
Console.WriteLine(myTabs + Name);
foreach (var myDir in _subDirectories)
{
myDir.PrintStructure(depth + 1);
}
foreach (var myFile in _files)
{
Console.WriteLine(myTabs + "\t" + myFile);
}
}
public void CreateOnDisk(string path)
{
string myPath = Path.Combine(path, Name);
if (!Directory.Exists(myPath))
{
Directory.CreateDirectory(myPath);
}
_files.ToList().ForEach(file => File.Create(Path.Combine(myPath, file)));
_subDirectories.ToList().ForEach(dir => dir.CreateOnDisk(myPath));
}
}
And, the DirectoryStructure class is now:
public class DirectoryStructure
{
private readonly SpeculativeDirectory _root;
public DirectoryStructure(string root = null)
{
_root = new SpeculativeDirectory(root);
}
public void AddFile(string filename)
{
_root.Add(filename);
}
public void AddFile(string directoryName, string filename)
{
var myDirectory = _root.GetDirectory(directoryName);
if (myDirectory != null)
{
myDirectory.Add(filename);
}
}
public void AddDirectory(string directoryName)
{
_root.Add(new SpeculativeDirectory(directoryName));
}
public void DeleteFile(string filename)
{
_root.Delete(filename);
}
public void DeleteFile(string directoryName, string filename)
{
var myDirectory = _root.GetDirectory(directoryName);
if (myDirectory != null)
{
myDirectory.Delete(filename);
}
}
public void DeleteDirectory(string directoryName)
{
_root.Delete(directoryName);
}
public void PrintStructure()
{
_root.PrintStructure(0);
}
public void CreateOnDisk()
{
_root.CreateOnDisk(string.Empty);
}
}
The main program that invokes this class is unchanged. So, now, notice the Structure of SpeculativeDirectory. It contains a collection of strings, representing files, and a collection of SpeculativeDirectory, representing sub-directories. For things like PrintStructure() and CreateOnDisk(), notice that we’re now taking advantage of recursion.
This is extremely important because what we’ve done here is future proofed for sprint 3 much better than before. It’s still going to be ugly and involve more and more overloads, but at least it won’t require defining increasingly nested (and insane) dictionaries in the DirectoryStructure class.
Speaking of DirectoryStructure, does this class serve a purpose anymore? Notice that the answer is “no, not really”. It basically defines a root directory and wraps its operations. So, let’s get rid of that before we do anything else.
To do that, we can just change the client code to the following and delete DirectoryStructure:
static void Main(string[] args)
{
var myDirectory = new SpeculativeDirectory(".");
myDirectory.Add("file1.txt");
myDirectory.Add(new SpeculativeDirectory("firstdir"));
myDirectory.GetDirectory("firstdir").Add("hoowa");
var mySecondDir = new SpeculativeDirectory("seconddir");
myDirectory.Add(mySecondDir);
myDirectory.GetDirectory("seconddir").Add("hoowa");
myDirectory.Delete(mySecondDir);
myDirectory.Add("file2.txt");
myDirectory.Add("file3.txt");
myDirectory.Delete("file2.txt");
myDirectory.PrintStructure(0);
Console.Read();
myDirectory.CreateOnDisk(".");
}
Now, we’re directly using the directory object and we’ve removed a class in addition to cleaning things up. The API still isn’t perfect, but we’re gaining some ground. So, let’s turn our attention now to cleaning up SpeculativeDirectory. Notice that we have a bunch of method pairs: GetDirectory/GetFile, Add(Directory)/Add(string), Delete(Directory)/Delete(string). This kind of duplication is a code smell — we’re begging for polymorphism here.
Notice that we are performing operations routinely on SpeculativeDirectory and performing the same operations on the string representing a file. It is worth noting that if we had a structure where file and directory inherited from a common base or implemented a common interface, we could perform operations on them just once. And, as it turns out, this is the crux of the command pattern.
Let’s see how that looks. First, we’ll define a SpeculativeFile object:
public class SpeculativeFile
{
private readonly string _name;
public string Name { get; }
public SpeculativeFile(string name)
{
_name = name ?? string.Empty;
}
public void Print(int depth)
{
string myTabs = new string(Enumerable.Repeat('\t', depth).ToArray());
Console.WriteLine(myTabs + Name);
}
public void CreateOnDisk(string path)
{
File.Create(Path.Combine(path, _name));
}
}
This is pretty simple and straightforward. The file class knows how to print itself and how to create itself on disk, and it knows that it has a name. Now our task is to have a common inheritance model for file and directory. We’ll go with an abstract base class since they are going to have common implementations and file won’t have an implementation, per se, for add and delete. Here is the common base:
public abstract class SpeculativeComponent
{
private readonly string _name;
public string Name { get { return _name; } }
private readonly HashSet _children = new HashSet();
protected HashSet Children { get { return _children; } }
public SpeculativeComponent(string name)
{
_name = name ?? string.Empty;
}
public virtual SpeculativeComponent GetChild(string name) { return null; }
public virtual void Add(SpeculativeComponent component) { }
public virtual void DeleteByName(string name) { }
public void Print()
{
Print(0);
}
public void CreateOnDisk()
{
CreateOnDisk(Name);
}
protected virtual void Print(int depth)
{
string myTabs = new string(Enumerable.Repeat('\t', depth).ToArray());
Console.WriteLine(myTabs + Name);
foreach (SpeculativeComponent myChild in _children)
{
myChild.Print(depth + 1);
}
}
protected virtual void CreateOnDisk(string path)
{
foreach (var myChild in _children)
{
myChild.CreateOnDisk(Path.Combine(path, Name));
}
}
}
A few things to note here. Fist of all, our recursive Print() and CerateOnDisk() methods are divided into two methods each, one public, one private. This is continue to allow for recursive calls but without awkwardly forcing the user to pass in zero or empty or whatever for path/depth. Notice also that common concerns for the two different types of nodes (file and directory) are now here, some stubbed as do-nothing virtuals and others implemented. The reason for this is conformance to the pattern — while files and directories share some overlap, some operations are clearly not meaningful for both (particularly adding/deleting and anything else regarding children). So, you do tend to wind up with the leaves (SpeculativeFile) ignoring inherited functionality, this is generally a small price to pay for avoiding duplication and the ability to recurse to n levels.
With this base class, we have pulled a good bit of functionality out of the file class, which is now this:
public class SpeculativeFile : SpeculativeComponent
{
public SpeculativeFile(string name) : base(name) {}
protected override void CreateOnDisk(string path)
{
File.Create(Path.Combine(path, Name));
base.CreateOnDisk(path);
}
}
Pretty simple. With this new base class, here is new SpeculativeDirectory class:
public class SpeculativeDirectory : SpeculativeComponent
{
public SpeculativeDirectory(string name) : base(name) { }
public override SpeculativeComponent GetChild(string name)
{
return Children.FirstOrDefault(child => child.Name == name);
}
public override void Add(SpeculativeComponent child)
{
if(child != null)
Children.Add(child);
}
public override void DeleteByName(string name)
{
var myMatchingChild = Children.FirstOrDefault(child => child.Name == name);
if (myMatchingChild != null)
{
Children.Remove(myMatchingChild);
}
}
protected override void CreateOnDisk(string path)
{
string myPath = Path.Combine(path, Name);
if (!Directory.Exists(myPath))
{
Directory.CreateDirectory(myPath);
}
base.CreateOnDisk(path);
}
}
Wow. A lot more focused and easy to reason about, huh? And, finally, here is the new API:
static void Main(string[] args)
{
var myDirectory = new SpeculativeDirectory(".");
myDirectory.Add(new SpeculativeFile("file1.txt"));
myDirectory.Add(new SpeculativeDirectory("firstdir"));
myDirectory.GetChild("firstdir").Add(new SpeculativeFile("hoowa"));
myDirectory.Add(new SpeculativeDirectory("seconddir"));
myDirectory.GetChild("seconddir").Add(new SpeculativeFile("hoowa"));
myDirectory.DeleteByName("seconddir");
myDirectory.Add(new SpeculativeFile("file2.txt"));
myDirectory.Add(new SpeculativeFile("file3.txt"));
myDirectory.DeleteByName("file2.txt");
myDirectory.Print();
Console.Read();
myDirectory.CreateOnDisk();
}
Even the API has improved since our start. We’re no longer creating this unnatural “structure” object. Now, we just create root directory and add things to it with simple API calls in kind of a fluent interface.
Now, bear in mind that this is not as robust as it could be, but that’s what you’ll do in sprint 3, since your sprint 2 implemented sub-directories for N depths of recursion and not just one. 🙂
A More Official Explanation
According to dofactory, the Composite Pattern’s purpose is to:
Compose objects into tree structures to represent part-whole hierarchies. Composite lets clients treat individual objects and compositions of objects uniformly.
What we’ve accomplished in rescuing the speculative directory creation app is to allow the main program to perform operations on nodes in a directory tree without caring whether they are files or directories (with the exception of actually creating them). This is most evident in the printing and writing to disk. Whether we created a single file or an entire directory hierarchy, we would treat it the same for creating on disk and for printing.
The elegant concept here is that we can build arbitrarily large structures with arbitrary conceptual tree structures and do things to them uniformly. This is important because it allows the encapsulation of tree node behaviors within the objects themselves. There is no master object like the original DirectoryStructure that has to walk the tree, deciding how to treat each element. Any given node in the tree knows how to treat itself and its sub-elements.
Other Quick Examples
Other places where one might use the composite pattern include:
- GUI composition where GUI widgets can be actual widgets or widget containers (Swing java, WPF XAML, etc).
- Complex Chain of Responsibility structures where some nodes handle events and others simply figure out who to hand them over to
- A menu structure where nodes can either be actions or sub-menus.
A Good Fit – When to Use
The pattern is useful when (1) you want to represent an object hierarchy and (2) there is some context in which you want to be able to ignore differences between the objects and treat them uniformly as a client. This is true in our example here in the context of printing the structure and writing it to disk. The client doesn’t care whether something is a file or a directory – he just wants to be able to iterate over the whole tree performing some operation.
Generally speaking, this is good to use anytime you find yourself looking at a conceptual tree-like structure and iterating over the whole thing in the context of a control flow statement (if or case). In our example, this was achieved indirectly by different method overloads, but the result in the end would have been the same if we had been looking at a single method with a bit of logic saying “if node is a file, do x, otherwise, do y”.
Square Peg, Round Hole – When Not to Use
There are some subtle considerations here. You don’t want to use this if all of your nodes can be the same type of object, such as the case of some simple tree structure. If you were, for example, creating a sorted tree for fast lookup, where each node had some children and a payload, this would not be an appropriate use of composite.
Another time you wouldn’t want to use this is if a tree structure is not appropriate for representing what you want to do. If our example had not had any concept of recursion and were only for representing a single directory, this would not have been appropriate.
So What? Why is this Better?
The code cleanup here speaks for itself. We were able to eliminate a bunch of method overloads (or conditional branching if we had gone that way), making the code more maintainable. In addition, it allows elimination of a structure that rots as it grows — right out of the box, the composite pattern with its tree structure allows handling of arbitrarily deep and wide tree structures. And, finally, it allows clients to walk the tree structure without concerning themselves with what kind of nodes its processing and how to navigate to the next nodes.