2015-03-29

Advancing Enterprise DDD - The Entity and the Aggregate Root

In this essay of the Advancing Enterprise DDD series, we will leave behind the POJO for a bit, and look at entity aggregates. After the entity, the aggregate is probably the most important building block in Domain Driven Design. Aggregates provide a way to organize your entities into small functional groups, bringing structure to what might otherwise be a knotted jumble of entity classes.

Project development and domain analysis can quickly generate a large number of entity types, and relationships between them. We need organizational tools to manage the complexities of working with so many entity types, or things can quickly get out of hand. DDD provides many tools for managing complexity, such as bounded contexts and context maps. But the first organizational principle we apply to our entities is grouping closely related ones into aggregates. One entity is selected to be the root of the aggregate.

Let’s take a look at an example to see how this works. Suppose we are developing an online storefront, where customers can go to our website, and order items that we have for sale. Our customers are important to us, and we want to be able to uniquely identify them throughout our system, so we give every customer a unique ID. We decide to call our customer ID a URI, to prevent any confusion between it and the database ID. A simple version of our customer might look like this in UML:


Next, we consider the items we have for sale. Like customers, we choose to give each retail item a URI, so we can uniquely identify them throughout the system. And we include a title, a description, and the current price:


Now we proceed to look at our orders. We keep track of some basic information about the order, such as when it was made, the address we are shipping to, the total cost of the items in the order, plus shipping costs and taxes. An order is also associated with a unique customer:



Of course, the order must contain a list of the items ordered. For each order item, we keep track of the quantity ordered, and the price. We note that because the price of the retail item may change over time, the price reflected in the order item may not be the same as the current price of the retail item:


Each order contains one or more order items. Although it may not seem necessary, we decide that the order items will have a fixed order, so that when the user views the order at different times, or in different contexts, the order items will not get shuffled around. Putting it all together, we get something like this:


We now have four entities in our class diagram, but not all of them are equal. While a customer, an order, and a retail item, can all stand on their own, an order item really does not make much sense outside of the context of an order. An order item is part of an order, and this is loosely described in various contexts as an aggregation, or composition, relationship. In contrast, the relationship between Order and Customer is an association, which is a more basic relationship that does not imply ownership. We indicate an aggregation in UML with a little diamond at the owning end of their relationship:


In Domain Driven Design, we group the Order and Order Item entities into an aggregate, and make the Order the aggregate root. Grouping entities into aggregates performs many functions for us to help limit complexity. For one, it is a form of encapsulation where the root acts as a single point of reference to the world outside. In our example, entities outside of the order aggregate can reference the Order, but cannot reference an Order Item. This makes sense: Why would anything other than an Order need an association with an Order Item? Limiting the number of associations between entities helps to keep the domain simple, and to prevent problematic cycles between entities.

Aggregate roots also serve as hubs for persistence operations. Outside of the persistence layer, we never want to issue explicit create/retrieve/update/delete (CRUD) requests for an Order Item. Instead, we issue CRUD requests for the Order, and the persistence operations for the Order Items are encapsulated within. So there is no need for an OrderItemRepository, and having one would only be confusing and redundant. Defining our aggregate boundaries simplifies our persistence layer API, and provides a clear contract to the caller. For example, when we issue an update request on an Order, we know that the Order Items will also be updated, and the associated Customer and Retail Items will not.

Explicit entity aggregation helps us provide a clean API from the persistence layer to the services layer. We know exactly when an operation on our domain classes might lead to expensive database operations: precisely when we cross an aggregate boundary.

But the tools we use do not help us make these kinds of distinctions between aggregate roots and other kinds of entities. Using our standard object-oriented tools, we model both associations and aggregations as JavaBean properties, with the same basic getter/setter style. For example, there is nothing to indicate the different kind of relationships in this stripped down sample code:

public class Order {

    private Customer customer;
    private List<OrderItem> orderItems;

    public Customer getCustomer() {
        return customer;
    }

    public List<OrderItem> getOrderItems() {
        // defensive copy
        return new ArrayList<>(orderItems);
    }

}

This example is not as clear as it could be, because the Customer relationship is many-to-one, and the Order Item relationship is one-to-many. But if an Order had many Customers, the getCustomers() method would look just like getOrderItems().

JPA provides no concept or representation of an aggregate root. Neither my relational database (RDB) nor JPA do anything to help me prevent associations leading to a non-root entity - from a Customer to an Order Item, for example. Because JPA uses proxy objects to represent entities that have not yet been loaded from the database, it is not clear when we are crossing a persistence boundary and performing a database operation. To know this, we need to dig in to the JPA configuration, and without careful repository design, these kinds of persistence concerns can easily leak into the service layer.

I have seen many JPA projects that have had a repository class for every entity in the model. Even for entities that were clearly not aggregate roots, such as our Order Item above. Of course, the development team can claim most of the responsibility for this. Either we weren’t really doing DDD, or we were doing a poor job at it. But just the same, it would be nice if the tools we used would assist us in our efforts to do DDD. Creating a JPA repository for a non-root entity is so easy - just stick an annotation on an empty-bodied class, and you have a repo with all the basic CRUD operations. JPA could help us here, by creating a separate concept of an aggregate root, and restricting this kind of repository auto-creation to roots.

It is still up to us to design our own entity classes well using the tools we have. The best thing for us may be to switch toolsets. We could use longevity, a persistence framework for Scala and NoSQL. Unlike JPA, longevity was designed to work well with a DDD mindset from the start, and it clearly differentiates between aggregate roots and non-root entities.

If we are stuck with JPA, then we should only create repository classes for the aggregate roots. But we still need to work to assure that persistence operations treat each aggregate as a unit. JPA provides us with tools to help accomplish this, such as cascades, and lazy and eager fetch strategies. But it is not the case that we choose the right JPA configuration for our entities and their properties, and the rest is handled for us. In the next essay, we’ll look into some of the complexities of using cascades and fetch strategies to perform persistence operations across aggregates.

1 comment:

Note: Only a member of this blog may post a comment.