Let’s say you’re building an application with a SQL database stored in the cloud. You’ve written a web API which you can call from you app to interact with the database. As you start to populate your database with test data, you realize your queries aren’t processing as fast as you like. You go to check out your database metrics and you realize the amount of data transferred out of the database is much higher than you expected. Not only is this slow—it’s going to get expensive! This can be a scary experience, but fear not! There may be multiple problems in play, but we can (hopefully) remedy most of them with data transfer objects (DTOs).
What’s Causing the Problem
If your outgoing data is is dramatically more than you expected, you should stop and consider what exactly your queries are returning. Imagine a database with two tables: Customers and Orders. The corresponding classes look like this (we’ll use C# today):
public class Customer
{
public Customer() { }
public int ID { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public string Address { get; set; }
public string City { get; set; }
public string State { get; set; }
public string ZipCode { get; set; }
public string Phone { get; set; }
public string Email { get; set; }
}
public class Order
{
public Order() { }
public int ID { get; set; }
public int CustomerID { get; set; }
public DateTime Date { get; set; }
public decimal Cost { get; set; }
}
The Customer class holds several pieces of information about each customer. Whenever a customer places an order, we get a new Order object that has some information about the order. Among the information stored in each Order is a CustomerID foreign key, which we can use to get the related customer information for that order.
After storing some data, we might like to query all of the Orders. Let’s say we want to display them in a table that includes a Customer Name column. Unfortunately, we don’t have the customer’s name stored directly in the Order model. In addition to querying the Orders, we’ll also have to query the Customer table for the CustomerIDs found on our Orders.
If you’re using an Object Relational Mapper (ORM) like Entity Framework to execute your database actions, you’ll probably use a navigation property to get the extra Customer information. You just add a new Customer property on the Order model with the following format.
public class Order
{
public Order() { }
public int ID { get; set; }
public int CustomerID { get; set; }
public Customer Customer { get; set; }
public DateTime Date { get; set; }
public decimal Cost { get; set; }
}
The Customer property isn’t stored in the database as a field on Order. It’s merely added to the Order object if you choose to include it as part of your query. If you’re using Entity Framework, the code in your web API might look like this.
return await _db.Orders
.Include(Order => Order.Customer)
.AsNoTracking()
.ToListAsync();
Now if we want to get the customer’s name off of the order, we can easily get it as follows.
string customerName = order.Customer.FirstName + " " + order.Customer.LastName;
That’s all good and well. It works. It’s just really inefficient. We’re returning all sorts of address and contract information for the customer on each Order object that we don’t need. If a customer has multiple orders, we’re returning all of that excess information multiple times!
This example only has one include, but things can get much worse if you layer more includes on top of includes. It’s time to rethink this approach with DTOs.
What’s a DTO?
Data Transfer Objects—DTOs for short—are objects that are optimized for sending information efficiently over the wire. Essentially, they cut out extra information. Your web API grabs the regular objects that you know and love out of the database. Before it sends them to your local application, the web api converts the objects to DTOs. Once your local application receives the DTOs, it converts the objects into a suitable format for display within the project’s UI.
This final UI form of the object may resemble the original model you got out of the database before the DTO conversion. It might also have some extra business logic on it. There are some people with very strong opinions about the stylebook surrounding DTO conversions, but there isn’t a definitive answer here. As long as the DTO carries an efficient amount of information and doesn’t have business logic, we’re doing just fine.
Implementing a DTO solution
Let’s return to our Customer and Order models. In order to avoid returning an entire Customer object along with each Order object, let’s write an OrderDTO model.
public class OrderDTO
{
public OrderDTO() { }
public int ID { get; set; }
public int CustomerID { get; set; }
public string CustomerName { get; set; }
public DateTime Date { get; set; }
public decimal Cost { get; set; }
OrderDTO(Order model)
{
ID = model.ID,
CustomerID = model.CustomerID,
Date = model.Date,
Cost = model.Cost,
CustomerName = model.Customer.FirstName + model.Customer.LastName;
}
}
Notice that this model only has a string for CustomerName instead of an entire Customer property. We’ve also added a constructor to convert our Orders to OrderDTOs. Let’s see how that works with an Entity Framework call on the web API with orders of $100 or more.
return await _db.Orders
.Include(Order => Order.Customer)
.AsNoTracking()
.Where(Order => Order.Cost >= 100)
.Select(Order => new OrderDTO(Order))
.ToListAsync();
We still do the same include as last time, so our initial Order object does have an entire Customer property on it. This version adds the conversion before we return the list, however, which flattens out our Orders to more efficient OrderDTOs. When we receive the OrderDTOs, we can go to town with them however we see fit. If we have a different Order model we want to use for the UI, we just write a constructor in our UI version of the model to convert the DTOs. We can even use the DTOs directly in the UI if we’re not worried about some stylebook vigilantes raising an eyebrow!
Conclusion
It’s easy to get carried away with queries that start compounding more queries and returning gigantic objects. These are the sort of situations where sending DTOs can be literally thousands of times more efficient from a data transfer standpoint than sending the original objects. I’ve primarily focused on flattening data with DTOs, but they’re also used for combining data into a single object that would otherwise require querying your API multiple times, effectively eliminating some roundtrips. Be creative. The fewer calls made and less data transferred, the better your app will perform!