Managing data in a large, layered application

The API we are building consists of many different, and mostly independent (decoupled), layers (at least 7 before reaching the database). One of the only things that is required by all the layers is the data that is being retrieved or sent to the app.

Note: this article was originally published on the Exonet Techblog and is written from that perspective.

That data comes in as a JSON object (which will be converted to a PHP array), is parsed and written to the SQL database. When retrieving the data, a query gets the requested data from the database and returns, by default, an Eloquent collection (because we’re using Laravel). So we’ve got at least two different types of data flowing through all of our app layers, which both have a different way of manipulating, checking etc. We can of course convert the JSON to a Collection or the Eloquent collection to a plain array, but the structure is still different. (Our JSON api isn’t a one-on-one mapping to our database).

Another “problem” is that our database is completely normalised. So there are a lot of relations to other tables and records. When inserting the data and relations in the database, the method executing the insert needs to know the structure of the array, build multiple queries and run the inserts. As we don’t exactly know where our data is coming from (for example is it directly sent to a resource, or as a relationship on another resource, or in a batch, or by a CLI tool) you need to write multiple insert methods to handle all cases. For getting the data it works pretty much the same. A “main” record can only exist out of multiple IDs referring other tables. All those referred data combined, gives the data that is requested.

Our solution: DataObjects

Our solution is a pretty simple one: make sure that you know how the data is structured, independent of the fact that it is retrieved from the database or sent by a POST request. Our first draft of DataObjects was straightforward; we created a class with private properties and added getters/setters to manage them:

<?php

namespace DataObjects;

class CustomerDataObject extends BaseDataObject {
    private $name;
    private $emailAddress;
    private $addressId;

    public function getName()
    {
        return $this->name;
    }

    public function setName($name)
    {
        $this->name = $name;
    }

    public function getEmailAddress()
    {
        return $this->emailAddress;
    }

    public function setEmailAddress($emailAddress)
    {
        $this->emailAddress = $emailAddress;
    }

    public function getAddressId()
    {
        return $this->addressId;
    }

    public function setAddressId(int $addressId)
    {
        $this->addressId = $addressId;
    }
}

As you can guess, this class is some sort of representation of our database table schema. We’ve created a BaseDataObject (which is extended by all of our ‘real’ DataObjects) which contains shared methods, for example a method that accepts an array and maps the data to a DataObject. A similar method also exists for Eloquent collections. By following the database table scheme, mapping the data back to the database is straightforward and forces us to use normalised data, even in our app when creating a new DataObject based on a POST.

We’ve decided to write all getters/setters instead of using magic methods for various reasons. One of the reasons is that our IDE now knows the possible methods for autocompletion. Now, accessing the database layer, the query response is converted to an instance of the correct DataObject and returned to the method invoking the database layer. When receiving a POST with data, we convert it to the correct DataObject as soon as possible in the application flow, before passing it further into our app. Another advantage of using DataObjects is that all our other layers/classes now have the possibility of using type declarations. No more vague arrays, but a plain and clear public function myMethod(\DataObjects\CustomerDataObject $customer) {}. Again, when writing code in myMethod, code completion still works and you don’t have to look inside the original class to know that you can use $customer->getName();. Also, if you need additional actions to be performed before ‘storing’ the variable, you can add it to the setter:

<?php

public function setUpdatedAt($updatedAt)
{
    $updatedAtCarbon = new Carbon($updatedAt);
    $this->updatedAt = $updatedAtCarbon;
}

This will store the ‘updated at’ value as a Carbon object in the DataObject. $dataObject->getUpdatedAt()->diffInDays() is now possible, because it returns the Carbon instance.

Relations

As mentioned before, data can have relations to other data, or in this case, other DataObjects. In our example, a customer has a relation to address. Eloquent provides a nice way to automatically retrieve the customer’s address as a relation, along with its data. To make our DataObjects aware of relations, we implemented a property in every DataObject containing an array with as key the name of the relation (and its corresponding DataObject!) with a default value null:

<?php

protected $relations = [
    'address' => null,
];

When retrieving the data from the database and passing it to the ‘Eloquent-to-DataObject-method’, that method checks if the ‘address’ relation is also retrieved (the model uses the same relation name) and when it is, it creates an ‘AddressDataObject’ with the correct data and places it in the $relations['address'] variable. Our BaseDataObject has several methods methods like hasRelation({name}), getRelation({name}), updateRelationValue({relationName}, {fieldName}, {value}) etc.

There are a lot of cases where you want to have easier access to relation data (especially with 1-1 relations), if for example, you want the display the user’s address, you have to do this:

<?php

echo $customerDataObject->getName().'<br>';
echo $customerDataObject->getRelation('address')->getStreet().'<br>';
echo $customerDataObject->getRelation('address')->getCity().'<br>';
// "country" is a relation in "address", which is also retrieved.
echo $customerDataObject->getRelation('address')->getRelation('country')->getName().'<br>';

// To update the city you would do something like this:
$customerDataObject->updateRelationValue('address', 'city', 'London'); // The 'city' property in AddressDataObject is now set to London.

As you can see, this is too verbose to keep it nice and clean, so we’ve implemented something we’d like to call “MethodMapping”:

<?php

protected $methodMapping = [
    'getStreet' => 'address.getStreet',
    'getCity' => 'address.getCity',
    'setCity' => 'address.setCity',

    'getCountry' => 'address.country.getName',
];

When this code is added to CustomerDataObject, displaying the user’s address will look as follows:

<?php

echo $customerDataObject->getName().'<br>';
echo $customerDataObject->getStreet().'<br>';
echo $customerDataObject->getCity().'<br>';
echo $customerDataObject->getCountry().'<br>';

// And to update the city you would do something like this:
$customerDataObject->setCity('London'); // The 'city' property in AddressDataObject is now set to London.

 

Notice the method name deviation for getCountry. A customer or address has no country property. An address has a relation with country, which has a name. In the first example, we’ve accessed getName() on the country DataObject. But as we already have a getName() method in the ‘CustomerDataObject’, this collides with each other. By using getCountry as array key, and referencing to the getName() method in the address.country relation, the CustomerDataObject maps getCountry to the correct getName in the country DataObject through address.

Validation

The BaseDataObject, which is extended by all DataObjects, has two validation methods: canUpdate() and canInsert(). When calling these methods on a DataObject, a special ‘DataObjectValidation’ class is called (in the above example the class will be \DataObject\Validation\CustomerValidator). This class contains rules that are accepted by our main validator, and can be called at any time to validate the DataObject. We do this, for example, just before inserting the DataObject to the database. Also, when creating a new DataObject based on a POST-request and no additional data is set by other layers in the app, the validation can already be done in the controller, returning errors before the data goes through the whole app. The validator (and rules) also understands defined relationships, so when adding a customer, we can decide that the ‘address’ relation is required and address.street has to be 3 characters or more.

Closing notes…

Managing data in your app using DataObjects is very simple and straightforward. It is completely independent of the rest of your app, you can create DataObjects anytime and anywhere. Also you know what to expect in your classes and what possible fields are available. No longer will you need to switch back-and-forth to see what the exact name of the array key was, or how the column name in the database was named. As you’ve probably guessed, one DataObject represents one record. When retrieving multiple records, we are using Collections to manage them. A collection can exists of multiple DataObjects (of the same type).

Leave a Reply