My Octopress Blog

A blogging framework for hackers.

Expandable Fields in REST APIs

To include or not to include? That’s the question.

When creating a RESTful API web service, there always comes a point where you have to decide on how to deal with a related field. You can choose to return the related object nested within the parent, or to only include the reference, by which a client can retrieve it.

In this article I would like to demonstrate an approach that I have started using, that does not rule out either one of the above options, but lets the client decide based on its requirements. I call them expandable fields.

An example

Let’s look at an example first. For my latest project I have been working on an app that displays a list of upcoming meetings. A meeting has several related fields: the owner, a location, and a list of invitations. Here is a sample JSON response, using simple references for the related fields:

[
    {
        "id": 1,
        "start": 1448406000.0,
        "end": 1448407800.0,
        "title": "Drinks",
        "owner": 27,
        "location": 5,
        "invitations": [4, 32]
    },
    ...
]

And here is a sample response, with the referenced objects included:

[
    {
        "id": "1",
        "title": "Drinks",
        "start": 1448406000.0,
        "end": 1448407800.0,
        "owner": {
            "id": 27,
            "first_name": "Lukas",
            "last_name": "Batteau"
        },
        "location": {
            "id": 5,
            "name": "Coffee Bar",
            "address": "39 Street",
            "city": "Utrecht"
        },
        "invitations": [
            {
                "id": 4
                etc...
            }
        ]
    },
    ...
]

Both types of responses have advantages and disadvantages, most of which will be obvious. Including only references is lean and mean, but the client will need extra requests to retrieve the related objects, leading to a lot of chatter. Including the objects makes it easier for the client, but the request will be slower, and can lead to a lot of bandwidth going completely unused.

So what if you need nested objects only sometimes?

Why decide?

It really depends on the client’s needs which of the two options to go for. If the screens are directly tied in with certain endpoints, then the data requirements of those screens should dictate what to retrieve and how much of it. For instance, in the unlikely case that the list only shows the meeting name, all you need is the first option. In the more realistic case that a lot of meeting details are displayed in the list, like showing the number of participants, you will need option two.

In my experience an app will display the same objects in different ways over different screens, depending on the use case for that particular screen.

In any case, it would be nice if we could control the response as required by the client. If the client could specify which related fields should be retrieved and included in the response, that would give us the power to save extra requests, or minimize the response payload, as needed.

Enters the query parameter

One way to control the API response, is to add a query parameter expand to the request. By specifying fields you would like to expand, you can ask the server to retrieve the related fields and include the objects in the response.

For instance, http://api.example.com/meetings?expand=owner,location would give the following response:

[
    {
        "id": "1",
        "title": "Drinks",
        "start": 1448406000.0,
        "end": 1448407800.0,
        "owner": {
            "id": 27,
            "first_name": "Lukas",
            "last_name": "Batteau"
        },
        "location": {
            "id": 5,
            "name": "Coffee Bar",
            "address": "39 Street",
            "city": "Utrecht"
        },
        "invitations": [4, 32]
    },
    ...
]

In the same way, any other related field can be included or not, as requested. By requesting all of them, we end up with option two. By requesting none, we’re back to option one. We’re free to choose any combination in between, meeting the client’s demands, but making sure no resources or bandwidth are used in vain.

We’re not there yet

Of course, this approach has consequences that require some extra thinking.

Maximum depth

Whenever anything’s nested, there can be recursive cycles. Nested objects can include references to the parent object. It is possible that both the parent and the child are to be expanded, leading to infinite recursion. We have to make sure that a maximum depth is respected, below which only expansion of fields is disabled.

Uniform data mapping

So what if the client is using a data mapper, converting the response JSON objects into native class instances that can be used in the client’s code. We don’t want the format of the response objects to structurally differ, depending on whether fields are expanded or not. Having the same field contain an integer representing a reference one way, or a JSON object the other way, will require way too much client side customization.

So instead of returning the reference as an integer, we return an object, just like the expanded version. Only this time the object only contains one field, the ID:

[
    {
        "id": "1",
        "title": "Drinks",
        "start": 1448406000.0,
        "end": 1448407800.0,
        "owner": {
            "id": 27,
            "first_name": "Lukas",
            "last_name": "Batteau"
        },
        "location": {
                "id": 5
        },
        "invitations": [
            {
                "id": 4
            }, 
            {
                "id": 32
            }
        ]
    },
    ...
]

This way the client data mapper will have no problem converting it into a native class instance. All it needs to do is retrieve the object details when they are needed.

Nested writes

If the client can retrieve nested objects, what about creating objects or changing them? The problem with nested writes is that they can be ambiguous.

Let’s take a look at an example, where the client is creating a new meeting by doing a POST of a meeting, with the location data nested in the meeting data. There are two ways for the API to handle the location:

  • If the location does not already exist, the API should create it using the passed data, and set the reference on the meeting object.
  • If the location does already exist, all the API has to do is set the reference. But there’s a catch. What if the location data has changed in the mean time? It could be external data, Shouldn’t it be updated in the process?

Given the above, the API should mimick either a POST or a PUT of the location, depending on whether it exists. Obviously it will need an ID to determine this. So if the included data contains an ID, a PUT is mimicked. If not, a POST.

One advantage of nested writes is that the creation or update of a meeting will happen in one request, making the error handling a lot more manageable for the client.

Field names

By choosing to use a simple comma-separated list of field names, there is a chance of ambiguity, where there are different fields with the same name.

Conclusion

I’ve presented a method to give the client more control of a REST API response by introducing expandable fields. I have implemented an ExpandableFieldSerializer that works together with an ExpandableViewSetMixin for the Django Rest Framework, which I will share on GitHub shortly.