Thoughts on API design in a world of carelessness

published on in category Development , Tags: Development PHP API

The company I’m working at provides eCommerce solutions for many years now. A few years ago we decided to give up on our own product and started to become an agency that would work with a existing eCommerce application from now on. In our own software, we provided a SOAP API which hadn’t changed for years that had some client-side implementations in various ERP systems and when we switched over, we decided to provide a compatibility plugin for the new software that would expose the SOAP facade we built years ago and translate all requests to the REST API (which we called internally without going over HTTP again).

Building on the existing APIs my opinion is in the only sane way for interacting with another application, contrarily to what other people consider to be an REST API (aka “the ultrafantastic and very secure HTTP-request-to-SQL-translator”) - at least, if the underlying APIs are designed well…

API design in applications

Sadly, almost any software written today seems to be evidence of the ignorant carelessness within the IT world. Either people don’t know what they do or they don’t care about what they do. But this is somewhat egoistical especially when it comes to publicly exposed APIs because it not only causes your software to be worse than it could be but your opposite side will also be degraded.

If there is one single guideline, that any API development has, then it is the following: “You don’t break API compatibility in minor releases”. Why don’t you do this? Well, because a public API is something other parties rely on. In user interface design, your application will continue to work, if you rename your submit button from “Register” to “Register now”. But when you rename the field “countryId” to “country”, other companies that connect to your API will probably get into serious trouble.

So what do I consider bad API design? Let me give you an example.

Layer separation

Let’s say you have an application called “MySuperShopSoftware” that exposes a REST API for external applications to connect to and to import customers into your shop. Your application is written in PHP and you’re using an ORM to connect with your database. Your customer model looks something like that, each class member is representing a field in your database:

<?php

namespace Model;

class Customer
{
  public $id;
  public $firstName;
  public $lastName;
  public $countryId;

  public function toArray()
  {
    return get_object_vars($this);
  }

  [...]
}

You provide a API resource that looks something like that:

<?php

namespace Api\Resource;

class Customer
{
  public function get($id)
  {
    $customer = Customer::getById((int) $id)->toArray();
    return json_encode($customer);
  }
}

I assume that this is (broken down) what most of the APIs in PHP applications look like nowadays. So why is this bad? Well, let’s look at our DTO (which in this case is just an array marshalled to JSON):

array(
  "id" => 1,
  "firstName" => "Jon",
  "lastName" => "Snow",
  "countryId" => 4
)

As your DTO represents exactly your database layout, any change to your database layout immediately changes the API exposed to the world and maybe thousands of participants relying on it. And you don’t want that!. And while you might be able to preserve the current way the API works even if you make a change to your internal API, a future developer might not - and break something. And you’re partly responsible for that.

So how to resolve that issue?

<?php

namespace Api\Resource;

class Customer
{
  public function get($id)
  {
    $customer = Customer::getById((int) $id);

    $response = [
      "id" => $customer->id,
      "firstName" => $customer->firstName,
      "lastName" => $customer->lastName,
      "countryId" => $customer->countryId,
    ];

    return json_encode($response);
  }
}

Looks stupid, doesn’t it? Yeah maybe, but now you can rename “firstName” column to “firstname” and replace $customer->firstName by $customer->firstname and no one beyond the scope of your application will even notice. But there’s even more to that. Because of the fact that the assembly of your DTO is now an explicit part of your API resource implementation, someone changing your code will recognize that this might cause trouble and take care of it.

Production APIs that are constantly changing are evidence for bad application design. But keeping an API sane is not that hard. In fact, there’s no need that your DTOs match your internal data structure… so why do you do that if it’s more hard to maintain?

Unit-testing DTOs

Additionally a point that might save your job is to write unit tests for that resource methods because this actually is a client-side reference implementation of your API… If you’re working on a new minor version and you have to change a field name in your tests to make them pass, you’re probably doing something wrong… yes… or anything… you’re actually doing anything wrong then…

No one wants to know your secrets

But there’s something more to see here. What about countryId? This has actually become something I really start to hate… A public API is meant to be used by other applications. So how the f*ck would any external application know what this countryId means? I want to get the country as ISO 3316-1 alpha-2 or whatever because it’s more likely that I already have this information somewhere in my database - at least more likely than the ID of a external system… and no, I don’t wanna look that up using GET /countries/4. And NO, I don’t want to create a mapping in my database to additionally keep all external IDs for this or that party, as if it wasn’t hard enough to do that for my internal IDs.

No developer who’s integrating his application with yours wants to know how your data is structured internally. All the developer on the other side wants to get is the information in a format that makes sense for the domain you both are talking about.

DTOs are not the only thing

There’s way more that contributes to your API’s behavior… like Exceptions. Let’s modify our Model a little:

<?php

namespace Model;

class Customer
{
  public $id;
  public $firstName;
  public $lastName;
  public $countryId;

  public function toArray()
  {
    if (null === $this->id) {
      throw new UninitializedException("Model has not been initialized");
    }

    return get_object_vars($this);
  }

  [...]
}

So let’s assume a global exception handler of our API catches all exceptions and converts them to JSON to provide the other side with a pretty formatted error message. But with this implementation, you’re still unable to change the exception behavior of you private API without also changing the behavior of your public API and this, again, is something that you don’t wanna do. The only way to create a coherent and stable API is to remap anything coming from your hell of private APIs in the scope of your public API to have control over what is sent to your client - within your API resource. This might be unstylish… but that’s the way it works.

API versioning

Something that you might also want to implement is a versioned API by creating multiple versions of your API resources that behave different. By this, I mean that you expose routes like this:

GET /api/v1/customers/4

GET /api/v2/customers/4

When creating a new version of your software, you might also want to create a new version of your API. And while the old API (v1) still behaves the same old way it did for the last couple of years, the new kids on the block may use your brand new and shiny version (v2) with all the API features they always dreamed of.

But of course, this only works if your DTO is assembled completely within your resources rather than composing parts of data structures of internal APIs that you just forward to your external API - otherwise even versions that are already archived and should not change will change. In fact this is something I see all the time when integrating with other applications… … or you version your whole application ;-)

Enhancements vs. Changes

But even if you don’t have a proper API versioning, there’s something you could do. Why don’t you just add new fields instead of replacing the old ones? So if you want to expose your country ISO-3316-1-encoded rather than just sending it’s ID, why don’t you just keep the field countryId where it is and enhancing your DTO by adding country next to it? Existing implementations will keep working while new ones will benefit from eliminating the need to make another HTTP request for retrieving the country code by it’s ID.

And the other APIs?

This does not only affect public APIs like REST or SOAP. If your application is extensible with plugins, you don’t want other parties to hook into your heart by using or modifying your internal APIs. The only way you can retain control of what’s going on is to expose special APIs that are specially designed to the needs that plugin developers have - and that don’t change either. Or do you want your users to always have to change their plugins each time you change your internal API? Because if that’s the case, don’t worry about your customers - they won’t be your customers for too long…