Today a lot of the world’s information is accessible using web APIs of various types. It doesn’t matter if you use Google Maps, the Azure Translator AI, or PaperCut MF. You can create clients that request and ingest information from other systems — but they are invariably systems that you don’t manage.
A few times recently I have seen developers make assumptions about the information they would get back from a web API call. This led to trouble when the server supplied data that was slightly different than expected.
It’s something to be aware of when using any API, but I’ll use the PaperCut NG/MF health monitoring API to provide a simple example.
The problem occurs because developers often don’t have a complete understanding of what a third party system can return in all circumstances.
It’s reasonable to expect that an API’s method signatures’ are fully documented and accurate (method name, parameters, and types), as well as the return data structure types (Is it a list? An array of lists? Integers? Strings? and so on…)
Furthermore, you should expect this API to be versioned so that any significant changes won’t break your code (as long as you remember to request the correct version). However, the API documentation does not give you enough information to make assumptions about all of the data you receive.
As an illustration, I’m calling the PaperCut NG/MF health monitoring API. For simplicity, I’ll call the API via PowerShell Core (which runs on Linux, macOS, and Windows). This blog post explains how to use our API with PowerShell.
Here I want to discover information about the PaperCut NG/MF web print server setup, specifically what file formats my server supports.
If I run the Powershell command on my laptop:
(Invoke-RestMethod -Uri "$uri/web-print" -Method Get -Headers $headers).servers.supportedFileTypes | ConvertTo-Json
[ "image", "pdf" ]
This is a simple default web print server, so the only currently supported file types are images and PDF files. However, suppose PaperCut Software decides to add support for Postscript files (please note that this is very unlikely to happen) then “
ps” could appear in the supportedFileTypes list in the future.
[ "image", "pdf," "ps" ]
If your code can’t handle the new value (even though it’s of no relevance to your solution) then you might see it fail in interesting ways after the PaperCut server is upgraded.
Most API developers (the folks who design and develop the code that your client is calling) will assume that you will only look for the values you need, and ignore any others.
In the web API world, changing return values in this way is usually considered a non-breaking change (but there can be exceptions, and if it was a binary API then it probably would be a breaking change).
There is a nice summary in Ben Nadel’s blog post, “When Is A Change A Breaking Change For An API”.
Note that, surprisingly enough, the opposite can also be true (you can receive less information than expected). Consider a system that provides an API with many different options, an example is the PaperCut MF kiosk and standalone cash loader API.
This API allows a PaperCut MF administrator to configure many different options — however different kiosk solutions often choose to only implement a subset of the available API options (reasonably enough).
After development is complete, PaperCut will customize the options so that customers can only configure what they can use.
However, now when the kiosk software asks the application server about the current set of configurations, PaperCut NG/MF will only supply values for the settings relevant to the specific solution and not all the other supported options the developer saw during development.
These types of changes should not be a problem if the developer has adopted a defensive programming posture when handling API return values.
So what is a defensive posture and how can it help us?
A defensive posture (in overlay simplistic terms) stops making any assumptions about the data being received from outside of our solution.
Furthermore, if a data problem relevant to the solution is detected then the system will handle the situation robustly.
Let’s summarize what that means in a practical fashion
When storing API return values, use a flexible data structure.
For example, a map or JSON buffer will store any structured data of the required types provided by the API server.
Depending on the programming language you use this might be hard or easy, however many modern languages support some form of dynamic data structure or generic interface.
Only process API fields and values your solution needs. Ignore the rest.
If you are iterating over a collection of fields, make sure you provide a default case for unrecognized fields. For instance, in a switch statement.
Check for the presence of the fields you need to access, before checking the fields for the values you need.
If a required field is missing then degrade gracefully. That is, a helpful error message to the user or log message, and a controlled shutdown if needed.
For each value that you need to process consider how much validation, and sanitization is required.
Finally, be sure to identify the 20% of cases for which this is not adequate and make sure you can still handle the problem in a sensible manner.
There is nothing unique about these suggestions and in fact, a cautious approach to handling ALL external input data is recommended in the majority of situations.
Get in touch
Remember that if you are already using one or more of our APIs we provide discussion forums via Google Groups, that you can use to discuss this topic further.