Skip to main content

Stepping Up to the GraphQL Buffet

Daniel Yu
Nov 25 - 10 min read

Recently at a conference, someone said to me: “Why isn’t <company X> using GraphQL for their REST APIs? Everyone’s going to be using GraphQL soon.” I smiled, nodded, and said to myself, “what the heck is GraphQL?”

I’ve been documenting REST APIs for years now. As a technical writer and former developer, I felt like I had a pretty solid understanding of what developers want out of REST API documentation. I had gotten pretty comfortable documenting REST APIs — I’ve gone from manually documenting REST resources to generating Swagger/OpenAPI documentation — and everything felt pretty good. And then, out of nowhere, I learn that I’m missing out on something called GraphQL. Naturally, I had to find out more about this.

GraphQL is a new (to some of us) way of designing your REST APIs. Really, it’s a new way of thinking about your APIs, but we’ll get to that in a moment. Suffice to say, GraphQL is very different from, say, a REST API defined using OpenAPI. If you’re a technical writer who’s worked with REST APIs in the past, GraphQL may require thinking about documenting your API in different ways than you might be used to.

But before we go into all of that, let’s first get a better understanding of GraphQL. What makes it different from the REST API approach that we’re all used to? To explain that, it helps to understand some of the shortcomings of typical REST APIs, and how GraphQL was created to address those shortcomings.

The REST We Know

REST popularized the methodology of having a simple, robust web service interface that used an industry standard protocol (HTTP).

REST is very “resource-centric.” You normally identify a particular REST API resource by its Uniform Resource Identifier (URI). As an example, if you had a REST API for your blogging service, you might have a call that gets information on a particular blog author:

/superblogger/user/{user ID}

You could make a GET request on this URI and get back user information in a JSON response similar to:

{
"id" : "093775",
"name" : "Sam",
"posts" : [
{
"title" : "My thoughts on REST",
"date" : "2019-7-6",
"content" : "When I was writing this entry, I thought...."
},
{
"title" : "My thoughts on emacs",
"date" : "2018-9-12",
"content" : "If you are looking for an excellent text editor then you should...."
},
...
]
}

If you instead wanted to create a new user, maybe you’d use the same URI, but without an ID and issue a POST. The “/superblogger/user” URI would still be the key piece of information.

When documenting REST APIs, you generally list them by resource URIs. Similarly, when using a REST API definition framework, like OpenAPI, you’d be specifying a list of REST resources, and the operations those resources can do.

Some Issues with REST

This all seems fine, right? URIs are nice and unique, and you can sometimes even guess at what they represent. No one is going to mistake the /superblogger/user resource for a resource that retrieves or uploads images to the service. Hopefully.

However, in the previous example, the /superblogger/user resource can be used to retrieve blog posts for a particular user. This isn’t obvious from the URI. The only way a developer new to your service would know to retrieve posts this way is by trial-and-error, or by careful reading of the documentation for the /superblogger/user resource.

When a REST resource is created, it is typically created with a specific use-case (or two) in mind. Whoever created /superblogger/user decided that any user request would also want the user’s posts, and that this was a common use-case. Maybe /superblogger/user was created to address a specific need from the front-end team for the superblogger website.

There’s a subtle problem with this decision process, unfortunately. The problem is that use cases can change over time, and it’s hard to predict (and handle) new use-cases. Imagine that a separate team is working on the mobile app for superblogger. Their front end is going to have separate pages for the user and the user’s posts. When their mobile app makes a request for user data, the response is going to be full of data they aren’t going to use (the list of posts). Since they’re on a mobile device with data bandwidth limitations, this is a serious issue for the team.

To address this, the service team could create a new REST resource just for the mobile team:

/superblogger/userNoPosts/{user ID}

Oh, and the mobile team will probably also need another new resource to get just the posts:

/superblogger/posts/{user ID}

Meanwhile the website team wants to have an abridged format for the post content. So, the service team changes /superblogger/user to return shortened abridged information for the user’s posts. Should the /superblogger/posts resource be changed the same way? Suddenly the service team is responsible for several different, slightly redundant resources, and has to understand and manage all the different use cases. This responsibility can quickly grow out of control.

GraphQL to the Rescue

GraphQL takes a completely different approach. In GraphQL, the data schema is the star of the show. Most GraphQL APIs have a single endpoint URI:

/superblogger/graphql

The client specifies the sort of call they want to make in the request body, using a strongly-typed schema-based format. For example, here’s a request body that retrieves the superblogger user info and user posts, like in our very first REST example:

{
"query" : "{ user(id: 093775) { id name posts { title date content } } }"
}

which might return something very similar to what we had before:

{
"data": {
"user": {
"id" : "093775",
"name" : "Sam",
"posts" : [
{
"title" : "My thoughts on REST",
"date" : "2019-7-6",
"content" : "When I was writing this entry, I thought...."
},
{
"title" : "My thoughts on emacs",
"date" : "2018-9-12",
"content" : "If you are looking for an excellent text editor then you should...."
},
...
]
}
}
}

And here’s a request body that just returns the user name (something the mobile team might find handy):

{
"query" : "{ user(id: 093775) { name } }"
}

Which returns:

{
"data": {
"user": {
"name" : "Sam"
}
}
}

The decisions on what data to get (or change) now are made by the client, not the API provider. The client is free to specify whatever information they want to fit whatever use case they had in mind. The service team is free to implement their GraphQL API without having to worry about how clients are going to use it.

This ability of GraphQL allowing clients to choose how they want their API to work is sometimes described as giving clients access to a “full buffet.” To continue with the restaurant metaphor: whereas traditional REST APIs only let you order specific dishes (via resources), GraphQL lets you go to the buffet and pick out exactly what you want (via GraphQL queries). Side (dish) benefits of this include: avoiding over-fetching (receiving data that you won’t use), reducing the number of calls and the amount of data transfer over the network, and reducing the amount of data processing after retrieving the data.

GraphQL also lets you do more with the data than query, but I’ll save that for a different post.

Documenting GraphQL

The source of truth

GraphQL APIs are defined by a GraphQL Schema Definition Language (SDL) file. This file is similar to a Swagger/OpenAPI definition file in concept, but the GraphQL SDL file describes the schema, and what you can do with the schema.

As a writer you don’t necessarily need to be an expert with the GraphQL SDL, but it will really help you if you understand some of the basics and gotchas around this format. Chances are that your team will be using an SDL file to define their GraphQL API, so the more you can utilize it, the better.

Document the schema

As GraphQL is focused on presenting the schema, and not necessarily how clients will use the schema, your docs have to reflect this too. In the past you may have documented REST API resources by describing what the resource does. For example:

/superblogger/user

Use this resource to get information about a specific Superblogger user, identified by ID. Includes posts for the given user.

Parameters: id — the user Id
Methods: GET
Request body: None
Response body: <response body format>

For GraphQL, you want to focus on the data schema instead. You can document schema types and fields within the information that defines the API (see the next section for more details on this); however, your documentation needs to describe the data objects and fields, similar to how you might document a database schema:

User: A user within the Superblogger service
Id: The unique ID for a user
Name: The name of the user (not unique, not required)

Notice that we didn’t even mention here whether you can query for users, or add new users.

Resist the temptation to talk about use cases within the schema documentation. Save this for additional docs you’ll add that talk about the API as a whole and provide use case examples.

Documentation from definitions

Similar to Swagger/OpenAPI, it’s possible to put documentation within the definition file. You do this via comments, like:

"""
A Blogger user
"""
type User {
"""
User name
"""
name: String @fake(type: fullName, locale:en_US)
"""
User ID
"""
id: ID @fake(type: uuid)
"""
All posts from the given user. **Includes** posts hidden by moderators.
"""
posts: [Post!]
}

There are various GraphQL “API explorer” tools (like GraphiQL) that will take this information and render it as in-tool documentation, and support Markdown elements in the comments as well (notice the “**” Markdown ‘bold’ directive used in the comment above). This could look something like:

GraphiQL example with documentation pane highlighted

This is pretty great, right? Assuming you can access the SDL file for the API you’re documenting, you can review and edit the comments that will end up in documentation.

Is that it? Well, not quite. There’s a couple things that you still need to take care of. For one thing, docs from the SDL file will work as reference docs about the schema and methods, but there isn’t room in the SDL file for things like overview information, tutorials, and other non-reference content. You’ll have to write this content yourself (no different than doc’ing any other type of API, really). You’ll probably want to include or refer to your reference doc in some way. Assuming you’re using a tool like GraphiQL, there are ways to embed a live demo client within an HTML page (see https://www.contentful.com/developers/docs/tutorials/general/graphql/ for an example of this).

Another thing to be aware of is that while GraphQL is strongly-typed, you’ll still need to document custom formats contained in strings or other types. For example, suppose the Superblogger service team adds a new data item that represents how popular a post is:

type Post {
...
"""
Popularity information for a post
"""
popularity: String
....
}

This might look like sufficient documentation to the untrained eye, but notice that we haven’t specified how popularity is represented. Without further docs, anyone accessing the popularity value for a given post might have no idea that the expected values are “Low”, “Medium”, and “High”.

Similarly, if there are limitations on field or parameter values, you’ll need to clearly document these, as they won’t be obvious from looking at the SDL information.

Additional things to watch out for

GraphQL has no standard mechanism to support pagination, so if your API clients need this, you’ll have to implement your own pagination mechanism, which will need to be documented. Also, GraphQL doesn’t provide any guidelines on how to cache pagination results (or batching in general) so if your API implementation does any special caching, you’ll want to document this behavior so clients know what to expect.

GraphQL supports a mechanism for dealing with out-of-date schema information. While traditional REST APIs might need to version resources when the underlying schema changes, GraphQL lets service providers mark changes in the SDL file. One approach is to use the isDeprecated and deprecationReason annotations in the SDL file:

...
fields: {
name: {
type: GraphQLString,
deprecationReason: 'Name is now split into firstname and lastname'
},
firstname: { type: GraphQLString },
lastname: { type: GraphQLString }
...

Tools like GraphiQL will take information from deprecationReason and display it along with the rest of the in-line doc.

The GraphQL Journey Continues

I’ve only touched upon some of the basics of GraphQL, and I’m still learning about all of the neat stuff GraphQL can do, but the experience so far has really opened my eyes to how novel and different GraphQL’s approach to REST APIs really is. Just when I thought I knew everything I needed to know about REST, I find this whole new way of thinking about things that’s going to keep me busy for quite a while!

If you’re as intrigued by GraphQL as I am, I hope you continue exploring what GraphQL can offer. There are plenty of great resources from the GraphQL Foundation and companies like Apollo for more complete, in-depth introductions to GraphQL. Additionally, while researching GraphQL I found many tools for mocking and testing GraphQL APIs, such as graphql-faker, that writers like me can use to immediately jump into GraphQL and start experimenting. It’s easy to see what the GraphQL “buffet” has to offer!


Special thanks to Gavin Austin and Durgaprasad Guduguntla for helping me write this post!

Related General Engineering Articles

View all