Overview
GraphQL is a new and exciting API for ad hoc queries and manipulation. It is extremely flexible and provides many benefits. It is particularly suitable for exposing data organized as graphs and trees. Facebook developed GraphQL in 2012 and open-sourced it in 2015.
It rapidly took off and became one of the hottest technologies. Many innovative companies adopted and used GraphQL in production. In this tutorial you’ll learn:
- the principles of GraphQL
- how it compares to REST
- how to design schemas
- how to set up a GraphQL server
- how to implement queries and mutations
- and a few additional advanced topics
Where Does GraphQL Shine?
GraphQL is at its best when your data is organized in a hierarchy or a graph and the front end would like to access different subsets of this hierarchy or graph. Consider an application that exposes the NBA. You have teams, players, coaches, championships, and a lot of information about each one. Here are some sample queries:
- What are the names of the players on the current roster of the Golden State Warriors?
- What are the names, heights and ages of the starters of the Washington Wizards?
- Which active coach has the most championships?
- For which teams and in which years did the coach win his championships?
- Which player won the most MVP awards?
I could come up with hundreds of such queries. Imagine that you have to design an API to expose all these queries to the front end and be able to easily extend the API with new query types as your users or product manager come up with new exciting things to query.
This is not trivial. GraphQL was designed to address this exact problem, and with a single API endpoint it provides enormous power, as you will see soon.
GraphQL vs. REST
Before diving into the nuts and bolts of GraphQL, let’s compare it against REST, which is currently the most popular type of web API.
REST follows a resource-oriented model. If our resources are players, coaches and teams then there will probably be endpoints like:
- /players
- /players/
- /coaches
- /coaches/
- /teams
- /teams/
Often the endpoints without id just return a list of ids, and the endpoints with the id return the full information on one resource. You can, of course, design your API in other ways (e.g. the /players endpoint may return also the name of each player or all the information about each player).
The problem with this approach in a dynamic environment is that you’re either under-fetching (e.g. you get just the ids and need more information) or you’re over-fetching (e.g. getting the full information on each player when you’re just interested in the name).
Those are hard problems. When under-fetching, if you fetch 100 ids, you’ll need to perform 100 separate API calls to get the information on each player. When over-fetching, you waste a lot of back-end time and network bandwidth preparing and transferring a lot of data that is not needed.
There are ways to address it with REST. You can design a lot of bespoke endpoints, each one returning exactly the data you need. This solution is not scalable. It’s hard to keep the API consistent. It’s hard to evolve it. It’s hard to document and use it. It’s hard to maintain it when there is a lot of overlap between those bespoke endpoints.
Consider these additional endpoints:
- /players/names
- /players/names_and_championships
- /team/starters
Another approach is to keep a small number of generic endpoints, but provide a lot of query parameters. This solution avoids the many endpoints problem, but it goes against the grain of the REST model, and also it’s difficult to evolve and maintain consistently.
You could say that GraphQL has taken this approach to the limit. It doesn’t think in terms of well-defined resources, but instead in terms of sub-graphs of the entire domain.
The GraphQL Type System
GraphQL models the domain using a type system that consists of types and attributes. Each attribute has a type. The attribute type can be one of the basic types that GraphQL provides like ID, String, and Boolean, or a user-defined type. The nodes of the graph are the user-defined types, and the edges are the attributes that have user-defined types.
For example, if a “Player” type has a “team” attribute with the “Team” type then it means there is an edge between each player node to a team node. All the types are defined in a schema that describes the GraphQL domain object model.
Here is a very simplified schema for the NBA domain. The player has a name, a team he is most associated with (yes, I know players sometimes move from one team to another), and the number of championships the player won.
The team has a name, an array of players, and the number of championships the team won.
type Player { id: ID name: String! team: Team! championshipCount: Integer! } type Team { id: ID name: String! players: [Player!]! championshipCount: Integer! }
There are also predefined entry points. Those are Query, Mutation, and Subscription. The front end communicates with the back end through the entry points and customizes them for its needs.
Here is a query that simply returns all players:
type Query { allPlayers: [Player!]! }
The exclamation point means that the value can’t be null. In the case of the allPlayers
query, it can return an empty list, but not null. Also, it means that there can be no null player in the list (because it contains Player!).
Setting Up a GraphQL Server
Here is a full-fledged GraphQL server based on node-express. It has an in-memory hard-coded data store. Normally, the data will be in a database or fetched from another service. The data is defined here (apologies in advance if your favorite team or player didn’t make it):
let data = { "allPlayers": { "1": { "id": "1", "name": "Stephen Curry", "championshipCount": 2, "teamId": "3" }, "2": { "id": "2", "name": "Michael Jordan", "championshipCount": 6, "teamId": "1" }, "3": { "id": "3", "name": "Scottie Pippen", "championshipCount": 6, "teamId": "1" }, "4": { "id": "4", "name": "Magic Johnson", "championshipCount": 5, "teamId": "2" }, "5": { "id": "5", "name": "Kobe Bryant", "championshipCount": 5, "teamId": "2" }, "6": { "id": "6", "name": "Kevin Durant", "championshipCount": 1, "teamId": "3" } }, "allTeams": { "1": { "id": "1", "name": "Chicago Bulls", "championshipCount": 6, "players": [] }, "2": { "id": "2", "name": "Los Angeles Lakers", "championshipCount": 16, "players": [] }, "3": { "id": "3", "name": "Golden State Warriors", "championshipCount": 5, "players": [] } } }
The libraries I use are:
const express = require('express'); const graphqlHTTP = require('express-graphql'); const app = express(); const { buildSchema } = require('graphql'); const _ = require('lodash/core');
This is the code to build the schema. Note that I added a couple of variables to the allPlayers
root query.
schema = buildSchema(` type Player { id: ID name: String! championshipCount: Int! team: Team! } type Team { id: ID name: String! championshipCount: Int! players: [Player!]! } type Query { allPlayers(offset: Int = 0, limit: Int = -1): [Player!]! }`
Here comes the key part: hooking up the queries and actually serving the data. The rootValue
object may contain multiple roots.
Here, there is only the allPlayers
. It extracts the offset and limit from the arguments, slices the all players data, and then sets the team on each player based on the team id. This makes each player a nested object.
rootValue = { allPlayers: (args) => { offset = args['offset'] limit = args['limit'] r = _.values(data["allPlayers"]).slice(offset) if (limit > -1) { r = r.slice(0, Math.min(limit, r.length)) } _.forEach(r, (x) => { data.allPlayers[x.id].team = data.allTeams[x.teamId] }) return r }, }
Finally, here is the graphql
endpoint, passing the schema and the root value object:
app.use('/graphql', graphqlHTTP({ schema: schema, rootValue: rootValue, graphiql: true })); app.listen(3000); module.exports = app;
Setting graphiql
to true
enables us to test the server with an awesome in-browser GraphQL IDE. I highly recommended it for experimenting with different queries.
Ad Hoc Queries With GraphQL
Everything is set. Let’s navigate to http://localhost:3000/graphql and have some fun.
We can start simple, with just a list of the player names:
query justNames { allPlayers { name } } Output: { "data": { "allPlayers": [ { "name": "Stephen Curry" }, { "name": "Michael Jordan" }, { "name": "Scottie Pippen" }, { "name": "Magic Johnson" }, { "name": "Kobe Bryant" }, { "name": "Kevin Durant" } ] } }
Alright. We got some superstars here. No doubt. Let’s go for something fancier: starting from offset 4 get 2 players. For each player, return their name and how many championships they won as well as their team name and how many championships the team won.
query twoPlayers { allPlayers(offset: 4, limit: 2) { name championshipCount team { name championshipCount } } } Output: { "data": { "allPlayers": [ { "name": "Kobe Bryant", "championshipCount": 5, "team": { "name": "Los Angeles Lakers", "championshipCount": 16 } }, { "name": "Kevin Durant", "championshipCount": 1, "team": { "name": "Golden State Warriors", "championshipCount": 5 } } ] } }
So Kobe Bryant won five championships with the Lakers, who won 16 championships overall. Kevin Durant won just one championship with the Warriors, who won five championships total.
GraphQL Mutations
Magic Johnson was a magician on the court for sure. But he couldn’t have done it without his pal Kareem Abdul-Jabbar. Let’s add Kareem to our database. We can define GraphQL mutations to perform operations like adding, updating and removing data from our graph.
First, let’s add a mutation type to the schema. It looks a little bit like a function signature:
type Mutation { createPlayer(name: String, championshipCount: Int, teamId: String): Player }
Then, we need to implement it and add it to the root value. The implementation simply takes the parameters provided by the query and adds a new object to the data['allPlayers']
. It also makes sure to set the team correctly. Finally, it returns the new player.
createPlayer: (args) => { id = (_.values(data['allPlayers']).length + 1).toString() args['id'] = id args['team'] = data['allTeams'][args['teamId']] data['allPlayers'][id] = args return data['allPlayers'][id] },
To actually add Kareem, we can invoke the mutation and query the returned player:
mutation addKareem { createPlayer(name: "Kareem Abdul-Jabbar", championshipCount: 6, teamId: "2") { name championshipCount team { name } } } Output: { "data": { "createPlayer": { "name": "Kareem Abdul-Jabbar", "championshipCount": 6, "team": { "name": "Los Angeles Lakers" } } } }
Here is a dark little secret about mutations… they are actually exactly the same as queries. You can modify your data in a query, and you may just return data from a mutation. GraphQL is not going to peek into your code. Both queries and mutations can take arguments and return data. It is more like syntactic sugar to make your schema more human readable.
Advanced Topics
Subscriptions
Subscriptions are another killer feature of GraphQL. With subscriptions, the client can subscribe to events that will be fired whenever the server state’s changes. Subscriptions were introduced at a later stage and are implemented by different frameworks in different ways.
Validation
GraphQL will verify every query or mutation against the schema. This is a big win when the input data has a complex shape. You don’t have to write annoying and brittle validation code. GraphQL will take care of it for you.
Schema Introspection
You can inspect and query the current schema itself. That gives you meta-powers to dynamically discover the schema. Here is a query that returns all the type names and their description:
query q { __schema { types { name description } }
Conclusion
GraphQL is an exciting new API technology that provides many benefits over REST APIs. There is a vibrant community behind it, not to mention Facebook. I predict that it will become a front-end staple in no time. Give it a try. You’ll like it.