gatsby-source-graphql
⚠️ Warning
We do not recommend using this plugin if your content source has an existing source plugin (like gatsby-source-wordpress for WordPress, gatsby-source-contentful for Contentful, etc.) This plugin has known limitations, specifically in that it does not support Incremental Builds, CMS Preview, image optimizations, and lack of full support for the GraphQL data layer. Please only use it for simple proof-of-concepts and if there is not an existing source plugin for your data source.
Description
Plugin for connecting arbitrary GraphQL APIs to Gatsby’s GraphQL. Remote schemas are stitched together by declaring an arbitrary type name that wraps the remote schema Query type (typeName
below), and putting the remote schema under a field of the Gatsby GraphQL query (fieldName
below).
Known Limitations
-
⚠️ Lack of support for Incremental Builds
- This can cause significant build speed issues, particularly for larger, content-heavy sites
- ⚠️ Lack of support for CMS Preview and real-time previews for content / API updates
- ⚠️ Lack of full support for GraphQL data layer, including image optimization / image CDN, and directive support
Install
npm install gatsby-source-graphql
How to use
If the remote GraphQL API needs authentication, you should pass environment variables to the build process, so credentials aren’t committed to source control. We recommend using dotenv
, which will then expose environment variables. Read more about dotenv and using environment variables here. Then we can use these environment variables via process.env
and configure our plugin.
// In your gatsby-config.js
module.exports = {
plugins: [
// Simple config, passing URL
{
resolve: "gatsby-source-graphql",
options: {
// Arbitrary name for the remote schema Query type
typeName: "SWAPI",
// Field under which the remote schema will be accessible. You'll use this in your Gatsby query
fieldName: "swapi",
// Url to query from
url: "https://swapi-graphql.netlify.app/.netlify/functions/index",
},
},
// Advanced config, passing parameters to apollo-link
{
resolve: "gatsby-source-graphql",
options: {
typeName: "GitHub",
fieldName: "github",
url: "https://api.github.com/graphql",
// HTTP headers
headers: {
// Learn about environment variables: https://gatsby.dev/env-vars
Authorization: `Bearer ${process.env.GITHUB_TOKEN}`,
},
// HTTP headers alternatively accepts a function (allows async)
headers: async () => {
return {
Authorization: await getAuthorizationToken(),
}
},
// Additional options to pass to node-fetch
fetchOptions: {},
},
},
// Advanced config, using a custom fetch function
{
resolve: "gatsby-source-graphql",
options: {
typeName: "GitHub",
fieldName: "github",
url: "https://api.github.com/graphql",
// A `fetch`-compatible API to use when making requests.
fetch: (uri, options = {}) =>
fetch(uri, { ...options, headers: sign(options.headers) }),
},
},
// Complex situations: creating arbitrary Apollo Link
{
resolve: "gatsby-source-graphql",
options: {
typeName: "GitHub",
fieldName: "github",
// Create Apollo Link manually. Can return a Promise.
createLink: pluginOptions => {
return createHttpLink({
uri: "https://api.github.com/graphql",
headers: {
Authorization: `Bearer ${process.env.GITHUB_TOKEN}`,
},
fetch,
})
},
},
},
],
}
How to Query
{
# This is the fieldName you've defined in the config
swapi {
allSpecies {
name
}
}
github {
viewer {
email
}
}
}
Schema definitions
By default, the schema is introspected from the remote schema. The schema is cached in the .cache
directory, and refreshing the schema requires deleting the cache (e.g. by restarting gatsby develop
).
To control schema consumption, you can alternatively construct the schema definition by passing a createSchema
callback. This way you could, for example, read schema SDL or introspection JSON. When the createSchema
callback is used, the schema isn’t cached. createSchema
can return a GraphQLSchema instance, or a Promise resolving to one.
const fs = require("fs")
const { buildSchema, buildClientSchema } = require("graphql")
module.exports = {
plugins: [
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
createSchema: async () => {
const json = JSON.parse(
fs.readFileSync(`${__dirname}/introspection.json`)
)
return buildClientSchema(json.data)
},
},
},
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
createSchema: async () => {
const sdl = fs.readFileSync(`${__dirname}/schema.sdl`).toString()
return buildSchema(sdl)
},
},
},
],
}
Composing Apollo Links for production network setup
Network requests can fail, return errors or take too long. Use Apollo Link to add retries, error handling, logging and more to your GraphQL requests.
Use the plugin’s createLink
option to add a custom Apollo Link to your GraphQL requests.
You can compose different types of links, depending on the functionality you’re trying to achieve. The most common links are:
@apollo/client/link/retry
for retrying queries that fail or time out@apollo/client/link/error
for error handling@apollo/client/link/http
for sending queries in http requests (used by default)
For more explanation of how Apollo Links work together, check out this Medium article: Productionizing Apollo Links.
Here’s an example of using the HTTP link with retries (using @apollo/client/link/retry):
// gatsby-config.js
const { createHttpLink, from } = require(`@apollo/client`)
const { RetryLink } = require(`@apollo/client/link/retry`)
const retryLink = new RetryLink({
delay: {
initial: 100,
max: 2000,
jitter: true,
},
attempts: {
max: 5,
retryIf: (error, operation) =>
Boolean(error) && ![500, 400].includes(error.statusCode),
},
})
module.exports = {
plugins: [
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
// `pluginOptions`: all plugin options
// (i.e. in this example object with keys `typeName`, `fieldName`, `url`, `createLink`)
createLink: pluginOptions =>
from([retryLink, createHttpLink({ uri: pluginOptions.url })]),
},
},
],
}
Custom transform schema function (advanced)
It’s possible to modify the remote schema, via a transformSchema
option which customizes the way the default schema is transformed before it is merged on the Gatsby schema by the stitching process.
The transformSchema
function gets an object argument with the following fields:
- schema (introspected remote schema)
- link (default link)
- resolver (default resolver)
- defaultTransforms (an array with the default transforms)
- options (plugin options)
The return value is expected to be the final schema used for stitching.
Below an example configuration that uses the default implementation (equivalent to not using the transformSchema
option at all):
const { wrapSchema } = require(`@graphql-tools/wrap`)
const { linkToExecutor } = require(`@graphql-tools/links`)
module.exports = {
plugins: [
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
transformSchema: ({
schema,
link,
resolver,
defaultTransforms,
options,
}) => {
return wrapSchema(
{
schema,
executor: linkToExecutor(link),
},
defaultTransforms
)
}
},
]
}
For details, refer to https://www.graphql-tools.com/docs/schema-wrapping.
An use case for this feature can be seen in this issue.
Refetching data
By default, gatsby-source-graphql
will only refetch the data once the server is restarted. It’s also possible to configure the plugin to periodically refetch the data. The option is called refetchInterval
and specifies the timeout in seconds.
module.exports = {
plugins: [
// Simple config, passing URL
{
resolve: "gatsby-source-graphql",
options: {
// Arbitrary name for the remote schema Query type
typeName: "SWAPI",
// Field under which the remote schema will be accessible. You'll use this in your Gatsby query
fieldName: "swapi",
// Url to query from
url: "https://api.graphcms.com/simple/v1/swapi",
// refetch interval in seconds
refetchInterval: 60,
},
},
],
}
Performance tuning
By default, gatsby-source-graphql
executes each query in a separate network request.
But the plugin also supports query batching to improve query performance.
Caveat: Batching is only possible for queries starting at approximately the same time. In other words it is bounded by the number of parallel GraphQL queries executed by Gatsby (by default it is 4).
Fortunately, we can increase the number of queries executed in parallel by setting the environment variable
GATSBY_EXPERIMENTAL_QUERY_CONCURRENCY
to a higher value and setting the batch
option of the plugin
to true
.
Example:
cross-env GATSBY_EXPERIMENTAL_QUERY_CONCURRENCY=20 gatsby develop
With plugin config:
const fs = require("fs")
const { buildSchema, buildClientSchema } = require("graphql")
module.exports = {
plugins: [
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
batch: true,
},
},
],
}
By default, the plugin batches up to 5 queries. You can override this by passing
dataLoaderOptions
and set a maxBatchSize
:
const fs = require("fs")
const { buildSchema, buildClientSchema } = require("graphql")
module.exports = {
plugins: [
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
batch: true,
// See https://github.com/graphql/dataloader#new-dataloaderbatchloadfn--options
// for a full list of DataLoader options
dataLoaderOptions: {
maxBatchSize: 10,
},
},
},
],
}
Having 20 parallel queries with 5 queries per batch means we are still running 4 batches in parallel.
Each project is unique so try tuning those two variables and see what works best for you. We’ve seen up to 5-10x speed-up for some setups.
How batching works
Under the hood gatsby-source-graphql
uses DataLoader
for query batching. It merges all queries from a batch to a single query that gets sent to the
server in a single network request.
Consider the following example where both of these queries are run:
{
query: `query(id: Int!) {
node(id: $id) {
foo
}
}`,
variables: { id: 1 },
}
{
query: `query(id: Int!) {
node(id: $id) {
bar
}
}`,
variables: { id: 2 },
}
They will be merged into a single query:
{
query: `
query(
$gatsby0_id: Int!
$gatsby1_id: Int!
) {
gatsby0_node: node(id: $gatsby0_id) {
foo
}
gatsby1_node: node(id: $gatsby1_id) {
bar
}
}
`,
variables: {
gatsby0_id: 1,
gatsby1_id: 2,
}
}
Then gatsby-source-graphql
splits the result of this single query into multiple results
and delivers it back to Gatsby as if it executed multiple queries:
{
data: {
gatsby0_node: { foo: `foo` },
gatsby1_node: { bar: `bar` },
},
}
is transformed back to:
[
{ data { node: { foo: `foo` } } },
{ data { node: { bar: `bar` } } },
]
Note that if any query result contains errors the whole batch will fail.
Apollo-style batching
If your server supports apollo-style query batching you can also try
HttpLinkDataLoader.
Pass it to the gatsby-source-graphql
plugin via the createLink
option.
This strategy is usually slower than query merging but provides better error reporting.