Creating a Source Plugin
In this tutorial, you’ll create your own source plugin that will gather data from an API. The plugin will source data, optimize remote images, and create foreign key relationships between data sourced by your plugin.
What is a source plugin?
Source plugins “source” data from remote or local locations into what Gatsby calls nodes. This tutorial uses a demo API so that you can see how the data works on both the frontend and backend, but the same principles apply if you would like to source data from another API.
For more background on source plugins, check out Gatsby’s source plugin documentation.
Why create a source plugin?
Source plugins convert data from any source into a format that Gatsby can process. Your Gatsby site can use several source plugins to combine data in interesting ways.
There may not be an existing plugin for your data source, so you can create your own.
NOTE: if your data is local i.e. on your file system and part of your site’s repo, then you generally don’t want to create a new source plugin. Instead you want to use gatsby-source-filesystem which handles reading and watching files for you. You can then use transformer plugins like gatsby-transformer-yaml to make queryable data from files.
How to create a source plugin
Overview
The plugin in this tutorial will source blog posts and authors from the demo API, link the posts and authors, and take image URLs from the posts and optimize them automatically. You’ll be able to configure your plugin in your site’s gatsby-config.js
file and write GraphQL queries to access your plugin’s data.
This tutorial builds off of an existing Gatsby site and some data. If you want to follow along with this tutorial, you can find the codebase inside the examples folder of the Gatsby repository. Once you clone this code, make sure to delete the source-plugin
and example-site
folders. Otherwise, the tutorial steps will already be completed.
An example API request
To see the API in action, you can run it locally by navigating into the api
folder, installing dependencies with npm install
, and starting the server with npm start
. You will then be able to navigate to a GraphQL playground running at http://localhost:4000
. This is a GraphQL server running in Node.js and is separate from Gatsby, this server could be replaced with a different backend or data source and the patterns in this tutorial would remain the same. Other possible examples could be a REST API, local files, or even a database, so long as you can access data it can be sourced.
If you paste the following query into the left side of the window and press the play button, you should see data for posts with their IDs and descriptions returned:
This data is an example of the data you will source with your plugin.
You can also see a running version of the GraphQL playground associated with a distinct API at https://gatsby-source-plugin-api.glitch.me/, which is running the api
folder in a Glitch project, like you would when you run npm start
on your own computer.
Plugin behavior
Your plugin will have the following behavior:
- Make an API request to the demo API.
- Convert the data in the API response to Gatsby’s node system.
- Link the nodes together so you can query for an author on each post.
- Accept plugin options to customize how your plugin works.
- Optimize images from Unsplash URLs so they can be used with
gatsby-image
.
Set up projects for plugin development
You’ll need to set up an example site and create a plugin inside it to begin building.
Set up an example site
Create a new Gatsby site with the gatsby new
command, based on the hello world starter.
This site generated by the new
command is where the plugin will be installed, giving you a place to test the code for your plugin.
Set up a source plugin
Create a new Gatsby plugin with the gatsby new
command, this time based on the plugin starter.
This will create your plugin in a separate project from your example site, but you could also include it in your site’s plugins
folder.
Your plugin starts with a few files from the starter, which can be seen in the snippet below:
The biggest changes will be in gatsby-node.js
. This file is where Gatsby expects to find any usage of the Gatsby Node APIs. These allow customization/extension of default Gatsby settings affecting pieces of the site build process. All the logic for sourcing data will live in this file.
Install your plugin in the example site
You need to install your plugin in the site to be able to test that your code is running. Gatsby only knows to run plugins that are included in its gatsby-config.js
file. Open up the gatsby-config.js
file in the example-site
and add your plugin using require.resolve
. If you decide to publish your plugin it can be installed with an npm install <plugin-name>
and including the name of the plugin in the config instead of require.resolve
.
You can include the plugin by using its name if you are using npm link or yarn workspaces or place your source-plugin
in example-site/plugins
instead of being in a folder a step above and using require.resolve
.
You can now navigate into the example-site
folder and run gatsby develop
. You should see a line in the output in the terminal that shows your plugin loaded:
If you open the gatsby-node.js
file in your source-plugin
folder, you will see the console.log
that produces that output in the terminal.
Source data and create nodes
Data is sourced in the gatsby-node.js
file of source plugins or Gatsby sites. Specifically, it’s done by calling a Gatsby function called createNode
inside of the sourceNodes
API in the gatsby-node.js
file.
Create nodes inside of sourceNodes
with the createNode
function
Open up the gatsby-node.js
file in the source-plugin
project and add the following code to create nodes from a hardcoded array of data :
This code creates Gatsby nodes that are queryable in a site. The following bullets break down what is happening in the code:
- You implemented Gatsby’s
sourceNodes
API, which Gatsby will run as part of its bootstrap process, and pulled out some Gatsby helpers (likecreateContentDigest
andcreateNodeId
) to facilitate creating nodes. - You provided the required fields for the node like creating a node ID and a content digest (which Gatsby uses to track dirty nodes—or nodes that have changed). The content digest should include the whole content of the item (
post
, in this case). - Then you stored some data in an array and looped through it, calling
createNode
on each post in the array.
If you run the example-site
with gatsby develop
, you can now open up http://localhost:8000/___graphql
and query your posts with this query:
The problem with this data is that it is not coming from the API, it is hardcoded into an array. The declaration of the data
array needs to be updated to pull data from a different location.
Querying and sourcing data from a remote location
You can query data from any location to source at build time using functions and libraries like Node.js’s built-in http.get
, axios
, or node-fetch
. This tutorial uses a GraphQL client so that the source plugin can support GraphQL subscriptions when it fetches data from the demo API, and can proactively update your data in the site when information on the API changes.
Adding dependencies
You’ll use several modules from npm to making fetching data with GraphQL easier. Install them in the source-plugin
project with:
Note: The libraries used here are specifically chosen so that the source plugin can support GraphQL subscriptions. You can fetch data the same way you would in any other Node.js app or however you are most comfortable.
Open your package.json
file after installation and you’ll see the packages have been added to a dependencies
section at the end of the file.
Configure an Apollo client to fetch data
Import the handful of Apollo packages that you installed to help set up an Apollo client in your plugin:
Then you can copy this code that sets up the necessary pieces of the Apollo client and paste it after your imports:
You can read about each of the packages that are working together in Apollo’s docs. The end result is creating a client
that you can use to call methods like query
to get data from the source it’s configured to work with. In this case, that is http://localhost:4000
where you should have the API running. If you can’t configure the API to run locally, you can update the URLs for the client to use gatsby-source-plugin-api.glitch.me
where a version of the API is deployed, instead of http://localhost:4000
.
Query data from the API
Now you can replace the hardcoded data in the sourceNodes
function with a GraphQL query:
Now you’re creating nodes based on data coming from the API. Neat! However, only the id
and description
fields are coming back from the API and being saved to each node, so add the rest of the fields to the query so that the same data is available to Gatsby.
This is also a good time to add data to your query so that it also returns authors.
With the new data, you can also loop through the authors to create Gatsby nodes from them by adding another loop to sourceNodes
:
At this point you should be able to run gatsby develop
in your example-site
, open up GraphiQL at http://localhost:8000/___graphql
and query both posts and authors.
Optimize remote images
Each node of post data has an imgUrl
field with the URL of an image on Unsplash. You could use that URL to load images on your site, but they will be large and take a long time to load. You can optimize the images with your source plugin so that a site using your plugin already has data for gatsby-image
ready to go!
You can read about how to use Gatsby Image to prevent image bloat if you are unfamiliar with it.
Create remoteFileNode
’s from a URL
To create optimized images from URLs, File
nodes for image files need to be added to your site’s data. Then, you can install gatsby-plugin-sharp
and gatsby-transformer-sharp
which will automatically find image files and add the data needed for gatsby-image
.
Start by installing gatsby-source-filesystem
in the source-plugin
project:
Now in your plugin’s gatsby-node.js
file, you can implement a new API, called onCreateNode
, that gets called every time a node is created. You can check if the node created was one of your Post
nodes, and if it was, create a file from the URL on the imgUrl
field.
Import the createRemoteFileNode
helper from gatsby-source-filesystem
, which will download a file from a remote location and create a File
node for you.
Then export a new function onCreateNode
, and call createRemoteFileNode
in it whenever a node of type Post
is created:
This code is called every time a node is created, e.g. when createNode
is invoked. Each time it is called in the sourceNodes
step, the condition will check if the node was a Post
node. Since those are the only nodes with an image associated with them, that is the only time images need to be optimized. Then a remote node is created, if it’s successful, the fileNode
is returned. The next few lines are important:
By assigning a field called remoteImage___NODE
to the ID of the File
node that was created, Gatsby will be able to infer a connection between this field and the file node. This will allow fields on the file to be queried from the post node.
Note: you can use schema customization APIs to create these kinds of connections between nodes as well as sturdier and more strictly typed ones.
At this point you have created local image files from the remote locations and associated them with your posts, but you still need to transform the files into optimized versions.
Transform File
nodes with sharp plugins
Sharp plugins make optimization of images possible at build time.
Install gatsby-plugin-sharp
and gatsby-transformer-sharp
in the example-site
(not the plugin):
Then include the plugins in your gatsby-config
:
By installing the sharp plugins in the site, they’ll run after the source plugin and transform the file nodes and add fields for the optimized versions at childImageSharp
. The transformer plugin looks for File
nodes with extensions like .jpg
and .png
to create optimized images and creates the GraphQL fields for you.
Now when you run your site, you will also be able to query a childImageSharp
field on the post.remoteImage
:
With data available, you can now query optimized images to use with the gatsby-image
component in a site! You will need to install gatsby-image
before you can use it.
Create foreign key relationships between data
To link the posts to the authors, Gatsby needs to be aware that the two are associated, and how. You have already implemented one example of this when Gatsby inferred a connection between a remoteImage
and the remote file from Unsplash.
The best approach for connecting related data is through customizing the GraphQL schema. By implementing the createSchemaCustomization
API, you can specify the exact shape of a node’s data. While defining that shape, you can optionally link a node to other nodes to create a relationship.
Copy this code and add it to the source-plugin
in the gatsby-node.js
file:
The author: Author @link(from: "author.name" by: "name")
line tells Gatsby to look for the value on the Post
node at post.author.name
and relate it with an Author
node with a matching name
. This demonstrates the ability to link using more than just an ID.
The line remoteImage: File @link
tells Gatsby to look for a remoteImage
field on a Post
node and link it to a File
node with the ID there.
Now, instead of using inference in for the remoteImage
field, you can take off the ___NODE
suffix. You can update the code in the onCreateNode
API now like this:
Now running the site will allow you to query authors and remoteImages from the post nodes!
Using data from the source plugin in a site
In the example-site
, you can now query data from pages.
Add a file at example-site/src/pages/index.js
and copy the following code into it:
Ensure you have gatsby-image
installed in the site by running npm install gatsby-image
. It provides a component that can take the optimized image data and render it.
This code uses a page query to fetch all posts and provide them to the component in the data
prop at build time. The JSX code loops through the posts so they can be rendered to the DOM.
Using plugin options to customize plugin usage
You can pass options into a plugin through a gatsby-config.js
file. Update the code where your plugin is installed in the example-site
, changing it from a string, to an object with a resolve
and options
key.
Now the options you designated (like previewMode: true
) will be passed into each of the Gatsby Node APIs like sourceNodes
, making options accessible inside of Gatsby APIs. Add an argument called pluginOptions
to your sourceNodes
function.
Options can be a good way of providing conditional paths to logic that you as a plugin author want to provide or limit.
Proactively updating data with subscriptions
The data sourced for your site was fetched using Apollo Client, which supports subscriptions. GraphQL subscriptions listen for changes in data and return changes to the GraphQL client. Your source plugin is able to listen—or subscribe—to the new data that is incoming. That means if a post has something on it updated, your source plugin can listen for that change and update the data in your site without having to restart your site, neat!
The API you connect to needs to provide support for live changes to data in order for this to be possible. You can read about other options for live data updates in the creating a source plugin guide.
You already set up your client to handle subscriptions by providing a websocket link (ws://localhost:4000
or ws://gatsby-source-plugin-api.glitch.me/
). Now you need to add some logic to your sourceNodes
function to handle updating and deleting nodes, rather than just creating them. The first step will be touching nodes, to make sure that Gatsby doesn’t discard the nodes that don’t get updated when sourceNodes
gets called.
You can use the getNodesByType
function to gather up the post and author nodes, loop through each, and call touchNode
on each node.
Then, you can use the plugin option provided in the previous section for previewMode
to only turn on subscriptions when it’s set to true
.
Using Apollo Client, you can create a subscription with almost all of the same fields as your query. Note this string of fields in the GraphQL subscription also includes a status
. This status is returned by the subscriptions on the backend as a mechanism to allow Gatsby to know what to do with each piece of updated data.
Now you can write a function to subscribe to the data updates (like when a post is updated), and handle the new data coming in with a switch statement.
Posts that are changed on the backend while Gatsby is running will be created if they are new or updated, and deleted if they were deleted on the backend.
You can test that this is working by running the site again and updating one of the posts. When you run the site this time you should see a message logged in the console: Subscribing to content updates...
. Now, running an updatePost
or deletePost
mutation on the GraphQL server will send information to the subscription because it is now listening.
Follow these steps to test it out:
- Open up your site at
http://localhost:8000
after you rungatsby develop
- Open up the GraphQL playground at
http://localhost:4000
(if you are running theapi
folder locally) orhttps://gatsby-source-plugin-api.glitch.me/
and first run a query for posts:
- Copy the ID from the post that you would like to update
- Inside the GraphQL playground, run an update post mutation, replacing
<id>
with the ID you just copied
- When you run the mutation, the data will be updated on the backend, the subscription will recognize the change, Gatsby will update the node, and your page query will render the new data.
It’s so fast that it’s a blink and you’ll miss it kind of moment, so try running another mutation or even run a deletePost
mutation to make it easier to see the changes!
Publishing a plugin
Don’t publish this particular plugin to npm or the Gatsby Plugin Library, because it’s just a sample plugin for the tutorial. However, if you’ve built a local plugin for your project, and want to share it with others, npm allows you to publish your plugins. Check out the npm docs on How to Publish & Update a Package for more info.
NOTE: Once you have published your plugin on npm, don’t forget to edit your plugin’s
package.json
file to include info about your plugin. If you’d like to publish a plugin to the Gatsby Plugin Library (please do!), please follow these steps.
Summary
You’ve written a Gatsby plugin that:
- can be configured with an entry in your
gatsby-config.js
file - requests data from an API
- pulls the API data into Gatsby’s node system
- allows the data to be queried with GraphQL
- optimizes images from a remote location automatically
- links data types with a customized GraphQL schema
- updates new data without needing to restart your Gatsby site
Congratulations!
Additional resources
- Example repository with all of this code implemented
- Creating a first class source plugin for Gatsby Cloud