Gatsby 4 introduced a lot of new features, most notably the introduction of additional rendering options besides Static Site Generation (SSG): Deferred Static Generation (DSG) and Server-Side Rendering (SSR). If you missed this exciting release you can catch up in the blog post What’s New in Gatsby 4 after following this tutorial.
So why should you care about Deferred Static Generation? In a nutshell it delays (or defers) the generation of pages until a user actually requests the page. After that the build artifacts (HTML & JSON) are cached at the CDN level and for all visitors afterwards it behaves like a normal SSG page. In a scenario where you build thousands of pages but only a subset of them gets regular visits from users, DSG can dramatically improve your build speeds. And you still get the SSG benefits like great SEO, speed, and reliability.
In choosing what pages to defer you can improve your page on a granular and individual level — but for how you’d do this the sky is really the limit. In this tutorial I’ll explain how you can use data from your analytics tool (like Google Analytics, Plausible Analytics, Fathom, etc.) to programmatically defer pages that are not popular, only building your top pages as SSG. By the end of this tutorial, you’ll be able to create a source plugin for your analytics tool (if it doesn’t exist yet) and use that data to defer pages in your site. You can find a complete example on GitHub: gatsby-dsg-plausible-analytics.
Prerequisites
If you want to follow this tutorial step-by-step you’ll need to have a couple of accounts for the online platforms in this guide. You need to have a Gatsby Cloud account and an account on GitHub or GitLab. Since I’m personally a big fan of Plausible Analytics it’ll be used in this tutorial. You can’t create a free account there (but start a 30 day trial), so if you need something free you can use Google Analytics.
If you haven’t set up your development environment yet, you can follow Part 0 of the official Gatsby tutorial to do so.
Lastly, you should already have a Gatsby site locally or on GitHub/GitLab that you can work with and connect to Gatsby Cloud.
Conceptual Guide
Before I begin to explain the actual steps you need to take, let me take a step back and explain the concept on a high-level. The problem statement is: You can mark pages as deferred, but what should be the differentiator? And one (of probably many) solutions is to take the data of your analytics tool via their API and use that to mark the pages.
So in short:
- Pull data from analytics tool via their API.
- Use a common but unique identificator (e.g. the URL of the page) to connect the analytics data to each page node to allow a direct comparison.
- Mark a page as deferred if it’s not one of your top pages using the
defer
key increatePage
action.
Adding Analytics to Your Site
Already have your analytics tool set up on your site? Then you can skip this step.
Use gatsby-plugin-plausible to add the necessary tracking script to your site.
Add the plugin to your gatsby-config.js
and provide the necessary domain
option.
Be sure to commit and push your change so that the tracking goes live and you can gather some data. If you don’t have many visitors yet you can also click around with a couple of devices of your own to generate some data.
Not using Plausible Analytics?
If you’re not using Plausible or don’t want to, you can use different plugins to add your analytics tool to your site. For example:
Or use the plugin library to find the fitting plugin.
Deploying to Gatsby Cloud
Already use Gatsby Cloud to build & host your site? Then you can skip this step.
- Go to your Gatsby Cloud Dashboard and click the Add a site button.
- Import your project from GitHub and select the project you worked on until now.
- For now you can leave the sections about integrations and environment variables empty. Later you’ll need to fill out the environment variables for the local plugin.
- Finish the wizard by pressing Create site.
Want a more detailed walkthrough?
No problem! Part 1 of the official Gatsby tutorial has you covered. The guide Create and Deploy Your First Gatsby Site walks you through the creation of a Gatsby site itself and then the deployment to Gatsby Cloud. It uses GitHub to host the code of your site.
Creating A Source Plugin
So you’ve added analytics to your site, gathered some data, and deployed the site to Gatsby Cloud. The Plausible dashboard might look something like this:
The information from the Top Pages box is what you’ll need to access. And you can do that through Plausible’s API. The Stats API Reference documentation explains how and we’ll get back to the details there later. But first, go to plausible.io/settings and obtain an API key. Save it inside an .env
file in the root of your project with PLAUSIBLE_API_KEY=your-key
.
Not using Plausible Analytics?
While you won’t be able to copy/paste the results of each step, you can still follow them. The API endpoint or authentication might be different, or the shape of the result in a slightly different form. But the general idea is the same:
- Go to your analytics tool’s API documentation and check if you need to authenticate requests with them and how. Obtain that authentication then and save it e.g. inside an
.env
file - Request the specific API endpoint to get the top pages of your site
- Create GraphQL nodes with the result
For Google Analytics there’s even an existing plugin called gatsby-source-google-analytics-reporting-api. For reference, you can also check out the Creating a Source Plugin guide from Gatsby.
Local Plugin
Instead of placing the logic for Plausible inside your gatsby-node.js
file at the root of your repository, you should create a local plugin instead. This way your own “gatsby-source-plausible” is easily reusable across your projects.
Start by creating the folder structure /plugins/gatsby-source-plausible
and create both a package.json
and gatsby-node.js
inside it.
For the package.json
use the following contents:
Go into plugins/gatsby-source-plausible
and run npm install
to install node-fetch
.
Please Note: Be sure to only install node-fetch@cjs
/ node-fetch@^2.0.0
as v3 of node-fetch
is ESM-only. At the time of writing this guide Gatsby doesn’t support handling ESM-only dependencies yet.
Lastly, create some boilerplate code inside gatsby-node.js
for the next step:
Retrieving Data & Creating Nodes
The Plausible Stats API Reference explains the /api/v1/stats/breakdown
endpoint and already has an example for “Top Pages” further below:
Here, for a period of 6 months the top 5 pages are returned for the given site id. By default, the limit
is 100. Translating that curl
command into the shape of node-fetch
looks something like this:
The siteId
and apiKey
in the second argument of the sourceNodes
function are plugin options that get passed through when you set them inside gatsby-config.js
. The constructed API_URL
endpoint is the same as Plausible’s example except for the removal of the limit
param.
To authenticate your request to Plausible you set an Authorization
header in the request using the bearer token method (What is a Bearer Token?). As response
is only a Response stream, you need to use .json()
to read it to completion and to get data you can actually work with.
All that is left now is to actually create GraphQL nodes from the API response! Use createNode
to loop over the response and create nodes. In the next step you’ll then be able to query allPlausibleTopPage
. The complete gatsby-node.js
file:
Each node needs a unique id
and by using the helper function createNodeId
you’re ensuring that’s the case. slug
and visitors
will be queryable later, the keys on internal
help Gatsby to correctly track and garbage collect nodes.
Using GraphiQL
Congrats, you wrote a source plugin! Time to test it out and see if it all works. Before you can start the development server, you first need to add your newly created local plugin to gatsby-config.js
. When using .env
files you’ll also need to initialize dotenv
. You saved the PLAUSIBLE_API_KEY
earlier in an .env
file, and the siteId
is the same thing as the domain
for gatsby-plugin-plausible
.
Please Note: Don’t forget to also set PLAUSIBLE_API_KEY
inside Gatsby Cloud as an environment variable. Otherwise the plugin won’t be able to pull the information from Plausible’s API. Go to your Site Settings and in the Environment Variables card you can add the key.
Now, in the root of your project start the development server with gatsby develop
and go to http://localhost:8000/___graphql
and run the following query:
If everything went well you should see something like this:
Now it’s time for you to interpret the data you see. In this instance the index page and three more pages have the most views by some margin. You might see something similar: The index page & a cohort of pages with the most views, then a noticeable gap, and the rest of the pages. Depending on this metric you should now decide how many pages from that top pages you want to mark as SSG. (For the sake of this tutorial three will be chosen).
Deferring Low-Traffic Pages
You’re nearly there! The last piece of this puzzle is to use the data you get from allPlausibleTopPage
in your page creation logic inside gatsby-node.js
of the project. Stop the development server (if you haven’t already) and go to your gatsby-node.js
file at the root of your project (not the gatsby-node.js
of the plugin). You should have something similar:
Querying data (in this case markdown) and creating pages through the createPage
action. A pretty typical createPages
API call. Extend the GraphQL call with the allPlausibleTopPage
type:
And this now really shows the power of Gatsby’s GraphQL data layer and the query options you have with it:
sort
: The result is filtered byvisitors
in a descending order, so the pages with the most visitors are at the topfilter
: Filter out the index page as we don’t want to defer itlimit
: Only return 3 items
The result will be an array of objects with the slug
key. For easier comparison you should convert it to an array just containing the slugs. Lastly, you’ll be comparing the slug from allPlausibleTopPage
with the current page post.frontmatter.slug
to set the defer
key on the createPage
call.
The complete gatsby-node.js
looks something like this:
As at the time of writing this, DSG is not run in gatsby develop
you’ll need to run gatsby build
and gatsby serve
to see it in action now!
You can check if it worked by looking at the build output at the end of gatsby build
:
So as expected, the index page is SSG, and three posts are marked as DSG (and the correct ones). Great! 🎉 Time to see in action on Gatsby Cloud. Commit your changes and push it to trigger a build on Gatsby Cloud (if you work on a feature branch you’ll need to open a PR to trigger a build).
Go to some pages that are marked as deferred. The first request might take a bit longer, after that the requests to other DSG pages should be fast.
You can also see it working in Gatsby Cloud directly. For this, go to the detailed view of a build and go to the SSR logs tab. There you’ll see each invocation of a DSG/SSR page.
Next steps
Congrats, you programmatically deferred low-traffic pages using data from your analytics tool! Now it’s your task to further personalize which pages to defer and what indicators to use. Maybe use the realtime visitors of a page or filter by specific campaigns you’re running.
I hope this also sparked your imagination of what other data sources and methods you could use to programmatically use Deferred Static Generation. Did you build something cool with Deferred Static Generation? Let us know on Twitter!