Overview of the Gatsby Build Process
This is a high-level overview about the steps in the Gatsby build process. For more detailed information about specific steps, check out the Gatsby Internals section of the docs.
Gatsby has two modes for compiling a site:
- Develop - run with the
gatsby develop
command - Build - run with
gatsby build
You can start Gatsby in either mode with its respective command: gatsby develop
or gatsby build
.
Build time vs runtime
A common confusion with the generation of static assets is the difference between build time and runtime.
Processes that happen in a web browser that you click through and interact with can be referred to as a browser runtime. JavaScript code can interact with the browser and take advantage of APIs it offers, such as window.location
to get information from a page’s URL, using AJAX to submit a form, or injecting markup dynamically into the DOM. Gatsby creates a JavaScript runtime that takes over in the browser once the initial HTML has loaded.
Build time, in contrast, refers to the process of using a server process to compile the site into files that can be delivered to a web browser later, so Browser APIs like window
aren’t available at that time.
The gatsby develop
command doesn’t perform some of the production build steps that the gatsby build
command does. Instead, it starts up a development server that you can use to preview your site in the browser — like a runtime. When you run gatsby build
(build time) there won’t be a browser available, so your site needs to be capable of protecting calls to browser based APIs.
To gain a greater understanding of what happens when you run either command, it can be helpful to look at the information that Gatsby is reporting back and break it down.
Understanding gatsby develop
(runtime)
Using gatsby develop
runs a server in the background, enabling useful features like live reloading and Gatsby’s data explorer.
gatsby develop
is optimized for rapid feedback and extra debugging information. The output of running gatsby develop
in a fresh install of the Gatsby default starter looks like this:
Some of the steps may be self-explanatory, but others require context of Gatsby internals to make complete sense of (don’t worry, the rest of this guide will give more explanations).
Understanding gatsby build
(build time)
Gatsby’s build
command should be run when you’ve added the finishing touches to your site and everything looks great. gatsby build
creates a version of your site with production-ready optimizations like packaging up your site’s config, data, and code, and creating all the static HTML pages that eventually get rehydrated into a React application.
The output of running gatsby build
in a fresh install of the Gatsby default starter looks like this:
Differences between develop and build
So what’s the difference?
If you compare the outputs of the two commands (gatsby develop
vs gatsby build
), you can see that everything (with the exception of deleting HTML and CSS files) up until the line that says info bootstrap finished
are the same. However, gatsby build
runs some additional steps to prepare your site to go live after the bootstrap phase.
The following output shows the differences between the above two examples:
Note: the output of gatsby develop
and gatsby build
can vary based on plugins you’ve installed that can tap into the build lifecycle and the Gatsby reporter. The above output is from running the default Gatsby starter.
There is only one difference in the bootstrap phase where HTML and CSS is deleted to prevent problems with previous builds. In the build phase, the build command skips setting up a dev server and goes into compiling the assets.
By omitting these later steps, gatsby develop
can speed up your ability to make live edits without page reloads using features like hot module replacement. It also saves time with the more CPU intensive processes that aren’t necessary to perform in rapid development.
There is also a difference in the number of page queries that will run. gatsby develop
will run at most 3 page queries (index page, actual 404 and develop 404) initially. The rest of the queries will run when they are needed (when the browser requests them). In contrast, gatsby build
will run page queries for every page that doesn’t have cached and up to date results already.
A cache is also used to detect changes to gatsby-*.js
files (like gatsby-node.js
, or gatsby-config.js
) or dependencies. It can be cleared manually with the gatsby clean
command to work through issues caused by outdated references and a stale cache.
What happens when you run gatsby build
?
To see the code where many of these processes are happening, refer to the code and comments in the build
and bootstrap
files of the repository.
A Node.js server process powers things behind the scenes when you run the gatsby build
command. The process of converting assets and pages into HTML that can be rendered in a browser by a server-side language like Node.js is referred to as server-side rendering, or SSR. Since Gatsby builds everything ahead of time, this creates an entire site with all of the data your pages need at once. When the site is deployed, it doesn’t need to run with server-side processes because everything has been gathered up and compiled by Gatsby.
Note: because Gatsby sites run as full React applications in the browser, you can still fetch data from other sources at runtime like you would in a typical React app.
The following model demonstrates what is happening in different “layers” of Gatsby as content and data are gathered up and made available for your static assets.
const query = graphql`
query HomePageQuery {
site {
title
description
}
}
`
Building compiles your application with modern features like server-side rendering, route based code splitting (and more!) for great performance out of the box.
During the build (when you run gatsby build
or gatsby develop
), data is fetched and combined into a GraphQL schema with a static snapshot of all data your site needs.
Like the console output demonstrates in the section above, there are 2 main steps that take place when you run a build: the bootstrap
phase, and the build
phase (which can be observed finishing in the console output when you run gatsby develop
or gatsby build
).
At a high level, what happens during the whole bootstrap and build process is:
Node
objects are sourced from whatever sources you defined ingatsby-config.js
with plugins as well as in yourgatsby-node.js
file- A schema is inferred from the Node objects
- Pages are created based off JavaScript components in your site or in installed themes
- GraphQL queries are extracted and run to provide data for all pages
- Static files are created and bundled in the
public
directory
For an introduction to what happens at each step throughout the process from the output above, refer to the sections below.
Steps of the bootstrap phase
The steps of the bootstrap phase are shared between develop and build (with only one exception in the delete html and css files from previous builds
step). This includes:
open and validate gatsby-configs
The gatsby-config.js
file for the site and any installed themes are opened, ensuring that a function or object is exported for each.
load plugins
Plugins installed and included in the config of your site and your site’s themes are loaded. Gatsby uses Redux for state management internally and stores info like the version, name, and what APIs are used by each plugin.
onPreInit
Runs the onPreInit
node API if it has been implemented by your site or any installed plugins.
delete html and css files from previous builds
The only different step between develop and build, the HTML and CSS from previous builds is deleted to prevent problems with styles and pages that no longer exist.
initialize cache
Check if new dependencies have been installed in the package.json
; if the versions of installed plugins have changed; or if the gatsby-config.js
or the gatsby-node.js
files have changed. Plugins can interact with the cache.
copy gatsby files
Copies site files into the cache in the .cache
folder.
onPreBootstrap
Calls the onPreBootstrap
node API in your site or plugins where it is implemented.
source and transform nodes
Creates Node objects from your site and all plugins implementing the sourceNodes
API, and warns about plugins that aren’t creating any nodes. Nodes created by source or transformer plugins are cached.
Node objects created at this stage are considered top level nodes, meaning they don’t have a parent node that they are derived from.
Add explicit types
Adds types to the GraphQL schema for nodes that you have defined explicitly with Gatsby’s schema optimization APIs.
Add inferred types
All other nodes not already defined are inspected and have types inferred by Gatsby.
Processing types
Composes 3rd party schema types, child fields, custom resolve functions, and sets fields in the GraphQL schema. It then prints information about type definitions.
building schema
Imports the composed GraphQL schema and builds it.
createPages
Calls the createPages
API for your site and all plugins implementing it, like when you create pages programmatically in your gatsby-node.js
.
Plugins can handle the onCreatePage
event at this point for use cases like manipulating the path of pages.
createPagesStatefully
Similar to the createPages
step, but for the createPagesStatefully
API which is intended for plugins who want to manage creating and removing pages in response to changes in data not managed by Gatsby.
onPreExtractQueries
Calls the onPreExtractQueries
API for your site and all plugins implementing it.
update schema
Rebuilds the GraphQL schema, this time with SitePage
context — an internal piece of Gatsby that allows you to introspect all pages created for your site.
extract queries from components
All JavaScript files in the site are loaded and Gatsby determines if there are any GraphQL queries exported from them. If there are problematic queries they can be reported back with warnings or errors. All these queries get queued up for execution in a later step.
write out requires
An internal Gatsby utility adds the code that files need to load/require.
write out redirect data
An internal Gatsby utility adds code for redirects, like implemented with createRedirect
.
Build manifest and related icons
- (fromgatsby-plugin-manifest
)
This step is activated by gatsby-plugin-manifest
in the gatsby-default-starter
and is not a part of the built-in Gatsby functionality, demonstrating how plugins are able to tap into the lifecycle. The plugin adds a manifest.json
file with the specified configurations and icons.
onPostBootstrap
Calls the onPostBootstrap
API for your site and all plugins implementing it.
Steps of the build phase
run static queries
Static queries in non-page components that were queued up earlier from query extraction are run to provide data to the components that need it.
Generating image thumbnails — 6/6
- (fromgatsby-plugin-sharp
)
Another step that is not a part of the built-in Gatsby functionality, but is the result of installing gatsby-plugin-sharp
, which taps into the lifecycle. Sharp runs processing on images to create image assets of different sizes.
Building production JavaScript and CSS bundles
Compiles JavaScript and CSS using webpack.
Rewriting compilation hashes
Compilation hashes are used by webpack for code splitting and keeping the cache up to date, and all files with page data need to be updated with the new hashes since they’ve been recompiled.
run page queries
Page queries that were queued up earlier from query extraction are run so the data pages need can be provided to them.
Building static HTML for pages
With everything ready for the HTML pages in place, HTML is compiled and written out to files so it can be served up statically. Since HTML is being produced in a Node.js server context, references to browser APIs like window
can break the build and must be conditionally applied.
By default, Gatsby rebuilds static HTML for all pages on each build. There is an experimental feature flag GATSBY_EXPERIMENTAL_PAGE_BUILD_ON_DATA_CHANGES
which enables conditional page builds.
What do you get from a successful build?
When a Gatsby build is successfully completed, everything you need to deploy your site ends up in the public
folder at the root of the site. The build includes minified files, transformed images, JSON files with information and data for each page, static HTML for each page, and more.
The final build is just static files so it can now be deployed.
You can run gatsby serve
to test the output from compilation. If you make any new changes to the source code that you want to preview, you will have to rebuild the site and run gatsby serve
again.