Orchestrating and dockerizing a monorepo with Yarn 3 and Turborepo
After a year of working on a monorepo with Yarn Workspaces and Lerna, we have learnt that Lerna is being deprecated, so we had to go back to the drawing board and come up with an alternative development pipeline for our multi-package application.
Switching to Yarn 3
The newer version of Yarn ships with a lot of performance optimizations, but it also changes the way we reason about our dependencies, so there were a few a-ha moments.
To start using Yarn 3, create .yarnrc.yml
file in the root of your project. If you already have a .npmrc
file, you will need to migrate the configuration to the new configuration format.
For instance, if you are using private repositories from npm, you will need to change:
//registry.npmjs.org/:_authToken=<YOUR TOKEN>
to
npmRegistries:
"https://registry.npmjs.org/":
npmAuthToken: <YOUR TOKEN>
If you are using packages from other private repositories, e.g. premium FontAwesome, you will need to change:
@fortawesome:registry=https://npm.fontawesome.com/
//npm.fontawesome.com/:_authToken=<YOUR TOKEN>
to
npmScopes:
fortawesome:
npmAuthToken: <YOUR TOKEN>
npmRegistryServer: "https://npm.fontawesome.com/"
Once you have done this, you can switch your yarn version to either latest or a set version (e.g. 3.2.1 at the time of writing).
yarn set version <version>
Yarn team was hard work to ensure consistency of dependency management across larger teams, so once you run yarn install
, you will notice a new folder inside your root directory called .yarn
. Now, intuition will tell you to add it to your .gitignore
, but don’t do that. Instead, use the following ignore rules:
# Yarn
.yarn/*
!.yarn/patches
!.yarn/releases
!.yarn/plugins
!.yarn/sdks
!.yarn/versions
.pnp.*
With Yarn 3, you also have an option to use Plug’n’Play mode, which would allow you to install and build your application offline. You can read more about it here.
For a monorepo setup, you would want to use a Yarn Workspace, which allows you to define dependencies on local packages, so that you don’t have to publish your packages to a registry to use them inside other local packages.
In our setup, we split the monorepo into packages and applications. Packages are libraries that power standalone applications. You can split your monorepo in many other ways, using the workspaces config in package.json
.
{
"name": "@me/monorepo",
"private": true,
"workspaces": [
"packages/**/*",
"applications/**/*"
]
}
To consume a package inside our application, we would then update our dependencies with workspace ranges as such:
{
"name": "@me/application-a",
"dependencies": {
"@me/package-a": "workspace:^"
}
}
Yarn will now create a semantic link between the package in node_modules and the source, so you can use it inside your application as you would any another node module.
Setting up Turborepo
Turborepo is a new build system developed by Vercel specifically for managing pipelines inside monorepos. Using pipeline configurations, you can define dependencies for your lifecycle commands, e.g. you can tell Turborepo to first build your package before building your application, or you can tell it to start a server before running an integration test. Besides that, Turborepo has a smart caching system that allows you to define inputs and outputs, so that it can figure out if changes were made to inputs before re-running a specific command, so it saves you time by replaying outputs if no changes were made.
Let’s say we have a React component library written in TypeScript, and a Next.js application that uses it.
Ideally we would want to be able to:
- Watch for changes in our package when Next.js server is running in dev mode
- Build our package before building our Next.js production server
- Build our package before testing our Next.js application
Let’s define our commands:
We can now configure Turborepo to help us work on both packages:
This is a contrived example, so the actual setup will vary, but it gives you an idea:
- Whenever
build
command is called on any package, it will also runbuild
on all dependencies - Whenever
test
command is called on any package, it will run without looking for cache - Whenever
dev
command is called on any package, it will run thebuild
first and will not cache the output - Whenever
watch
command is called it will run without cache
We then have configurations for commands specific to our application, which allows us to define other rules, e.g. we can say that we do actually want to build dependencies before we run tests inside our application (unlike our packages).
We can then update our root package.json
with some helper commands:
Notice that we are using a concurrent job to start Next.js server and a separate watch job. That is because Turborepo does not support jobs that do not exit — as a build tool, it needs a successful exit code (or an error code if --continue
flag is set) to continue with the pipeline. It’s a minor hinderance, but easy to work around, and it still gives you advantages of performing dependent jobs before running watch command inside the packages.
Yarn and Turborepo caches
Once you start using this setup, you will notice .yarn/cache
directory in the root, as well as .turbo
inside your packages. These caches can greatly increase your productivity and time spent on build and other jobs, but they also raise a question — what to ignore and when.
Both caches are environment specific, so you should probably ignore them whenever the code moves between environments — so add them to .gitignore
and .dockerignore
(that said, there may be exceptions, e.g. if you want to develop locally and inside the docker container, you can benefit from shared yarn cache).
Your CI/CD pipeline could benefit from both caches, so make sure to take advantage of them. That said, be mindful of what commands are being cached and what command output is being cached — in certain cases you may not want to skip the job (use --no-cache
flag if needed). Worth noting that if you are using an LRU cache, you may end up with some inconsistencies if the Turborepo cache gets partially ejected, so do some trial and error testing on your CI/CD jobs.
Dockerizing the production build
Most of the credit for this solution, goes to my colleague Ruben Costa, as he weathered a day of stress trying to reduce the image size of our original build from 3Gb to only 250Mb.
Our monorepo consists of a Next.js application and 2 React SPAs. There are no cross-dependencies between applications, but they depend on various packages. So the challenge was to orchestrate a multi-step build that would contain only the production dependencies needed by each application.
After going through documentation for Yarn and Turborepo, we have identified two tools that helped us achieve the goal.
turbo prune command
This command allows you to aggregate a local package with all of its local dependencies in a different location, where you can run install and build, getting only the dependencies you need for that segment of your monorepo to build. See the command documentation here.
To use this command you will need turbo
installed in your environment, either locally or globally, but can always rely on good-old npx
to do the job.
npx turbo prune --scope @me/application-a
yarn workspaces focus plugin
This plugin allows you to substitute the old yarn install --prod
, which is no longer part of Yarn CLI. See the plugin documentation here.
Install the plugin to your workspace:
yarn plugin import workspace-tools
Use it:
yarn workspaces focus --all --production
Dockerfile for a Next.js application
Let’s see what’s happening:
If you are using Docker Buildx, you can mount a cache volume to speed up yarn installs.
--mount=type=cache,target=/app/.yarn/cache
Prune the application you want to build:
npx turbo prune --scope=@me/application-a
The command will copy the package and its dependencies into ./out
directory.
Copy everything else you need for your application to build, i.e. global tsconfig.json, install dependencies and build the application:
cp -R .yarn .yarnrc.yml tsconfig.json postcss.config.js out/ && \ cd out && \
yarn install && \
turbo run build --filter=@me/application-a
Reinstall the dependencies skipping devDependencies:
yarn workspaces focus --all --production
Clean up useless files that are not needed in the Docker image:
rm -rf node_modules/.cache .yarn/cache applications/web-server/.next/cache
Once you have a built application, you can copy clean production files into a different step and start the server.
Building images for SPAs is quite straightforward, as you don’t need to worry about extraneous dependencies ending up in the final static build — you just copy over the dist and setup your Nginx or other server to serve static files.
Conclusion
Monorepos are not a silver bullet — they make development easier, but they complicate CI/CD, and make the dependency hell even worse. Tools like Turborepo make life a bit easier, but do not address the fundamental problems of a JS ecosystem — complexity of dependency modularisation throughout incompatible module systems (ES, CJS etc) that require both runtime and build-time tooling.