The deployment process of Ruby on Rails apps

16 minute read

We have selected, installed and configured our tools. The next logical process is to deploy the app. Before we do that, I would like to talk about the process that we would be using. The reason to do so is to understand why we are configuring things the way we are configuring them and the reasons why we are taking each step.

Overview

Rails conforms to a lot of factors from the list of 12 Factor App guidelines. However, the deployment part is not yet done (which is one of the 12 factors of a 12-factor app). Our deployment tool (which I have repeatedly said is a bash script) will try to follow the best principals and make sure the deployment is as fool-proof as possible.

I would reiterate that it closely resembles what capistrano does but provides a much better control over the process because you can make it interactive and run complicated commands which you might find difficult to achieve using the DSL of any tool you use.

We will share the deployment script later but before we do so, we will explain why it is written the way it is written.

Why deploy this particular way

Before we get into the code, I believe it is important to know why the deployment should be done the way it should be done. Moreover, I promised that I was going to explain reasons why we are doing things the particular way. And unlike rules, promises aren’t meant to be broken.

There are certain things to consider when you are deploying. Let’s list them (and talk a bit about them):

  1. Where will our site be placed on the server? - You can very well put it under /var/www/mysite but we are rather going to put it under ~/mysite (so if the username is ubuntu then the path would be /home/ubuntu/mysite). You can move it under a subdirectory if you wish (e.g. creating a sites directory and make the full path be /home/ubuntu/sites/mysite). There are a couple of points to remember about this:
    1. This directory is not where the actual code will live. It is more like a base directory (we will call it base_dir whenever referring to it) under which a bunch of things would be there. Code would be in one of the directories among them.
    2. The reason we are placing the code in our home directory is because most of the times the operating system will set correct permissions for new files whenever a new deployment happens. No need to worry about creating masks, settings permissions, changing owners and so on. The code is supposed to run as the deploying user anyway and others are not supposed to access it. For those reasons, there is no place like $HOME!
  2. Where will the deployment script be placed? - We will place our deployment script on the server. It will placed in the $HOME as well.
  3. Where will the code be? - Inside the base_dir directory. Multiple versions of our code will live inside releases directory (inside the base_dir and one of them will be symlinked as current).
  4. How will we get new versions of the code? - We will get them from an online git repository. We will basically create a copy of the git repo on our server and fetch new commits from the master branch (or whichever branch we want) and deploy them.
  5. Will we keep multiple versions of our code available on the server in case a rollback is required? - Yes. It is not uncommon for a bug to creep in in a new release. When that happens, we need to go back to the previous version as soon as possible. For this reason, we will keep multiple copies of our code (recent 5 are enough, but we can change that) on the server. In case a bug is detected, we can revert back to one of the previous releases.
  6. How difficult is the deployment of a previous version going to be? - We are going to serve our Rails app from a directory which will be a link (symbolic link or symlink) to one of the releases in the releases directory. The script will automatically switch to a new release directory whenever we deploy our latest version of code. When we need to rollback, we will just change the symlink to point to an earlier release and restart our puma server. That would normally do the trick unless you also made a change to the database schema which we also have to revert back. In such a case, there are two possibilities: 1) Write a new migration reverting those changes and merge our changes to master and redeploy (or create a new branch where the bugfix changes are and deploy that branch instead). 2) Go to the current release directory, run enough database rollbacks so that your database is in the state corresponding to the release you want to roll back to and then change the symlink to the desired release directory (and restart puma).
  7. Where will I store logs? - We will have a shared directory where we will put things that need to persist across multiple releases. Logs are one of those things that need to persist. For this reason, we will place our logs in the shared directory. Everytime a new deployment runs, it will symlink the shared logs directory inside the current release directory so that old logs are not lost.
  8. What about environment variables? - Rails has this amazing way of keeping some of the important configuration (including secret keys, credentials, important paths etc.) in environment so that they can be isolated across development, production, staging and test environments easily. Since environment changes need to persist across releases, we will keep them in the shared directory as well.
  9. Will the server restart be part of automated deployment? - Yes. We will restart the puma server as part of the deployment. The way we are going to set up nginx - it wouldn’t need any restart so we are not going to restart nginx automatically. Since a lot of people use sidekiq as their background job runner, we will have sidekiq restart as part of the automated deployment as well.
  10. Why is there a 10th point? - Because I did not like stopping at 9. :P

Understanding the steps of deployment

Below we explain the steps that we will go through during the automated deployment process. Of course

Step 0 - Create the directories

Before anything, we need to create the directories. Everything else depends on the correct ones being in place. We are going to assume that our base_dir will be located in /home/ubuntu/mysite.

Make the base_dir by issuing the command:

mkdir /home/ubuntu/mysite

The directory structure has to look like this for the script to work:

/home/ubuntu/mysite/
├── pids
├── releases
├── scm
├── shared
│   ├── config
│   ├── log
│   ├── public
│   │   └── assets
│   ├── tmp
│   │   └── cache
│   └── vendor
│       └── bundle
└── tmp

So the first thing that you need to do is to create the directories in that order. cd to yours base_dir and run the following commands:

mkdir -p {pids,releases,scm,shared,shared/config,shared/log,shared/public,shared/tmp,shared/vendor/bundle,shared/tmp/cache,shared/public/assets,tmp}

Step 1 - Defining variables we will use later

We will be using a few values which stay constant between multiple runs of the deployment scripts but are better extracted away in variables so that if we need to change certain things about the deployment later, we can do so without doing a search-replace on the file. The variables we need are:

  1. base_dir: The base directory (described above)
  2. git_repo_url: The URL to the repository from where the script will fetch our latest changes to the code
  3. git_deployment_branch: The branch which we want to deploy on our server. Basicaly, you would be pushing changes to this branch before running the deployment tool.
  4. api_only_app: If the app is an API only app, or any app for which you can skip the asset precompilation step, define this value as true.
  5. restart_sidekiq: If the app uses sidekiq as the ActiveJob adapter, then you would need to restart sidekiq right after deployment. Set this variable to true if you want to restart sidekiq as part of the deployment process.

Step 2 - Check the directories

To make sure that the rest of the steps can be done properly, the first step is to check if all the required directories are in place. The first section of the script contains the instructions/commands to check whether the required directories are in place. There are other variables that we need to test for.

Step 3 - Capturing the paths

We will need a number of file paths to work with. They are:

  1. previous_path: The path at which the previous installation was actually stored (before the deployment). From the perspective of a complete deployment, the previous path is the ‘current’ path from where your app is being served. If it’s a bit confusing, we will expand on this when we get to understanding the code in the script.
  2. build_path: Ruby is an interpreted language and as such, and just in case you are coming from the word of a compiled language like C or Java - there is no compilation process with Rails apps and thus, the term ‘build’ here is not the same as compiling source to object or executable files. Here, ‘build_path’ points to a directory where we will dump our source files, check for things, copy files, create links to shared directories etc, so that the app is ready to run. We will need a place to do that. The build_path in our case is a directory with a random looking name inside the /tmp directory.
  3. version: Like a good server backend, we will not completely destroy all old deployments. Instead we will keep a few of our old releases on the server. These releases will be identified by an incrementing number - the release version which we are going to call version. The version will be one more than the greatest release found in the releases directory we had created.
  4. release_path: This is the path to the directory where the new release’s files will be saved. It’s going to be a directory named version inside the releases subdirectory of our base_dir.

Step 4 - Fending against multiple deployments

What if we accidentally launched two deployment processes in one go? Only one should run, right? To do that, we will do two things:

  1. Check if a file named deploy.lock exists or not. If the file exists, then we assume that another deployment is in progress and exit from the deployment script right away.
  2. If no other deployment is in progress, we create a file named deploy.lock so that if another instance of script is launched, the second instance will read it and know that a deployment is in progress and exits (thus making sure our first instance executes as planned). We create this file before beginning any real deployment process (anything which changes the folder structure from what it is like right now).

Step 5 - Enable rbenv

We check that the rbenv is installed. If it’s not, we will show an error and return because the deployment process depends on rbenv being available on the server. If rbenv is available, we will run the command to initialize rbenv. This should make the intended version of ruby available to the shell we are operating in.

Step 6 - Getting code from remote git repo

We are using git to store our codebase and we will use the same repo (the path to which we have stored in a script variable) to fetch code from.

NOTE: If the repo URL is not publicly accessible, you would probably have to generate a deployment key (in github, bitbucket etc.) and add that on the server. If you are not using any of those, you would have to find a way to clone the private repo without needing any input from the user.

The first thing we do is we check if we already have the repo cloned at our expected path (which is the scm directory inside our base directory). If we don’t have the repo already cloned, we clone it. If we have the git repository cloned, we will perform the fetch operation on the repo to get the latest commits!

Step 7 - Checking out the latest code to our build directory

Once we have the update from the remote repository, we want to check out the changes in our build directory because a git repo in itself is not useful to us. So the next step is to check out the git_deployment_branch branch which should contain the latest updates and put them in the build path.

Reminder: If you pushed the code to another branch, the deployment won’t change the behavior.

When we check out a git repo, it includes the .git directory that git uses to keep its resources in. This directory can be huge in size and we don’t need that directory. So as a part of this step, we should also remove the directory once we have checked out the code.

Step 8 - Linking shared directories inside the build path

A running Rails application depends on multiple resources out of which some have to remain common. For that very reason, they should be shared across deployments. In our deployment process, we shall keep those resources within our shared directory. Here is a list of folders which we are going to link into the build path and the reasons why we keep them in shared path and link them each time:

  1. gem bundle directory: Normally when you run Rails on your development machine, you use the gems installed against your version of rails. When deploying on server, it is better to isolate the gems and put them in a separate directory. bundler allows you to do so with the --deployment flag. The place where you put the gems is stored in the .bundle/config file in the project. To make sure we are not installing gems over and over again while keeping them away from the main gems installation, we keep them in the shared directory and we link it in the project build path during each deployment. It also saves on disk space that would have been consumed by each deployment.
  2. log directory: You log events for a reason - so that you can look back at them when you need to debug or trace a problem. It would make no sense to delete your logs on each deployment. Even if you create a new log file for every deployment, it would complicate matters when you need to look into them (it is easy to get confused which file contains which event when debugging).
  3. Cache directory: When Rails runs, it stores a lot of data in its cache (not talking about the ActiveRecord caching here) which normally consists of sessions, caches, socket infromation etc. A lot of them add towards a good experience of users (e.g. if you delete all the sessions, your users will be logged out!). It is important that we persist them away because they have to shared across deployments.
  4. public assets: You need your images, HTML, CSS and compiled/minified JS files to remain available across your deployments.

Step 9 - Installing Gems

Rails depends on a number of gems to work. But sometimes you need to add a new gem to solve a problem that you face during development; this means adding a new gem in your Gemfile. You do not want to log into the server and run bundle install every single time you add a new Gem to your Gemfile. After all, that’s what installation scripts should do, isn’t it? We will do that in this step.

NOTE: We will be using bundler in deployment mode. If you remove a gem from your Gemfile and run the installation with bundler in deployment mode, it will fail on this step. Make sure you run bundle install on your development machine (and push the code to your remote) anytime you delete a gem before attempting the deployment. This updates the Gemfile.lock file and bundler would not complain. And of course, you would have to have the Gemfile.lock file checked in your repo so that git is tracking it already.

Step 10 - Copying custom files

If you have any files (such as configuration files) that you need to copy to your build process, now is the right time. Any file which is important for the bootstrap process would qualify for being copied in this space.

You should keep such files somewhere inside the shared directory so that they can be copied in every deployment run.

Step 11 - Migrating your database to the latest version

From this step, Rails (and the concerning stack) come into picture. You need your migrations to run every time you deploy before starting the server so that any updated code which concerns your updated data model would work properly. So the logical step is to run your migrations.

Step 12 - Precompilation of assets

Rails comes with something called as a asset pipeline which precompiles JS and CSS which is basically their minified version. Typically, you should get it done before you start the server.

An API-only app (created with rails new <project> --api command) does not have assets and thus we won’t be running this step for API-only apps.

Step 13 - Cleanup of old releases

As you deploy new versions of your app, your releases directory would start filling up with older versions. There should be a way to remove them. Our installation script checks for the number of releases directories there are and removes older ones while keeping a pre-determined number of releases on the server.

Step 14 - Moving the build to releases directory

The installation script creates the build directory in the /tmp directory. Once the step to build the app is complete, we should move our build to the releases directory inside our base_dir so that we can call it a release and further runs of deployment script run properly.

Step 15 - Marking this release as the current release

Once we have build in the releases directory, we need to mark this release as the current release. Since the way we do it is by using a symlink called current in our base_dir directory pointing to the latest release we want to run, we will update that symlink.

Step 16 - Restarting puma

We have marked the new release as the current one. However, puma must be running the code of the previous release. The immediate step is to restart puma so tha the new release code could be used immediately. In this (almost final) step, we will do just that!

Step 17 - Restarting Sidekiq (Bonus step)

This is not part of a standard Rails application. However, Sidekiq remains one of the background job runners for Rails (an adapter to ActiveJob actually) and my favorite, of-course. One of the applications that I wrote uses Sidekiq for multiple purposes which includes sending emails, delivering notifications and a few other tasks.

One of the things that makes sidekiq interesting is that you can use almost any part of your Rails app inside a Sidekiq job and the reason that it can happen is that Sidekiq runs your entire Rails app when you run sidekiq. So if you update your app but do not restart your app, any changes you make won’t have any effect. You need to restart sidekiq for the jobs to behave the new way.

In this step, we will restart sidekiq.

Step 18 - Remove the deploy.lock file and report

Remember the deploy.lock file we created to stop parallel deployments from running? This is the final step - remove the lock file so that we can run another deployment!

Updated:

Leave a comment