Hosting a Jekyll Site on Amazon S3 and Cloudfront
I recently reconfigured the deployment for my blog since I had switched to a new laptop. This site is built with Jekyll and hosted on Amazon S3, with Cloudfront as the CDN. The deployment of the site was done with the s3_website gem. I have both a staging and production environment, which is really useful to test any kind of changes - either to the blog config, styling, or anything with AWS configuration. I automated all of this with a simple bash script which could deploy to either environment.
#!/bin/bash --login
set -euo pipefail
IFS=$'\n\t'
set -a
ENVIRONMENT_NAME=$1
if [ -z "$ENVIRONMENT_NAME" ]; then
echo "No environment specified!"
echo "Usage: ./release.sh environment"
exit 1
fi
echo "Loading environment variables for $ENVIRONMENT_NAME"
CONFIG_FILE="./$ENVIRONMENT_NAME.config"
if [ ! -f "$CONFIG_FILE" ]; then
echo "Config file $CONFIG_FILE not found!"
exit 1
fi
echo "Sourcing $CONFIG_FILE"
echo
source "$CONFIG_FILE"
if [ -z "$AWS_ACCESS_KEY_ID" ] || [ -z "$AWS_SECRET_ACCESS_KEY" ] || [ -z "$AWS_S3_BUCKET" ]; then
echo "Not all expected variables are present, stopping deployment"
exit 1
fi
echo "Building website"
export JEKYLL_ENV="$ENVIRONMENT_NAME"
bundle exec jekyll build
echo
echo "Deploying to bucket: $AWS_S3_BUCKET"
bundle exec s3_website push
if [ -n "$WEBSITE_URL" ]; then
open "$WEBSITE_URL"
fi
This allowed me to run ./release staging
and a moment later a browser tab would open showing me the deployed site. It required a config file with the AWS credentials (per environment) which I was careful to exclude from git so I don’t accidentally leak my AWS credentials. I felt a bit uneasy about this part, but it worked just fine for many years.
Fast-forward to this week where I was trying to add the config files again in order to release new changes. This meant I had to generate new credentials in AWS IAM and figure out what the config files should look like since I didn’t bother to create a template file when I did this the first time. I also noticed that the s3_website gem is no longer maintained, although there is a community fork which is maintained. That also left me pretty uneasy about my whole release strategy here, but I forged ahead.
I tested my new credentials and config files by deploying to the staging environment and everything worked after I figured out all the necessary configuration variables. I then deployed to production and again confirmed that the site worked as expected. I didn’t realize it at the time, but I had just made a mess. The first red flag was that the output from the s3_website push
command was much longer than expected - a whole bunch of posts had been updated. I briefly looked into it and realized that this was due to encoding issues on the files themselves - most likely because the markdown files had been pushed to github and then fetched again on my new laptop. So no real issues there, just noise.
I kept looking at the output from the push to S3 and then noticed another problem - one of the posts was being deleted in S3. I realized that I probably deployed my last post but never pushed it to github. Luckily I could get the content from Wayback Machine and using ChatGPT I could easily re-create the Markdown file, add it to the repo, and restore it to S3/Cloudfront.
I continued perusing the output from the push to S3 and then noticed something really odd - I somehow created two config files in S3, production.config
and staging.config
. I then navigated to that URL in production and saw my access keys displayed in plain text. Great. Luckily I had followed best practices (or at least above-average practices) when I created the access keys - the access keys only had access to the S3 bucket and Cloudfront distribution, so the potential damage was limited. Also, since I realized my blunder right away I could invalidate the access keys immediately, delete the config files from S3, and invalidate the Cloudfront distributions - before any proper damage was done.
At this point I was pretty set on fixing the release process, since I was clearly doing something novel and it finally had come back to bite me. There are quite a few options that handle all of the deployment details, eg. Netlify, but I really like having full control over the entire process. I also wanted to maintain the staging and production environments separately, so I turned to using Github Actions.
My rough idea was
- Every push to
main
should deploy to staging - Deploys to production can be triggered manually
I followed the process outlined in this post from PagerTree and found it to be really straightforward. This was literally my first time writing a Github Action workflow from scratch, and I was pleasantly surprised by how easy it was. I ended up writing two very similar workflows for staging and production, with the key difference being the trigger and different access keys for the two environments. I considered other approaches - using different branches, or a single set of access keys - but in the end I’m pretty happy with this result.
name: Deploy to staging on push to main
on:
push:
branches: [main]
workflow_dispatch:
env:
AWS_ACCESS_KEY_ID: ${{ secrets.STAGING_AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.STAGING_AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: "us-east-1"
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Ruby
uses: ruby/setup-ruby@v1
with:
bundler-cache: true
- name: "Build Site"
run: bundle exec jekyll build
env:
JEKYLL_ENV: production
- name: "Deploy to AWS S3"
run: aws s3 sync ./_site/ s3://$ --acl public-read --delete --cache-control max-age=604800
- name: "Create AWS Cloudfront Invalidation"
run: aws cloudfront create-invalidation --distribution-id $ --paths "/*"
For the production workflow I just swapped out the variable prefixes and removed the push
trigger. The fact that you’re reading this proves that it worked!