What is this site?

kyle

This site is a blogging platform. It only shows my posts by default, but if you're signed in, you can post as well. I wrote this site from scratch in Python and TypeScript using Django and Vue. It's free software, and you can see the code on GitHub (that link is the current version of the code, see here for the version from when this post was written).

Before going further, I should note that for most use cases, I would not write a site like this from scratch. There are countless off-the-shelf solutions (even free ones) for this kind of site, which would take far less time and effort, would probably work better, and could be extended as needed.

I wrote this site because I'm between jobs and wanted to have some recent project to show off in line with my work (I'm a web developer, if you couldn't guess). It's also a fun way to experiment with different technologies that interest me. In short, I just wanted to have a website.

Testing Workflow

I'm a firm believer in test-driven development. I think that as much as possible, development should proceed by

Writing an automated test that currently fails
Editing code to make that failing test pass
Refactoring code as needed to clean it up, keeping the tests passing

To enable this, I set up an e2e project in the repository which uses Playwright to make browsers interact with the site and verify that features work, from the user perspective, with as little knowledge as possible of the code.

Once you have a failing e2e test, why is it failing? Is it trying to use a button that doesn't exist, is the data not being saved to the database? Whatever it is, once you figure it out, you can proceed down the stack and test more specifically. If a component needs to render a button, you can add a unit test that it does so, make that test pass, and then run the e2e test again to see what your next step is.

This process keeps you focused on real user requirements. Everything you are doing is in service of specified, user-facing features. As these tests accumulate, they are an executable specification that says whether all the features you designed still work. They will tell you if you accidentally break something, and if you're not sure why some code is there or if it's safe to remove, you can ideally just try removing it and seeing what tests break.

Infrastructure

The cloud services the site uses are all provisioned via the Terraform code in the infra directory. This makes it so that that running terraform apply will provision resources for everything needed to run the app and monitor it in production.

The site is hosted on AWS. I configured a VPC to contain a publically accessible EC2 instance connected to a private RDS instance running Postgres. The EC2 instance is available at a static Elastic IP address, and we configure DNS records in Cloudflare to point at it. Cloudflare also provides some protection against bots and DDoS attacks.

Terraform also sets up an SSH key for the EC2 instance and saves it to the local filesystem for easy access.

The EC2 instance just runs a 1-node Docker Swarm, and deploys our production Docker images with docker stack. In production, the frontend container runs Nginx. It serves the compiled Vue app and proxies API calls to the backend container.

We also provision a Grafana Cloud stack. This gives us a centralised place to monitor logs and metrics from across our services.

The Docker daemon on the EC2 instance is configured to push its logs to Amazon CloudWatch, where an AWS Lambda function then pushes them to Loki in Grafana Cloud.

The code is also instrumented with OpenTelemetry which pushes traces to Tempo in Grafana Cloud, which gives us an easy way to see how user actions look across our whole system: request
trace

CI/CD with GitHub Actions

In the spirit of automation, I wrote a couple GitHub Actions workflows.

The first runs when code is pushed to any branch. It builds the Docker images, checks that the code lints without warnings, and runs all of the unit tests and e2e tests.

The second runs when the first one succeeds on main. It builds production Docker images (which are smaller and do not have debugging features, but also don't have the linter or unit test runners) and pushes them to DockerHub. After this, it uses a script to generate our updated Docker Stack config, and copies it and a .env file to our EC2 instance. Finally, it sshes into the EC2 instance to update the Docker services and run migrations. We run containers for each service on prod, so that we can do a rolling update to avoid downtime. At this point, all updates are live on this site.

Summary

Hopefully some of this was enlightening to you in terms of technology or methodology. Feel free to comment and ask for clarification on anything.

This site remains under active development, so I don't know how well this description reflects the current state of things. The tagged git release linked at the start should be fully functional if you want to clone it and play with it, or even deploy it yourself.

programming

First posted 12/3/2023, 8:59:28 PM

Updated 10/15/2024, 4:23:35 PM