To run a production-ready application on EC2 gives you maximum freedom but also maximum responsibilities. By production-ready, I mean:
- Highly available: no single point of failure
- Scalable: increase or decrease the number of instances based on load
- Frictionless deployment: deliver new versions of your application automatically without downtime
- Secure: patching operating systems and libraries frequently, follow the least privilege principle in all areas
- Operations: provide tools like logging, monitoring and alerting to recognize and debug problems
The overall architecture will consist of a load balancer, forwarding requests to multiple EC2 instances, distributed among different availability zones (data centers).
The diagram was created with Cloudcraft – Visualize your cloud architecture like a pro.
AWS Velocity Series
Most of our clients use AWS to reduce time-to-market following an agile approach. But AWS is only one part of the solution. In this article series, I show you how we help our clients to improve velocity: the time from idea to production. Discover all posts!
Let’s start simple and tackle all the challenges along the way.
A single EC2 instance is a single point of failure
A single EC2 instance is a single point of failure. When you want to run a production-ready app on EC2, you need more than one EC2 instance. Luckily, AWS provides a way to manage multiple EC2 instances: the Auto Scaling Group. But if you run multiple EC2 instances to serve your application, you also need a load balancer to distribute the requests to one of the EC2 instances.
In the Local development environment part of this series, you created an infrastructure
folder which is empty by now. It’s time to change this. You will now create a CloudFormation template that describes the infrastructure that is needed to run the app on EC2 instances.
Load balancer
You can follow step by step or get the full source code here: https://github.com/widdix/aws-velocity
Create a file infrastructure/ec2.yml
. The first part of the file contains the load balancer. To fully describe an Application Load Balancer, you need:
- A Security Group that allows traffic on port 80
- The lApplication Load Balancer itself
- A Target Group, which is a fleet of EC2 instances that can receive traffic from the load balancer
- A Listener, which wires the load balancer together with the target group and defines the listening port
Watch out for comments with more detailed information in the code.
|
But how do you get notified if something goes wrong? Let’s add a parameter to the Parameters
section to make the receiver configurable:
AdminEmail: |
Alerts are triggered by a CloudWatch Alarm which can send an alert to an SNS topic. You can subscribe to this topic via an email address to receive the alerts. Let’s create a SNS topic and two alarms in the Resources
section:
# A SNS topic is used to send alerts via Email to the value of the AdminEmail parameter |
Let’s recap what you implemented: A load balancer with a firewall rule that allows traffic on port 80. In the case of 5XX status codes you will receive an Email. But the load balancer alone is not enough. Now it’s time to add the EC2 instances.
EC2 instances
So far, there are no EC2 instances. Let’s change that by adding a few more parameters in the Parameters
section to make EC2 instances configurable:
# A bastion host increases the security of your system. In this case, we use one of our Free Templates for AWS CloudFormation (https://github.com/widdix/aws-cf-templates/tree/master/vpc). |
To make the template react differently to different parameter inputs, you need to add a few Conditions
that will be used later in the template:
HasKeyName: !Not [!Equals [!Ref KeyName, '']] |
Now everything is prepared to describe the EC2 instances. You need:
- A Security Group that allows
- traffic on port 3000 from the load balancer Security Group
- traffic on port 22 from the bastion host Security Group if the condition
HasSSHBastionSecurityGroup
is met - traffic on port 22 from the world if the condition
HasNotSSHBastionSecurityGroup
is met
- An Auto Scaling Group that defined how many EC2 instances should run
- A CloudWatch Logs Group to capture the logs
- A Instance Profile to reference the IAM Role
- An IAM Role that allows access to deliver logs to CloudWatch Logs
- A Launch Configuration that defined what kind of EC2 instances should be created by the Auto Scaling Group
And also create a fleet of EC2 instances in the Resources
section:
# The app listens on port 3000, but only the load balancer is allowed to send traffic to that port! |
Let’s recap what you implemented: A firewall rule that allows traffic on port 3000 (the application’s port). Depending on if you use the bastion host approach or not, an appropriate firewall rule will be created to allow SSH access. You also added an Auto Scaling Group that can scale between 2 and 4 instances. So far you have not defined what kind of EC2 instances you want to start, let’s do this in the Resources
section:
# Log files that reside on EC2 instances must be avoided because instances come and go depending on load. CloudWatch Logs provides a centralized way to store and search logs. |
Let’s recap what you implemented: The Launch Configuration defines what kind of EC2 instances the Auto Scaling Group creates. The cfn-init
script reads Metadata
from CloudFormation to configure an running EC2 instance dynamically. The cfn-signal
script reports to CloudFormation if the EC2 instance was started successfully or not. CloudWatch Logs stored the log files that are delivered by an agent that runs on the EC2 instance.
Auto Scaling
So far, the number of EC2 instances is static. To scale based on the load you need to add
- Scaling Policies to define what should happen if the system should scale up/down
- CloudWatch Alarms to trigger a Scaling Policy based on a metric such as CPU utilization
to the Resources
section:
# Increase the number of instances by 25% but at least by one not more often than every 10 minutes. |
Let’s recap what you implemented: The Scaling Policy defines what happens when you want to scale while a CloudWatch Alarm triggers the Scaling Policy based on live metrics like CPUUtilization. The Auto Scaling Group will now keep a dynamic number of EC2 instances but always ensures that not less that two instances are running and not more than 4.
One thing is missing: Monitoring of your EC2 instances. Add
- A CloudWatch Alarm to monitor the CPU utilization
- A Log Filter that searches for the word
Error
in the logs and puts the result count into a CloudWatch Metric - A CloudWatch Alarm that monitors the Log Filter output
to your Resources
section:
# Sends an alert if the average CPU load of the past 5 minutes is higher than 85% |
The infrastructure is ready now. Read the next part of the series to learn how to setup the CI/CD pipeline to deploy the EC2 based app.
Series
- Set the assembly line up
- Local development environment
- CI/CD Pipeline as Code
- Running your application
- EC2 based app
a. Infrastructure (you are here)
b. CI/CD Pipeline - Containerized ECS based app
a. Infrastructure
b. CI/CD Pipeline - Serverless app
- Summary
You can find the source code on GitHub.
Leave a Reply