Automate Orchard Core Deployment on AWS ECS with CloudFormation

For .NET developers looking for Content Management System (CMS) solution, Orchard Core presents a compelling, open-source option. Orchard Core is a CMS built on ASP.NET Core. When deploying Orchard Core on AWS, the Elastic Container Service (ECS) provides a good hosting platform that can handle high traffic, keep costs down, and remain stable.

However, finding clear instructions for deploying Orchard Core to ECS end-to-end can be difficult. This may require us to do more testing and troubleshooting, and potentially lead to a less efficient or secure setup. A lack of a standard deployment process can also complicate infrastructure management and hinder the implementation of CI/CD. This is where Infrastructure as Code (IaC) comes in.

Source Code

The complete CloudFormation template we built in this article is available on GitHub: https://github.com/gcl-team/Experiment.OrchardCore.Main/blob/main/Infrastructure.yml

CloudFormation

IaC provides a solution for automating infrastructure management. With IaC, we define our entire infrastructure which hosts Orchard Core setup as code. This code can then be version-controlled, tested, and deployed just like application code.

CloudFormation is an AWS service that implements IaC. By using CloudFormation, AWS automatically provisions and configures all the necessary resources for our Orchard Core hosting, ensuring consistent and repeatable deployments across different environments.

This article is for .NET developers who know a bit about AWS concepts such as ECS or CloudFormation. We’ll demonstrate how CloudFormation can help to setup the infrastructure for hosting Orchard Core on AWS.

The desired infrastructure of our CloudFormation setup.

Now let’s start writing our CloudFormation as follows. We start by defining some useful parameters that we will be using later. Some of the parameters will be discussed in the following relevant sections.

AWSTemplateFormatVersion: '2010-09-09'
Description: "Infrastructure for Orchard Core CMS"

Parameters:
VpcCIDR:
Type: String
Description: "VPC CIDR Block"
Default: 10.0.0.0/16
AllowedPattern: '((\d{1,3})\.){3}\d{1,3}/\d{1,2}'
ApiGatewayStageName:
Type: String
Default: "production"
AllowedValues:
- production
- staging
- development
ServiceName:
Type: String
Default: cld-orchard-core
Description: "The service name"
CmsDBName:
Type: String
Default: orchardcorecmsdb
Description: "The name of the database to create"
CmsDbMasterUsername:
Type: String
Default: orchardcoreroot
HostedZoneId:
Type: String
Default: <your Route 53 hosted zone id>
HostedZoneName:
Type: String
Default: <your custom domain>
CmsHostname:
Type: String
Default: orchardcms
OrchardCoreImage:
Type: String
Default: <your ECR link>/orchard-core-cms:latest
EcsAmi:
Description: The Amazon Machine Image ID used for the cluster
Type: AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>
Default: /aws/service/ecs/optimized-ami/amazon-linux-2023/recommended/image_id

Dockerfile

The Dockerfile is quite straightforward.

# Global Arguments
ARG DCR_URL=mcr.microsoft.com
ARG BUILD_IMAGE=${DCR_URL}/dotnet/sdk:8.0-alpine
ARG RUNTIME_IMAGE=${DCR_URL}/dotnet/aspnet:8.0-alpine

# Build Container
FROM ${BUILD_IMAGE} AS builder
WORKDIR /app

COPY . .

RUN dotnet restore
RUN dotnet publish ./OCBC.HeadlessCMS/OCBC.HeadlessCMS.csproj -c Release -o /app/src/out

# Runtime Container
FROM ${RUNTIME_IMAGE}

## Install cultures
RUN apk add --no-cache \
icu-data-full \
icu-libs

ENV ASPNETCORE_URLS http://*:5000

WORKDIR /app

COPY --from=builder /app/src/out .

EXPOSE 5000

ENTRYPOINT ["dotnet", "OCBC.HeadlessCMS.dll"]

With the Dockerfile, we then can build the Orchard Core project locally with the command below.

docker build --platform=linux/amd64 -t orchard-core-cms:v1 .

The --platform flag specifies the target OS and architecture for the image being built. Even though it is optional, it is particularly useful when building images on a different platform (like macOS or Windows) and deploying them to another platform (like Amazon Linux) that has a different architecture.

ARM-based Apple Silicon was announced in 2020. (Image Credit: The Verge)

I am using macOS with ARM-based Apple Silicon, whereas Amazon Linux AMI uses amd64 (x86_64) architecture. Hence, if I do not specify the platform, the image I build on my Macbook will be incompatible with EC2 instance.

Once the image is built, we will push it to the Elastic Container Registry (ECR).

We choose ECR because it is directly integrated with ECS, which means deploying images from ECR to ECS is smooth. When ECS needs to pull an image from ECR, it automatically uses the IAM role to authenticate and authorise the request to ECR. The execution role of our ECS is associated with the AmazonECSTaskExecutionRolePolicy IAM policy, which allows ECS to pull images from ECR.

ECR also comes with built-in support for image scanning, which automatically scans our images for vulnerabilities.

Image scanning in ECR helps ensure our images are secure before we deploy them.

Unit 01: IAM Role

Technically, we are able to run Orchard Core on ECS without any ECS task role. However, that is possible only if our Orchard Core app does not need to interact with AWS services. Not only for our app, but actually most of the modern web apps, we always need to integrate our app with AWS services such as S3, CloudWatch, etc. Hence, the first thing that we need to work on is setting up an ECS task role.

iamRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub "${AWS::StackName}-ecs"
Path: !Sub "/${AWS::StackName}/"
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service:
- ecs-tasks.amazonaws.com
Action:
- sts:AssumeRole

In AWS IAM, permissions are assigned to roles, not directly to the services that need them. Thus, we cannot directly assign IAM policies to ECS tasks. Instead, we assign those policies to a role, and then the ECS task temporarily assumes that role to gain those permissions, as shown in the configuration above.

Roles are considered temporary because they are only assumed for the duration that the ECS task needs to interact with AWS resources. Once the ECS task stops, the temporary permissions are no longer valid, and the service loses access to the resources.

Hence, by using roles and AssumeRole, we follow the principle of least privilege. The ECS task is granted only the permissions it needs and can only use them temporarily.

Unit 02: CloudWatch Log Group

ECS tasks, by default, do not have logging enabled.

Hence, assigning a role to our ECS task for logging to CloudWatch Logs is definitely one of the first roles we should assign when setting up ECS tasks. Setting logging up early helps to avoid surprises later on when our ECS tasks are running.

To setup the logging, we first need to specify Log Group, a place in CloudWatch that logs go. While ECS itself can create the log group automatically when the ECS task starts (if it does not already exist), it is a good practice to define the log group in CloudFormation to ensure it exists ahead of time and can be managed within our IaC.

ecsLogGroup:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Sub "/ecs/${ServiceName}-log-group"
RetentionInDays: 3
Tags:
- Key: Stack
Value: !Ref AWS::StackName

The following policy will grant the necessary permissions to write logs to CloudWatch.

ecsLoggingPolicy:
Type: AWS::IAM::Policy
Properties:
PolicyName: !Sub "${AWS::StackName}-cloudwatch-logs-policy"
Roles:
- !Ref iamRole
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- logs:CreateLogStream
- logs:PutLogEvents
Resource:
- !Sub "arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/ecs/${ServiceName}-log-group/*"

By separating the logging policy into its own resource, we make it easier to manage and update policies independently of the ECS task role. After defining the policy, we attach it to the ECS task role by referencing it in the Roles section.

The logging setup helps us consolidate log events from the container into a centralised log group in CloudWatch.

Unit 03: S3 Bucket

We will be storing the files uploaded to the Orchard Core through its Media module on Amazon S3. So, we need to configure our S3 Bucket as follows.

mediaContentBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: !Join
- '-'
- - !Ref ServiceName
- !Ref AWS::Region
- !Ref AWS::AccountId
OwnershipControls:
Rules:
- ObjectOwnership: BucketOwnerPreferred
Tags:
- Key: Stack
Value: !Ref AWS::StackName

Since bucket names must be globally unique, we dynamically create it using AWS Region and AWS Account ID.

Since our Orchard Core can be running in multiple ECS tasks that upload media files to a shared S3 bucket, the BucketOwnerPreferred setting ensures that even if media files are uploaded by different ECS tasks, the owner of the S3 bucket can still access, delete, or modify any of those media files without needing additional permissions for each uploaded object.

The bucket owner having full control is a security necessity in many cases because it allows the owner to apply policies, access controls, and auditing in a centralised way, maintaining the security posture of the bucket.

However, even if the bucket owner has control, the principle of least privilege should still apply. For example, only the ECS task responsible for Orchard Core should be allowed to interact with the media objects.

mediaContentBucketPolicy:
Type: AWS::IAM::Policy
Properties:
PolicyName: !Sub "${mediaContentBucket}-s3-policy"
Roles:
- !Ref iamRole
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- s3:ListBucket
Resource: !GetAtt mediaContentBucket.Arn
- Effect: Allow
Action:
- s3:PutObject
- s3:GetObject
Resource: !Join ["/", [!GetAtt mediaContentBucket.Arn, "*"]]

Keeping the s3:ListBucket permission in the policy is a necessary permission for Orchard Core Media module to work properly. Meanwhile, both s3:PutObject and s3:GetObject are used for uploading and downloading media files.

IAM Policy

Now, let’s pause a while to talk about the policies that we have added above for the log group and S3.

In AWS, we mostly deal with managed policies and inline policies depending on whether the policy needs to be reused or tightly scoped to one role.

We use AWS::IAM::ManagedPolicy when the permission needs to be reused by multiple roles or services. So it is frequently used in company-wide security policies. Thus it is not suitable for our Orchard Core examples above. Instead, we use AWS::IAM::Policy because it is for a permission which is tightly connected to a single role and will not be reused elsewhere.

In addition, since AWS::IAM::Policy is tightly tied to entities, it will be deleted when the corresponding entities are deleted. This is a key difference from AWS::IAM::ManagedPolicy, which remains even if the entities that use it are deleted. This explains why managed policy is used in company-wide policies because managed policy provides better long-term management for permissions that may be reused across multiple roles.

We can summarise the differences between two of them into the following table.

FeatureManaged PolicyPolicy
ScopeCompany-wide.Tight coupling to a single entity.
Deletion BehaviourPersists even if attached entities are deleted.Deleted along with the associated entity.
Versioning SupportSupports versioning (can roll back).No.
Limit per Entity20.10.
Best Use CaseLong-term, reusable permissions (e.g., company-wide security policies).One-off, tightly scoped permissions (e.g., role-specific needs).

Unit 04: Aurora Database Cluster

Orchard Core supports Relational DataBase Management System (RDBMS). Unlike traditional CMS platforms that rely on a single database engine, Orchard Core offers flexibility by supporting multiple RDBMS options, including:

  • Microsoft SQL Server;
  • PostgreSQL;
  • MySQL;
  • SQLite.

While SQLite is lightweight and easy to use, it is not suitable for production deployments on AWS. SQLite is designed for local storage, not multi-user concurrent access. On AWS, there are fully managed relational databases (RDS and Aurora) provided instead.

The database engines supported by Amazon RDS and Amazon Aurora.

While Amazon RDS is a well-known choice for relational databases, we can also consider Amazon Aurora, which was launched in 2014. Unlike traditional RDS, Aurora automatically scales up and down, reducing costs by ensuring we only pay for what we use.

High performance and scalability of Amazon Aurora. (Image Source: Amazon Aurora MySQL PostgreSQL Features)

In addition, Aurora is faster than standard PostgreSQL and MySQL, as shown in the screenshot above. It also offers built-in high availability with Multi-AZ replication. This is critical for a CMS like Orchard Core, which relies on fast queries and efficient data handling.

It is important to note that, while Aurora is optimised for AWS, it does not lock us in, as we retain full control over our data and schema. Hence, if we ever need to switch, we can export data and move to standard MySQL/PostgreSQL on another cloud or on-premises.

Instead of manually setting up Aurora, we will be using CloudFormation to ensure that the correct database instance, networking, security settings, and additional configurations are managed consistently.

Aurora is cluster-based rather than standalone DB instances like traditional RDS. Thus, instead of a single instance, we deploy a DB cluster, which consists of a primary writer node and multiple reader nodes for scalability and high availability.

Because of this cluster-based architecture, Aurora does not use the usual DBParameterGroup like standalone RDS instances. Instead, it requires a DBClusterParameterGroup to apply settings at the cluster level, ensuring all instances in the cluster inherit the same configuration, as shown in the following Cloudformation template.

cmsDBClusterParameterGroup:
Type: AWS::RDS::DBClusterParameterGroup
Properties:
Description: "Aurora Provisioned Postgres DB Cluster Parameter Group"
Family: aurora-postgresql16
Parameters:
timezone: UTC # Ensures consistent timestamps
rds.force_ssl: 1 # Enforce SSL for security

The first parameter we configure is the timezone. We set it to UTC to ensure consistency. So when we store date-time values in the database, we should use TIMESTAMPTZ for timestamps, and store the time zone as a TEXT field. After that, when we need to display the time in a local format, we can use the AT TIME ZONE feature in PostgreSQL to convert from UTC to the desired local time zone. This is important because PostgreSQL returns all times in UTC, so storing the time zone ensures we can always retrieve and present the correct local time when needed, as shown in the query below.

SELECT event_time_utc AT TIME ZONE timezone AS event_local_time
FROM events;

After that, we enabled the rds.force_ssl so that all connections to our Aurora are encrypted using SSL. This is necessary to prevent data from being sent in plaintext. Even if our Aurora database is behind a bastion host, enforcing SSL connections is still recommended because SSL ensures the encryption of all data in transit, adding an extra layer of security. It is also worth mentioning that enabling SSL does not negatively impact performance much, but it adds a significant security benefit.

Once the DBClusterParameterGroup is configured, the next step is to configure the AWS::RDS::DBCluster resource, where we will define the cluster main configuration with the parameter group defined above.

cmsDatabaseCluster:
Type: AWS::RDS::DBCluster
Properties:
BackupRetentionPeriod: 7
DatabaseName: !Ref CmsDBName
DBClusterIdentifier: !Ref AWS::StackName
DBClusterParameterGroupName: !Ref cmsDBClusterParameterGroup
DeletionProtection: true
Engine: aurora-postgresql
EngineMode: provisioned
EngineVersion: 16.1
MasterUsername: !Ref CmsDbMasterUsername
MasterUserPassword: !Sub "{{resolve:ssm-secure:/OrchardCoreCms/DbPassword:1}}"
DBSubnetGroupName: !Ref cmsDBSubnetGroup
VpcSecurityGroupIds:
- !GetAtt cmsDBSecurityGroup.GroupId
Tags:
- Key: Stack
Value: !Ref AWS::StackName

Let’s go through the Properties.

About BackupRetentionPeriod

The BackupRetentionPeriod parameter in the Aurora DB cluster determines how many days automated backups are retained by AWS. It can be from a minimum of 1 day to a maximum of 35 days for Aurora databases. For most business applications, 7 days of backups is often enough to handle common recovery scenarios unless we are required by law or regulation to keep backups for a certain period.

Aurora automatically performs incremental backups for our database every day, which means that it does not back up the entire database each time. Instead, it only stores the changes since the previous backup. This makes the backup process very efficient, especially for databases with little or no changes over time. If our CMS database remains relatively static, then the backup storage cost will remain very low or even free as long as our total backup data for the whole retention period does not exceed the storage capacity of our database.

So the total billed usage for backup depends on how much data is being changed each day, and whether the total backup size exceeds the volume size. If our database does not experience massive daily changes, the backup storage will likely remain within the database size and be free.

About DBClusterIdentifier

For the DBClusterIdentifier, we set it to the stack name, which makes it unique to the specific CloudFormation stack. This can be useful for differentiating clusters.

About DeletionProtection

In production environments, data loss or downtime is critical. DeletionProtection ensures that our CMS DB cluster will not be deleted unless it is explicitly disabled. There is no “shortcut” to bypass it for production resources. If DeletionProtection is enabled on the DB cluster, even CloudFormation will fail to delete the DB cluster. The only way to delete the DB cluster is that we disable DeletionProtection first via the AWS Console, CLI or SDK.

About EngineMode

In Aurora, EngineMode refers to the database operational mode. There are two primary modes, i.e. Provisioned and Serverless. For Orchard Core, Provisioned mode is typically the better choice because the mode ensures high availability, automatic recovery, and read scaling. Hence, if the CMS is going to have a consistent level of traffic, Provisioned mode will be able to handle that load. Serverless is useful if our CMS workload has unpredictable traffic patterns or usage spikes.

About MasterUserPassword

Storing database passwords directly in the CloudFormation template is a security risk.

There are a few other ways to handle sensitive data like passwords in CloudFormation, for example using AWS Secrets Manager and AWS Systems Manager (SSM) Parameter Store.

AWS Secrets Manager is a more advanced solution that offers automatic password rotation, which is useful for situations where we need to regularly rotate credentials. However, it may incur additional costs.

On the other hand, SSM Parameter Store provides a simpler and cost-effective solution for securely storing and referencing secrets, including database passwords. We can store up to 10,000 parameters (standard type) without any cost.

Hence, we need to use SSM Parameter Store to securely store the database password and reference it in CloudFormation without exposing it directly in our template, reducing the security risks and providing an easier management path for our secrets.

Database password is stored as a SecureString in Parameter Store.
Database password is stored as a SecureString in Parameter Store.

About DBSubnetGroupName and VpcSecurityGroupIds

These two configurations about Subnet and VPC will involve networking considerations. We will discuss further when we dive into the networking setup later.

Unit 05: Aurora Database Instance

Now that we have covered the Aurora DB cluster, which is the overall container for the database, let’s move on to the DB instance.

Think of the cluster as the foundation, and the DB instances are where the actual database operations take place. The DB instances are the ones that handle the read and write operations, replication, and scaling for the workload. So, in order for our CMS to work correctly, we need to define the DB instance configuration, which runs on top of the DB cluster.

cmsDBInstance:
Type: 'AWS::RDS::DBInstance'
DeletionPolicy: Retain
Properties:
DBInstanceIdentifier: !Sub "${AWS::StackName}-db-instance"
DBInstanceClass: db.t4g.medium
DBClusterIdentifier: !Ref cmsDatabaseCluster
DBSubnetGroupName: !Ref cmsDBSubnetGroup
Engine: aurora-postgresql
Tags:
- Key: Stack
Value: !Ref AWS::StackName

For our Orchard Core CMS, we do not expect very high traffic or intensive database operations. Hence, we choose to use db.t4g. T4g database instances are AWS Graviton2-based, thus they are more cost-efficient than traditional instance types, especially for workloads like a CMS that does not require continuous high performance. However, there are a few things we make need to look into when using T instance classes.

Unit 06: Virtual Private Cloud (VPC)

Now that we have covered how the Aurora cluster and instance work, the next important thing is ensuring they are deployed in a secure and well-structured network. This is where the Virtual Private Cloud (VPC) comes in.

VPC is a virtual network in AWS where we define the infrastructure networking. It is like a private network inside AWS where we can control IP ranges, subnets, routing, and security.

The default VPC in Malaysia region.

By the way, you might have noticed that AWS automatically provides a default VPC in every region. It is a ready-to-use network setup that allows us to launch resources without configuring networking manually.

While it is convenient, it is recommended not to use the default VPC. This is because the default VPC is automatically created with predefined settings, which means we do not have full control over its configuration, such as subnet sizes, routing, security groups, etc. It also has public subnets by default which can accidentally expose internal resources to the Internet.

Since we are setting up our own VPC, one key decision we need to make is the CIDR block, i.e. the range of private IPs we allocate to our network. This is important because it determines how many subnets and IP addresses we can have within our VPC.

To future-proof our infrastructure, we will be using a /16 CIDR block, as shown in the VpcCIDR in our CloudFormation template. This gives us 65,536 IP addresses, which we can break into 64 subnets of /22 (each having 1,024 IPs). 64 subnets is usually more than enough for a well-structured VPC because most companies do not even need so many subnets in a single VPC unless they have very complex workloads. Just in case if one service needs more IPs, we can allocate a larger subnet, for example /21 instead of /22.

In the VPC setup, we are also trying to avoid creating too many VPCs unnecessarily. Managing multiple VPCs means handling VPC peering which increases operational overhead.

vpc:
Type: AWS::EC2::VPC
Properties:
CidrBlock: !Ref VpcCIDR
InstanceTenancy: default
EnableDnsSupport: true
EnableDnsHostnames: true
Tags:
- Key: Name
Value: !Sub "${AWS::AccountId}-${AWS::Region}-vpc"

Since our ECS workloads and Orchard Core CMS are public-facing, we need EnableDnsHostnames: true so that public-facing instances get a public DNS name. We also need EnableDnsSupport: true to allow ECS tasks, internal services, and AWS resources like S3 and Aurora to resolve domain names internally.

For InstanceTenancy, which determines whether instances in our VPC run on shared (default) or dedicated hardware, it is recommended to use the default because AWS automatically places instances on shared hardware, which is cost-effective and scalable. We only need to change it if we are asked to use dedicated instances with full hardware isolation.

Now that we have defined our VPC, the next step is planning its subnet structure. We need both public and private subnets for our workloads.

Unit 07: Subnets and Subnet Groups

For our VPC with a /16 CIDR block, we will be breaking it into /24 subnets for better scalability:

  • Public Subnet 1: 10.0.0.0/24
  • Public Subnet 2: 10.0.1.0/24
  • Private Subnet 1: 10.0.2.0/24
  • Private Subnet 2: 10.0.3.0/24

Instead of manually specifying CIDRs, we will let CloudFormation automatically calculates the CIDR blocks for public and private subnets using !Select and !Cidr, as shown below.

# Public Subnets
publicSubnet1:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref vpc
CidrBlock: 10.0.0.0/24
AvailabilityZone: !Select [0, !GetAZs '']
Tags:
- Key: Name
Value: !Sub "${AWS::AccountId}-${AWS::Region}-public-subnet-1"

publicSubnet2:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref vpc
CidrBlock: 10.0.1.0/24
AvailabilityZone: !Select [1, !GetAZs '']
Tags:
- Key: Name
Value: !Sub "${AWS::AccountId}-${AWS::Region}-public-subnet-2"

# Private Subnets
privateSubnet1:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref vpc
CidrBlock: 10.0.2.0/24
AvailabilityZone: !Select [0, !GetAZs '']
Tags:
- Key: Name
Value: !Sub "${AWS::AccountId}-${AWS::Region}-private-subnet-1"

privateSubnet2:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref vpc
CidrBlock: 10.0.3.0/24
AvailabilityZone: !Select [1, !GetAZs '']
Tags:
- Key: Name
Value: !Sub "${AWS::AccountId}-${AWS::Region}-private-subnet-2"

For availability zones (AZs), all commercial AWS regions have at least two AZs, with most having three or more. Hence, we do not need to worry about the assignment of !Select [1, !GetAZs ''] in the template above will fail.

Now with our subnets setup, we can revisit the DBSubnetGroupName in Aurora cluster and instance. Aurora clusters are highly available, and AWS recommends placing Aurora DB instances across multiple AZs to ensure redundancy and better fault tolerance. The Subnet Group allows us to define the subnets where Aurora will deploy its instances, which enables the multi-AZ deployment for high availability.

cmsDBSubnetGroup:
Type: AWS::RDS::DBSubnetGroup
Properties:
DBSubnetGroupDescription: "Orchard Core CMS Postgres DB Subnet Group"
SubnetIds:
- !Ref privateSubnet1
- !Ref privateSubnet2
Tags:
- Key: Stack
Value: !Ref AWS::StackName

Unit 08: Security Groups

Earlier, we configured the Subnet Group for Aurora, which defines which subnets the Aurora instances will reside in. Now, we need to ensure that only authorised systems or services can access our database. That is where the Security Group cmsDBSecurityGroup comes into play.

A Security Group acts like a virtual firewall that controls inbound and outbound traffic to our resources, such as our Aurora instances. It is like setting permissions to determine which IP addresses and which ports can communicate with the database.

For Aurora, we will configure the security group to only allow traffic from our private subnets, so that only trusted services within our VPC can reach the database.

cmsDBSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub "${CmsDBName}-security-group"
GroupDescription: "Permits Access To CMS Aurora Database"
VpcId: !Ref vpc
SecurityGroupIngress:
- CidrIp: !GetAtt privateSubnet1.CidrBlock
IpProtocol: tcp
FromPort: 5432
ToPort: 5432
- CidrIp: !GetAtt privateSubnet2.CidrBlock
IpProtocol: tcp
FromPort: 5432
ToPort: 5432
Tags:
- Key: Name
Value: !Sub "${CmsDBName}-security-group"
- Key: Stack
Value: !Ref AWS::StackName

Here we only setup security group for ingress but not egress because AWS security groups, by default, allow all outbound traffic.

Unit 09: Elastic Load Balancing (ELB)

Before diving into how we host Orchard Core on ECS, let’s first figure out how traffic will reach our ECS service. In modern cloud web app development and hosting, three key factors matter: reliability, scalability, and performance. And that is why a load balancer is essential.

  • Reliability – If we only have one container and it crashes, the whole app goes down. A load balancer allows us to run multiple containers so that even if one fails, the others keep running.
  • Scalability – As traffic increases, a single container will not be enough. A load balancer lets us add more containers dynamically when needed, ensuring smooth performance.
  • Performance – Handling many requests in parallel prevents slowdowns. A load balancer efficiently distributes traffic to multiple containers, improving response times.

For that, we need an Elastic Load Balancing (ELB) to distribute requests properly.

AWS originally launched ELB with only Classic Load Balancers (CLB). Later, AWS completely redesigned its load balancing services and introduced the following in ElasticLoadBalancingV2:

  • Network Load Balancer (NLB);
  • Application Load Balancer (ALB);
  • Gateway Load Balancer (GLB).
Summary of differences: ALB vs. NLB vs. GLB (Image Source: AWS)

NLB is designed for high performance, low latency, and TCP/UDP traffic, which makes it perfect for situations like ours, where we are dealing with an Orchard Core CMS web app. NLB is optimised for handling millions of requests per second and is ideal for routing traffic to ECS containers.

ALB is usually better suited for HTTP/HTTPS traffic. ALB offers more advanced routing features for HTTP. Since we are mostly concerned with handling general traffic to ECS, NLB is simpler and more efficient.

GLB works well if we manage traffic between cloud and on-premises environments or across different regions, which does not apply to our use case here.

Configure NLB

Setting up an NLB in AWS always involves these three key components:

  • AWS::ElasticLoadBalancingV2::LoadBalancer;
  • AWS::ElasticLoadBalancingV2::TargetGroup;
  • AWS::ElasticLoadBalancingV2::Listener.

Firstly, LoadBalancer distributes traffic across multiple targets such as ECS tasks.

internalNlb:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Name: !Sub "${ServiceName}-private-nlb"
Scheme: internal
Type: network
Subnets:
- !Ref privateSubnet1
- !Ref privateSubnet2
LoadBalancerAttributes:
- Key: deletion_protection.enabled
Value: "true"
Tags:
- Key: Stack
Value: !Ref AWS::StackName

In the template above, we create a NLB (Type: network) that is not exposed to the public internet (Scheme: internal). It is deployed across two private subnets, ensuring high availability. Finally, to prevent accidental deletion, we enable the deletion protection. In the future, we must disable it before we can delete the NLB.

Please take note that we do not enable Cross-Zone Load Balancing here because AWS charges for inter-AZ traffic. Also, since we are planning each AZ to have the same number of targets, disabling cross-zone helps preserve optimal routing.

Secondly, we need to setup TargetGroup to tell the NLB to send traffic to our ECS tasks running Orchard Core CMS.

nlbTargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
DependsOn:
- internalNlb
Properties:
Name: !Sub "${ServiceName}-target-group"
Port: 80
Protocol: TCP
TargetType: instance
VpcId: !Ref vpc
HealthCheckProtocol: HTTP
HealthCheckPort: 80
HealthCheckPath: /health
TargetGroupAttributes:
- Key: deregistration_delay.timeout_seconds
Value: 10
Tags:
- Key: Stack
Value: !Ref AWS::StackName

Here, we indicate that the TargetGroup is listening on port 80 and expects TCP traffic. TargetType: instance means NLB will send traffic directly to EC2 instances that are hosting our ECS tasks. We also link it to our VPC to ensure traffic stays within our network.

Even though the NLB uses TCP at the transport layer, it performs health checks at the application layer (HTTP). This ensures that the NLB can intelligently route traffic only to instances that are responding correctly to the application-level health check endpoint. Our choice of HTTP for the health check protocol instead of TCP is because the Orchard Core running on ECS is listening on port 80 and exposing an HTTP health check endpoint /health. By using HTTP for health checks, we can ensure that the NLB can detect not only if the server is up but also if the Orchard Core is functioning correctly.

We also setup Deregistration Delay to be 10 seconds. Thus, when an ECS task is stopped or removed, the NLB waits 10 seconds before fully removing it. This helps prevent dropped connections by allowing any in-progress requests to finish. We can keep 10 for now if the CMS does not have long requests. However, when we start to notice 502/503 errors when deploying updates, we should increase it to 30 or more.

In addition, normally, a Target Group checks if the app is healthy before sending traffic.
Since NLB only supports TCP health checks and our Orchard Core app does not expose a TCP check, we skip health checks for now.

Thirdly, we need to configure the Listener. This Listener is responsible for handling incoming traffic on our NLB. When a request comes in, the Listener forwards the traffic to the Target Group, which then routes it to our ECS instances running Orchard Core CMS.

internalNlbListener:
Type: AWS::ElasticLoadBalancingV2::Listener
Properties:
LoadBalancerArn: !Ref internalNlb
Port: 80
Protocol: TCP
DefaultActions:
- Type: forward
TargetGroupArn: !Ref nlbTargetGroup

The Listener port is the entry point where the NLB receives traffic from. It is different from the TargetGroup port which is the port on the ECS instances where the Orchard Core app is actually running. The Listener forwards traffic from its port to the TargetGroup port. In most cases, they are the same for simplicity.

The DefaultActions section ensures that all incoming requests are automatically directed to the correct target without any additional processing. This setup allows our NLB to efficiently distribute traffic to the ECS tasks while keeping the configuration simple and scalable.

In the NLB setup above, have you noticed that we do not handle port 443 (HTTPS)? Right now, our setup only works with HTTP on port 80.

So, if users visit our Orchard Core with HTTPS, the request stays encrypted as it passes through the NLB. But here is the problem because that means our ECS task must be able to handle HTTPS itself. If our ECS tasks only listen on port 80, they will receive encrypted HTTPS traffic, which they cannot process.

So why not we configure Orchard Core to accept HTTPS directly by having it listen on port 443 in Program.cs? Sure! However, this would require our ECS tasks to handle SSL termination themselves. We thus need to manage SSL certificates ourselves, which adds complexity to our setup.

Hence, we need a way to properly handle HTTPS before it reaches ECS. Now, let’s see how we can solve this with API Gateway!

Unit 10: API Gateway

As we discussed earlier, not always, but it is best practice to offload SSL termination to API Gateway because NLB does not handle SSL decryption. The SSL termination happens automatically with API Gateway for HTTPS traffic. It is a built-in feature, so we do not have to worry about manually managing SSL certificates on our backend.

In addition, API Gateway brings extra benefits such as blocking unwanted traffic and ensures only the right users can access our services. It also caches frequent requests, reducing load on our backend. Finally, it is able to log all requests, making troubleshooting faster.

By using API Gateway, we keep our infrastructure secure, efficient, and easy to manage.

Let’s start with a basic setup of API Gateway with NLB by setting up the following required components:

  • AWS::ApiGateway::RestApi: The root API that ties everything together. It defines the API itself before adding resources and methods.
  • AWS::ApiGateway::VpcLink: Connects API Gateway to the NLB.
  • AWS::ApiGateway::Resource: Defines the API endpoint path.
  • AWS::ApiGateway::Method: Specifies how the API handles requests (e.g. GET, POST).
  • AWS::ApiGateway::Deployment: Deploys the API configuration.
  • AWS::ApiGateway::Stage: Assigns a stage (e.g. dev, prod) to the deployment.

Setup Rest API

API Gateway is like a front door to our backend services. Before we define any resources, methods, or integrations, we need to create this front door first, i.e. the AWS::ApiGateway::RestApi resource.

apiGatewayRestApi:
Type: AWS::ApiGateway::RestApi
Properties:
Name: !Sub "${ServiceName}-api-gateway"
DisableExecuteApiEndpoint: True
EndpointConfiguration:
Types:
- REGIONAL
Policy: ''

Here we disable the execute-api endpoint because we want to stop AWS from exposing a default execute-api endpoint. We want to enforce access through our own custom domain which we will setup later.

REGIONAL ensures that the API is available only within our AWS region. Setting it to REGIONAL is generally the recommended option for most apps, especially for our Orchard Core CMS, because both the ECS instances and the API Gateway are in the same region. This setup allows requests to be handled locally, which minimises latency. In the future, if our CMS user base grows and is distributed globally, we may need to consider switching to EDGE to serve our CMS to a larger global audience with better performance and lower latency across regions.

Finally, since this API is mainly acting as a reverse proxy to our Orchard Core homepage on ECS, CORS is not needed. We also leave Policy: '' empty means anyone can access the public-facing Orchard Core. Instead, security should be handled by the Orchard Core authentication.

Now that we have our root API, the next step is to connect it to our VPC using VpcLink!

Setup VPC Link

The VPC Link allows API Gateway to access private resources in our VPC, such as our ECS services via the NLB. This connection ensures that requests from the API Gateway can securely reach the Orchard Core CMS hosted in ECS, even though those resources are not publicly exposed.

In simple terms, VPC Link acts as a bridge between the public-facing API Gateway and the internal resources within our VPC.

So in our template, we define the VPC Link and specify the NLB as the target, which means that all API requests coming into the Gateway will be forwarded to the NLB, which will then route them to our ECS tasks securely.

apiGatewayVpcLink:
Type: AWS::ApiGateway::VpcLink
Description: "VPC link for API Gateway of Orchard Core"
Properties:
Name: !Sub "${ServiceName}-vpc-link"
TargetArns:
- !Ref internalNlb

Now that we have set up the VpcLink, which connects our API Gateway to our ECS, the next step is to define how requests will actually reach our ECS. That is where the API Gateway Resource comes into play.

Setup API Gateway Resource

For the API Gateway to know what to do with the incoming requests once they cross that VPC Link bridge, we need to define specific resources, i.e. the URL paths our users will use to access the Orchard Core CMS.

In our case, we use a proxy resource to catch all requests and send them to the backend ECS service. This lets us handle dynamic requests with minimal configuration, as any path requested will be forwarded to ECS.

Using proxy resource is particularly useful for web apps like Orchard Core CMS, where the routes could be dynamic and vary widely, such as /home, /content-item/{id}, /admin/{section}. With the proxy resource, we do not need to define each individual route or API endpoint in the API Gateway. As the CMS grows and new routes are added, we also will not need to constantly update the API Gateway configuration.

apiGatewayRootProxyResource:
Type: AWS::ApiGateway::Resource
Properties:
RestApiId: !Ref apiGatewayRestApi
ParentId: !GetAtt apiGatewayRestApi.RootResourceId
PathPart: '{proxy+}'
DependsOn:
- apiGatewayRestApi

After setting up the resources and establishing the VPC link to connect API Gateway to our ECS instances, the next step is to define how we handle incoming requests to those resources. This is where the AWS::ApiGateway::Method comes in. It defines the specific HTTP methods that API Gateway should accept for a particular resource.

Setup Method

The Resource component above is used to define where the requests will go. However, just defining the path alone is not enough to handle incoming requests. We need to tell API Gateway how to handle requests that come to those paths. This is where the AWS::ApiGateway::Method component comes into play.

For a use case like hosting Orchard Core CMS, the following configuration can be a good starting point.

apiGatewayRootMethod:
Type: AWS::ApiGateway::Method
Properties:
HttpMethod: ANY
AuthorizationType: NONE
ApiKeyRequired: False
RestApiId: !Ref apiGatewayRestApi
ResourceId: !GetAtt apiGatewayRestApi.RootResourceId
Integration:
ConnectionId: !Ref apiGatewayVpcLink
ConnectionType: VPC_LINK
Type: HTTP_PROXY
IntegrationHttpMethod: ANY
Uri: !Sub "http://${internalNlb.DNSName}"
DependsOn:
- apiGatewayRootProxyResource

apiGatewayRootProxyMethod:
Type: AWS::ApiGateway::Method
Properties:
ApiKeyRequired: False
RestApiId: !Ref apiGatewayRestApi
ResourceId: !Ref apiGatewayRootProxyResource
HttpMethod: ANY
AuthorizationType: NONE
RequestParameters:
method.request.path.proxy: True
Integration:
ConnectionId: !Ref apiGatewayVpcLink
ConnectionType: VPC_LINK
Type: HTTP_PROXY
RequestParameters:
integration.request.path.proxy: method.request.path.proxy
CacheKeyParameters:
- method.request.path.proxy
IntegrationHttpMethod: ANY
IntegrationResponses:
- StatusCode: 200
SelectionPattern: 200
Uri: !Sub "http://${internalNlb.DNSName}/{proxy}"
DependsOn:
- apiGatewayRootProxyResource
- apiGatewayVpcLink

By setting up both the root method and the proxy method, the API Gateway can handle both general traffic via the root method and dynamic path-based traffic via the proxy method in a flexible way. This reduces the need for additional methods and resources to manage various paths.

Handling dynamic path-based traffic for Orchard Core via the proxy method.

Since Orchard Core is designed for browsing, updating, and deleting content, as a start, we may need support for multiple HTTP methods. By using ANY, we are ensuring that all these HTTP methods are supported without having to define separate methods for each one.

Setting AuthorizationType to NONE is a good starting point, especially in cases where we are not expecting to implement authentication directly at the API Gateway level. Instead, we are relying on Orchard Core built-in authentication module, which already provides user login, membership, and access control. Later, if needed, we can enhance security by adding authentication layers at the API Gateway level, such as AWS IAM, Cognito, or Lambda authorisers.

Similar to the authorisation, setting ApiKeyRequired to False is also a good choice for a starting point, especially since we are not yet exposing a public API. The setup above is primarily for routing requests to Orchard Core CMS. We could change if we need to secure our CMS API endpoints in the future when 3rd-party integrations or external apps need access to the CMS API.

Up to this point, API Gateway has a Resource and a Method, but it still does not know where to send the request. That is where Integration comes in. In our setup above, it tells API Gateway to use VPC Link to talk to the ECS. It also makes API Gateway act as a reverse proxy by setting Type to HTTP_PROXY. It will simply forward all types of HTTP requests to Orchard Core without modifying them.

Even though API Gateway enforces HTTPS for external traffic, it decrypts (aka terminates SSL), validates the request, and then forwards it over HTTP to NLB within the AWS private network. Since this internal communication happens securely inside AWS, the Uri is using HTTP.

After setting up the resources and methods in API Gateway, we are essentially defining the blueprint for our API. However, these configurations are only in a draft state so they are not yet live and accessible to our end-users. We need a step called Deployment to publish the configuration.

Setup Deployment

Without deploying, the changes we discussed above are just concepts and plans. We can test them within CloudFormation, but they will not be real in the API Gateway until they are deployed.

There is an important thing to take note is that API Gateway does not automatically detect changes in our CloudFormation template. If we do not create a new deployment, our changes will not take effect in the live environment. So, we must force a new deployment by changing something in AWS::ApiGateway::Deployment.

Another thing to take note is that a new AWS::ApiGateway::Deployment will not automatically be triggered when we update our API Gateway configurations unless the logical ID of the deployment resource itself changes. This means that every time we make changes to our API Gateway configurations, we need to manually change the logical ID of the AWS::ApiGateway::Deployment. The reason CloudFormation does not automatically redeploy is to avoid unnecessary changes or disruptions.

apiGatewayDeployment202501011048:
Type: AWS::ApiGateway::Deployment
Properties:
RestApiId: !Ref apiGatewayRestApi
DependsOn:
- apiGatewayRootMethod

In the template above, we append a timestamp 202501011048 to the logical ID of the Deployment. This way, even if we make multiple deployments on the same day, each will have a unique logical ID due to the timestamp.

Deployment alone does not make our API available to the users. We still need to assign it to a specific Stage to ensure it has a versioned endpoint with all configurations applied.

Setup Stage

A Stage in API Gateway is a deployment environment that allows us to manage and control different versions of our API. It acts as a live endpoint for clients to interact with our API. Without a Stage, the API exists but is not publicly available. We can create stages like dev, test, and prod to separate development and production traffic.

apiGatewayStage:
Type: AWS::ApiGateway::Stage
Properties:
StageName: !Ref ApiGatewayStageName
RestApiId: !Ref apiGatewayRestApi
DeploymentId: !Ref apiGatewayDeployment202501011048
MethodSettings:
- ResourcePath: '/*'
HttpMethod: '*'
ThrottlingBurstLimit: 100
ThrottlingRateLimit: 50
Tags:
- Key: Stack
Value: !Ref AWS::StackName

For now, we will use production as the default stage name to keep things simple. This will help us get everything set up and running quickly. Once we are ready for more environments, we can easily update the ApiGatewayStageName in the Parameters based on our environment setup.

MethodSettings are configurations defining how requests are handled in terms of performance, logging, and throttling. Using /* and * is perfectly fine at the start as our goal is to apply global throttling and logging settings for all our Orchard Core routes in one go. However, in the future we might want to adjust the settings as follows:

  • Content Modification (POST, PUT, DELETE): Stricter throttling and more detailed logging.
  • Content Retrieval (GET): More relaxed throttling for GET requests since they are usually read-only and have lower impact.

Having a burst and rate limit is useful for protecting our Orchard Core backend from excessive traffic. Even if we have a CMS with predictable traffic patterns, having rate limiting helps to prevent abuse and ensure fair usage.

The production stage in our API Gateway.

Unit 11: Route53 for API Gateway

Now that we have successfully set up API Gateway, it is accessible through an AWS-generated URL, i.e. something like https://xxxxxx.execute-api.ap-southeast-5.amazonaws.com/production which is functional but not user-friendly. Hence, we need to setup a custom domain for it so that it easier to remember, more professional, and consistent with our branding.

AWS provides a straightforward way to implement this using two key configurations:

  • AWS::ApiGateway::DomainName – Links our custom domain to API Gateway.
  • AWS::ApiGateway::BasePathMapping – Organises API versions and routes under the same domain.

Setup Hosted Zone and DNS

Since I have my domain on GoDaddy, I will need to migrate DNS management to AWS Route 53 by creating a Hosted Zone.

My personal hosted zone: chunlinprojects.com.

After creating a Hosted Zone in AWS, we need to manually copy the NS records to GoDaddy. This step is manual anyway, so we will not be automating this part of setup in CloudFormation. In addition, hosted zones are sensitive resources and should be managed carefully. We do not want hosted zones to be removed when our CloudFormation stacks are deleted too.

Once the switch is done, we can go back to our CloudFormation template to setup the custom domain name for our API Gateway.

Setup Custom Domain Name for API Gateway

API Gateway requires an SSL/TLS certificate to use a custom domain.

apiGatewayCustomDomainCert:
Type: AWS::CertificateManager::Certificate
Properties:
DomainName: !Ref HostedZoneName
ValidationMethod: 'DNS'
DomainValidationOptions:
- DomainName: !Sub "${CmsHostname}.{HostedZoneName}"
HostedZoneId: !Ref HostedZoneId

Take note that please update the DomainNames in the template above to use your domain name. Also, the HostedZoneId can be retrieved from the AWS Console under “Hosted zone details” in the screenshot above.

In the resource, DomainValidationOptions tells CloudFormation to use DNS validation. When we use the AWS::CertificateManager::Certificate resource in a CloudFormation stack, domain validation is handled automatically if all three of the following are true:

  • We are using DNS validation;
  • The certificate domain is hosted in Amazon Route 53;
  • The domain resides in our AWS account.

However, if the certificate uses email validation, or if the domain is not hosted in Route 53, then the stack will remain in the CREATE_IN_PROGRESS state. Here, we will show how we can log in to AWS Console to manually set up DNS validation.

Remember to log in to AWS Console to check for ACM Certificate Status.

After that, we need to choose the Create records in Route 53 button to create records. The Certificate status page should open with a status banner reporting Successfully created DNS records. According to the documentation, our new certificate might continue to display a status of Pending validation for up to 30 minutes.

Successfully created DNS records.

Now that the SSL certificate is ready and the DNS validation is done, we will need to link the SSL certificate to our API Gateway using a custom domain. We are using RegionalCertificateArn, which is intended for a regional API Gateway.

apiGatewayCustomDomainName:
Type: AWS::ApiGateway::DomainName
Properties:
RegionalCertificateArn: !Ref apiGatewayCustomDomainCert
DomainName: !Sub "${CmsHostname}.{HostedZoneName}"
EndpointConfiguration:
Types:
- REGIONAL
SecurityPolicy: TLS_1_2

This allows our API to be securely accessed using our custom domain. We also set up a SecurityPolicy to use the latest TLS version (TLS 1.2), ensuring that the connection is secure and follows modern standards.

Even though it is optional, it is a good practice to specify the TLS version for both security and consistency, especially for production environments. Enforcing a TLS version helps avoid any potential vulnerabilities from outdated protocols.

Setup Custom Domain Routing

Next, we need to create a base path mapping to map the custom domain to our specific API stage in API Gateway.

The BasePathMapping is the crucial bridge between our custom domain and our API Gateway because when users visit our custom domain, we need a way to tell AWS API Gateway which specific API and stage should handle the incoming requests for that domain.

apiGatewayCustomDomainBasePathMapping:
Type: AWS::ApiGateway::BasePathMapping
Properties:
DomainName: !Ref apiGatewayCustomDomainName
RestApiId: !Ref apiGatewayRestApi
Stage: !Ref apiGatewayStage

While the BasePathMapping connects our custom domain to a specific stage inside our API Gateway, we need to setup DNS routing outside AWS which handles the DNS resolution.

The RecordSet creates a DNS record (typically an A or CNAME record) that points to the API Gateway endpoint. Without this record, DNS systems outside AWS will not know where to direct traffic for our custom domain.

apiGatewayCustomDomainARecord:
Type: AWS::Route53::RecordSet
Properties:
HostedZoneName: !Sub "${HostedZoneName}."
Name: !Sub "${CmsHostname}.{HostedZoneName}"
Type: A
AliasTarget:
DNSName: !GetAtt apiGatewayCustomDomainName.RegionalDomainName
HostedZoneId: !GetAtt apiGatewayCustomDomainName.RegionalHostedZoneId

There is one interesting stuff to take note here is that when we use an AWS::Route53::RecordSet that specifies HostedZoneName, we must include a trailing dot (for example, chunlinprojects.com.) as part of the HostedZoneName. Otherwise, we can also choose to specify HostedZoneId instead, but never specifying both.

For API Gateway with a custom domain, AWS recommends using an Alias Record (which is similar to an A record) instead of a CNAME because the endpoint for API Gateway changes based on region and the nature of the service.

Alias records are a special feature in AWS Route 53 designed for pointing domain names directly to AWS resources like API Gateway, ELB, and so on. While CNAME records are often used in DNS to point to another domain, Alias records are unique to AWS and allow us to avoid extra DNS lookup costs.

For the HostedZoneId of AliasTarget, it is the Route 53 Hosted Zone ID of the API Gateway, do not mess up with the ID of our own hosted zone in Route 53.

Finally, please take note that when we are creating an alias resource record set, we need to omit TTL.

Reference 01: ECS Cluster

As we move forward with hosting Orchard Core CMS, let’s go through a few hosting options available within AWS, as listed below.

  • EC2 (Elastic Compute Cloud): A traditional option for running virtual machines. We can fully control the environment but need to manage everything, from scaling to OS patching;
  • Elastic Beanstalk: PaaS optimised for traditional .NET apps on Windows/IIS, not really suitable for Orchard Core which runs best on Linux containers with Kestrel;
  • Lightsail: A traditional VPS (Virtual Private Server), where we manage the server and applications ourselves. It is a good fit for simple, low-traffic websites but not ideal for scalable workloads like Orchard Core CMS.
  • EKS (Elastic Kubernetes Service): A managed Kubernetes offering from AWS. It allows us to run Kubernetes clusters, which are great for large-scale apps with complex micro-services. However, managing Kubernetes adds complexity.
  • ECS (Elastic Container Service): A service designed for running containerised apps. We can run containers on serverless Fargate or EC2-backed clusters.

The reason why we choose ECS is because it offers a scalable, reliable, and cost-effective way to deploy Orchard Core in a containerised environment. ECS allows us to take advantage of containerisation benefits such as isolated, consistent deployments and easy portability across environments. With built-in support for auto-scaling and seamless integration with AWS services like RDS for databases, S3 for media storage, and CloudWatch for monitoring, ECS ensures high availability and performance.

In ECS, we can choose to use either Fargate or EC2-backed ECS for hosting Orchard Core, depends on our specific needs and use case. For highly customised, predictable, or resource-intensive workloads CMS, EC2-based ECS might be more appropriate due to the need for fine-grained control over resources and configurations.

Official documentation with CloudFormation template on how to setup an ECS cluster.

There is an official documentation on how to an setup ECS cluster. Hence, we will not discuss in depth about how to set it up. Instead, we will focus on some of the key points that we need to take note of.

Official ECS-optimised AMIs from AWS.

While we can technically use any Linux AMI for running ECS tasks, the Amazon ECS-Optimised AMI offers several key benefits and optimisations that make it a better choice, particularly for ECS workloads. The Amazon ECS-Optimised AMI is designed and optimised by AWS to run ECS tasks efficiently on EC2 instances. By using the ECS-Optimised AMI, we benefit from pre-installed ECS agent + Docker as well as optimised configuration for ECS. Those AMI look for agent configuration data in the /etc/ecs/ecs.config file when the container agent starts. That’s why can specify this configuration data at launch with Amazon EC2 user data, as shown below.

containerInstances:
Type: AWS::EC2::LaunchTemplate
Properties:
LaunchTemplateName: "asg-launch-template"
LaunchTemplateData:
ImageId: !Ref EcsAmi
InstanceType: "t3.large"
IamInstanceProfile:
Name: !Ref ec2InstanceProfile
SecurityGroupIds:
- !Ref ecsContainerHostSecurityGroup
# This injected configuration file is how the EC2 instance
# knows which ECS cluster it should be joining
UserData:
Fn::Base64: !Sub |
#!/bin/bash -xe
echo "ECS_CLUSTER=core-cluster" >> /etc/ecs/ecs.config
# Disable IMDSv1, and require IMDSv2
MetadataOptions:
HttpEndpoint: enabled
HttpTokens: required

As shown in the above CloudFormation template, instead of hardcoding an AMI ID which will become outdated over time, we have a parameter to ensure that the cluster always provisions instances using the most recent Amazon Linux 2023 ECS-optimised AMI.

EcsAmi:
Description: The Amazon Machine Image ID used for the cluster
Type: AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>
Default: /aws/service/ecs/optimized-ami/amazon-linux-2023/recommended/image_id

Also, the EC2 instances need access to communicate with the ECS service endpoint. This can be through an interface VPC endpoint or through our EC2 instances having public IP addresses. In our case, we are placing our EC2 instances in private subnets, so we use the Network Address Translation (NAT) to provide this access.

ecsNatGateway:
Type: AWS::EC2::NatGateway
Properties:
AllocationId: !GetAtt ecsEip.AllocationId
SubnetId: !Ref publicSubnet1

Unit 12: ECS Task Definition and Service

This ECS cluster definition is just the starting point. Next, we will define how the containers run and interact through AWS::ECS::TaskDefinition.

ecsTaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: !Ref ServiceName
TaskRoleArn: !GetAtt iamRole.Arn
ContainerDefinitions:
- Name: !Ref ServiceName
Image: !Ref OrchardCoreImage
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Sub "/ecs/${ServiceName}-log-group"
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: ecs
PortMappings:
- ContainerPort: 5000
HostPort: 80
Protocol: tcp
Cpu: 256
Memory: 1024
MemoryReservation: 512
Environment:
- Name: DatabaseEndpoint
Value: !GetAtt cmsDBInstance.Endpoint.Address
Essential: true
HealthCheck:
Command:
- CMD-SHELL
- "wget -q --spider http://localhost:5000/health || exit 1"
Interval: 30
Timeout: 5
Retries: 3
StartPeriod: 30

In the setup above, we are sending logs to CloudWatch Logs so that we can centralise logs from all ECS tasks, making it easier to monitor and troubleshoot our containers.

By default, ECS is using bridge network mode. In bridge mode, containers do not get their own network interfaces. Instead, the container port (5000) must be mapped to a port on the host EC2 instance (80). Without this mapping, the Orchard Core on EC2 would not be reachable from outside. The reason we set the ContainerPort: 5000 in is to match the port our Orchard Core app is exposed on within the Docker container.

As CMS platforms like Orchard Core generally require more memory for smooth operations, especially in production environments with more traffic, it is better to start with a CPU allocation like 256 (0.25 vCPU) and 1024 MB for memory, depending on expected load.

For the MemoryReservation which is a guaranteed amount of memory for our container, we set it to be 512 MB of memory. By reserving memory, we are ensuring that your container has enough memory to run reliably. Orchard Core, being a modular CMS, can consume more memory depending on the number of features/modules you have enabled. Later if we realise Orchard Core does not need that much guaranteed memory, we can leave MemoryReservation lower. The key idea is to reserve enough memory to ensure stable operations without overcommitting.

Next, we have Essential where we set it to true. This property specifies whether the container is essential to the ECS task. We set it to true so that ECS will treat this Orchard Core container as vital for the task. If the container stops or fails, ECS will stop the entire task. Otherwise, ECS will not automatically stop the task if this Orchard Core container fails, which could lead to issues, especially in a production environment.

Finally, we must not forget about HealthCheck. In most web apps like Orchard Core, a simple HTTP endpoint /health is normally used as a health check. Here, we need to understand that many minimal container images like ECS-optimised AMIs do not include curl by default to keep them lightweight. However, wget is often available by default, making it a good alternative for checking if an HTTP endpoint is reachable. Hence, in the template above, ECS is using wget to check the /health endpoint on port 5000. If it receives an error, the container is considered unhealthy.

We can test locally to check if curl or wget is available in the image.

Once the TaskDefinition is set up, it defines the container specs. However, the ECS service is needed to manage how and where the task runs within the ECS cluster. We need the ECS service tells ECS how to run the task, manage it, and keep it running smoothly.

ecsService:
Type: AWS::ECS::Service
DependsOn:
- iamRole
- internalNlb
- nlbTargetGroup
- internalNlbListener
Properties:
Cluster: !Ref ecsCluster
DesiredCount: 2
DeploymentConfiguration:
MaximumPercent: 200
MinimumHealthyPercent: 50
LoadBalancers:
- ContainerName: !Ref ServiceName
ContainerPort: 5000
TargetGroupArn: !Ref nlbTargetGroup
PlacementStrategies:
- Type: spread
Field: attribute:ecs.availability-zone
- Type: spread
Field: instanceId
TaskDefinition: !Ref ecsTaskDefinition
ServiceName: !Ref ServiceName
Role: !Sub "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/aws-service-role/ecs.amazonaws.com/AWSServiceRoleForECS"
HealthCheckGracePeriodSeconds: 60

The DesiredCount is the number of tasks (or containers) we want ECS to run at all times for Orchard Core app. In this case, we set it to 2 which means that ECS will try to keep exactly 2 tasks running for our service. Setting it to 2 helps ensure that we have redundancy. If one task goes down, the other task can continue serving, ensuring that our CMS stays available and resilient.

Based on the number of DesiredCount, we indicate that during deployment, ECS can temporarily run up to 4 tasks (MaximumPercent: 200) and at least 1 task (MinimumHealthyPercent: 50) must be healthy during updates to ensure smooth deployment.

The LoadBalancers section in the ECS service definition is where we link our service to the NLB that we set up earlier, ensuring that the NLB will distribute the traffic to the correct tasks running within the ECS service. Also, since our container is configured to run on port 5000 as per our Dockerfile, this is the port we use.

Next, we have PlacementStrategies to help us control how our tasks are distributed across different instances and availability zones, making sure our CMS is resilient and well-distributed. Here, attribute:ecs.availability-zone ensures the tasks are spread evenly across different availability zones within the same region. At the same time, Field: instanceId ensures that our tasks are spread across different EC2 instances within the cluster.

Finally, it is a good practice to set a HealthCheckGracePeriodSeconds to give our containers some time to start and become healthy before ECS considers them unhealthy during scaling or deployments.

Unit 13: CloudWatch Alarm

To ensure we effectively monitor the performance of Orchard Core on our ECS service, we also need to set up CloudWatch alarms to track metrics like CPU utilisation, memory utilisation, health check, running task count, etc.

We set up the following CloudWatch alarm to monitor CPU utilisation for our ECS service. This alarm triggers if the CPU usage exceeds 75% for a specified period (5 minutes). By doing this, we can quickly identify when our service is under heavy load, which helps us take action to prevent performance issues.

highCpuUtilizationAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Sub "${AWS::StackName}-high-cpu"
AlarmDescription: !Sub "ECS service ${AWS::StackName}: Cpu utilization above 75%"
Namespace: AWS/ECS
MetricName: CPUUtilization
Dimensions:
- Name: ClusterName
Value: !Ref ecsCluster
- Name: ServiceName
Value: !Ref ServiceName
Statistic: Average
Period: 60
EvaluationPeriods: 5
Threshold: 75
ComparisonOperator: GreaterThanOrEqualToThreshold
TreatMissingData: notBreaching
ActionsEnabled: true
AlarmActions: []
OKActions: []

Even if we leave AlarmActions and OKActions as empty arrays, the alarm state will still be visible in the AWS CloudWatch Console. We can monitor the alarm state directly on the CloudWatch dashboard.

Similar to the CPU utilisation alarm above, we have another alarm to trigger when the count of running tasks is 0 (less than 1) for 5 consecutive periods, indicating that there have been no running tasks for a full 5 minutes.

noRunningTasksAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Sub "${AWS::StackName}-no-task"
AlarmDescription: !Sub "ECS service ${AWS::StackName}: No running ECS tasks for more than 5 mins"
Namespace: AWS/ECS
MetricName: RunningTaskCount
Dimensions:
- Name: ClusterName
Value: !Ref ecsCluster
- Name: ServiceName
Value: !Ref ServiceName
Statistic: Average
Period: 60
EvaluationPeriods: 5
Threshold: 1
ComparisonOperator: LessThanThreshold
TreatMissingData: notBreaching
ActionsEnabled: true
AlarmActions: []
OKActions: []
The two alarms are available on CloudWatch dashboard.

By monitoring these key metrics, we can proactively address any performance or availability issues, ensuring our Orchard Core CMS runs smoothly and efficiently.

Wrap-Up

Setting up Orchard Core on ECS with CloudFormation does have its complexities, especially with the different moving parts like API Gateway, load balancers, and domain configurations. However, once we have the infrastructure defined in CloudFormation, it becomes much easier to deploy, update, and manage our AWS environment. This is one of the key benefits of using CloudFormation, as it gives us consistency, repeatability, and automation in our deployments.

Orchard Core website is up and accessible via our custom domain!

The heavy lifting is done up front, and after that, it is mostly about making updates to our CloudFormation stack and redeploying without having to worry about manually reconfiguring everything.

Learning Postman Flows and Newman: A Beginner’s Journey

For API developers, Postman is a popular tool that streamlines the process of testing APIs. Postman is like having a Swiss Army knife for API development as it allows developers to interact with APIs efficiently.

In March 2023, Postman Flows was officially announced. Postman Flows is a no-code visual tool within the Postman platform designed to help us create API workflows by dragging and dropping components, making it easier to build complex sequences of API calls without writing extensive code.

Please make sure Flows is enabled in our Postman Workspace.

Recently, my teammate demonstrated how Postman Flows works, and I’d like to share what I’ve learned from him in this article.

Use Case

Assume that there is a list of APIs that we need to call in sequence every time, then we can make use of Postman Flows. For demonstration purpose, let’s say we have to call the following three AWS APIs for a new user registration.

We will first need to define the environment variables, as shown in the following screenshot.

We keep the region variable to be empty because the region of IAM setup should be Global.

Next, we will setup the three API calls. For example, to create a new user on AWS, we need to make an HTTP GET request to the IAM base URL with CreateUser as the action, as shown in the following screenshot.

The UserName will be getting its value from the variable userName.

To tag the user, for example assigning the user above to a team in the organisation, we can do so by setting TagUser as the action, as shown in the screenshot below. The team that the user will be assigned to is based on their employee ID, which we will discuss later in this article.

The teamName is a variable with value determined by another variable.

Finally, we will assign the user to an existing user group by setting AddUserToGroup as its action.

The groupName must be having a valid existing group name in our AWS account.

Create the Flow

As demonstrated in the previous section, calling the three APIs sequentially is straightforward. However, managing variables carefully to avoid input errors can be challenging. Postman Flows allows us to automate these API calls efficiently. By setting up the Flow, we can execute all the API requests with just a few clicks, reducing the risk of mistakes and saving time.

Firstly, we will create a new Flow called “AWS IAM New User Registration”, as shown below.

Created a new Flow.

By default, it comes with three options that we can get started. We will go with the “Send a request” since we will be sending a HTTP GET request to create a user in IAM. As shown in the following screenshot, a list of variables that we defined earlier will be available. We only need to make sure that we choose a correct Environment. The values of service, region, accessKey, and secretKey will then be retrieved from the selected Environment.

Choosing the environment for the block.

Since the variable userName will be used in all three API calls, let’s create a variable block and assign it a value called “postman03”.

Created a variable and assigned a string value to it.

Next, we simply need to tell the API calling block to assign the value of userName to the field userName.

Assigning variable to the query string in the API call.

Now if we proceed to click on the “Run” button, by right, the call should respond with HTTP 200 and relevant info returned from AWS, as demonstrated below.

Yes, the user “postman03” is created successfully on AWS IAM.

With the success of user creation, the next step is to call the user tagging API. In this API, we will have two variables, i.e. userName and teamName. Let’s assume that the teamName is assigned based on whether the user’s employeeId is an even or odd number, we can design the Flow as shown in the following screenshot.

With two different employeeId, the teamNames are different too.

As shown in the Log, when we assign an even number to the user postman06, the team name assigned to it is “Team A”. However, when we assign an odd number to another user postman07, its team name is “Team B”.

Finally, we can complete the Flow with the third API call as shown below.

The groupName variable is introduced for the third API call.

Now, we can visit AWS Console to verify that the new user postman09 is indeed assigned to the testing-user-group.

The new user is assigned to the desired user group.

The Flow above only can create one new user in every Flow execution. By using the Repeat block which takes a numeric input N and iterates over it from 0 to N - 1, we can create multiple users in a single run, as shown in the following updated Flow.

This flow will create 5 users in one single execution.

Variable Limitation in Flows

Could we save the response in a variable that allows us to reuse its value in other Flow? We would guess this is possible with variables in Postman.

Postman supports variables at different scopes, in order from broadest to narrowest, these scopes are: global, collection, environment, data, and local.

The variable with the broadest scope in Postman is the global variables which are available throughout a workspace. So what we can do is actually storing the response in the global variables.

To do so, we need to update our request in the Collection. One of the snippets provided is actually about setting a global variable, so we can use it. For example, we add it to the post-response script of “Create IAM User” request, as shown in the screenshot below.

Postman provides snippets to help us quickly generate boilerplate code for common tasks, including setting global variables.

Let’s change the boilerplate code to be as follows.

pm.globals.set("my_global", pm.response.code);

Now, if we send a “Create IAM User” request, we will notice that a new global variable called my_global has been created, as demonstrated below.

The response code received is 400 and thus my_global is 400.

However, now if we run our Flow, we will realise that my_global is not being updated even though the response code received is 200, as illustrated in the following screenshot.

Our global variable is not updated by Flow.

Firstly, we need to understand that nothing is wrong with environments or variables. The reason why it did not work is because the variables just work differently in Flows from how they used to work in the Collection Runner or the Request Tab, as explained by Saswat Das, the creator of Postman Flows.

According to Saswat, global variables and environment variables are now treated as read-only values in the Flows, and any updates made to them through script are not respected.

So the answer to the question whether we can share variable across different Flows earlier is simply a big no by design currently.

Do It with CLI: Newman

Is there any alternative way that we can use instead of relying on the GUI of Postman?

Another teammate of mine from the QA team recently also showed how Newman, a command-line Collection Runner for Postman, can help.

A command-line Collection Runner is basically a tool that allows users to execute API requests defined in Postman collections directly from the command line. So, could we use Newman to do what we have done above with Postman Flows as well?

Before we can use Newman, we first need to export our Collection as well as the corresponding Environment. To do so, firstly, we click on the three dots (…) next to the collection name and select Export. Then, we choose the format (usually Collection v2.1 is recommended) and click Export. Secondly, we proceed to click on the three dots (…) next to the environment name and select Export as well.

Once we have the two exported files in the same folder, for example a folder canned “experiment.postman.newman”, we can run the following command.

$ newman run aws_iam.postman_collection.json -e aws_iam.postman_environment.json --env-var "userName=postman14462" --env-var "teamName=teamA" --env-var "groupName=testing-user-group"

While Newman and Postman Flows can both be used to automate API requests, they are tailored for different use cases: Newman is better suited for automated testing, integration into CI/CD pipelines, and command-line execution. Postman Flows, on the other hand, is ideal for visually designing the workflows and interactions between API calls.

References

Processing S3 Data before Returning It with Object Lambda (version 2024)

We use Amazon S3 to store data for easy sharing among various applications. However, each application has its unique requirements and might require a different perspective on the data. To solve this problem, at times, we store additional customised datasets of the same data, ensuring that each application has its own unique dataset. This sometimes creates another set of problems because we now need to maintain additional datasets.

In March 2021, a new feature known as S3 Object Lambda was introduced. Similar to the idea of setting up a proxy layer in front of S3 to intercept and process data as it is requested, Object Lambda uses AWS Lambda functions to automatically process and transform your data as it is being retrieved from S3. With Object Lambda, we only need to change our apps to use the new S3 Object Lambda Access Point instead of the actual bucket name to retrieve data from S3.

Simplified architecture diagram showing how S3 Object Lambda works.

Example: Turning JSON to Web Page with S3 Object Lambda

I have been keeping details of my visits to medical centres as well as the treatments and medicines I received in a JSON file. So, I would like to take this opportunity to show how S3 Object Lambda can help in doing data processing.

The JSON file looks something as follows.

{
"visits": [
{
"medicalCentreName": "Tan Tock Seng Hospital",
"centreType": "hospital",
"visitStartDate": {
"year": 2024,
"month": 3,
"day": 24
},
"visitEndDate": {
"year": 2024,
"month": 4,
"day": 19
},
"purpose": "",
"treatments": [
{
"name": "Antibiotic Meixam(R) 500 Cloxacillin Sodium",
"type": "medicine",
"amount": "100ml per doese every 4 hours",
"startDate": {
"year": 2024,
"month": 3,
"day": 26
},
"endDate": {
"year": 2024,
"month": 4,
"day": 19
}
},
...
]
},
...
]
}

In this article, I will show the steps I took to setup the S3 Object Lambda architecture for this use case.

Step 1: Building the Lambda Function

Before we begin, we need to take note that the maximum duration for a Lambda function used by S3 Object Lambda is 60 seconds.

We need a Lambda Function to do the data format transformation from JSON to HTML. To keep things simple, we will be developing the Function using Python 3.12.

Object Lambda does not need any API Gateway since it should be accessed via the S3 Object Lambda Access Point.

In the beginning, we can have the code as follows. The code basically does two things. Firstly, it performs some logging. Secondly, it reads the JSON file from S3 Bucket.

import json
import os
import logging
import boto3
from urllib import request
from urllib.error import HTTPError
from types import SimpleNamespace

logger = logging.getLogger()
logger.addHandler(logging.StreamHandler())
logger.setLevel(getattr(logging, os.getenv('LOG_LEVEL', 'INFO')))

s3_client = boto3.client('s3')

def lambda_handler(event, context):
object_context = event["getObjectContext"]
# Get the presigned URL to fetch the requested original object from S3
s3_url = object_context["inputS3Url"]
# Extract the route and request token from the input context
request_route = object_context["outputRoute"]
request_token = object_context["outputToken"]

# Get the original S3 object using the presigned URL
req = request.Request(s3_url)
try:
response = request.urlopen(req)
responded_json = response.read().decode()
except Exception as err:
logger.error(f'Exception reading S3 content: {err}')
return {'status_code': 500}

json_object = json.loads(responded_json, object_hook=lambda d: SimpleNamespace(**d))

visits = json_object.visits

html = ''

s3_client.write_get_object_response(
Body = html,
ContentType = 'text/html',
RequestRoute = request_route,
RequestToken = request_token)

return {'status_code': 200}
Step 1.1: Getting the JSON File with Presigned URL

In the event that an Object Lambda receives, there is a property known as the getObjectContext, which contains useful information for us to figure out the inputS3Url, which is the presigned URL of the object in S3.

By default, all S3 objects are private and thus for a Lambda Function to access the S3 objects, we need to configure the Function to have S3 read permissions to retrieve the objects. However, with the presigned URL, the Function can get the object without the S3 read permissions.

In the code above, we can retrieve the JSON file from the S3 using its presigned URL. After that we parse the JSON file content with json.loads() method and convert it into a JSON object with SimpleNamespace. Thus the variable visits now should have all the visits data from the original JSON file.

Step 1.2: Call WriteGetObjectResponse

Since the purpose of Object Lambda is to process and transform our data as it is being retrieved from S3, we need to pass transformed object to a GetObject operation in the Function via the method write_get_object_response. Without this method, there will be an error from the Lambda complaining that it is missing.

Error: The Lambda exited without successfully calling WriteGetObjectResponse.

The method write_get_object_response requires two compulsory parameters, i.e. RequestRoute and RequestToken. Both of them are available from the property getObjectContext under the name outputRoute and outputToken.

Step 1.3: Get the HTML Template from S3

To make our Lambda code cleaner, we will not write the entire HTML there. Instead, we keep a template of the web page in another S3 bucket.

Now, the architecture above will be improved to include second S3 bucket which will provide web page template and other necessary static assets.

Introducing second S3 bucket for storing HTML template and other assets.

Now, we will replace the line html = '' earlier with the Python code below.

    template_response = s3_client.get_object(
Bucket = 'lunar.medicalrecords.static',
Key = 'medical-records.html'
)

template_object_data = template_response['Body'].read()
template_content = template_object_data.decode('utf-8')

dynamic_table = f"""
<table class="table accordion">
<thead>
<tr>
<th scope="col">#</th>
<th scope="col">Medical Centre</th>
<th scope="col">From</th>
<th scope="col">To</th>
<th scope="col">Purpose</th>
</tr>
</thead>

<tbody>
...
</tbody>
</table>"""

html = template_content.replace('{{DYNAMIC_TABLE}}', dynamic_table)

Step 2: Give Lambda Function Necessary Permissions

With the setup we have gone through above, we understand that our Lambda Function needs to have the following permissions.

  • s3-object-lambda:WriteGetObjectResponse
  • s3:GetObject

Step 3: Create S3 Access Point

Next, we will need to create a S3 Access Point. It will be used to support the creation of the S3 Object Lambda Access Point later.

One of the features that S3 Access Point offers is that we can specify any name that is unique within the account and region. For example, as shown in the screenshot below, we can actually have a “lunar-medicalrecords” access point in every account and region.

Creating an access point from the navigation pane of S3.

When we are creating the access point, we need to specify the bucket which resides in the same region that we want to use with this Access Point. In addition, since we are not restricting the access of it to only a specific VPC in our case, we will be choosing “Internet” for the “Network origin” field.

After that, we keep all other defaults as is. We can directly proceed to choose the “Create access point” button.

Our S3 Access Point is successfully created.

Step 4: Create S3 Object Lambda Access Point

After getting our S3 Access Point set up, we can then move on to create our S3 Object Lambda Access Point. This is the actual access point that our app will be using to access the JSON file in our S3 bucket. It then should return a HTML document generated by the Object Lambda that we built in Step 1.

Creating an object lambda access point from the navigation pane of S3.

In the Object Lambda Access Point creation page, after we give it a name, we need to provide the Supporting Access Point. This access point is the Amazon Resource Name (ARN) of the S3 Access Point that we created in Step 3. Please take note that both the Object Lambda Access Point and Supporting Access Point must be in the same region.

Next we need to setup the transformation configuration. In our case, we will be retrieving the JSON file from the S3 bucket to perform the data transformation via our Lambda Function, so we will be choosing GetObject as the S3 API we will be using, as shown in the screenshot below.

Configuring the S3 API that will be used in the data transformation and the Lambda Function to invoke.

Once all these fields are keyed in, we can proceed to create the Object Lambda Access Point.

Now, we will access the JSON file via the Object Lambda Access Point to verify that the file is really transformed into a web page during the request. To do so, firstly, we need to select the newly create Object Lambda Access Point as shown in the following screenshot.

Locate the Object Lambda Access Point we just created in the S3 console.

Secondly, we will be searching for our JSON file, for example chunlin.json in my case. Then, we will click on the “Open” button to view it. The reason why I name the JSON file containing my medical records is because later I will be adding authentication and authorisation to only allow users retrieving their own JSON file based on their login user name.

This page looks very similar to the usual S3 objects listing page. So please make sure you are doing this under the “Object Lambda Access Point”.

There will be new tab opened showing the web page as demonstrated in the screenshot below. As you have noticed in the URL, it is still pointing to the JSON file but the returned content is a HTML web page.

The domain name is actually no longer the usual S3 domain name but it is our Object Lambda Access Point.

Using the Object Lambda Access Point from Our App

With the Object Lambda Access Point successfully setup, we will show how we can use it. To not overcomplicate things, for the purposes of this article, I will host a serverless web app on Lambda which will be serving the medical record website above.

In addition, since Lambda Functions are by default not accessible from the Internet, we will be using API Gateway so that we can have a custom REST endpoint in the AWS and thus we can map this endpoint to the invokation of our Lambda Function. Technically speaking, the architecture diagram now looks as follows.

This architecture allows public to view the medical record website which is hosted as a serverless web app.

In the newly created Lambda, we will still be developing it with Python 3.12. We name this Lambda lunar-medicalrecords-frontend. We will be using the following code which will retrieve the HTML content from the Object Lambda Access Point.

import json
import os
import logging
import boto3

logger = logging.getLogger()
logger.addHandler(logging.StreamHandler())
logger.setLevel(getattr(logging, os.getenv('LOG_LEVEL', 'INFO')))

s3_client = boto3.client('s3')

def lambda_handler(event, context):
try:
bucket_name = 'ol-lunar-medicalreco-t5uumihstu69ie864td6agtnaps1a--ol-s3'
object_key = 'chunlin.json'

response = s3_client.get_object(
Bucket=bucket_name,
Key=object_key
)

object_data = response['Body'].read()
object_string = object_data.decode('utf-8')

return {
'statusCode': 200,
'body': object_string,
'headers': {
'Content-Type': 'text/html'
}
}

except Exception as err:
return {
'statusCode': 500,
'body': json.dumps(str(err))
}

As shown in the code above, we are still using the same function get_object from the S3 client to retrieve the JSON file, chunlin.json. However, instead of providing the bucket name, we will be using the Object Lambda Access Point Alias, which is located at the S3 Object Lambda Access Points listing page.

This is where we can find the Object Lambda Access Point Alias.

You can read more about the Boto3 get_object documentation to understand more about its Bucket parameter.

The Boto3 documentation highlights the use of Object Lambda Access Point in get_object.

The API Gateway for the Lambda Function is created with HTTP API through the “Add Trigger” function (which is located at the Function overview page). For the Security field, we will be choosing “Open” for now. We will add the login functionality later.

Adding API Gateway as a trigger to our Lambda.

Once this is done, we will be provided an API Gateway endpoint, as shown in the screenshot below. Visiting the endpoint should be rendering the same web page listing the medical records as we have seen above.

Getting the API endpoint of the API Gateway.

Finally, for the Lambda Function permission, we only need to grand it the following.

  • s3:GetObject.

To make the API Gateway endpoint looks more user friendly, we can also introduce Custom Domain to the API Gateway, following the guide in one of our earlier posts.

Assigned medical.chunlinprojects.com to our API Gateway.

Protecting Data with Cognito

In order to ensure that only authenticated and authorised users can access their own medical records, we need to securely control access to our the app with the help from Amazon Cognito. Cognito is a service that enables us to add user sign-in and access control to our apps quickly and easily. Hence it helps authenticate and authorise users before they can access the medical records.

Step 1: Setup Amazon Cognito

To setup Cognito, firstly, we need to configure the User Pool by specifying sign-in options. User pool is a managed user directory service that provides authentication and user management capabilities for our apps. It enables us to offload the complexity of user authentication and management to AWS.

Configuring sign-in options and user name requirements.

Please take note that Cognito user pool sign-in options cannot be changed after the user pool has been created. Hence, kindly think carefully during the configuration.

Configuring password policy.

Secondly, we need to configure password policy and choose whether to enable Multi-Factor Authentication (MFA).

By default, Cognito comes with a password policy that ensures our users maintain a password with a minimum length and complexity. For password reset, it will also generate a temporary password to the user which will expire in 7 days, by default.

MFA adds an extra layer of security to the authentication process by requiring users to provide additional verification factors to gain access to their accounts. This reduces the risk of unauthorised access due to compromised passwords.

Enabling MFA in our Cognito user pool.

As shown in the screenshot above, one of the methods is called TOTP. TOTP stands for Time-Based One-Time Password. It is a form of multi-factor authentication (MFA) where a temporary passcode is generated by the authenticator app, adding a layer of security beyond the typical username and password.

Thirdly, we will be configuring Cognito to allow user account recovery as well as new user registration. Both of these by default require email delivery. For example, when users request an account recovery code, an email with the code should be sent to the user. Also, when there is a new user signing up, there should be emails sent to verify and confirm the new account of the user. So, how do we handle the email delivery?

We can choose to send email with Cognito in our development environment.

Ideally, we should be setting up another service known as Amazon SES (Simple Email Service), an email sending service provided by AWS, to deliver the emails. However, for testing purpose, we can choose to use Cognito default email address as well. This approach is normally only suitable for development purpose because we can only use it to send up to 50 emails a day.

Finally, we will be using the hosted authentication pages for user sign-in and sign-up, as demonstrated below.

Using hosted UI so that we can have a simple frontend ready for sign-in and sign-up.

Step 2: Register Our Web App in Cognito

To integrate our app with Cognito, we still need to setup the app client. An App Client is a configuration entity that allows our app to interact with the user pool. It is essentially an application-specific configuration that defines how users will authenticate and interact with our user pool. For example, we have setup a new app client for our medical records app as shown in the following screenshot.

We customise the hosetd UI with our logo and CSS.

As shown in the screenshot above, we are able to to specify customisation settings for the built-in hosted UI experience. Please take note that we are only able to customise the look-and-feel of the default “login box”, so we cannot modify the layout of the entire hosted UI web page, as demonstrated below.

The part with gray background cannot be customised with the CSS.

In the setup of the app client above, we have configured the callback URL to /authy-callback. So where does this lead to? It actually points to a new Lambda function which is in charge of the authentication.

Step 3: Retrieve Access Token from Cognito Token Endpoint

Here, Cognito uses the OAuth 2.0 authorization code grant flow. Hence, after successful authentication, Cognito redirects the user back to the specified callback URL with an authorisation code included in the query string with the name code. Our authentication Lambda function thus needs to makes a back-end request to the Cognito token endpoint, including the authorisation code, client ID, and redirect URI to exchange the authorisation code for an access token, refresh token, and ID token.

Client ID can be found under the “App client information” section.
auth_code = event['queryStringParameters']['code']

token_url = "https://lunar-corewebsite.auth.ap-southeast-1.amazoncognito.com/oauth2/token"
client_id = "<client ID to be found in AWS Console>"
callback_url = "https://medical.chunlinprojects.com/authy-callback"

params = {
"grant_type": "authorization_code",
"client_id": client_id,
"code": auth_code,
"redirect_uri": callback_url
}

http = urllib3.PoolManager()
tokens_response = http.request_encode_body(
"POST",
token_url,
encode_multipart = False,
fields = params,
headers = {'Content-Type': 'application/x-www-form-urlencoded'})

token_data = tokens_response.data
tokens = json.loads(token_data)

As shown in the code above, the token endpoint URL for a Cognito user pool generally follows the following structure.

https://<your-domain>.auth.<region>.amazoncognito.com/oauth2/token

A successful response from the token endpoint typically is a JSON object which includes:

  • access_token: Used to access protected resources;
  • id_token: Contains identity information about the user;
  • refresh_token: Used to obtain new access tokens;
  • expires_in: Lifetime of the access token in seconds.

Hence we can retrieve the medical records if there is an access_token but return an “HTTP 401 Unauthorized” response if there is no access_token returned.

if 'access_token' not in tokens:
return {
'statusCode': 401,
'body': get_401_web_content(),
'headers': {
'Content-Type': 'text/html'
}
}

else:
access_token = tokens['access_token']

return {
'statusCode': 200,
'body': get_web_content(access_token),
'headers': {
'Content-Type': 'text/html'
}
}

The function get_401_web_content is responsible to retrieve a static web page showing 401 error message from the S3 bucket and return it to the frontend, as shown in the code below.

def get_401_web_content():
bucket_name = 'lunar.medicalrecords.static'
object_key = '401.html'

response = s3_client.get_object(
Bucket=bucket_name,
Key=object_key
)

object_data = response['Body'].read()
content = object_data.decode('utf-8')

return content

Step 4: Retrieve Content Based on Username

For the get_web_content function, we will be passing the access token to the Lambda that we developed earlier to retrieve the HTML content from the Object Lambda Access Point. As shown in the following code, we invoke the Lambda function synchronously and wait for the response.

def get_web_content(access_token):
useful_tokens = {
'access_token': access_token
}

lambda_response = lambda_client.invoke(
FunctionName='lunar-medicalrecords-frontend',
InvocationType='RequestResponse',
Payload=json.dumps(useful_tokens)
)

lambda_response_payload = lambda_response['Payload'].read().decode('utf-8')

web_content = (json.loads(lambda_response_payload))['body']

return web_content

In the Lambda function lunar-medicalrecords-frontend, we will no longer need to hardcode the object key as chunlin.json. Instead, we can just retrieve the user name from the Cognito using the access token, as highlighted in bold in the code below.

...
import boto3

cognito_idp_client = boto3.client('cognito-idp')

def lambda_handler(event, context):
if 'access_token' not in event:
return {
'statusCode': 200,
'body': get_homepage_web_content(),
'headers': {
'Content-Type': 'text/html'
}
}

else:
cognitio_response = cognito_idp_client.get_user(AccessToken = event['access_token'])

username = cognitio_response['Username']

try:
bucket_name = 'ol-lunar-medicalreco-t5uumihstu69ie864td6agtnaps1a--ol-s3'
object_key = f'{username}.json'

...

except Exception as err:
return {
'statusCode': 500,
'body': json.dumps(str(err))
}

The get_homepage_web_content function above basically is to retrieve a static homepage from the S3 bucket. It is similar to how the get_401_web_content function above works.

The homepage comes with a Login button redirecting users to Hosted UI of our Cognito app client.

Step 5: Store Access Token in Cookies

We need to take note that the auth_code above in the OAuth 2.0 authorisation code grant flow can only be used once. This is because single-use auth_code prevents replay attacks where an attacker could intercept the authorisation code and try to use it multiple times to obtain tokens. Hence, our implementation above will break if we refresh our web page after logging in.

To solve this issue, we will be saving the access token in a cookie when the user first signs in. After that, as long as we detect that there is a valid access token in the cookie, we will not use the auth_code.

In order to save an access token in a cookie, there are several important considerations to ensure security and proper functionality:

  • Set the Secure attribute to ensure the cookie is only sent over HTTPS connections. This helps protect the token from being intercepted during transmission;
  • Use the HttpOnly attribute to prevent client-side scripts from accessing the cookie. This helps mitigate the risk of cross-site scripting (XSS) attacks;
  • Set an appropriate expiration time for the cookie. Since access tokens typically have a short lifespan, ensure the cookie does not outlive the token’s validity.

Thus the code at Step 3 above can be improved as follows.

def lambda_handler(event, context):
now = datetime.now(timezone.utc)

if 'cookies' in event:
for cookie in event['cookies']:
if cookie.startswith('access_token='):
access_token = cookie.replace("access_token=", "")
break

if 'access_token' in locals():

returned_html = get_web_content(access_token)

return {
'statusCode': 200,
'headers': {
'Content-Type': 'text/html'
},
'body': returned_html
}

return {
'statusCode': 401,
'body': get_401_web_content(),
'headers': {
'Content-Type': 'text/html'
}
}

else:
...
if 'access_token' not in tokens:
...

else:
access_token = tokens['access_token']

cookies_expiry = now + timedelta(seconds=tokens['expires_in'])

return {
'statusCode': 200,
'headers': {
'Content-Type': 'text/html',
'Set-Cookie': f'access_token={access_token}; path=/; secure; httponly; expires={cookies_expiry.strftime("%a, %d %b %Y %H:%M:%S")} GMT'
},
'body': get_web_content(access_token)
}

With this, now we can safely refresh our web page and there should be no case of reusing the same auth_code repeatedly.

Wrap-Up

In summary, we can conclude the infrastructure that we have gone through above in the following diagram.

References

Setup and Access Private RDS Database via a Bastion Host

There is always a common scenario that requires cloud engineers to configure infrastructure which allows developers to safely and securely connect to the RDS or Aurora database that is in a private subnet.

For development purpose, some developers tend to create a public IP address to access the databases on AWS as part of setup. This makes it easy for the developers to gain access to their database, but it is undoubtedly not a recommended method because it has huge security vulnerability that can compromise sensitive data.

Architecture Design

In order to make our database secure, the recommended approach by AWS is to place our database in a private subnet. Since a private subnet has no ability to communicate with the public Internet directly, we are able to isolate our data from the outside world.

Then in order to enable the developers to connect remotely to our database instance, we will setup a bastion host that allows them to connect to the database via SSH tunnelling.

The following diagram describes the overall architecture that we will be setting up for this scenario.

We will be configuring with CloudFormation template. The reason why we use CloudFormation is because it provides us with a simple way to create and manage a collection of AWS resources by provisioning and updating them in a predictable way.

Step 1: Specify Parameters

In the CloudFormation template, we will be using the following parameters.

Parameters:
ProjectName:
Type: String
Default: my-project
EC2InstanceType:
Type: String
Default: t2.micro
EC2AMI:
Type: String
Default: ami-020283e959651b381 # Amazon Linux 2023 AMI 2023.3.20240219.0 x86_64 HVM kernel-6.1
EC2KeyPairName:
Type: String
Default: my-project-ap-northeast-1-keypair
MasterUsername:
Type: String
Default: admin
MasterUserPassword:
Type: String
AllowedPattern: "[a-zA-Z0-9]+"
NoEcho: true
EngineVersion:
Type: String
Default: 8.0
MinCapacity:
Type: String
Default: 0.5
MaxCapacity:
Type: String
Default: 1

As you have noticed in the parameters for EC2, we choose to use the Amazon Linux 2023 AMI, which is shown in the following screenshot.

We can easily retrieve the AMI ID of an image in the AWS Console.

We are also using a keypair that we have already created. It is a keypair called “my-project-ap-northeast-1-keypair”.

We can locate existing key pairs in the EC2 instances page.

Step 2: Setup VPC

Amazon Virtual Private Cloud (VPC) is a foundational service for networking and compute categories. It lets us provision a logically isolated section of the AWS cloud to launch our AWS resources. VPC allows resources within a VPC to access AWS services without needing to go over the Internet.

When we use a VPC, we have control over our virtual networking environment. We can choose our own IP address range, create subnets, and configure routing and access control lists.

VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: 38.0.0.0/16
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc'
- Key: Project
Value: !Ref ProjectName

Step 3: Setup Public Subnet, IGW, and Bastion Host

A bastion host is a dedicated server that lets authorised users access a private network from an external network such as the Internet.

A bastion host, also known as a jump server, is used as a bridge between the public Internet and a private subnet in a network architecture. It acts as a gateway that allows secure access from external networks to internal resources without directly exposing those resources to the public.

This setup enhances security by providing a single point of entry that can be closely monitored and controlled, reducing the attack surface of the internal network.

In this step, we will be launching an EC2 instance which is also our bastion host into our public subnet which is defined as follows.

PublicSubnet:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone: !Select [0, !GetAZs '']
VpcId: !Ref VPC
CidrBlock: 38.0.0.0/20
MapPublicIpOnLaunch: true
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-public-subnet1'
- Key: AZ
Value: !Select [0, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

This public subnet will be able to receive public connection requests from the Internet. However, we should make sure that our bastion host to only be accessible via SSH at port 22.

BastionSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub '${AWS::StackName}-bastion-sg'
GroupDescription:
!Sub 'Security group for ${AWS::StackName} bastion host'
VpcId: !Ref VPC

BastionAllowInboundSSHFromInternet:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref BastionSecurityGroup
IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: 0.0.0.0/0

CidrIp defines the IP address range that is permitted to send inbound traffic through the security group. 0.0.0.0/0 means from the whole Internet. Thus, we can also make sure that the connections are from certain IP addresses such as our home or workplace networks. Doing so will reduce the risk of exposing our bastion host to unintended outside audiences.

In order to enable resources in our public subnets, which is our bastion host in this case, to connect to the Internet, we also need to add Internet Gateway (IGW). IGW is a VPC component that allows communication between the VPC and the Internet.

InternetGateway:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-igw'
- Key: Project
Value: !Ref ProjectName

VPCGatewayAttachment:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
InternetGatewayId: !Ref InternetGateway
VpcId: !Ref VPC

For outbound traffic, a route table for the IGW is necessary. When resources within a subnet need to communicate with resources outside of the VPC, such as accessing the public Internet or other AWS services, they need a route to the IGW.

PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table'
- Key: Project
Value: !Ref ProjectName

InternetRoute:
Type: AWS::EC2::Route
DependsOn: VPCGatewayAttachment
Properties:
RouteTableId: !Ref PublicRouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway

SubnetRouteTableAssociationAZ1:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PublicRouteTable
SubnetId: !Ref PublicSubnet

A destination of 0.0.0.0/0 in the DestinationCidrBlock means that all traffic that is trying to access the Internet needs to flow through the target, i.e. the IGW.

Finally, we can define our bastion host EC2 instance with the following template.

BastionInstance:
Type: AWS::EC2::Instance
Properties:
ImageId: !Ref EC2AMI
InstanceType: !Ref EC2InstanceType
KeyName: !Ref EC2KeyPairName
SubnetId: !Ref PublicSubnet
SecurityGroupIds:
- !Ref BastionSecurityGroup
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-bastion'
- Key: Project
Value: !Ref ProjectName

Step 4: Configure Private Subnets and Subnet Group

The database instance, as shown in the diagram above, is hosted in a private subnet so that it is securely protected from direct public Internet access.

When we are creating a database instance, we need to provide something called a Subnet Group. Subnet group helps deploy our instances across multiple Availability Zones (AZs), providing high availability and fault tolerance. Hence, we need to create two private subnets in order to successfully setup our database cluster.

PrivateSubnet1:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [0, !GetAZs '']
CidrBlock: 38.0.128.0/20
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-private-subnet1'
- Key: AZ
Value: !Select [0, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

PrivateSubnet2:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [1, !GetAZs '']
CidrBlock: 38.0.144.0/20
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-private-subnet2'
- Key: AZ
Value: !Select [1, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

Even thought resources in private subnets should not be directly accessible from the internet, they still need to communicate with other resources within the VPC. Hence, route table is neccessary to define routes that enable this internal communication.

PrivateRouteTable1:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table-private-1'
- Key: Project
Value: !Ref ProjectName

PrivateSubnetRouteTableAssociationAZ1:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTable1
SubnetId: !Ref PrivateSubnet1

PrivateRouteTable2:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table-private-2'
- Key: Project
Value: !Ref ProjectName

PrivateSubnetRouteTableAssociationAZ2:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTable2
SubnetId: !Ref PrivateSubnet2

In this article, as shown in the diagram above, one of the private subnets is not used. The additional subnet makes it easier for us to switch to a Multi-AZ database instance deployment in the future.

After we have defined the two private subnets, we can thus proceed to configure the subnet group as follows.

DBSubnetGroup: 
Type: 'AWS::RDS::DBSubnetGroup'
Properties:
DBSubnetGroupDescription:
!Sub 'Subnet group for ${AWS::StackName}-core-db DB Cluster'
SubnetIds:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
Tags:
- Key: Project
Value: !Ref ProjectName

Step 5: Define Database Cluster and Instance

As mentioned earlier, we will be using Amazon Aurora. So what is Aurora?

In 2014, Aurora was introduced to the public. Aurora is a fully-managed MySQL and PostgreSQL-compatible RDBMS. Aurora has 5x the throughput of MySQL and 3x of PostgreSQL, at 1/10th the cost of commercial databases. Aurora.

Five years after that, in 2019, Aurora Serverless was generally available in several regions such as US, EU, and Japan. Aurora Serverless is a flexible and cost-effective RDBMS option on AWS for apps with variable or unpredictable workloads because it offers an on-demand and auto-scaling way to run Aurora database clusters.

In 2022, Aurora Serverless v2 is generally available and supports CloudFormation.

RDSDBCluster:
Type: 'AWS::RDS::DBCluster'
Properties:
Engine: aurora-mysql
DBClusterIdentifier: !Sub '${AWS::StackName}-core-db'
DBSubnetGroupName: !Ref DBSubnetGroup
NetworkType: IPV4
VpcSecurityGroupIds:
- !Ref DatabaseSecurityGroup
AvailabilityZones:
- !Select [0, !GetAZs '']
EngineVersion: !Ref EngineVersion
MasterUsername: !Ref MasterUsername
MasterUserPassword: !Ref MasterUserPassword
ServerlessV2ScalingConfiguration:
MinCapacity: !Ref MinCapacity
MaxCapacity: !Ref MaxCapacity

RDSDBInstance:
Type: 'AWS::RDS::DBInstance'
Properties:
Engine: aurora-mysql
DBInstanceClass: db.serverless
DBClusterIdentifier: !Ref RDSDBCluster

The ServerlessV2ScalingConfiguration property is specially designed for Aurora Serverless v2 only. Here, we configure the minimum and maximum capacities for our database cluster to be 0.5 and 1 ACUs, respectively.

Choose 0.5 for the minimum because that allows our database instance to scale down the most when it is completely idle. For the maximum, we choose the lowest possible value, i.e. 1 ACU, to avoid the possibility of unexpected charges.

Step 6: Allow Connection from Bastion Host to the Database Instance

Finally, we need to allow the traffic from our bastion host to the database. Hence, our database security group template should be defined in the following manner.

DatabaseSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub '${AWS::StackName}-core-database-sg'
GroupDescription:
!Sub 'Security group for ${AWS::StackName} core database'
VpcId: !Ref VPC

DatabaseAllowInboundFromBastion:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref DatabaseSecurityGroup
IpProtocol: tcp
FromPort: 3306
ToPort: 3306
SourceSecurityGroupId:
Fn::GetAtt:
- BastionSecurityGroup
- GroupId
GroupId:
Fn::GetAtt:
- DatabaseSecurityGroup
- GroupId

To connect to the database instance from the bastion host, we need to navigate to the folder containing the private key and perform the following.

ssh -i <private-key.pem> -f -N -L 3306:<db-instance-endpoint>:3306 ec2-user@<bastion-host-ip-address> -vvv

The -L option in the format of port:host:hostport in the command above basically specifies that connections to the given TCP port on the local host are to be forwarded to the given host and port on the remote side.

We can get the endpoint and port of our DB instance from the AWS Console.

With the command above, we should be able to connect to our database instance via our bastion host, as shown in the screenshot below.

We can proceed to connect to our database instance after reaching this step.

Now, we are able to connect to our Aurora database on MySQL Workbench.

Connecting to our Aurora Serverless database on AWS!

WRAP-UP

That’s all for how we have to configure the infrastructure described in the following diagram so that we can connect to our RDS databases in private subnets through a bastion host.

I have also attached the complete CloudFormation template below for your reference.

# This is the complete template for our scenario discussed in this article.
---
AWSTemplateFormatVersion: '2010-09-09'
Description: 'Setup and Access Private RDS Database via a Bastion Host'

Parameters:
ProjectName:
Type: String
Default: my-project
EC2InstanceType:
Type: String
Default: t2.micro
EC2AMI:
Type: String
Default: ami-020283e959651b381 # Amazon Linux 2023 AMI 2023.3.20240219.0 x86_64 HVM kernel-6.1
EC2KeyPairName:
Type: String
Default: my-project-ap-northeast-1-keypair
MasterUsername:
Type: String
Default: admin
MasterUserPassword:
Type: String
AllowedPattern: "[a-zA-Z0-9]+"
NoEcho: true
EngineVersion:
Type: String
Default: 8.0
MinCapacity:
Type: String
Default: 0.5
MaxCapacity:
Type: String
Default: 1

Resources:
VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: 38.0.0.0/16
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc'
- Key: Project
Value: !Ref ProjectName

PublicSubnet:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone: !Select [0, !GetAZs '']
VpcId: !Ref VPC
CidrBlock: 38.0.0.0/20
MapPublicIpOnLaunch: true
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-public-subnet1'
- Key: AZ
Value: !Select [0, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

PrivateSubnet1:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [0, !GetAZs '']
CidrBlock: 38.0.128.0/20
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-private-subnet1'
- Key: AZ
Value: !Select [0, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

PrivateSubnet2:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [1, !GetAZs '']
CidrBlock: 38.0.144.0/20
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-private-subnet2'
- Key: AZ
Value: !Select [1, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

InternetGateway:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-igw'
- Key: Project
Value: !Ref ProjectName

VPCGatewayAttachment:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
InternetGatewayId: !Ref InternetGateway
VpcId: !Ref VPC

PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table'
- Key: Project
Value: !Ref ProjectName

InternetRoute:
Type: AWS::EC2::Route
DependsOn: VPCGatewayAttachment
Properties:
RouteTableId: !Ref PublicRouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway

SubnetRouteTableAssociationAZ1:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PublicRouteTable
SubnetId: !Ref PublicSubnet

PrivateRouteTable1:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table-private-1'
- Key: Project
Value: !Ref ProjectName

PrivateSubnetRouteTableAssociationAZ1:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTable1
SubnetId: !Ref PrivateSubnet1

PrivateRouteTable2:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table-private-2'
- Key: Project
Value: !Ref ProjectName

PrivateSubnetRouteTableAssociationAZ2:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTable2
SubnetId: !Ref PrivateSubnet2

BastionSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub '${AWS::StackName}-bastion-sg'
GroupDescription:
!Sub 'Security group for ${AWS::StackName} bastion host'
VpcId: !Ref VPC

BastionAllowInboundSSHFromInternet:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref BastionSecurityGroup
IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: 0.0.0.0/0

BastionInstance:
Type: AWS::EC2::Instance
Properties:
ImageId: !Ref EC2AMI
InstanceType: !Ref EC2InstanceType
KeyName: !Ref EC2KeyPairName
SubnetId: !Ref PublicSubnet
SecurityGroupIds:
- !Ref BastionSecurityGroup
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-bastion'
- Key: Project
Value: !Ref ProjectName

DatabaseSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub '${AWS::StackName}-core-database-sg'
GroupDescription:
!Sub 'Security group for ${AWS::StackName} core database'
VpcId: !Ref VPC

DatabaseAllowInboundFromBastion:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref DatabaseSecurityGroup
IpProtocol: tcp
FromPort: 3306
ToPort: 3306
SourceSecurityGroupId:
Fn::GetAtt:
- BastionSecurityGroup
- GroupId
GroupId:
Fn::GetAtt:
- DatabaseSecurityGroup
- GroupId

DBSubnetGroup:
Type: 'AWS::RDS::DBSubnetGroup'
Properties:
DBSubnetGroupDescription:
!Sub 'Subnet group for ${AWS::StackName}-core-db DB Cluster'
SubnetIds:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
Tags:
- Key: Project
Value: !Ref ProjectName

RDSDBCluster:
Type: 'AWS::RDS::DBCluster'
Properties:
Engine: aurora-mysql
DBClusterIdentifier: !Sub '${AWS::StackName}-core-db'
DBSubnetGroupName: !Ref DBSubnetGroup
NetworkType: IPV4
VpcSecurityGroupIds:
- !Ref DatabaseSecurityGroup
AvailabilityZones:
- !Select [0, !GetAZs '']
EngineVersion: !Ref EngineVersion
MasterUsername: !Ref MasterUsername
MasterUserPassword: !Ref MasterUserPassword
ServerlessV2ScalingConfiguration:
MinCapacity: !Ref MinCapacity
MaxCapacity: !Ref MaxCapacity

RDSDBInstance:
Type: 'AWS::RDS::DBInstance'
Properties:
Engine: aurora-mysql
DBInstanceClass: db.serverless
DBClusterIdentifier: !Ref RDSDBCluster