Setting up Custom Domain for AWS API Gateway (version 2024)

In our earlier post about setting up AWS Lambda, we understood that for Lambda which will be invoked by HTTP requests, we normally need API Gateway to handle incoming HTTP requests and route them to the appropriate Lambda function. Hence, the API Gateway is often used as a complementary service to manage and expose our Lambda functions to the outside world.

By default, AWS provides a custom domain name in the form of api-id.execute-api.region.amazonaws.com, as shown in the screnshot below.

This Lambda is deployed in Singapore and thus it is ap-southeast-1 for the region.

Usually, we would like to expose our API Gateway through a domain name that we own, rather than using the default domain provided, in order to enhance the visibility, security, and professionalism of our API, while also providing greater flexibility and control over its configuration and branding. In this article, we will show one of the available approaches.

Domain Registrar

We must have a registered Internet domain name before we can set up custom domain names for our API Gateway.

Previously, I have registered a domain name, chunlinprojects.com, on GoDaddy, one of the world’s largest domain registrars. Hence, I decided to create a subdomain called poker.chunlinprojects.com and then use it for my API Gateway.

My personal domain, chunlinprojects.com, on GoDaddy.

ACM Certificate

Before setting up a custom domain name for an API, we also must have an SSL/TLS certificate ready in AWS Certificate Manager (ACM). Please take note that for API Gateway Regional custom domain name, we must request or import the certificate in the same Region as our API Gateway.

Requesting a public certificate in ACM.

In the certificate request page, we need to specify the domain name. In my case, it should be poker.chunlinprojects.com. Once it is done, we will need to add a CNAME record to our domain registrar. Its status will only be “Success” after we have added the CNAME record successfully, as shown in the screenshot below.

We need to add the CNAME record under the “Domain” section to our domain registrar.

Route 53

As shown in the screenshot above, it is also possible to add the CNAME record to Route 53. So what is this Route 53 about?

Route 53 is Amazon Domain Name System (DNS) web service. It allows us to manage the DNS records for your domain, including A records, CNAME records, TXT records, and more. This is similar to what we can do on GoDaddy, so why do we need Route 53?

Route 53 is a better option because Route 53 provides a user-friendly interface for adding, updating, and deleting these records. In addition, Route 53 supports alias records, which can be used to map our custom domain directly to the API Gateway endpoint. Alias records work similarly to CNAME records but with the advantage of being resolved by Route 53 without incurring additional DNS lookup costs. This can improve the performance and reliability of our API Gateway by reducing latency and DNS resolution times.

To setup Route 53 for our custom domain, we first need to create a Hosted Zone in Route 53 for our domain, as shown in the screenshot below.

Creating a new hosted zone for our custom domain.

After getting the hosted zone created, we will be able to get the list of name servers that we can use, as shown in the following screenshot.

NS (Name Server) records in Route 53 are used to specify the name servers responsible for answering DNS queries for our domain.

Since we want to use Route 53 name servers for DNS resolution, we need to update the (NS) records in GoDaddy DNS settings to point to the Route 53 name servers.

The name servers in GoDaddy for my domain have been updated to use Route 53 ones.

Now we can add the CNAME record earlier in our Route 53 hosted zone too.

The CNAME record required by the ACM certificate has been added to Route 53.

Custom Domain Name in API Gateway

After we have both ACM and Route 53 setup completely, we can move on to configure our custom domain name for the API Gateway.

The ACM certificate we created earlier will be available as one of the options.

Next, we use API mappings to connect API stages to the custom domain name. For more information about how API mapping is configured, please read the official AWS documentation.

We have mapped the custom domain to the API Gateway that we created.

As shown in the screenshot above, under the Default Endpoint section of the API, we can choose to disable it so that users are unable to access the API using the AWS-generated default endpoint.

Disabled the default endpoint so that users can only access the API Gateway via our custom domain.

Create A Record in Route 53

The last step is to add a new A Record pointing to our API Gateway using alias.

Remember to turn on the “Alias” when creating the A Record.

After creating it, AWS console will allow us to view the status of the record creation, as shown in the screenshot below.

We need to wait the Status to change from “Pending” to “Insync”.

Wrap-Up

Now, when we visit our custom domain name together with the path, we should be able to access the Lambda function that we setup earlier in another article.

If you ever encounter an error message saying “Not Found” as shown in the screenshot below, it is possible that the API mapping is not done properly or there is a typo in the Path.

Error message: { “message”: “Not Found” }

The entire infrastructure that we have gone through in this article basically can be described in the following diagram.

References

Setup and Access Private RDS Database via a Bastion Host

There is always a common scenario that requires cloud engineers to configure infrastructure which allows developers to safely and securely connect to the RDS or Aurora database that is in a private subnet.

For development purpose, some developers tend to create a public IP address to access the databases on AWS as part of setup. This makes it easy for the developers to gain access to their database, but it is undoubtedly not a recommended method because it has huge security vulnerability that can compromise sensitive data.

Architecture Design

In order to make our database secure, the recommended approach by AWS is to place our database in a private subnet. Since a private subnet has no ability to communicate with the public Internet directly, we are able to isolate our data from the outside world.

Then in order to enable the developers to connect remotely to our database instance, we will setup a bastion host that allows them to connect to the database via SSH tunnelling.

The following diagram describes the overall architecture that we will be setting up for this scenario.

We will be configuring with CloudFormation template. The reason why we use CloudFormation is because it provides us with a simple way to create and manage a collection of AWS resources by provisioning and updating them in a predictable way.

Step 1: Specify Parameters

In the CloudFormation template, we will be using the following parameters.

Parameters:
ProjectName:
Type: String
Default: my-project
EC2InstanceType:
Type: String
Default: t2.micro
EC2AMI:
Type: String
Default: ami-020283e959651b381 # Amazon Linux 2023 AMI 2023.3.20240219.0 x86_64 HVM kernel-6.1
EC2KeyPairName:
Type: String
Default: my-project-ap-northeast-1-keypair
MasterUsername:
Type: String
Default: admin
MasterUserPassword:
Type: String
AllowedPattern: "[a-zA-Z0-9]+"
NoEcho: true
EngineVersion:
Type: String
Default: 8.0
MinCapacity:
Type: String
Default: 0.5
MaxCapacity:
Type: String
Default: 1

As you have noticed in the parameters for EC2, we choose to use the Amazon Linux 2023 AMI, which is shown in the following screenshot.

We can easily retrieve the AMI ID of an image in the AWS Console.

We are also using a keypair that we have already created. It is a keypair called “my-project-ap-northeast-1-keypair”.

We can locate existing key pairs in the EC2 instances page.

Step 2: Setup VPC

Amazon Virtual Private Cloud (VPC) is a foundational service for networking and compute categories. It lets us provision a logically isolated section of the AWS cloud to launch our AWS resources. VPC allows resources within a VPC to access AWS services without needing to go over the Internet.

When we use a VPC, we have control over our virtual networking environment. We can choose our own IP address range, create subnets, and configure routing and access control lists.

VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: 38.0.0.0/16
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc'
- Key: Project
Value: !Ref ProjectName

Step 3: Setup Public Subnet, IGW, and Bastion Host

A bastion host is a dedicated server that lets authorised users access a private network from an external network such as the Internet.

A bastion host, also known as a jump server, is used as a bridge between the public Internet and a private subnet in a network architecture. It acts as a gateway that allows secure access from external networks to internal resources without directly exposing those resources to the public.

This setup enhances security by providing a single point of entry that can be closely monitored and controlled, reducing the attack surface of the internal network.

In this step, we will be launching an EC2 instance which is also our bastion host into our public subnet which is defined as follows.

PublicSubnet:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone: !Select [0, !GetAZs '']
VpcId: !Ref VPC
CidrBlock: 38.0.0.0/20
MapPublicIpOnLaunch: true
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-public-subnet1'
- Key: AZ
Value: !Select [0, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

This public subnet will be able to receive public connection requests from the Internet. However, we should make sure that our bastion host to only be accessible via SSH at port 22.

BastionSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub '${AWS::StackName}-bastion-sg'
GroupDescription:
!Sub 'Security group for ${AWS::StackName} bastion host'
VpcId: !Ref VPC

BastionAllowInboundSSHFromInternet:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref BastionSecurityGroup
IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: 0.0.0.0/0

CidrIp defines the IP address range that is permitted to send inbound traffic through the security group. 0.0.0.0/0 means from the whole Internet. Thus, we can also make sure that the connections are from certain IP addresses such as our home or workplace networks. Doing so will reduce the risk of exposing our bastion host to unintended outside audiences.

In order to enable resources in our public subnets, which is our bastion host in this case, to connect to the Internet, we also need to add Internet Gateway (IGW). IGW is a VPC component that allows communication between the VPC and the Internet.

InternetGateway:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-igw'
- Key: Project
Value: !Ref ProjectName

VPCGatewayAttachment:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
InternetGatewayId: !Ref InternetGateway
VpcId: !Ref VPC

For outbound traffic, a route table for the IGW is necessary. When resources within a subnet need to communicate with resources outside of the VPC, such as accessing the public Internet or other AWS services, they need a route to the IGW.

PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table'
- Key: Project
Value: !Ref ProjectName

InternetRoute:
Type: AWS::EC2::Route
DependsOn: VPCGatewayAttachment
Properties:
RouteTableId: !Ref PublicRouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway

SubnetRouteTableAssociationAZ1:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PublicRouteTable
SubnetId: !Ref PublicSubnet

A destination of 0.0.0.0/0 in the DestinationCidrBlock means that all traffic that is trying to access the Internet needs to flow through the target, i.e. the IGW.

Finally, we can define our bastion host EC2 instance with the following template.

BastionInstance:
Type: AWS::EC2::Instance
Properties:
ImageId: !Ref EC2AMI
InstanceType: !Ref EC2InstanceType
KeyName: !Ref EC2KeyPairName
SubnetId: !Ref PublicSubnet
SecurityGroupIds:
- !Ref BastionSecurityGroup
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-bastion'
- Key: Project
Value: !Ref ProjectName

Step 4: Configure Private Subnets and Subnet Group

The database instance, as shown in the diagram above, is hosted in a private subnet so that it is securely protected from direct public Internet access.

When we are creating a database instance, we need to provide something called a Subnet Group. Subnet group helps deploy our instances across multiple Availability Zones (AZs), providing high availability and fault tolerance. Hence, we need to create two private subnets in order to successfully setup our database cluster.

PrivateSubnet1:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [0, !GetAZs '']
CidrBlock: 38.0.128.0/20
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-private-subnet1'
- Key: AZ
Value: !Select [0, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

PrivateSubnet2:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [1, !GetAZs '']
CidrBlock: 38.0.144.0/20
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-private-subnet2'
- Key: AZ
Value: !Select [1, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

Even thought resources in private subnets should not be directly accessible from the internet, they still need to communicate with other resources within the VPC. Hence, route table is neccessary to define routes that enable this internal communication.

PrivateRouteTable1:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table-private-1'
- Key: Project
Value: !Ref ProjectName

PrivateSubnetRouteTableAssociationAZ1:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTable1
SubnetId: !Ref PrivateSubnet1

PrivateRouteTable2:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table-private-2'
- Key: Project
Value: !Ref ProjectName

PrivateSubnetRouteTableAssociationAZ2:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTable2
SubnetId: !Ref PrivateSubnet2

In this article, as shown in the diagram above, one of the private subnets is not used. The additional subnet makes it easier for us to switch to a Multi-AZ database instance deployment in the future.

After we have defined the two private subnets, we can thus proceed to configure the subnet group as follows.

DBSubnetGroup: 
Type: 'AWS::RDS::DBSubnetGroup'
Properties:
DBSubnetGroupDescription:
!Sub 'Subnet group for ${AWS::StackName}-core-db DB Cluster'
SubnetIds:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
Tags:
- Key: Project
Value: !Ref ProjectName

Step 5: Define Database Cluster and Instance

As mentioned earlier, we will be using Amazon Aurora. So what is Aurora?

In 2014, Aurora was introduced to the public. Aurora is a fully-managed MySQL and PostgreSQL-compatible RDBMS. Aurora has 5x the throughput of MySQL and 3x of PostgreSQL, at 1/10th the cost of commercial databases. Aurora.

Five years after that, in 2019, Aurora Serverless was generally available in several regions such as US, EU, and Japan. Aurora Serverless is a flexible and cost-effective RDBMS option on AWS for apps with variable or unpredictable workloads because it offers an on-demand and auto-scaling way to run Aurora database clusters.

In 2022, Aurora Serverless v2 is generally available and supports CloudFormation.

RDSDBCluster:
Type: 'AWS::RDS::DBCluster'
Properties:
Engine: aurora-mysql
DBClusterIdentifier: !Sub '${AWS::StackName}-core-db'
DBSubnetGroupName: !Ref DBSubnetGroup
NetworkType: IPV4
VpcSecurityGroupIds:
- !Ref DatabaseSecurityGroup
AvailabilityZones:
- !Select [0, !GetAZs '']
EngineVersion: !Ref EngineVersion
MasterUsername: !Ref MasterUsername
MasterUserPassword: !Ref MasterUserPassword
ServerlessV2ScalingConfiguration:
MinCapacity: !Ref MinCapacity
MaxCapacity: !Ref MaxCapacity

RDSDBInstance:
Type: 'AWS::RDS::DBInstance'
Properties:
Engine: aurora-mysql
DBInstanceClass: db.serverless
DBClusterIdentifier: !Ref RDSDBCluster

The ServerlessV2ScalingConfiguration property is specially designed for Aurora Serverless v2 only. Here, we configure the minimum and maximum capacities for our database cluster to be 0.5 and 1 ACUs, respectively.

Choose 0.5 for the minimum because that allows our database instance to scale down the most when it is completely idle. For the maximum, we choose the lowest possible value, i.e. 1 ACU, to avoid the possibility of unexpected charges.

Step 6: Allow Connection from Bastion Host to the Database Instance

Finally, we need to allow the traffic from our bastion host to the database. Hence, our database security group template should be defined in the following manner.

DatabaseSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub '${AWS::StackName}-core-database-sg'
GroupDescription:
!Sub 'Security group for ${AWS::StackName} core database'
VpcId: !Ref VPC

DatabaseAllowInboundFromBastion:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref DatabaseSecurityGroup
IpProtocol: tcp
FromPort: 3306
ToPort: 3306
SourceSecurityGroupId:
Fn::GetAtt:
- BastionSecurityGroup
- GroupId
GroupId:
Fn::GetAtt:
- DatabaseSecurityGroup
- GroupId

To connect to the database instance from the bastion host, we need to navigate to the folder containing the private key and perform the following.

ssh -i <private-key.pem> -f -N -L 3306:<db-instance-endpoint>:3306 ec2-user@<bastion-host-ip-address> -vvv

The -L option in the format of port:host:hostport in the command above basically specifies that connections to the given TCP port on the local host are to be forwarded to the given host and port on the remote side.

We can get the endpoint and port of our DB instance from the AWS Console.

With the command above, we should be able to connect to our database instance via our bastion host, as shown in the screenshot below.

We can proceed to connect to our database instance after reaching this step.

Now, we are able to connect to our Aurora database on MySQL Workbench.

Connecting to our Aurora Serverless database on AWS!

WRAP-UP

That’s all for how we have to configure the infrastructure described in the following diagram so that we can connect to our RDS databases in private subnets through a bastion host.

I have also attached the complete CloudFormation template below for your reference.

# This is the complete template for our scenario discussed in this article.
---
AWSTemplateFormatVersion: '2010-09-09'
Description: 'Setup and Access Private RDS Database via a Bastion Host'

Parameters:
ProjectName:
Type: String
Default: my-project
EC2InstanceType:
Type: String
Default: t2.micro
EC2AMI:
Type: String
Default: ami-020283e959651b381 # Amazon Linux 2023 AMI 2023.3.20240219.0 x86_64 HVM kernel-6.1
EC2KeyPairName:
Type: String
Default: my-project-ap-northeast-1-keypair
MasterUsername:
Type: String
Default: admin
MasterUserPassword:
Type: String
AllowedPattern: "[a-zA-Z0-9]+"
NoEcho: true
EngineVersion:
Type: String
Default: 8.0
MinCapacity:
Type: String
Default: 0.5
MaxCapacity:
Type: String
Default: 1

Resources:
VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: 38.0.0.0/16
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc'
- Key: Project
Value: !Ref ProjectName

PublicSubnet:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone: !Select [0, !GetAZs '']
VpcId: !Ref VPC
CidrBlock: 38.0.0.0/20
MapPublicIpOnLaunch: true
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-public-subnet1'
- Key: AZ
Value: !Select [0, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

PrivateSubnet1:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [0, !GetAZs '']
CidrBlock: 38.0.128.0/20
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-private-subnet1'
- Key: AZ
Value: !Select [0, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

PrivateSubnet2:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [1, !GetAZs '']
CidrBlock: 38.0.144.0/20
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-vpc-private-subnet2'
- Key: AZ
Value: !Select [1, !GetAZs '']
- Key: Project
Value: !Ref ProjectName

InternetGateway:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-igw'
- Key: Project
Value: !Ref ProjectName

VPCGatewayAttachment:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
InternetGatewayId: !Ref InternetGateway
VpcId: !Ref VPC

PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table'
- Key: Project
Value: !Ref ProjectName

InternetRoute:
Type: AWS::EC2::Route
DependsOn: VPCGatewayAttachment
Properties:
RouteTableId: !Ref PublicRouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway

SubnetRouteTableAssociationAZ1:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PublicRouteTable
SubnetId: !Ref PublicSubnet

PrivateRouteTable1:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table-private-1'
- Key: Project
Value: !Ref ProjectName

PrivateSubnetRouteTableAssociationAZ1:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTable1
SubnetId: !Ref PrivateSubnet1

PrivateRouteTable2:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-route-table-private-2'
- Key: Project
Value: !Ref ProjectName

PrivateSubnetRouteTableAssociationAZ2:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTable2
SubnetId: !Ref PrivateSubnet2

BastionSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub '${AWS::StackName}-bastion-sg'
GroupDescription:
!Sub 'Security group for ${AWS::StackName} bastion host'
VpcId: !Ref VPC

BastionAllowInboundSSHFromInternet:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref BastionSecurityGroup
IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: 0.0.0.0/0

BastionInstance:
Type: AWS::EC2::Instance
Properties:
ImageId: !Ref EC2AMI
InstanceType: !Ref EC2InstanceType
KeyName: !Ref EC2KeyPairName
SubnetId: !Ref PublicSubnet
SecurityGroupIds:
- !Ref BastionSecurityGroup
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}-bastion'
- Key: Project
Value: !Ref ProjectName

DatabaseSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub '${AWS::StackName}-core-database-sg'
GroupDescription:
!Sub 'Security group for ${AWS::StackName} core database'
VpcId: !Ref VPC

DatabaseAllowInboundFromBastion:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref DatabaseSecurityGroup
IpProtocol: tcp
FromPort: 3306
ToPort: 3306
SourceSecurityGroupId:
Fn::GetAtt:
- BastionSecurityGroup
- GroupId
GroupId:
Fn::GetAtt:
- DatabaseSecurityGroup
- GroupId

DBSubnetGroup:
Type: 'AWS::RDS::DBSubnetGroup'
Properties:
DBSubnetGroupDescription:
!Sub 'Subnet group for ${AWS::StackName}-core-db DB Cluster'
SubnetIds:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
Tags:
- Key: Project
Value: !Ref ProjectName

RDSDBCluster:
Type: 'AWS::RDS::DBCluster'
Properties:
Engine: aurora-mysql
DBClusterIdentifier: !Sub '${AWS::StackName}-core-db'
DBSubnetGroupName: !Ref DBSubnetGroup
NetworkType: IPV4
VpcSecurityGroupIds:
- !Ref DatabaseSecurityGroup
AvailabilityZones:
- !Select [0, !GetAZs '']
EngineVersion: !Ref EngineVersion
MasterUsername: !Ref MasterUsername
MasterUserPassword: !Ref MasterUserPassword
ServerlessV2ScalingConfiguration:
MinCapacity: !Ref MinCapacity
MaxCapacity: !Ref MaxCapacity

RDSDBInstance:
Type: 'AWS::RDS::DBInstance'
Properties:
Engine: aurora-mysql
DBInstanceClass: db.serverless
DBClusterIdentifier: !Ref RDSDBCluster

Infrastructure Management with Terraform

Last week, my friend working in the field of infrastructure management gave me an overview of Infrastructure as Code (IaC).

He came across a tool called Terraform which can automate the deployment and management of cloud resources. Hence, together, we researched on ways to build a simple demo in order to demonstrate how Terraform can help in the cloud infrastructure management.

We decided to start from a simple AWS cloud architecture as demonstrated below.

As illustrated in the diagram, we have a bastion server and an admin server.

A bastion server, aka a jump host, is a server that sits between internet network of a company and the external network, such as the Internet. It is to provide an additional layer of security by limiting the number of entry points to the internet network and allowing for strict access controls and monitoring.

An admin server, on the other hand, is a server used by system admins to manage the cloud resources. Hence the admin server typically includes tools for managing cloud resources, monitoring system performance, deploying apps, and configuring security settings. It’s generally recommended to place an admin server in a private subnet to enhance security and reduce the attack surface of our cloud infrastructure.

In combination, the two servers help to ensure that the cloud infrastructure is secure, well-managed, and highly available.

Show Me the Code!

The complete source code of this project can be found at https://github.com/goh-chunlin/terraform-bastion-and-admin-servers-on-aws.

Infrastructure as Code (IaC)

As we can see in the architecture diagram above, the cloud resources are all available on AWS. We can set them up by creating the resources one by one through the AWS Console. However, doing it manually is not efficient and it is also not easy to be repeatedly done. In fact, there will be other problems arising from doing it with AWS Console manually.

  • Manual cloud resource setup leads to higher possibility of human errors and it takes longer time relatively;
  • Difficult to identify cloud resource in use;
  • Difficult to track modifications in infrastructure;
  • Burden on infrastructure setup and configuration;
  • Redundant work is inevitable for various development environments;
  • Restriction is how only the infrastructure PIC can setup the infrastructure.

A concept known as IaC is thus introduced to solve these problems.

IaC is a way to manage our infrastructure through code in configuration files instead of through manual processes. It is thus a key DevOps practice and a component of continuous delivery.

Based on the architecture diagram, the services and resources necessary for configuring with IaC can be categorised into three parts, i.e. Virtual Private Cloud (VPC), Key Pair, and Elastic Compute Cloud (EC2).

The resources necessary to be created.

There are currently many IaC tools available. The tools are categorised into two major groups, i.e. those using declarative language and those using imperative language. Terraform is one of them and it is using Hashicorp Configuration Language (HCL), a declarative language.

The workflow for infrastructure provisioning using Terraform can be summarised as shown in the following diagram.

The basic process of Terraform. (Credit: HashiCorp Developer)

We first write the HCL code. Then Terraform will verify the status of the code and apply it to the infrastructure if there is no issue in verification. Since Terraform is using a declarative language, it will do the identification of resources itself without the need of us to manually specify the dependency of resources, sometimes.

After command apply is executed successfully, we can check the applied infrastructure list through the command terraform state list. We can also check records of output variable we defined through the command terraform output.

When the command terraform apply is executed, a status information file called terraform.tfstate will be automatically created.

After understanding the basic process of Terraform, we proceed to write the HCL for different modules of the infrastructure.

Terraform

The components of a Terraform code written with the HCL are as follows.

Terraform code.

In Terraform, there are three files, i.e. main.tf, variables.tf, and outputs.tf recommended to have for a minimal module, even if they’re empty. The file main.tf should be the primary entry point. The other two files, variables.tf and outputs.tf, should contain the declarations for variables and outputs, respectively.

For variables, we have vars.tf file which defines the necessary variables and terraform.tfvars file which allocated value to the defined variables.

In the diagram above, we also see that there is a terraform block. It is to declare status info, version info, action, etc. of Terraform. For example, we use the following HCL code to set the Terraform version to use and also specify the location for storing the status info file generated by Terraform.

terraform {
  backend "s3" {
    bucket  = "my-terraform-01"
    key     = "test/terraform.tfstate"
    region  = "ap-southeast-1"
  }
  required_version = ">=1.1.3"
}

Terraform uses a state file to map real world resources to our configuration, keep track of metadata, and to improve performance for large infrastructures. The state is stored by default in a file named “terraform.tfstate”.

This is a S3 bucket we use for storing our Terraform state file.

The reason why we keep our terraform.tfstat file on the cloud, i.e. the S3 bucket, is because state is a necessary requirement for Terraform to function and thus we must make sure that it is stored in a centralised repo which cannot be easily deleted. Doing this also good for everyone in the team because they will be working with the same state so that operations will be applied to the same remote objects.

Finally, we have a provider block which declares cloud environment or provider to be created with Terraform, as shown below. Here, we will be creating our resources on AWS Singapore region.

provider "aws" {
  region = "ap-southeast-1"
}

Module 1: VPC

Firstly, in Terraform, we will have a VPC module created with resources listed below.

1.1 VPC

resource "aws_vpc" "my_simple_vpc" {
  cidr_block = "10.2.0.0/16"

  tags = {
    Name = "${var.resource_prefix}-my-vpc",
  }
}

The resource_prefix is a string to make sure all the resources created with the Terraform getting the same prefix. If your organisation has different naming rules, then feel free to change the format accordingly.

1.2 Subnets

The public subnet for the bastion server is defined as follows. The private IP of the bastion server will be in the format of 10.2.10.X. We also set the map_public_ip_on_launch to true so that instances launched into the subnet should be assigned a public IP address.

resource "aws_subnet" "public" {
  count                   = 1
  vpc_id                  = aws_vpc.my_simple_vpc.id
  availability_zone       = data.aws_availability_zones.available.names[count.index]
  cidr_block              = "10.2.1${count.index}.0/24"
  map_public_ip_on_launch = true

  tags = tomap({
    Name = "${var.resource_prefix}-public-subnet${count.index + 1}",
  })
}

The private subnet for the bastion server is defined as follows. The admin server will then have a private IP with the format of 10.2.20.X.

resource "aws_subnet" "private" {
  count                   = 1
  vpc_id                  = aws_vpc.my_simple_vpc.id
  availability_zone       = data.aws_availability_zones.available.names[count.index]
  cidr_block              = "10.2.2${count.index}.0/24"
  map_public_ip_on_launch = false

  tags = tomap({
    Name = "${var.resource_prefix}-private-subnet${count.index + 1}",
  })
}

The aws_availability_zones data source is part of the AWS provider and retrieves a list of availability zones based on the arguments supplied. Here, we make the public subnet and private subnet to be in the same first availability zones.

1.3 Internet Gateway

Normally, if we create an internet gateway via AWS console, for example, we will sometimes forget to associate it with the VPC. With Terraform, we can do the association in the code and thus reduce the chance of setting up the internet gateway wrongly.

resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.my_simple_vpc.id

  tags = {
    Name = "${var.resource_prefix}-igw"
  }
}

1.4 NAT Gateway

Even though Terraform is a declarative language, i.e. a language describing an intended goal rather than the steps to reach that goal, we can use the depends_on meta-argument to handle hidden resource or module dependencies that Terraform cannot automatically infer.

resource "aws_nat_gateway" "nat_gateway" {
  allocation_id = aws_eip.nat.id
  subnet_id     = element(aws_subnet.public.*.id, 0)
  depends_on    = [aws_internet_gateway.igw]

  tags = {
    Name = "${var.resource_prefix}-nat-gw"
  }
}

1.5 Elastic IP (EIP)

If you have noticed, in the NAT gateway definition above, we have assigned a public IP to it using EIP. Since Terraform is declarative, the ordering of blocks is generally not significant. So we can define the EIP after the NAT gateway.

resource "aws_eip" "nat" {
  vpc        = true
  depends_on = [aws_internet_gateway.igw]

  tags = {
    Name = "${var.resource_prefix}-NAT"
  }
}

1.6 Route Tables

Finally, we just need to link the resources above with both public and private route tables, as defined below.

resource "aws_route_table" "public_route" {
  vpc_id = aws_vpc.my_simple_vpc.id
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.igw.id
  }

  tags = {
    Name = "${var.resource_prefix}-public-route"
  }
}

resource "aws_route_table_association" "public_route" {
  count          = 1
  subnet_id      = aws_subnet.public.*.id[count.index]
  route_table_id = aws_route_table.public_route.id
}

resource "aws_route_table" "private_route" {
  vpc_id = aws_vpc.my_simple_vpc.id
  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.nat_gateway.id
  }

  tags = {
    Name = "${var.resource_prefix}-private-route",
  }
}

resource "aws_route_table_association" "private_route" {
  count          = 1
  subnet_id      = aws_subnet.private.*.id[count.index]
  route_table_id = aws_route_table.private_route.id
}

That’s all we need to do the setup the VPC on AWS as illustrated in the diagram.

MODULE 2: Key pair

Before we move on the create the two instances, we will need to define a key pair. A key pair is a set of security credentials that we use to prove our identity when connecting to an EC2 instance. Hence, we need to ensure that we have access to the selected key pair before we launch the instances.

If we are doing this on the AWS Console, we will be seeing this part on the console as shown below.

The GUI on the AWS Console to create a new key pair.

So, we can use the same info to define the key pair.

resource "tls_private_key" "instance_key" {
  algorithm = "RSA"
}

resource "aws_key_pair" "generated_key" {
  key_name = var.keypair_name
  public_key = tls_private_key.instance_key.public_key_openssh
  depends_on = [
    tls_private_key.instance_key
  ]
}

resource "local_file" "key" {
  content = tls_private_key.instance_key.private_key_pem
  filename = "${var.keypair_name}.pem"
  file_permission ="0400"
  depends_on = [
    tls_private_key.instance_key
  ]
}

The tls_private_key is to create a PEM (and OpenSSH) formatted private key. This is not a recommended way for production because it will generate the private key file and keep it unencrypted in the directory where we run the Terraform commands. Instead, we should generate the private key file outside of Terraform and distribute it securely to the system where Terraform will be run.

MODULE 3: EC2

Once we have the key pair, we can finally move on to define how the bastion and admin servers can be created. We can define a module for the servers as follows.

resource "aws_instance" "linux_server" {
  ami                         = var.ami
  instance_type               = var.instance_type
  subnet_id                   = var.subnet_id
  associate_public_ip_address = var.is_in_public_subnet
  key_name                    = var.key_name
  vpc_security_group_ids      = [ aws_security_group.linux_server_security_group.id ]
  tags = {
    Name = var.server_name
  }
  user_data = var.user_data
}

resource "aws_security_group" "linux_server_security_group" {
  name         = var.security_group.name
  description  = var.security_group.description
  vpc_id       = var.vpc_id
 
  ingress {
    description = "SSH inbound"
    from_port = 22
    to_port = 22
    protocol = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  egress {
    description = "Allow All egress rule"
    from_port = 0
    to_port = 0
    protocol = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
 
  tags = {
    Name = var.security_group.name
  }
}

By default, AWS creates an ALLOW ALL egress rule when creating a new security group inside of a VPC. However, Terraform will remove this default rule, and require us to specifically re-create it if we desire that rule. Hence that is why we need to include the protocol=”-1″ egress block above.

The output.tf of EC2 instance module is defined as follows.

output "instance_public_ip" {
  description = "Public IP address of the EC2 instance."
  value       = aws_instance.linux_server.public_ip
}

output "instance_private_ip" {
  description = "Private IP address of the EC2 instance in the VPC."
  value       = aws_instance.linux_server.private_ip
}

With this definition, once the Terraform workflow is completed, the public IP of our bastion server and the private IP of our admin server will be displayed. We can then easily use these two IPs to connect to the servers.

Main Configuration

With all the above modules, we can finally define our AWS infrastructure using the following main.tf.

module "vpc" {
  source          = "./vpc_module"
  resource_prefix = var.resource_prefix
}
  
module "keypair" {
  source              = "./keypair_module"
  keypair_name        = "my_keypair"
}
    
module "ec2_bastion" {
  source              = "./ec2_module"
  ami                 = "ami-062550af7b9fa7d05"       # Ubuntu 20.04 LTS (HVM), SSD Volume Type
  instance_type       = "t2.micro"
  server_name         = "bastion_server"
  subnet_id           = module.vpc.public_subnet_ids[0]
  is_in_public_subnet = true
  key_name            = module.keypair.key_name
  security_group      = {
    name        = "bastion_sg"
    description = "This firewall allows SSH"
  }
  vpc_id              = module.vpc.vpc_id
}
    
module "ec2_admin" {
  source              = "./ec2_module"
  ami                 = "ami-062550af7b9fa7d05"       # Ubuntu 20.04 LTS (HVM), SSD Volume Type
  instance_type       = "t2.micro"
  server_name         = "admin_server"
  subnet_id           = module.vpc.private_subnet_ids[0]
  is_in_public_subnet = false
  key_name            = module.keypair.key_name
  security_group      = {
    name        = "admin_sg"
    description = "This firewall allows SSH"
  }
  user_data           = "${file("admin_server_init.sh")}"
  vpc_id              = module.vpc.vpc_id
  depends_on          = [module.vpc.aws_route_table_association]
}

Here, we will pre-install the AWS Command Line Interface (AWS CLI) in the admin server. Hence, we have the following script in the admin_server_init.sh file. The script will be run when the admin server is launched.

#!/bin/bash
sudo apt-get update
sudo apt-get install -y awscli

However, since the script above will be downloading AWS CLI from the Internet, we need to make sure that the routing from private network to the Internet via the NAT gateway is already done. Instead of using the depends_on meta-argument directly on the module, which will have side effect, we choose to use a recommended way, i.e. expression references.

Expression references let Terraform understand which value the reference derives from and avoid planning changes if that particular value hasn’t changed, even if other parts of the upstream object have planned changes.

Thus, I made the change accordingly with expression references. In the change, I forced the description of security group which the admin server depends on to use the the private route table association ID returned from the VPC module. Doing so will make sure that the admin server is created only after the private route table is setup properly.

With expression references, we force the admin server to be created at a later time, as compared to the bastion server.

If we don’t force the admin_server to be created after the private route table is completed, the script may fail and we can find the error logs at /var/log/cloud-init-output.log on the admin server. In addition, please remember that even though terraform apply runs just fine without any error, it does not mean user_data script is run successfully without any error as well. This is because Terraform knows nothing about the status of user_data script.

We can find the error in the log file cloud-init-output.log in the admin server.

Demo

With the Terraform files ready, now we can move on to go through the Terraform workflow using the commands.

Before we begin, besides installing Terraform, since we will deploy the infrastructure on AWS, we also shall configure the AWS CLI using the following command on the machine where we will run the Terraform commands.

aws configure

Once it is done then only we can move on to the following steps.

Firstly, we need to download plug-in necessary for the defined provider, backend, etc.

Initialising Terraform with the command terraform init.

After it is successful, there will be a message saying “Terraform has been successfully initialized!” A hidden .terraform directory, which Terraform uses to manage cached provider plugins and modules, will be automatically created.

Only after initialisation is completed, we can execute other commands, like terraform plan.

Result of executing the command terraform plan.

After running the command terraform plan, as shown in the screenshot above, we know that there are in total 17 resources to be added and two outputs, i.e. the two IPs of the two servers, will be generated.

Apply is successfully completed. All 17 resources added to AWS.

We can also run the command terraform output to get the two IPs. Meanwhile, we can also find the my_keypair.pem file which is generated by the tls_private_key we defined earlier.

The PEM file is generated by Terraform.

Now, if we check the resources, such as the two EC2 instances, on AWS Console, we should see they are all there up and running.

The bastion server and admin server are created automatically with Terraform.

Now, let’s see if we can access the admin server via the bastion server using the private key. In fact, there is no problem to access and we can also realise that the AWS CLI is already installed properly, as shown in the screenshot below.

With the success of user_data script, we can use AWS CLI on the admin server.

Deleting the Cloud Resources

To delete what we have just setup using the Terraform code, we simply run the command terraform destroy. The complete deletion of the cloud resources is done within 3 minutes. This is definitely way more efficient than doing it manually on AWS Console.

All the 17 resources have been deleted successfully and the private key file is deleted too.

Conclusion

That is all for what I had researched on with my friend.

If you are interested, feel free to checkout our Terraform source code at https://github.com/goh-chunlin/terraform-bastion-and-admin-servers-on-aws.

Long Weekend Activity #3: Moving to the Cloud above Amazon

One day before the end of my long weekend, I decided to learn setting up Windows Server 2012 instance on Amazon EC2. Also, I noted down the setup steps for my future reference.

After signing up at Amazon Web Service website, I visited the EC2 Dashboard from the AWS Management Console. Since I’d like to setup one instance in Singapore, I had to choose the region from the drop-down list at the top-right corner of the website.

Choosing region for the instance.
Choosing region for the instance.

After the region was chosen, I clicked on the blue “Launch Instance” button located at the middle of the web page to launch my first virtual server on EC2. Normally I chose the Classic Wizard so that some configurations could be changed before the setup.

Create a new instance.
Create a new instance.

The following step would be choosing an Amazon Machine Image (AMI). Somehow the Root Device Size was 0 GB which I had no idea why so. Due to the fact that I only wanted to try out AWS, I chose the one with Free Usage Tier, i.e. the Microsoft Windows Server 2012 Base.

Choose an AMI.
Choose an AMI.

In the following steps, there were options for me to set the number of instances required, instance type (set to Micro to enjoy free usage tier), subnet, network interfaces, etc. After all these, there would be a section to set the root volume size. By default, it’s 0 GB. So the instance wouldn’t be launched if the value was left default. I set it to 35 GB.

Set the volume size of the root to be 35GB.
Set the volume size of the root to be 35GB.

After providing the instance details, the next step would be creating key pair which would be used to decrypt the RDP password in the later stage. Thus, the key pair needed to be downloaded and stored safely on the computer.

Create a key pair.
Create a key pair.

There was also another section to set which ports would be open/blocked on the instance.

Set up security group to determine whether a network port is open or blocked on the instance.
Set up security group to determine whether a network port is open or blocked on the instance.

Finally, after reviewing all the details, I just clicked on the “Launch” button to launch the instance.

Review the information provided earlier before the launch of the instance.
Review the information provided earlier before the launch of the instance.

Right after the button was clicked, there was a new record added to the Instances table and its State immediately changed to “running”.

The new instance is successfully added.
The new instance is successfully added.

By right-clicking on the instance and choosing the item “Get Windows Password”, I received the default Windows Administrator password which would be used to access the instance remotely via RDP.

Retrieve the Windows Administrator password.
Retrieve the Windows Administrator password.

Yup, finally I can start playing with Windows Server 2012. =D

Yesh, successfully access the new Windows Server 2012!
Yesh, successfully access the new Windows Server 2012!