A Complete Guide on How to Autoscale Your ECS EC2 Based Application with Auto Scaling Groups Using CDK

5 min readApr 5, 2024

Amazon Web Services (AWS) offers a variety of tools to create scalable applications, and when you’re looking to scale your ECS-based applications efficiently, AWS ECS with EC2 Auto Scaling is a powerful combination. In this guide, we’ll explore how to autoscale your ECS applications using the EC2 launch type, diving deep into the configuration of Auto Scaling Groups and introducing warm pools for efficient scaling. We’ll use the AWS Cloud Development Kit (CDK) for declarative infrastructure provisioning and welcome readers of all levels to harness the power of these tools.

What is ECS EC2 Auto Scaling?

Auto Scaling in ECS EC2 allows you to adjust the number of EC2 instances within your cluster automatically, based on the demand. It ensures that your application has the right amount of capacity to handle the load at any given time.

Types of Scaling Groups

In ECS with EC2 Auto Scaling, there are primarily two types of scaling:

1. Target Tracking Scaling

This method adjusts the number of instances automatically based on a specified target for a specific metric, like CPU utilization or the number of requests per minute.

2. Step Scaling

This method adjusts the number of instances in steps, defined by your specified policies, in response to CloudWatch metrics. You can define different steps for different metric levels.

Configuring Warm Pools for Efficient Scaling

Warm pools in Auto Scaling Groups allow you to maintain a pool of pre-initialized instances that can quickly be brought into service. This significantly reduces the time to scale out and can improve the performance of your application under varying loads. Here’s how to configure them using the CDK:

In the provided code, we define an AutoScalingGroup and add a WarmPool to it:

const autoScalingGroup = new autoscaling.AutoScalingGroup(stack, 'EcsEc2Capacity', {
  // Configuration parameters
});

// Adding a Warm Pool
const warmPool = new autoscaling.WarmPool(stack, 'WarmPool', {
  autoScalingGroup: autoScalingGroup,
  maxGroupPreparedCapacity: 2,
  minSize: 1,
  poolState: autoscaling.PoolState.HIBERNATED,
  reuseOnScaleIn: false,
});

Here, the WarmPool is configured with a maximum prepared capacity and a minimum size. The poolState is set to HIBERNATED, which means instances in the warm pool are stopped but can be quickly started when needed.

Integrating with ECS

To make the Auto Scaling Group work with ECS, we use AsgCapacityProvider. This integrates the auto scaling group with ECS, allowing ECS tasks to be scheduled on the EC2 instances managed by the auto scaling group:

const capacityProvider = new ecs.AsgCapacityProvider(stack, ‘AsgCapacityProvider’, {
  autoScalingGroup,
  enableManagedTerminationProtection: true,
  enableManagedScaling: true,
});
Ec2cluster.addAsgCapacityProvider(capacityProvider);

The enableManagedScaling parameter allows ECS to manage the scaling of EC2 instances, making sure that instances are available to meet the demands of your tasks.

Here is the whole code for better understanding

import * as cdk from "aws-cdk-lib";
import * as ec2 from "aws-cdk-lib/aws-ec2";
import * as ecs from "aws-cdk-lib/aws-ecs";
import * as autoscaling from "aws-cdk-lib/aws-autoscaling";
import { Construct } from ‘constructs’;
import { VpcConstruct } from "../infra-stack/vpc-const";


export interface StackProps extends cdk.StackProps {
  vpc:VpcConstruct
  Alb: {
    alb: cdk.aws_elasticloadbalancingv2.ApplicationLoadBalancer;
    albSg: cdk.aws_ec2.SecurityGroup;
    httpslistener: cdk.aws_elasticloadbalancingv2.ApplicationListener;
  };
}

export class Stack extends cdk.Stack {
  constructor(scope: Construct, id: string, props: StackProps) {
    super(scope, id, props);

    const Cluster = createCluster(this, props);
  }
}
function createCluster(stack: Stack, props:StackProps) {
  const ecsSG = new ec2.SecurityGroup(stack, ‘SecurityGroupEcsEc2', {
    vpc: props.vpc.vpc,
    allowAllOutbound: true,
  });
  const Ec2cluster = new ecs.Cluster(stack, ‘EcsEc2Cluster’, {
    vpc: props.vpc.vpc,
    containerInsights: true,
  });
  const autoScalingGroup = new autoscaling.AutoScalingGroup(stack, 'EcsEc2Capacity', {
    vpc: props.vpc.vpc,
    instanceType: new ec2.InstanceType('m6i.xlarge'),
    machineImage: ecs.EcsOptimizedImage.amazonLinux2(),
    vpcSubnets: props.glvpc.vpc.selectSubnets({
      subnetGroupName: 'private-',
    }),
    minCapacity: 1,
    maxCapacity: 2,
    desiredCapacity: 1,
    blockDevices: [
      {
        deviceName: '/dev/xvda',
        volume: autoscaling.BlockDeviceVolume.ebs(100, {
          encrypted: true,
          volumeType: autoscaling.EbsDeviceVolumeType.GP3,
        }),
      },
    ],
  });
  autoScalingGroup.protectNewInstancesFromScaleIn();
  autoScalingGroup.connections.addSecurityGroup(ecsSG);

  const warmPool = new autoscaling.WarmPool(stack, 'WarmPool', {
    autoScalingGroup: autoScalingGroup,
    maxGroupPreparedCapacity: 2,
    minSize: 1,
    poolState: autoscaling.PoolState.HIBERNATED,
    reuseOnScaleIn: false,
  });
  const capacityProvider = new ecs.AsgCapacityProvider(stack, 'AsgCapacityProvider', {
    autoScalingGroup,
    enableManagedTerminationProtection: true,
    enableManagedScaling: true,
  });
  Ec2cluster.addAsgCapacityProvider(capacityProvider);
}

Conclusion

By following this guide, you’ve learned how to autoscale your ECS EC2-based application effectively using Auto Scaling Groups and Warm Pools with CDK. This setup not only optimizes resource use and costs but also ensures that your application remains responsive under varying loads. Whether you’re deploying new applications or looking to improve existing ones, these strategies can help you build a highly available and scalable system on AWS.

For a more detailed guide and step-by-step instructions, exploring the full capabilities of ECS, EC2, and CDK, consider visiting the AWS documentation and CDK API reference. Happy scaling!

Note: Always test your infrastructure as code changes in a safe environment before deploying to production.