A Complete Guide on How to Autoscale Your ECS based application Using Cdk

13 min readFeb 15, 2023

Amazon Web Services (AWS) Elastic Container Service (ECS) is a highly scalable, fully managed container orchestration service that makes it easy to run, stop, and manage Docker containers on a cluster. AWS ECS Auto Scaling allows you to dynamically adjust the number of instances in your ECS service based on metrics such as CPU utilization or request count, ensuring that your application can handle varying levels of traffic without manual intervention. In this blog, we will discuss how to implement AWS ECS Auto Scaling using AWS Cloud Development Kit (CDK) and explore different types of scaling with examples.

AWS ECS Auto Scaling with CDK

To implement AWS ECS Auto Scaling using CDK, you will need to define the following components:

ECS Cluster: The cluster is a logical grouping of container instances on which you run your tasks.
ECS Service: The service is a task definition that specifies the number of tasks to run, the CPU and memory requirements, and the container image.
Target Tracking Scaling Policy: The scaling policy is an AWS Auto Scaling policy that adjusts the desired capacity of your service based on a target metric. For example, you can set a target CPU utilization of 70%, and AWS will adjust the number of tasks running in the service to maintain that target.
Step Scaling Policy: The step scaling policy is an AWS Auto Scaling policy that adjusts the desired capacity of your service based on a series of scaling steps. For example, you can set a threshold for CPU utilization at which the number of tasks running in the service increases by a specified increment.
Scheduled Scaling Policy: The scheduled scaling policy is an AWS Auto Scaling policy that adjusts the desired capacity of your service based on a schedule. For example, you can set a schedule to increase the number of tasks running in the service during peak traffic hours and decrease the number of tasks during off-peak hours.

Let’s explore these different types of scaling in more detail with examples.

Target Tracking Scaling Policy

Target tracking scaling is a method of scaling in which you set a target value for a specific metric, such as CPU utilization or request count, and AWS Auto Scaling adjusts the desired capacity of your service to maintain that target. With target tracking scaling, you don’t have to manually adjust the number of tasks running in your service; AWS does it for you based on the target metric.

To implement target tracking scaling for an ECS service, you need to define a ScalableTarget and a TargetTrackingScalingPolicy in your CDK code.

The ScalableTarget represents the ECS service that you want to scale. You need to specify the serviceNamespace, which in this case is ECS, the resourceId, which is the ARN of your ECS service, and the scalableDimension, which is ecs:service:DesiredCount. The minCapacity and maxCapacity parameters represent the minimum and maximum number of tasks that you want to run in the service.

const scalableTarget = new autoscaling.ScalableTarget(this, 'ScalableTarget', {
  serviceNamespace: autoscaling.ServiceNamespace.ECS,
  resourceId: `service/${service.serviceName}/${cluster.clusterName}`,
  scalableDimension: 'ecs:service:DesiredCount',
  minCapacity: 1,
  maxCapacity: 10,
});

The TargetTrackingScalingPolicy represents the scaling policy that you want to use. You need to specify the targetValue, which is the target value for the metric that you want to track. In this case, we're setting the target CPU utilization to 70%. You also need to specify the scaleOutCooldown and scaleInCooldown parameters, which represent the amount of time that AWS Auto Scaling waits before adding or removing tasks in response to the metric. Finally, you need to specify the predefinedMetricSpecification, which is the metric that you want to track. In this case, we're using the ECS_SERVICE_AVERAGE_CPU_UTILIZATION metric.

const scalingPolicy = new autoscaling.TargetTrackingScalingPolicy(this, 'ScalingPolicy', {
  policyName: 'CPUUtilizationScalingPolicy',
  targetValue: 70,
  scaleOutCooldown: cdk.Duration.seconds(60),
  scaleInCooldown: cdk.Duration.seconds(60),
  predefinedMetricSpecification: {
    predefinedMetricType: autoscaling.PredefinedMetric.ECS_SERVICE_AVERAGE_CPU_UTILIZATION,
  },
});

Once you’ve defined the ScalableTarget and the TargetTrackingScalingPolicy, you can use the scaleToTrackMetric method to associate the scaling policy with the scalable target.

scalableTarget.scaleToTrackMetric('Tracking', {
  targetValue: 70,
  predefinedMetric: ecs.Metric.cpuUtilization({ serviceNamespace: 'AWS/ECS', serviceName: service.serviceName, clusterName: cluster.clusterName }),
});

In this example, we’re using the cpuUtilization method of the ecs.Metric class to define the metric that we want to track. We're passing in the serviceNamespace, serviceName, and clusterName properties to specify the ECS service and cluster that we want to monitor.

With target tracking scaling, AWS Auto Scaling adjusts the desired capacity of your ECS service to maintain the target CPU utilization of 70%. If the CPU utilization exceeds the target, AWS Auto Scaling adds more tasks to the service; if the CPU utilization falls below the target, AWS Auto Scaling removes tasks from the service.

Target tracking scaling is a simple and effective way to scale your ECS service based on a specific metric. By setting a target value for a metric, you can ensure that your service is always running at the desired capacity, without the need for manual intervention.

In addition to CPU utilization, you can use target tracking scaling to scale your service based on a variety of other metrics, such as request count, network throughput, or memory utilization. Here are some examples of how you can use target tracking scaling to scale your ECS service based on different metrics:

Request Count Scaling: You can use request count as a metric to scale your service. For example, if you’re running a web application that experiences a high volume of traffic, you can set a target value for the request count and let AWS Auto Scaling adjust the number of tasks running in your service to maintain the target. To implement request count scaling, you can use the AWS/ApplicationELB namespace and the RequestCountPerTarget metric.

const scalableTarget = new autoscaling.ScalableTarget(this, 'ScalableTarget', {
  serviceNamespace: autoscaling.ServiceNamespace.ECS,
  resourceId: `service/${service.serviceName}/${cluster.clusterName}`,
  scalableDimension: 'ecs:service:DesiredCount',
  minCapacity: 1,
  maxCapacity: 10,
});

const scalingPolicy = new autoscaling.TargetTrackingScalingPolicy(this, 'ScalingPolicy', {
  policyName: 'RequestCountScalingPolicy',
  targetValue: 1000,
  scaleOutCooldown: cdk.Duration.seconds(60),
  scaleInCooldown: cdk.Duration.seconds(60),
  predefinedMetricSpecification: {
    predefinedMetricType: autoscaling.PredefinedMetric.APP_TARGET_GROUP_REQUEST_COUNT,
    resourceLabel: `app/${loadBalancer.loadBalancerFullName}/${targetGroup.targetGroupFullName}`,
  },
});

scalableTarget.scaleToTrackMetric('Tracking', {
  targetValue: 1000,
  predefinedMetric: elbv2.Metric.requestCount({ serviceNamespace: 'AWS/ApplicationELB', targetGroupArn: targetGroup.targetGroupArn, statistic: 'sum' }),
});

In this example, we’re using the APP_TARGET_GROUP_REQUEST_COUNT predefined metric type and the requestCount method of the elbv2.Metric class to define the metric that we want to track. We're passing in the targetGroupArn and the statistic parameter to specify the target group and the statistic that we want to use.

Network Throughput Scaling: You can use network throughput as a metric to scale your service. For example, if your service handles a large volume of network traffic, you can set a target value for the network throughput and let AWS Auto Scaling adjust the number of tasks running in your service to maintain the target. To implement network throughput scaling, you can use the AWS/ECS namespace and the NetworkIn and NetworkOut metrics.

const scalableTarget = new autoscaling.ScalableTarget(this, 'ScalableTarget', {
  serviceNamespace: autoscaling.ServiceNamespace.ECS,
  resourceId: `service/${service.serviceName}/${cluster.clusterName}`,
  scalableDimension: 'ecs:service:DesiredCount',
  minCapacity: 1,
  maxCapacity: 10,
});

const scalingPolicy = new autoscaling.TargetTrackingScalingPolicy(this, 'ScalingPolicy', {
  policyName: 'NetworkThroughputScalingPolicy',
  targetValue: 100000000,
  scaleOutCooldown: cdk.Duration.seconds(60),
  scaleInCooldown: cdk.Duration.seconds(60),
  predefinedMetricSpecification: {
    predefinedMetricType: autoscaling.PredefinedMetric.ECS_SERVICE_AVERAGE_NETWORK_IN

In this example, we’re using the ECS_SERVICE_AVERAGE_NETWORK_IN predefined metric type and the ecs.Metric.networkInAverage method to define the metric that we want to track. We're passing in the cluster.clusterName and the service.serviceName parameters to specify the cluster and the service that we want to monitor.

Memory Utilization Scaling: You can use memory utilization as a metric to scale your service. For example, if your service requires a lot of memory to run, you can set a target value for the memory utilization and let AWS Auto Scaling adjust the number of tasks running in your service to maintain the target. To implement memory utilization scaling, you can use the AWS/ECS namespace and the MemoryUtilization metric.

const scalableTarget = new autoscaling.ScalableTarget(this, 'ScalableTarget', {
  serviceNamespace: autoscaling.ServiceNamespace.ECS,
  resourceId: `service/${service.serviceName}/${cluster.clusterName}`,
  scalableDimension: 'ecs:service:DesiredCount',
  minCapacity: 1,
  maxCapacity: 10,
});

const scalingPolicy = new autoscaling.TargetTrackingScalingPolicy(this, 'ScalingPolicy', {
  policyName: 'MemoryUtilizationScalingPolicy',
  targetValue: 70,
  scaleOutCooldown: cdk.Duration.seconds(60),
  scaleInCooldown: cdk.Duration.seconds(60),
  predefinedMetricSpecification: {
    predefinedMetricType: autoscaling.PredefinedMetric.ECS_SERVICE_AVERAGE_MEMORY_UTILIZATION,
    resourceLabel: `${cluster.clusterName}/${service.serviceName}`,
  },
});

scalableTarget.scaleToTrackMetric('Tracking', {
  targetValue: 70,
  predefinedMetric: ecs.Metric.memoryUtilization({ serviceNamespace: 'AWS/ECS', clusterName: cluster.clusterName, serviceName: service.serviceName }),
});

In this example, we’re using the ECS_SERVICE_AVERAGE_MEMORY_UTILIZATION predefined metric type and the ecs.Metric.memoryUtilization method to define the metric that we want to track. We're passing in the cluster.clusterName and the service.serviceName parameters to specify the cluster and the service that we want to monitor.

Overall, target tracking scaling is a powerful feature that can help you automatically adjust the capacity of your ECS service based on different metrics. By setting a target value for a metric, you can ensure that your service is always running at the desired capacity, without the need for manual intervention.

Step Scaling Policy

Step scaling allows you to specify a set of thresholds that trigger scaling adjustments based on the value of a metric. For example, you can define a scaling policy that scales out your service by 2 tasks if the CPU utilization is between 50% and 75%, and scales in your service by 2 tasks if the CPU utilization drops below 25%. To implement a step scaling policy, you can use the AWS/ECS namespace and the CPUUtilization metric.

const scalableTarget = new autoscaling.ScalableTarget(this, 'ScalableTarget', {
  serviceNamespace: autoscaling.ServiceNamespace.ECS,
  resourceId: `service/${service.serviceName}/${cluster.clusterName}`,
  scalableDimension: 'ecs:service:DesiredCount',
  minCapacity: 1,
  maxCapacity: 10,
});

const scalingPolicy = new autoscaling.StepScalingPolicy(this, 'ScalingPolicy', {
  policyName: 'CPUUtilizationStepScalingPolicy',
  scalingTarget: scalableTarget,
  metricAggregationType: 'Average',
  adjustmentType: autoscaling.AdjustmentType.CHANGE_IN_CAPACITY,
  steps: [
    { lower: 0, upper: 20, change: -2 },
    { lower: 20, upper: 40, change: -1 },
    { lower: 40, upper: 60, change: 0 },
    { lower: 60, upper: 80, change: 1 },
    { lower: 80, upper: 100, change: 2 },
  ],
  cooldown: cdk.Duration.seconds(60),
  metric: new cloudwatch.Metric({
    namespace: 'AWS/ECS',
    metricName: 'CPUUtilization',
    dimensions: {
      ClusterName: cluster.clusterName,
      ServiceName: service.serviceName,
    },
    period: cdk.Duration.minutes(1),
  }),
});

scalableTarget.scaleOnMetric('Scaling', {
  metric: ecs.Metric.cpuUtilization({ serviceNamespace: 'AWS/ECS', clusterName: cluster.clusterName, serviceName: service.serviceName }),
  scalingSteps: [
    { lower: 0, upper: 20, change: -2 },
    { lower: 20, upper: 40, change: -1 },
    { lower: 40, upper: 60, change: 0 },
    { lower: 60, upper: 80, change: 1 },
    { lower: 80, upper: 100, change: 2 },
  ],
});

In this example, we’re using the AWS/ECS namespace and the CPUUtilization metric to define the scaling policy. We're using the StepScalingPolicy class to create a new scaling policy and passing in the ScalableTarget instance that we created earlier. We're using the steps property to define the scaling steps and their corresponding adjustments, and the cooldown property to specify the cooldown period between scaling events.

Step Scaling Policy is a type of scaling policy that allows you to define a set of thresholds, or steps, that trigger scaling adjustments based on the value of a metric. Each step can define a different adjustment for the scaling action. This can be useful if you want to fine-tune the scaling actions of your service based on specific performance metrics.

The thresholds and adjustments are defined using a set of scaling steps. Each scaling step consists of a lower and upper bound, and an adjustment value. The lower and upper bounds define the range of metric values that will trigger the scaling action. The adjustment value defines the amount by which the service should be scaled up or down when the metric is within the defined range.

The Step Scaling Policy is useful in scenarios where you have specific thresholds that you want to use for scaling, such as when you want to scale your service up by a certain number of tasks when the CPU utilization exceeds a certain threshold. Step Scaling Policy can be used with any metric that is supported by Amazon CloudWatch, including custom metrics that you may have defined.

When you define a Step Scaling Policy with AWS ECS and CDK, you can use the AWS/ECS namespace and the CPUUtilization metric to define the scaling policy. You can then use the StepScalingPolicy class to create a new scaling policy and pass in the ScalableTarget instance that you created earlier. You can use the steps property to define the scaling steps and their corresponding adjustments, and the cooldown property to specify the cooldown period between scaling events.

Here’s an example of how you can define a Step Scaling Policy with AWS ECS and CDK:

const scalableTarget = new autoscaling.ScalableTarget(this, 'ScalableTarget', {
  serviceNamespace: autoscaling.ServiceNamespace.ECS,
  resourceId: `service/${service.serviceName}/${cluster.clusterName}`,
  scalableDimension: 'ecs:service:DesiredCount',
  minCapacity: 1,
  maxCapacity: 10,
});

const scalingPolicy = new autoscaling.StepScalingPolicy(this, 'ScalingPolicy', {
  policyName: 'CPUUtilizationStepScalingPolicy',
  scalingTarget: scalableTarget,
  metricAggregationType: 'Average',
  adjustmentType: autoscaling.AdjustmentType.CHANGE_IN_CAPACITY,
  steps: [
    { lower: 0, upper: 20, change: -2 },
    { lower: 20, upper: 40, change: -1 },
    { lower: 40, upper: 60, change: 0 },
    { lower: 60, upper: 80, change: 1 },
    { lower: 80, upper: 100, change: 2 },
  ],
  cooldown: cdk.Duration.seconds(60),
  metric: new cloudwatch.Metric({
    namespace: 'AWS/ECS',
    metricName: 'CPUUtilization',
    dimensions: {
      ClusterName: cluster.clusterName,
      ServiceName: service.serviceName,
    },
    period: cdk.Duration.minutes(1),
  }),
});

We’re also using the metric property to define the metric that we want to monitor. In this case, we’re monitoring the CPUUtilization metric for the ECS service that we created earlier. We're using the dimensions property to specify the cluster and service name dimensions for the metric, and the period property to specify the time interval at which the metric data should be collected.

Once you’ve defined the scaling policy, you can attach it to the scalable target using the attachToScalingTarget method:

scalingPolicy.attachToScalingTarget(scalableTarget);

This will attach the scaling policy to the scalable target and start monitoring the metric for scaling events. If the metric value exceeds or falls below the threshold defined in the scaling policy, ECS will trigger a scaling action based on the defined adjustments.

In conclusion, the Step Scaling Policy is a powerful feature of AWS ECS autoscaling that allows you to fine-tune your scaling actions based on specific performance metrics. With the AWS CDK, you can easily define and configure a Step Scaling Policy for your ECS services using TypeScript or other supported programming languages. By combining the Step Scaling Policy with other scaling policies such as Target Tracking Policy, you can ensure that your ECS services are always running at optimal capacity and performance while minimizing costs and resource waste.

Scheduled Scaling Policy

The scheduled Scaling Policy allows you to define a schedule for scaling actions based on time intervals. This is useful if you know that your application will experience a surge in traffic at certain times, such as during a peak sales period or a special event. With Scheduled Scaling Policy, you can ensure that your ECS services are scaled up or down according to the predicted traffic pattern, rather than relying on reactive scaling policies.

To use Scheduled Scaling Policy with AWS ECS and CDK, you can create a ScheduledScalingAction object, which defines the scaling action and the schedule for when the action should be performed. Here's an example of how to define a Scheduled Scaling Policy with the AWS CDK:

import { ScheduledScalingAction } from '@aws-cdk/aws-applicationautoscaling';

const scalingAction = new ScheduledScalingAction(this, 'ScheduledScalingAction', {
  schedule: appautoscaling.Schedule.cron({ minute: '0', hour: '8' }),
  minCapacity: 2,
  maxCapacity: 10,
  resourceId: 'service/MyService',
  scalableDimension: 'ecs:service:DesiredCount',
});

In this example, we’re creating a new ScheduledScalingAction object with a schedule that triggers the scaling action every day at 8:00 AM. We're using the minCapacity and maxCapacity properties to specify the minimum and the maximum number of tasks to run, respectively. We're also using the resourceId property to specify the ID of the ECS service that we want to scale, and the scalableDimension property to specify that we want to scale based on the desired task count of the service.

Once you’ve defined the scaling action, you can attach it to the scalable target using the attachToScalingTarget method, similar to the other scaling policies:

scalingAction.attachToScalingTarget(scalableTarget);

This will attach the scaling action to the scalable target and schedule it to trigger at the specified time intervals.

In addition to specifying the schedule for when the scaling action should be performed, the ScheduledScalingAction object allows you to specify other properties to customize the scaling action. Here are some of the other properties that you can use:

startTime and endTime: These properties allow you to specify a window of time during which the scaling action should be performed. This is useful if you want to limit the impact of the scaling action on other processes or resources.
scalableTarget: This property allows you to specify the ScalableTarget object that the scaling action should be attached to. This is useful if you have multiple scalable targets and want to associate different scaling actions with different targets.
cooldown: This property allows you to specify a cooldown period after the scaling action is performed, during which no additional scaling actions will be triggered. This is useful if you want to prevent rapid oscillations in the number of tasks running, which can cause instability in your application.

Here’s an example of how to use some of these additional properties:

const scalableTarget = new appscaling.ScalableTarget(this, 'MyScalableTarget', {
  serviceNamespace: appscaling.ServiceNamespace.ECS,
  resourceId: 'service/MyService',
  scalableDimension: 'ecs:service:DesiredCount',
  minCapacity: 1,
  maxCapacity: 10,
});

const scalingAction = new appscaling.ScheduledScalingAction(this, 'MyScheduledScalingAction', {
  schedule: appscaling.Schedule.cron({ minute: '0', hour: '8' }),
  startTime: new Date('2023-02-20T00:00:00.000Z'),
  endTime: new Date('2023-02-28T23:59:59.000Z'),
  minCapacity: 2,
  maxCapacity: 10,
  resourceId: 'service/MyService',
  scalableTarget: scalableTarget,
  cooldown: cdk.Duration.minutes(5),
});

In this example, we’re using the startTime and endTime properties to define a window of time between February 20 and February 28 during which the scaling action should be performed. We're also using the scalableTarget property to associate the scaling action with the ScalableTarget the object that we created earlier.

Once you’ve defined the scaling action with the desired properties, you can attach it to the scalable target using the attachToScalingTarget method:

scalingAction.attachToScalingTarget(scalableTarget);

This will attach the scaling action to the scalable target and schedule it to trigger at the specified time intervals, subject to the defined properties.

In conclusion, the Scheduled Scaling Policy is a powerful feature of AWS ECS autoscaling that allows you to proactively manage your application’s capacity and performance based on predicted traffic patterns. With the AWS CDK, you can easily define and configure Scheduled Scaling Policy for your ECS services using TypeScript or other supported programming languages. By combining Scheduled Scaling Policy with other scaling policies such as Target Tracking Policy and Step Scaling Policy, you can ensure that your ECS services are always running at optimal capacity and performance while minimizing costs and resource waste.

I hope this provides a detailed overview of using AutoScaling Policy with AWS ECS and CDK. Let me know if you have any further questions or if you need additional clarification on any of the topics covered.

❤️ Follow Siddhanth Dwivedi aka mafiaguy for more such awesome blogs.

A Complete Guide on How to Autoscale Your ECS based application Using Cdk

AWS ECS Auto Scaling with CDK

Target Tracking Scaling Policy

Step Scaling Policy

Scheduled Scaling Policy

Written by THE HOW TO BLOG |Siddhanth Dwivedi