CRITICAL

AWS IAM: Fix AccessDeniedException for EC2 Auto Scaling during High Traffic

Quick Fix Summary

TL;DR

Temporarily attach the 'AmazonEC2FullAccess' managed policy to the Auto Scaling IAM role.

The IAM role attached to your EC2 Auto Scaling group lacks the necessary permissions to perform actions (like launching instances, describing resources, or modifying load balancers) required to scale during a high-traffic event.

Diagnosis & Causes

  • IAM role attached to the Auto Scaling group has insufficient permissions.
  • IAM policy boundaries or Service Control Policies (SCPs) are blocking required actions.
  • Recovery Steps

    1

    Step 1: Identify the Failing IAM Role and Action

    Check CloudTrail logs to pinpoint the exact API call and the IAM role that is being denied. This is critical for targeted remediation.

    bash
    aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=AccessDeniedException --start-time $(date -u -v-1H +%Y-%m-%dT%H:%M:%SZ) --region us-east-1 --query 'Events[*].CloudTrailEvent' --output text | jq -r '. | {eventTime, userIdentity.arn, errorMessage, eventSource, eventName}'
    2

    Step 2: Verify the IAM Role Attached to the Auto Scaling Group

    Confirm which IAM role is currently associated with your Auto Scaling group.

    bash
    aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names YOUR_ASG_NAME --query 'AutoScalingGroups[0].Instances[0].IamInstanceProfile' --region YOUR_REGION
    3

    Step 3: Audit the Attached IAM Role Permissions

    Simulate the critical scaling actions to see which permissions are missing. Replace ACTION (e.g., autoscaling:DescribeAutoScalingGroups) and RESOURCE_ARN.

    bash
    aws iam simulate-principal-policy --policy-source-arn arn:aws:iam::ACCOUNT_ID:role/YOUR_ASG_ROLE --action-names autoscaling:DescribeAutoScalingGroups ec2:RunInstances --region YOUR_REGION
    4

    Step 4: Attach a Comprehensive Policy for Immediate Recovery

    For immediate recovery during an incident, attach the managed 'AmazonEC2FullAccess' policy to the role. This is a temporary fix to restore scaling capability.

    bash
    aws iam attach-role-policy --role-name YOUR_ASG_ROLE --policy-arn arn:aws:iam::aws:policy/AmazonEC2FullAccess
    5

    Step 5: Create and Attach a Least-Privilege Custom Policy

    After recovery, replace the broad policy with a custom, least-privilege policy. Save this JSON to a file (e.g., asg-policy.json) and create/attach it.

    bash
    aws iam create-policy --policy-name ASG-LeastPrivilege-Policy --policy-document file://asg-policy.json
    aws iam attach-role-policy --role-name YOUR_ASG_ROLE --policy-arn arn:aws:iam::ACCOUNT_ID:policy/ASG-LeastPrivilege-Policy
    6

    Step 6: Remove the Broad Managed Policy

    Once the custom policy is attached and verified, detach the temporary full-access policy to enforce least privilege.

    bash
    aws iam detach-role-policy --role-name YOUR_ASG_ROLE --policy-arn arn:aws:iam::aws:policy/AmazonEC2FullAccess
    7

    Step 7: Check for Organizational SCPs

    If the issue persists, verify that no Service Control Policy (SCP) at the AWS Organizations level is denying the required actions for the entire account or OU.

    text
    # This requires Organizations permissions. Review SCPs in the AWS Console under AWS Organizations.

    Architect's Pro Tip

    "This often happens when a custom IAM policy for Auto Scaling misses permissions for dependent services like EC2 (to launch instances), ELBv2 (to modify target groups), or CloudWatch (to read alarms). Always test scaling actions with `simulate-principal-policy` before deploying policy changes."

    Frequently Asked Questions

    Why did scaling work before but fail during high traffic?

    The IAM role might have had just enough permissions for steady-state operations (e.g., describing instances) but lacked permissions for scaling actions (e.g., RunInstances, CreateTags, modifying load balancer target groups) that are only triggered during a scale-out event.

    Is attaching 'AmazonEC2FullAccess' safe for production?

    As a permanent solution, no—it violates the principle of least privilege. However, as a critical incident response action to restore service, it is an acceptable temporary fix. You must replace it with a scoped policy as soon as possible (Steps 5 & 6).

    Related AWS Guides