无法承担角色并验证指定的目标组ARN。

22

我想使用terraform ecs_service创建和部署一个集群,但是无法做到。我的terraform apply总是在IAM角色周围失败,我不清楚这些角色的作用。具体来说,错误消息是:

InvalidParametersException: Unable to assume role and validate the specified targetGroupArn. Please verify that the ECS service role being passed has the proper permissions.

而且我已经发现:

  1. 当我在ecs_service中指定了iam_role时,ECS会抱怨我需要使用服务相关角色。
  2. 当我在ecs_service中注释掉iam_role时,ECS会抱怨所假定的角色无法验证targetGroupArn。

我的terraform涉及许多文件。我在下面摘录了感觉与此问题相关的部分。虽然我看到过一些类似的问题发布,但没有提供可行的解决方案来解决我的困境。

## ALB

resource "aws_alb" "frankly_internal_alb" {
    name = "frankly-internal-alb"
    internal = false
    security_groups = ["${aws_security_group.frankly_internal_alb_sg.id}"]
    subnets = ["${aws_subnet.frankly_public_subnet_a.id}", "${aws_subnet.frankly_public_subnet_b.id}"]
}

resource "aws_alb_listener" "frankly_alb_listener" {
    load_balancer_arn = "${aws_alb.frankly_internal_alb.arn}"

    port = "8080"
    protocol = "HTTP"

    default_action {
        target_group_arn = "${aws_alb_target_group.frankly_internal_target_group.arn}"
        type = "forward"
    }
}

## Target Group

resource "aws_alb_target_group" "frankly_internal_target_group" {
    name = "internal-target-group"
    port = 8080
    protocol = "HTTP"
    vpc_id = "${aws_vpc.frankly_vpc.id}"

    health_check {
        healthy_threshold = 5
        unhealthy_threshold = 2
        timeout = 5
    }
}

## IAM

resource "aws_iam_role" "frankly_ec2_role" {
  name               = "franklyec2role"
  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Effect": "Allow",
      "Sid": ""
    }
  ]
}
EOF
}

resource "aws_iam_role" "frankly_ecs_role" {
  name = "frankly_ecs_role"

  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "ecs.amazonaws.com"
      },
      "Effect": "Allow",
      "Sid": ""
    }
  ]
}
EOF
}

# aggresively add permissions...
resource "aws_iam_policy" "frankly_ecs_policy" {
  name        = "frankly_ecs_policy"
  description = "A test policy"

  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "ec2:*",
        "ecs:*",
        "ecr:*",
        "autoscaling:*",
        "elasticloadbalancing:*",
        "application-autoscaling:*",
        "logs:*",
        "tag:*",
        "resource-groups:*"
      ],
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}
EOF
}

resource "aws_iam_role_policy_attachment" "frankly_ecs_attach" {
  role       = "${aws_iam_role.frankly_ecs_role.name}"
  policy_arn = "${aws_iam_policy.frankly_ecs_policy.arn}"
}

## ECS

resource "aws_ecs_cluster" "frankly_ec2" {
    name = "frankly_ec2_cluster"
}

resource "aws_ecs_task_definition" "frankly_ecs_task" {
  family                = "service"
  container_definitions = "${file("terraform/task-definitions/search.json")}"

  volume {
    name      = "service-storage"

    docker_volume_configuration {
      scope         = "shared"
      autoprovision = true
    }
  }

  placement_constraints {
    type       = "memberOf"
    expression = "attribute:ecs.availability-zone in [us-east-1]"
  }
}

resource "aws_ecs_service" "frankly_ecs_service" {
  name            = "frankly_ecs_service"
  cluster         = "${aws_ecs_cluster.frankly_ec2.id}"
  task_definition = "${aws_ecs_task_definition.frankly_ecs_task.arn}"
  desired_count   = 2
  iam_role        = "${aws_iam_role.frankly_ecs_role.arn}"
  depends_on      = ["aws_iam_role.frankly_ecs_role", "aws_alb.frankly_internal_alb", "aws_alb_target_group.frankly_internal_target_group"]

  # network_configuration = {
  #   subnets = ["${aws_subnet.frankly_private_subnet_a.id}", "${aws_subnet.frankly_private_subnet_b}"]
  #   security_groups = ["${aws_security_group.frankly_internal_alb_sg}", "${aws_security_group.frankly_service_sg}"]
  #   # assign_public_ip = true
  # }

  ordered_placement_strategy {
    type  = "binpack"
    field = "cpu"
  }

  load_balancer {
    target_group_arn = "${aws_alb_target_group.frankly_internal_target_group.arn}"
    container_name   = "search-svc"
    container_port   = 8080
  }

  placement_constraints {
    type       = "memberOf"
    expression = "attribute:ecs.availability-zone in [us-east-1]"
  }
}

1
你能否编辑你的问题,展示更多关于apply错误的上下文?是aws_ecs_service资源引起的吗?你也能展示一下ECS集群实例的Terraform代码吗? - ydaetskcoR
我遇到了这个具体的问题。运行 terraform destroy 然后重新应用并不能解决问题。 - Daniel Patrick
6个回答

64

我看到了一个完全一样的错误信息,但我的问题不同:

我指定了负载均衡器的 ARN,而不是 负载均衡器的目标组 ARN。


7
你救了我的一天! - yeer
7
你也让我的一天充满了希望! - chrizonline
然而在资源“aws_ecs_service”“frankly_ecs_service”中,负载均衡器部分正确地具有目标组ARN:“$ {aws_alb_target_group.frankly_internal_target_group.arn}” - Miguel Conde
@MiguelConde 是的,我认为这只是一个糟糕的错误信息。我肯定不会说这是你可能收到此消息的唯一原因。看起来你可能因为其他原因遇到了这个问题。 - Daniel Patrick
1
我的情况是因为我忘记添加一个aws_iam_role_policy_attachment资源。 - Miguel Conde
1
如果我可以再点赞一次,我会的!当我部署一个新的 ECS 服务时,其中一个目标组有一个无效的名称,导致了这个问题。 - Neara

2

对于我来说,问题在于我忘记将正确的策略附加到服务角色上。附加此AWS管理的策略有所帮助:arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceRole


1
为了防止错误:
Error: creating ECS Service (*****): InvalidParameterException: Unable to assume role and validate the specified targetGroupArn. Please verify that the ECS service role being passed has the proper permissions.

从我的角度来看,我正在使用以下配置:

  1. 角色信任关系:向信任策略添加语句
{
            "Sid": "ECSpermission",
            "Effect": "Allow",
            "Principal": {
                "Service": [
                    "ecs.amazonaws.com",
                    "ecs-tasks.amazonaws.com"
                ]
            },
            "Action": "sts:AssumeRole"
}
  • 角色权限:
  • 添加 AWS 托管策略:

    • AmazonEC2ContainerRegistryFullAccess
    • AmazonEC2ContainerServiceforEC2Role
    • 添加自定义内联策略:(我知道权限很广泛)
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Action": [
                    "autoscaling:*",
                    "elasticloadbalancing:*",
                    "application-autoscaling:*",
                    "resource-groups:*"
                ],
                "Effect": "Allow",
                "Resource": "*"
            }
        ]
    }
    
    1. 在资源 "aws_ecs_service" 中使用参数 iam_role 声明您的自定义角色
    resource "aws_ecs_service" "team_deployment" {
      name            = local.ecs_task
      cluster         = data.terraform_remote_state.common_resources.outputs.ecs_cluster.id
      task_definition = aws_ecs_task_definition.team_deployment.arn
      launch_type     = "EC2"
      iam_role        = "arn:aws:iam::****:role/my_custom_role"
    
      desired_count = 3
      enable_ecs_managed_tags = true
      force_new_deployment    = true
      scheduling_strategy     = "REPLICA"
      wait_for_steady_state   = false
    
    
      load_balancer {
        target_group_arn = data.terraform_remote_state.common_resources.outputs.target_group_api.arn
        container_name   = var.ecr_image_tag
        container_port   = var.ecr_image_port
      }
    }
    
    

    当然要小心参数target_group_arn的值。必须是目标组ARN。现在一切正常!

    Releasing state lock. This may take a few moments...
    
    Apply complete! Resources: 1 added, 2 changed, 0 destroyed.
    

    0

    我附加了错误的角色。

    resource "aws_ecs_service" "ECSService" {
      name    = "stage-quotation"
      cluster = aws_ecs_cluster.ECSCluster2.id
      load_balancer {
        target_group_arn = aws_lb_target_group.ElasticLoadBalancingV2TargetGroup2.arn
        container_name   = "stage-quotation"
        container_port   = 8000
      }
      desired_count                      = 1
      task_definition                    = aws_ecs_task_definition.ECSTaskDefinition.arn
      deployment_maximum_percent         = 200
      deployment_minimum_healthy_percent = 100
      iam_role                           = aws_iam_service_linked_role.IAMServiceLinkedRole4.arn #
      ordered_placement_strategy {
        type  = "spread"
        field = "instanceId"
      }
      health_check_grace_period_seconds = 0
      scheduling_strategy               = "REPLICA"
    }
    

    resource "aws_iam_service_linked_role" "IAMServiceLinkedRole2" {
      aws_service_name = "ecs.application-autoscaling.amazonaws.com"
    }
    resource "aws_iam_service_linked_role" "IAMServiceLinkedRole4" {
      aws_service_name = "ecs.amazonaws.com"
      description      = "Role to enable Amazon ECS to manage your cluster."
    }
    

    由于命名规范不佳,我误用了应用程序自动缩放的角色。我们需要使用的正确角色如上所定义为 IAMServiceLinkedRole4


    0

    对我来说,我正在使用上一个命令的输出。但是输出为空,因此在创建服务调用中目标组 arn 也为空。


    -1

    通过销毁我的堆栈并重新部署解决了问题。


    网页内容由stack overflow 提供, 点击上面的
    可以查看英文原文,
    原文链接