EventBridge触发器:Sagemaker处理作业完成

3

我正在使用AWS开发一些与机器学习模型相关的ETL。问题是,当某个Sagemaker处理任务完成时,我想要触发一个Lambda函数。并且传递给Lambda的事件应该是Sagemaker处理任务的配置信息(作业名称、参数等)。

Q1:我如何在处理任务完成时触发事件

Q2:我如何将处理任务配置作为事件传递给Lambda?

1个回答

4
您可以使用以下EventBridge规则模式:
{
  "source": ["aws.sagemaker"],
  "detail-type": ["SageMaker Processing Job State Change"],
  "detail": {
    "ProcessingJobStatus": ["Failed", "Completed", "Stopped"]
  }
}

根据您想要处理的状态,可以修改ProcessingJobStatus列表。

您可以将Lambda函数设置为EventBridge规则的目标。

以下是一个示例事件,将传递给Lambda,该示例事件来自AWS控制台:

{
  "version": "0",
  "id": "0a15f67d-aa23-0123-0123-01a23w89r01t",
  "detail-type": "SageMaker Processing Job State Change",
  "source": "aws.sagemaker",
  "account": "123456789012",
  "time": "2019-05-31T21:49:54Z",
  "region": "us-east-1",
  "resources": ["arn:aws:sagemaker:us-west-2:012345678987:processing-job/integ-test-analytics-algo-54ee3282-5899-4aa3-afc2-7ce1d02"],
  "detail": {
    "ProcessingInputs": [{
      "InputName": "InputName",
      "S3Input": {
        "S3Uri": "s3://input/s3/uri",
        "LocalPath": "/opt/ml/processing/input/local/path",
        "S3DataType": "MANIFEST_FILE",
        "S3InputMode": "PIPE",
        "S3DataDistributionType": "FULLYREPLICATED"
      }
    }],
    "ProcessingOutputConfig": {
      "Outputs": [{
        "OutputName": "OutputName",
        "S3Output": {
          "S3Uri": "s3://output/s3/uri",
          "LocalPath": "/opt/ml/processing/output/local/path",
          "S3UploadMode": "CONTINUOUS"
        }
      }],
      "KmsKeyId": "KmsKeyId"
    },
    "ProcessingJobName": "integ-test-analytics-algo-54ee3282-5899-4aa3-afc2-7ce1d02",
    "ProcessingResources": {
      "ClusterConfig": {
        "InstanceCount": 3,
        "InstanceType": "ml.c5.xlarge",
        "VolumeSizeInGB": 5,
        "VolumeKmsKeyId": "VolumeKmsKeyId"
      }
    },
    "StoppingCondition": {
      "MaxRuntimeInSeconds": 2000
    },
    "AppSpecification": {
      "ImageUri": "012345678901.dkr.ecr.us-west-2.amazonaws.com/processing-uri:latest"
    },
    "NetworkConfig": {
      "EnableInterContainerTrafficEncryption": true,
      "EnableNetworkIsolation": false,
      "VpcConfig": {
        "SecurityGroupIds": ["SecurityGroupId1", "SecurityGroupId2", "SecurityGroupId3"],
        "Subnets": ["Subnet1", "Subnet2"]
      }
    },
    "RoleArn": "arn:aws:iam::012345678987:role/SageMakerPowerUser",
    "ExperimentConfig": {},
    "ProcessingJobArn": "arn:aws:sagemaker:us-west-2:012345678987:processing-job/integ-test-analytics-algo-54ee3282-5899-4aa3-afc2-7ce1d02",
    "ProcessingJobStatus": "Completed",
    "LastModifiedTime": 1589879735000,
    "CreationTime": 1589879735000
  }
}

编辑:

如果您想要将ProcessingJobName与特定的前缀匹配:

{
  "source": ["aws.sagemaker"],
  "detail-type": ["SageMaker Processing Job State Change"],
  "detail": {
    "ProcessingJobStatus": ["Failed", "Completed", "Stopped"],
    "ProcessingJobName": [{
      "prefix": "standarize-data"
    }]
  }
}

谢谢您的回答。如果我需要过滤包含<some-keyword>的处理作业,该怎么办? “ProcessingJobName”:[“<some-keyworkd>”] 此外,是否可以仅使用名称开头进行过滤?例如,如果名称为 <some-keyword>-some-metadata - mxmrpn
此文档涵盖了有关模式可实现的不同事例。您能否举个例子,说明您想在哪些字段中匹配哪些类型的关键字? - Kaustubh Khavnekar
这是完整的处理作业名称: standarize-data-1643826039-USA-2021-2022 因此,我想匹配每个以standarize-data开头的处理作业。 - mxmrpn
@MaximoRipani 我已经更新了我的回答。 - Kaustubh Khavnekar
谢谢!这真的帮了我很多。 - mxmrpn

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接