Terraform Deployment
Metaflow ships with a Terraform template that automates the deployment of all the AWS resources needed to enable cloud-scaling in Metaflow.
The major components of the template are:
  • Amazon S3 - A dedicated private bucket and all appropriate permissions to serve as a centralized storage backend.
  • AWS Batch - A dedicated AWS Batch Compute Environment and Job Queue to extend Metaflow's compute capabilities to the cloud.
  • Amazon CloudWatch - Configuration to store and manage AWS Batch job execution logs.
  • AWS Step Functions - A dedicated role to allow scheduling Metaflow flows on AWS Step Functions.
  • Amazon Event Bridge - A dedicated role to allow time-based triggers for Metaflow flows configures on AWS Step Functions.
  • Amazon DynamoDB - A dedicated Amazon DynamoDB table for tracking certain step executions on AWS Step Functions.
  • Amazon Sagemaker - An Amazon Sagemaker Notebook instance for interfacing with Metaflow flows.
  • AWS Fargate and Amazon Relational Database Service - A Metadata service running on AWS Fargate with a PostGres DB on Amazon Relational Database Service to log flow execution metadata.
  • Amazon API Gateway - A dedicated TLS termination point and an optional point of basic API authentication via key to provide secure, encrypted access to metadata service.
  • Amazon VPC Networking - A VPC with (2) customizable subnets and Internet connectivity.
  • AWS Identity and Access Management - Dedicated roles obeying "principle of least privilege" access to resources such as AWS Batch and Amazon Sagemaker Notebook instances.
  • AWS Lambda - An AWS Lambda function that automates any migrations needed for the Metadata service.
To deploy the template, just follow the instructions listed here.
Last modified 1d ago
Export as PDF
Copy link