Metaflow ships with a Terraform template that automates the deployment of all the AWS resources needed to enable cloud-scaling in Metaflow.
The major components of the template are:
Amazon S3 - A dedicated private bucket and all appropriate permissions to serve as a centralized storage backend.
AWS Batch - A dedicated AWS Batch Compute Environment and Job Queue to extend Metaflow's compute capabilities to the cloud.
Amazon CloudWatch - Configuration to store and manage AWS Batch job execution logs.
AWS Step Functions - A dedicated role to allow scheduling Metaflow flows on AWS Step Functions.
Amazon Event Bridge - A dedicated role to allow time-based triggers for Metaflow flows configures on AWS Step Functions.
Amazon DynamoDB - A dedicated Amazon DynamoDB table for tracking certain step executions on AWS Step Functions.
Amazon Sagemaker - An Amazon Sagemaker Notebook instance for interfacing with Metaflow flows.
AWS Fargate and Amazon Relational Database Service - A Metadata service running on AWS Fargate with a PostGres DB on Amazon Relational Database Service to log flow execution metadata.
Amazon API Gateway - A dedicated TLS termination point and an optional point of basic API authentication via key to provide secure, encrypted access to metadata service.
Amazon VPC Networking - A VPC with (2) customizable subnets and Internet connectivity.
AWS Identity and Access Management - Dedicated roles obeying "principle of least privilege" access to resources such as AWS Batch and Amazon Sagemaker Notebook instances.
AWS Lambda - An AWS Lambda function that automates any migrations needed for the Metadata service.
To deploy the template, just follow the instructions listed here.