AWS CloudFormation Deployment

Metaflow ships with an AWS CloudFormation template that automates the deployment of all the AWS resources needed to enable cloud-scaling in Metaflow. The Metaflow Sandbox uses a version of this template to automatically vend out sandboxes for evaluating Metaflow.

The major components of the template are:

  • Amazon S3 - A dedicated private bucket and all appropriate permissions to serve as a centralized storage backend.

  • AWS Batch - A dedicated AWS Batch Compute Environment and Job Queue to extend Metaflow's compute capabilities to the cloud.

  • Amazon CloudWatch - Configuration to store and manage AWS Batch job execution logs.

  • AWS Step Functions - A dedicated role to allow scheduling Metaflow flows on AWS Step Functions.

  • Amazon Event Bridge - A dedicated role to allow time-based triggers for Metaflow flows configures on AWS Step Functions.

  • Amazon DynamoDB - A dedicated Amazon DynamoDB table for tracking certain step executions on AWS Step Functions.

  • Amazon Sagemaker - An Amazon Sagemaker Notebook instance for interfacing with Metaflow flows.

  • AWS Fargate and Amazon Relational Database Service - A Metadata service running on AWS Fargate with a PostGres DB on Amazon Relational Database Service to log flow execution metadata.

  • Amazon API Gateway - A dedicated TLS termination point and an optional point of basic API authentication via key to provide secure, encrypted access to metadata service.

  • Amazon VPC Networking - A VPC with (2) customizable subnets and Internet connectivity.

  • AWS Identity and Access Management - Dedicated roles obeying "principle of least privilege" access to resources such as AWS Batch and Amazon Sagemaker Notebook instances.

  • AWS Lambda - An AWS Lambda function that automates any migrations needed for the Metadata service.

Steps for AWS CloudFormation Deployment

  1. Navigate to Services and select CloudFormation under the Management and Governance heading (or search for it in the search bar) in your AWS console.

  2. Click Create stack and select With new resources (standard).

  3. Download the template from this location and save it locally.

  4. Ensure Template is ready remains selected, choose Upload a template file, and click Choose file and upload the file saved in previous step.

  5. Name your stack, select your parameters, and click Next, noting that if you enable APIBasicAuth and/or CustomRole, further configuration will be required after deployment.

  6. If desired, feel free to tag your stack in whatever way best fits your organization. When finished, click Next.

  7. Ensure you select the check box next to I acknowledge that AWS CloudFormation might create IAM resources. and click Create stack.

  8. Wait roughly 10-15 minutes for deployment to complete. The Stack status will eventually change to CREATE_COMPLETE.

Once complete, you'll find an Outputs tab that contains values for the components generated by this CloudFormation template. Those values correlate to respective environment variables (listed next to the outputs) you'll set to enable cloud features within Metaflow.

Additional Configuration

Did you choose to enable APIBasicAuth and/or CustomRole and are wondering how they work? Below are some details on what happens when those features are enabled and how to make use of them.

  • APIBasicAuth - In addition to TLS termination, Amazon API Gateway provides the ability to generate an API key that restricts access only to requests that pass that API key in the 'x-api-key' HTTP header. This is useful in that it restricts access to flow information from the general Internet while still allowing remote connectivity to authenticated clients. However, enabling this feature means that you'll need to request the API Key from Amazon API Gateway, as exposing a credential as an output from CloudFormation is a potential security problem. CloudFormation does, however, output the ID of the API Key that correlates to your stack, making is easy to get the key and pass it to Metaflow. Follow one of the two instructions below to get METAFLOW_SERVICE_AUTH_KEY.

    1. From the AWS CLI, run the following: aws apigateway get-api-key --api-key <YOUR_KEY_ID_FROM_CFN> --include-value | grep value

    2. From the AWS Console, navigate to Services and select API Gateway from Networking & Content Delivery (or search for it in the search bar). Click on your API, select API Keys from the left side, select the API that corresponds to your Stack name, and click show next to API Key.

  • CustomRole - This template can create an optional role that can be assumed by users (or applications) that includes limited permissions to only the resources required by Metaflow, including access only to the Amazon S3 bucket, AWS Batch Compute Environment, and Amazon Sagemaker Notebook Instance created by this template. You will, however, need to modify the trust policy for the role to grant access to the principals (users/roles/accounts) who will assume it, and you'll also need to have your users configure an appropriate role-assumption profile. The ARN of the Custom Role can be found in the Output tab of the CloudFormation stack under MetaflowUserRoleArn. To modify the trust policy to allow new principals, follow the directions here. Once you've granted access to the principals of your choice, have your users create a new Profile for the AWS CLI that assumes the role ARN by following the directions here.

Once you have followed all these steps, you can configure your metaflow installation using the outputs from the CloudFormation stack.