AWS CDK Multi-Modal Agent Deployment

Introduction

This is a Python-based CDK (Cloud Development Kit) project that demonstrates how to deploy serverless Lambda functions implementing the Strands Agent framework. The project includes two Lambda functions:

Weather Forecasting Agent - A simple agent that provides weather forecasting capabilities
Multi-Modal Processing Agent - An advanced agent that can process and analyze different types of media (images, documents, videos)

Prerequisites

AWS CLI installed and configured
Python 3.8 or later
jq (optional) for formatting JSON output
AWS account with Bedrock access

Project Structure

agent_lambda/ - Contains the CDK stack definition in Python
app.py - Main CDK application entry point
layers/ - Contains Lambda layers for the Strands Agent framework
- package_for_lambda.py - Python script that packages Lambda code and dependencies into deployment archives
- lambda_requirements.txt - Dependencies for the Lambda functions
lambdas/code/ - Contains the Lambda function code
- lambda-s-agent/ - Weather forecasting agent Lambda function
- lambda-s-multimodal/ - Multi-modal processing agent Lambda function

Setup and Deployment

Create a Python virtual environment and install dependencies:

# Create a Python virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install CDK dependencies
pip install -r requirements.txt

# Install Python dependencies for lambda with correct architecture
pip install -r layers/lambda_requirements.txt --python-version 3.12 --platform manylinux2014_aarch64 --target layers/strands/_dependencies --only-binary=:all:

Package the lambda layers:

python layers/package_for_lambda.py

Bootstrap your AWS environment (if not already done):

cdk bootstrap

Deploy the stack:

cdk deploy

Usage

After deployment, you can invoke the Lambda functions using the AWS CLI or AWS Console.

Weather Forecasting Agent

aws lambda invoke --function-name AgentSFunction \
      --region us-east-2 \
      --cli-binary-format raw-in-base64-out \
      --payload '{"prompt": "What is the weather in New York?"}' \
      output.json

Multi-Modal Processing Agent

aws lambda invoke --function-name MultimodalSFunction \
      --region us-east-2 \
      --cli-binary-format raw-in-base64-out \
      --payload '{"prompt": "Analyze this image", "s3object": "s3://your-bucket/path/to/image.jpg"}' \
      output.json

If you have jq installed, you can output the response from output.json like so:

jq -r '.' ./output.json

Otherwise, open output.json to view the result.

Multi-Modal Processing Capabilities

The Multi-Modal Processing Agent can handle various types of media:

Images: PNG, JPEG/JPG, GIF, WebP
Documents: PDF, CSV, DOCX, XLS, XLSX
Videos: MP4, MOV, AVI, MKV, WebM

The agent uses custom tools built with the Strands Agent framework to process and analyze these media types.

Cleanup

To remove all resources created by this example:

cdk destroy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWS CDK Multi-Modal Agent Deployment

Introduction

Prerequisites

Project Structure

Setup and Deployment

Usage

Weather Forecasting Agent

Multi-Modal Processing Agent

Multi-Modal Processing Capabilities

Cleanup

Additional Resources

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

AWS CDK Multi-Modal Agent Deployment

Introduction

Prerequisites

Project Structure

Setup and Deployment

Usage

Weather Forecasting Agent

Multi-Modal Processing Agent

Multi-Modal Processing Capabilities

Cleanup

Additional Resources