Assignment Goal
Reference: Assignment Requirements
Keep the Assignment 2 behavior, but deploy the infrastructure with AWS CDK instead of manually creating resources in the AWS Console.
Business flow stays the same:
- S3 object changes are tracked
- total size history is stored in DynamoDB
- plotting Lambda generates a chart
- driver Lambda runs the demo sequence
CDK Project Setup
Typical bootstrap commands:
cdk init app --language python
python3 -m venv .venv
source .venv/bin/activate
./.venv/bin/python -m pip install -r requirements.txt
Local synth-style check:
./.venv/bin/python app.py
Project Layout
assignment3/
├── app.py
├── assignment3_app/
│ ├── storage_stack.py
│ └── lambda_stack.py
├── lambdas/
│ ├── size_tracking_lambda.py
│ ├── plotting_lambda.py
│ └── driver_lambda.py
└── tests/
StorageStack
Creates the stateful resources first.
- data S3 bucket
- plot S3 bucket
- DynamoDB table
- GSI:
GSI_SizeByBucket
Purpose:
- persist bucket size history
- store generated plot image
LambdaStack
Consumes the storage resources.
Creates:
- size-tracking Lambda (Python 3.12)
- plotting Lambda (Python 3.9 + matplotlib layer)
- driver Lambda (Python 3.12)
- API Gateway REST API
- EventBridge rule
- IAM permissions
- CloudWatch Log Groups
plotting_lambda.py
Reads history and generates the plot.
Does:
- get latest DynamoDB record
- query items within 10 seconds before that latest record
- query historical maximum from GSI
- generate matplotlib chart
- upload image to plot bucket
Only triggered via API Gateway (no automatic trigger)
S3: DataBucket
DynamoDB Data Model
Main table:
- partition key:
bucket_name - sort key:
time
GSI:
- partition key:
bucket_name - sort key:
total_size
Why:
- query recent history from main table
- query historical max without using
scan
S3: PlotBucket
driver_lambda.py
Runs the demo sequence:
0. clean up leftover objects
- create
assignment1.txt(18 B) - update it (27 B)
- delete it (0 B)
- create
assignment2.txt(2 B) - call plotting API
Expected visible history:
0 -> 18 -> 27 -> 0 -> 2
size_tracking_lambda.py
Triggered by object changes.
Does:
- list objects in tracked bucket
- compute
object_cnt - compute
total_size - write one history item to DynamoDB
Input:
- S3 / EventBridge event
Output: - DynamoDB history record
End-to-End Flow
Automated tracking:
S3 object change
→ EventBridge (Object Created / Deleted)
→ size_tracking_lambda
→ DynamoDB history record
Plot generation (manual only):
driver_lambda
→ API Gateway
→ plotting_lambda
→ plot bucket
plotting_lambda has NO automatic trigger.
It is only invoked via API Gateway.
Deploy Workflow
Typical order:
- verify AWS account / region
cdk bootstrapif neededcdk synthcdk deploy- invoke driver Lambda
- validate DynamoDB + plot output
cdk destroywhen done
CDK Toolkit (CDKToolkit)
Created by cdk bootstrap.
This is the CDK bootstrap stack, not your business stack.
Typically provides:
- bootstrap S3 bucket for CDK assets
- IAM roles used during deployment
- support resources for synth / deploy flow
Why it appears:
- CDK needs a place to stage assets before CloudFormation deploys them
CloudFormation
CDK does not create resources directly.
Actual path:
Python CDK code
→ cdk synth
→ CloudFormation template
→ CloudFormation creates / updates stacks
Responsible for:
- stack lifecycle
- updates / rollback
- resource state tracking