1. Core Idea
Related reading:
Assignment 3 keeps the Assignment 2 application behavior, but replaces manual AWS Console setup with AWS CDK.
In practice, this means:
- the Lambda code still performs the same business logic
- the infrastructure is now defined in Python CDK code
- deployment, updates, and cleanup should happen through CDK and CloudFormation
2. Functional Behavior Inherited from Assignment 2
The system still needs the same core flow:
- object changes happen in S3
- a size-tracking Lambda records bucket history in DynamoDB
- a plotting Lambda generates a chart for the recent time window
- a driver Lambda performs the demo sequence and triggers plotting
The driver sequence is still:
- create
assignment1.txtwithEmpty Assignment 1 - update
assignment1.txttoEmpty Assignment 2222222222 - delete
assignment1.txt - create
assignment2.txtwith33 - call the plotting API
3. Infrastructure Requirements
The assignment expects CDK to create the resources instead of manual console clicks.
That includes:
- S3 storage
- DynamoDB table
- three Lambda functions
- event wiring from S3 changes to the size-tracking flow
- a REST API for plotting
The design expectations are also important:
- use CDK instead of manual setup
- split resources into a reasonable number of stacks
- avoid hardcoding physical resource names
4. Current Project Interpretation
The current project uses two stacks:
StorageStackLambdaStack
And the current implementation creates:
- 2 S3 buckets
- 1 DynamoDB table
- 1 GSI
- 3 Lambda functions
- 1 API Gateway REST API
- 1 EventBridge rule
Important detail:
- the assignment text is often described as needing one bucket
- the current implementation uses one data bucket and one separate plot bucket
- that is still a reasonable design because the data bucket is the tracked bucket and the plot bucket stores the generated image output
5. Resource-Level Expectations
5.1 Data Model
The DynamoDB table should support storing history by bucket over time.
The current schema is:
- partition key:
bucket_name - sort key:
time
The current GSI is:
- name:
GSI_SizeByBucket - partition key:
bucket_name - sort key:
total_size
This supports:
- recent-history queries from the main table
- historical-maximum queries without using
scan
5.2 Size-Tracking Lambda
This Lambda should:
- respond to object create and delete activity
- compute current object count and total size
- write a new history record to DynamoDB
Each record should include at least:
bucket_nametimeobject_cnttotal_size
5.3 Plotting Lambda
This Lambda should:
- query the last 10 seconds of data for the tracked bucket
- query the historical maximum from the GSI
- generate a plot
- upload the plot to S3
- be callable synchronously through API Gateway
5.4 Driver Lambda
This Lambda should:
- perform the required file operations in order
- leave enough time between steps for separate data points
- invoke the plotting API
6. Naming Requirement
The important requirement is to avoid hardcoded physical names for deployable AWS resources.
For the current project, the practical interpretation is:
- do not hardcode bucket names
- do not hardcode Lambda function names
- do not hardcode CloudWatch LogGroup names
- let CDK generate those physical names
Using a fixed internal identifier such as the DynamoDB GSI name is acceptable because it is part of the table schema and is passed into Lambda code through environment variables.
7. Demo Checklist
For the demo, you should be able to show:
- the repo at the required commit
- successful CDK deployment
- both stacks in CloudFormation
- the deployed AWS resources
- manual invocation of the driver Lambda
- DynamoDB history records
- the generated plot object
8. Operational Note
This project intentionally uses cleanup-friendly settings:
- S3 buckets use
RemovalPolicy.DESTROY - S3 buckets use
auto_delete_objects=True - the DynamoDB table uses
RemovalPolicy.DESTROY
That makes redeploying easier for a class assignment, but it also means stack deletion or replacement can remove data.
9. One-Sentence Summary
Assignment 3 is Assignment 2 behavior packaged as CDK-managed infrastructure, with emphasis on reasonable stack design and non-hardcoded deployable resource names.