June 3, 2025
Need to react the instant a file lands in a bucket without babysitting a server? AWS Lambda + Amazon S3 is your peanut-butter-and-jelly combo. This how-to walks you through wiring the two together with Python, sprinkling in battle-tested tips (and a dash of Glitched-Goblet snark) along the way. I recently used this setup to automate a data pipeline, and it’s been a game-changer. So I thought I’d share the recipe.
What | Why |
---|---|
An AWS account with Admin or equivalent rights |
You’ll create buckets, roles, and functions. |
AWS CLI + configured credentials | Faster than clicking in the console. |
Python 3.11+ locally (optional) | Handy for zipper-up deployments or unit tests. |
A splash of IAM know-how | Least-privilege saves future-you from 3 a.m. alerts. |
Piece | Role |
---|---|
Handler (lambda_handler ) |
The first line executed when Lambda fires. Think of it as main() for server-less. (AWS Documentation) |
Event | A JSON blob describing why the function was invoked—bucket name, object key, etc. |
Context | Metadata about this invocation (request ID, memory limit, log group, etc.). Great for tracing and perf metrics. (AWS Documentation) |
tip: Dump
json.dumps(event, indent=2)
to CloudWatch on day 1. Future you will thank current you when the event shape inevitably changes.
Create a bucket to hold your files. Use the AWS CLI for speed:
aws s3 mb s3://gg-demo-ingest-bucket
Optional filters: Use prefix (uploads/
) or suffix (.json
) rules to avoid noisy triggers.
Create an IAM role that your Lambda function will assume. This role needs permissions to read from S3 and write logs to CloudWatch.
aws iam create-role \
--role-name gg-s3-lambda-role \
--assume-role-policy-document file://trust.json
Attach policies:
AWSLambdaBasicExecutionRole
(writes logs)AmazonS3ReadOnlyAccess
(or a tighter bucket-scoped policy)Here's a simple Lambda function that reads JSON files from S3 and logs the contents. It uses structured logging for better observability. We'll use boto3
to interact with S3 and json
to parse the file contents.
In order to get the file, we'll need to get the bucket
and key
from the event object. The key
is URL-encoded, so we use urllib.parse.unquote_plus
to decode it. We'll then read the file and parse it as JSON.
import json, logging, urllib.parse
import boto3
logger = logging.getLogger()
logger.setLevel(logging.INFO)
s3 = boto3.client("s3")
def lambda_handler(event, context):
bucket = event["Records"][0]["s3"]["bucket"]["name"]
key = urllib.parse.unquote_plus(event["Records"][0]["s3"]["object"]["key"])
logger.info("Processing s3://%s/%s", bucket, key)
try:
raw_bytes = s3.get_object(Bucket=bucket, Key=key)["Body"].read()
payload = json.loads(raw_bytes)
# TODO: do something cool
logger.debug("Payload: %s", payload)
return {"statusCode": 200, "body": "Processed ✔️"}
except Exception as exc:
logger.exception("Failed on %s", key)
raise
Here are the highlights of this function:
logger
instead of print
⇒ structured logs.Package your function and deploy it to AWS Lambda. First, zip the code, then create the Lambda function using the AWS CLI. Make sure to replace <acct>
with your actual AWS account ID.
zip function.zip lambda_function.py
aws lambda create-function \
--function-name gg-s3-ingest \
--runtime python3.12 \
--handler lambda_function.lambda_handler \
--role arn:aws:iam::<acct>:role/gg-s3-lambda-role \
--zip-file fileb://function.zip
Attach the bucket notification:
aws s3api put-bucket-notification-configuration \
--bucket gg-demo-ingest-bucket \
--notification-configuration '{
"LambdaFunctionConfigurations": [{
"LambdaFunctionArn": "arn:aws:lambda:...:function:gg-s3-ingest",
"Events": ["s3:ObjectCreated:*"],
"Filter": { "Key": { "FilterRules": [
{ "Name": "suffix", "Value": ".json" }
]}}
}]
}'
AWS Console fan? S3 → Properties → Event notifications
does the same thing with. (AWS Documentation)
Now, let's test the whole flow. Upload a JSON file to your bucket and watch the magic happen.
echo '{ "hello": "glitch" }' > demo.json
aws s3 cp demo.json s3://gg-demo-ingest-bucket/uploads/
Watch CloudWatch → Log groups → /aws/lambda/gg-s3-ingest
for your log lines.
This step is extra credit. If you need to fan out work or keep your first function skinny, you can invoke another Lambda function from within your handler. This is useful for decoupling tasks or offloading heavy processing.
You can call another Lambda function using the boto3
client. You'll need the other lambda's name or ARN. Here’s how you can do it:
import boto3, json
client = boto3.client("lambda")
def lambda_handler(event, context):
payload = { "CustomerId": "123", "Amount": 50 }
resp = client.invoke(
FunctionName="gg-payment-processor", # name or full ARN
InvocationType="RequestResponse", # or "Event" for async
Payload=json.dumps(payload).encode()
)
result = json.load(resp["Payload"])
logger.info("Downstream result: %s", result)
Permissions check: Calling function needs
lambda:InvokeFunction
on the callee.
Now, because this was a test, let's clean up the resources we created. Run the following commands to delete the Lambda function, the S3 bucket, and the IAM role:
aws lambda delete-function --function-name gg-s3-ingest
aws s3 rb s3://gg-demo-ingest-bucket --force
aws iam delete-role --role-name gg-s3-lambda-role
Cool beans! You’ve now got a serverless pipeline that auto-reacts to bucket uploads, keeps your infra footprint tiny, and leaves plenty of headroom. Go forth and automate!
Thanks for reading! If you found this guide helpful, please consider sharing it with your fellow developers or bookmarking it for later.
Enjoy my content? You can read more on my blog at The Glitched Goblet or follow me on BlueSky at kaemonisland.bsky.social. I write new posts each week, so be sure to check back often!