The Glitched Goblet Logo

The Glitched Goblet

Where Magic Meets Technology

Summoning Your ML Model from the Cloud: FastAPI + AWS Lambda Speedrun

June 9, 2025

Intro

Hello, fellow wizards! If you've been following my posts you'll know that I've been learning about Machine Learning and MLOps. Currently, I've been working on deploying a new Logistic Regression that can make requests to via API calls. I've gone through the process of training and fine-tuning a new model. So in this article we'll go through the next steps of deployment and making a request.

TL;DR In under 15 minutes we'll train a ~45 KB Logistic Regression that predicts whether a Magic: The Gathering card will become an EDH staple (EDH Rec rank ≤ 5 000) using only four features. mana value, card type, color identity, and rarity. Then we'll sling the model into AWS Lambda via a containerised FastAPI so you can make predictions on demand.

Why Even Go Serverless

  • Cold starts are small these days - Lambda container images give you up to 10 GB, which is plenty for most models.
  • Scale to zero - pay only when your model actually gets summoned.
  • Ops? Lol - no fleet of EC2 instances to babysit.

If your model needs sub-100 ms latency 24/7, maybe spin up a GPU box instead. For everything else, Lambda is your bff.

Prerequisites

Tool Tested Version Purpose
Python 3.11 Training + inference script
FastAPI 0.110 Lightweight API
Docker CLI 25+ Build container image
AWS CLI 2.15 Push + deploy
AWS Account Obviously
MTGJSON Card data source

Why these four features?

  • Mana value - Great way to weed out the likelihood of a card being usable. Lower mana value is more playable than a high mana value card.
  • Card type (creature / non-creature) - Creatures are the most popular cards to play.
  • Color count - More colors means more specific deck pairings whereas less colors can go into more decks.
  • Rarity tier - Mythics and rares tend to be pushed (though not always!). Encoding it gives the model extra signal for power level.

All four are text-or-numeric columns already present in Scryfall/MTG JSON dumps. Which means no embeddings or heavy tokenisation needed. Perfect for a Lambda free-tier budget.

1 · Train a snack-sized model

# train.py
import joblib, pandas as pd
from sklearn.linear_model import LogisticRegression
from pathlib import Path

cards = pd.read_csv("cards.csv") # download from https://mtgjson.com

# --- feature engineering --------------------------------------------
cards["numColors"] = cards["colorIdentity"].apply(len)
cards["isCreature"] = cards["type"].str.contains("Creature").astype(int)
rarity_map = {"common": 0, "uncommon": 1, "rare": 2, "mythic": 3}
cards["rarityScore"] = cards["rarity"].str.lower().map(rarity_map).fillna(0)

X = cards[["manaValue", "numColors", "isCreature", "rarityScore"]]
y = (cards["edhrecRank"] <= 5000).astype(int)

model = LogisticRegression(max_iter=1000).fit(X, y)
joblib.dump(model, "edh_staple_model.joblib")
print("Saved model — size:", round(Path('edh_staple_model.joblib').stat().st_size / 1024, 1), "KB")

Result? ≈45 KB. Tiny enough that cold starts won't feel like summoning Eldrazi.

2 · Wrap the model in FastAPI (app.py)

This will create a simple API that accepts card features and returns the probability of being an EDH staple.

from fastapi import FastAPI
from pydantic import BaseModel
import joblib, numpy as np

model = joblib.load("edh_staple_model.joblib")
app = FastAPI(title="EDH-Staple-Predictor")

class CardFeatures(BaseModel):
    manaValue: float
    numColors: int # 0-5
    isCreature: int # 1 = Creature, 0 = not
    rarityScore: int # 0-common … 3-mythic

@app.post("/predict")
def predict(card: CardFeatures):
    feats = [[card.manaValue, card.numColors, card.isCreature, card.rarityScore]]
    prob = float(model.predict_proba(feats)[0, 1])
    return {"stapleProbability": round(prob, 3)}

# for local dev fun\if __name__ == "__main__":
    import uvicorn; uvicorn.run(app, host="0.0.0.0", port=8000)

3 · Dockerise → ECR → Lambda

Next, we'll need to create a Dockerfile to package our FastAPI app and model into a container image that AWS Lambda can run.

# Base image: AWS Lambda Python 3.11 runtime
FROM public.ecr.aws/lambda/python:3.11

COPY app.py edh_staple_model.joblib ./
RUN pip install --no-cache-dir fastapi uvicorn gunicorn joblib scikit-learn pydantic mangum

# ASGI-to-Lambda shim
CMD [ "app.handler" ]

Heads-up: mangum provides the handler ASGI adapter. Add this to the bottom of app.py:

from mangum import Mangum
handler = Mangum(app)

Then, from the repo root:

# build & push (replace with your IDs)
docker build -t gg-staple-api .
docker tag gg-staple-api:latest <aws-id>.dkr.ecr.<region>.amazonaws.com/gg-staple-api:latest
docker push <aws-id>.dkr.ecr.<region>.amazonaws.com/gg-staple-api:latest

# one-liner Lambda creation
aws lambda create-function \
  --function-name gg-staple-api \
  --package-type Image \
  --code ImageUri=<aws-id>.dkr.ecr.<region>.amazonaws.com/gg-staple-api:latest \
  --memory-size 256 \
  --timeout 10

4 · Test the spell (curl)

Make sure your Lambda function is deployed and ready. You can test it using curl or any HTTP client of your choice.

curl -X POST https://<api-gw>.execute-api.<region>.amazonaws.com/predict \
  -H "Content-Type: application/json" \
  -d '{"manaValue":2,"numColors":1,"isCreature":1,"rarityScore":2}'
# → {"stapleProbability":0.73}

5 · Side quests & power-ups

Pain point Quick fix
Cold starts ≥ 500 ms Use ARM/Graviton base image or SnapStart + Provisioned Concurrency
Accuracy meh? Try LightGBM (~200 KB) or XGBoost with tree_method=hist
Model getting chonky Strip scikit-learn from runtime; ship pure-NumPy weights

Obviously this model had to be small for Lambda to be viable. There is a lot more optimisation you can do to improve accuracy, such as:

  • Feature engineering: Add more features like keywords, card text, or even historical price trends.
  • Model selection: Try more complex models like LightGBM or XGBoost, which can handle categorical features better and often yield higher accuracy.
  • Hyperparameter tuning: Use tools like Optuna or Hyperopt to find the best hyperparameters for your model.

Final Thoughts

With a pinch of feature engineering and docker build && docker push, you've moved an ML model from dev box to globally-scalable Lambda endpoint. No K8s, no autoscaling groups, just a pay-per-invocation goblet of goodness.

Feel free to fork the repo, toss in extra features (e.g., keywords in rules text), or swap in a boosted tree. All the plumbing stays the same.

Drink deeply, code boldly, and may your deployments stay glitch-free. brb, brewing more coffee.