A complete visual walkthrough: from user input → validation → ML model → Docker container
The user sees a Streamlit web form. When they hit Predict, these exact raw values are packaged into a JSON payload:
18.117.168.71:8000). Users can override it in the sidebar.FastAPI automatically parses the JSON body. The type annotation user_input: UserInput triggers Pydantic validation before any of your code runs.
Also available: GET / (welcome message) and GET /health (status + model version).
The UserInput Pydantic model does heavy lifting: it validates ranges, normalises text, and computes derived features automatically using @computed_field and @field_validator.
Also imports tier_1_cities and tier_2_cities from config/city_tier.py for the lookup.
Back in app.py, only 6 of the 11 available fields are extracted and passed to the prediction function. Raw inputs like age, weight, height, smoker, city are dropped — they already contributed to computed features.
The model is a pre-trained scikit-learn classifier loaded from model/model.pkl at startup (not per request). It outputs 3 premium categories.
FastAPI returns a JSONResponse with status 200. The structure does NOT exactly match PredictionResponse schema (uses different keys) — this is a minor inconsistency in the code.
Streamlit web UI. Renders input form, collects user values, sends HTTP POST to the FastAPI backend, and displays the prediction result.
Streamlit • requestsFastAPI application entry point. Defines 3 endpoints: / (welcome), /health (status), /predict (POST). Orchestrates the full prediction pipeline.
Pydantic model for request validation AND feature engineering. Computes bmi, lifestyle_risk, age_group, city_tier automatically from raw inputs.
Pydantic v2 • @computed_fieldPydantic model for API response documentation. Defines predicted_category, confidence, class_probabilities. Used as OpenAPI schema only (note: actual response uses JSONResponse directly).
Pydantic • OpenAPI docsLoads the pickled ML model at startup. Provides predict_output() which wraps model.predict() and model.predict_proba(), returning a dict with prediction, confidence, and per-class probabilities.
scikit-learn • pandas • picklePre-trained scikit-learn classifier (binary pickle). Has .classes_, .predict(), and .predict_proba() — typical of a RandomForest, GradientBoosting, or similar ensemble. Version: 1.0.0
Pre-trained • binary filePure data config. Two lists: tier_1_cities (7 metro cities) and tier_2_cities (48 smaller cities). Everything else → Tier 3. Used by UserInput.city_tier computed field.
Config • lookup listsContainerises the FastAPI backend. Uses python:3.12-slim, installs requirements, copies all code, exposes port 8000, and runs uvicorn to serve the app.
Docker • uvicornLists Python dependencies installed inside the Docker image. Likely includes fastapi, uvicorn, pydantic, scikit-learn, pandas, and streamlit.
pip • dependenciesfrontend.py) runs separately and points to the container's exposed port 8000.0.0.0.0 (accessible from outside container)FastAPI_Key.pem is in the project folder — this is an AWS EC2 SSH key for the server at 18.117.168.71. It should NOT be committed to git or included in the Docker image in production. The COPY . . instruction currently copies it into the container.
GET / welcome, GET /health status check + model version, POST /predict main prediction endpoint