Skip to main content
While Cerebrium’s default runtime works well for most app needs, teams sometimes need more control over their web server implementation. Using ASGI or WSGI servers through Cerebrium’s Python runtime feature enables capabilities like custom authentication, dynamic batching, frontend dashboards, public endpoints, and WebSocket connections.

Setting Up Custom Servers

Here’s a simple FastAPI server implementation that shows how custom servers work in Cerebrium:
from fastapi import FastAPI
app = FastAPI()

@app.post("/hello")
def hello():
    return {"message": "Hello Cerebrium!"}

@app.get("/health")
def health():
    return "OK"

@app.get("/ready")
def ready():
    return "OK"
Configure this server in cerebrium.toml by adding a Python runtime section:
[deployment]
name = "my-fastapi-app"

[runtime.python]
entrypoint = ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "5000"]
port = 5000
healthcheck_endpoint = "/health"
readycheck_endpoint = "/ready"

[dependencies.pip]
pydantic = "latest"
numpy = "latest"
loguru = "latest"
fastapi = "latest"
uvicorn = "latest"
The configuration requires the following key parameters:
  • entrypoint: The command that starts your server
  • port: The port your server listens on
  • healthcheck_endpoint: The endpoint used to confirm instance health. If unspecified, defaults to a TCP ping on the configured port. If the health check registers a non-200 response, it will be considered unhealthy, and be restarted should it not recover timely.
  • readycheck_endpoint: The endpoint used to confirm if the instance is ready to receive. If unspecified, defaults to a TCP ping on the configured port. If the ready check registers a non-200 response, it will not be a viable target for request routing.
You can also configure build settings in the Python runtime section:
[runtime.python]
python_version = "3.11"
docker_base_image_url = "debian:bookworm-slim"
use_uv = true
entrypoint = ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
port = 8000
healthcheck_endpoint = "/health"
readycheck_endpoint = "/ready"
For ASGI applications like FastAPI, include the appropriate server package (like uvicorn) in your dependencies. After deployment, your endpoints become available at https://api.aws.us-east-1.cerebrium.ai/v4/[project-id]/[app-name]/your/endpoint.
Our FastAPI Server Example provides a complete implementation.

Request Headers

When using custom web servers, you can access the Cerebrium run ID through the X-Request-Id header, which is included in all requests to your endpoints. This header is particularly useful for tracking and debugging requests in your custom runtime implementation, as it corresponds to the run_id that Cerebrium uses internally.