The one question to answer first

Where does your agent start? Find the file that runs before any model calls are made — main.py, app.py, a FastAPI lifespan hook, an AWS Lambda handler, a Celery worker entrypoint. That is where you add two lines.

Two lines at the top of that file

import apeiros

apeiros.instrument()
apeiros.start_session(budget=5.00)

# Your existing agent code below — no changes needed
from anthropic import Anthropic
client = Anthropic()

instrument() patches the Anthropic (and optionally OpenAI) client in memory at startup. Every subsequent client.messages.create() call is tracked automatically. start_session() initialises cost and warning accumulators for the current run.

What instrument() actually does

It reads response.usage.input_tokens and response.usage.output_tokens from every model response and stores the running cost in a local Python dict. Nothing is sent to any external service. No disk writes. No network calls.

Want to audit it? The full implementation is in apeiros/interceptor.py — 205 lines total. Read it in two minutes.

If you already track tokens manually

You probably have something like this:

# Before Apeiros — manual tracking
response = client.messages.create(...)
total_input  += response.usage.input_tokens
total_output += response.usage.output_tokens
total_cost   = (total_input + total_output) / 1000 * 0.008

After adding instrument(), delete that accumulator. Apeiros replaces it entirely and adds anomaly detection on top.

Placement patterns

FastAPI

from contextlib import asynccontextmanager
import apeiros
from fastapi import FastAPI

@asynccontextmanager
async def lifespan(app: FastAPI):
    apeiros.instrument()
    yield

app = FastAPI(lifespan=lifespan)

@app.post("/run-agent")
async def run_agent(request: AgentRequest):
    apeiros.start_session(budget=2.00)
    # ... your agent logic ...
    result = apeiros.end_session()
    return result

AWS Lambda

import apeiros

apeiros.instrument()  # runs once at cold start

def handler(event, context):
    apeiros.start_session(budget=1.00)
    # ... your agent logic ...
    return apeiros.end_session()

Celery worker

from celery import Celery
import apeiros

app = Celery("tasks")
apeiros.instrument()  # module-level, runs at worker startup

@app.task
def run_agent_task(payload):
    apeiros.start_session(budget=3.00)
    # ... your agent logic ...
    return apeiros.end_session()

Standalone script

import apeiros

apeiros.instrument()
apeiros.start_session(budget=5.00, debug=True)

# Your agent code here

summary = apeiros.end_session()
print(f"Total cost: ${summary['total_cost']:.4f}")

Verify it's working

Pass debug=True to start_session(). Apeiros will print a line after each tracked model call:

[apeiros] step=1  tokens=1240+380  cost=$0.0129  budget_used=0.3%
[apeiros] step=2  tokens=850+210   cost=$0.0085  budget_used=0.5%

Once you see those lines, integration is complete.

If you only use one provider

apeiros.instrument(providers=("anthropic",))   # skip OpenAI patching
apeiros.instrument(providers=("openai",))      # skip Anthropic patching

instrument() is idempotent — calling it more than once is safe.

Next: How to see which customers are losing you money →

← Back to Guides