intermediate6 min read

How to set a customer budget and get notified when it's exceeded

Pass budget= to start_session(), read the budget_exceeded warning, and wire it to your alerting system.

Two budget mechanisms

Apeiros provides two budget patterns depending on your use case:

  1. Session-level budget — cap spend on a single agent run via start_session(budget=X)
  2. Monthly per-customer cap — enforce a monthly limit by polling meter() from a background task

Both are covered below.

Session-level budget

import apeiros

apeiros.instrument()
apeiros.start_session(budget=0.50)  # $0.50 per session

# Run your agent...
result = apeiros.end_session()

When cumulative session cost exceeds the budget, the result dict includes a warning:

result = apeiros.end_session()

if any(w["type"] == "budget_exceeded" for w in result.get("warnings", [])):
    print("Session exceeded budget")
    print(result["total_cost"])

Three ways to respond to a budget warning

1. Abort the agent

result = apeiros.wrap(None, input_tokens=1200, output_tokens=380)

for warning in result.get("warnings", []):
    if warning["type"] == "budget_exceeded":
        raise RuntimeError(f"Budget exceeded: {warning['message']}")

2. Switch to a cheaper model

# Check after each step — downgrade if approaching limit
state = apeiros.get_state()
if state["total_cost"] > 0.40:  # 80% of $0.50 budget
    model = "claude-3-haiku"    # switch to cheaper model for remaining steps

3. Fire an alert

import httpx

result = apeiros.end_session()
for warning in result.get("warnings", []):
    if warning["type"] == "budget_exceeded":
        httpx.post(
            "https://hooks.slack.com/services/YOUR/WEBHOOK/URL",
            json={"text": f"⚠️ Agent budget exceeded: {warning['message']}"},
        )

Monthly per-customer budget via meter()

For capping total monthly spend per customer, poll meter() from a background task:

import apeiros
from apeiros import ApeirosAgent

PLAN_PRICE      = 299.0
COST_CAP        = 250.0   # hard stop if a customer exceeds $250/mo in AI costs
ALERT_THRESHOLD = 200.0   # warn at $200/mo

# Run this every 15 minutes from a cron or background thread
def check_customer_budgets():
    for customer_id, tasks in ApeirosAgent._registry.items():
        monthly_cost = sum(t.get("cost_estimate", 0.0) for t in tasks)

        if monthly_cost >= COST_CAP:
            enforce_cap(customer_id)
        elif monthly_cost >= ALERT_THRESHOLD:
            send_alert(customer_id, monthly_cost)

Hard-stop a customer mid-month

The SDK stores everything in memory and does not enforce caps automatically — you keep control over the decision. The pattern is to check before start_task():

from apeiros import ApeirosAgent

MONTHLY_COST_CAP = 250.0

def get_customer_spend(customer_id: str) -> float:
    tasks = ApeirosAgent._registry.get(customer_id, [])
    return sum(t.get("cost_estimate", 0.0) for t in tasks)


def run_customer_task(customer_id: str, task_name: str, tokens: int):
    if get_customer_spend(customer_id) >= MONTHLY_COST_CAP:
        raise PermissionError(
            f"Customer {customer_id} has reached their monthly usage limit."
        )

    agent = ApeirosAgent(customer_id=customer_id, model="claude-3-5-sonnet")
    agent.start_task(task_name)
    agent.update_tokens(tokens)
    agent.end_task()
    return agent.cost_estimate

Note on persistence: The SDK stores cost data in memory. If your process restarts, the registry resets. For durable monthly caps that survive restarts, persist customer_report() output to your database at the end of each session and load it back at startup.

The five automatic detectors

budget_exceeded is one of five anomaly detectors that run on every tracked call:

| Detector | Triggers when | |----------|--------------| | budget_exceeded | Session cost crosses the budget= threshold | | retry_loop | Same prompt runs 3+ times in a row | | context_bloat | Context window grows faster than 20% per step | | token_acceleration | Token rate doubles between steps | | tool_amplification | Tool calls multiply tokens by 5× or more |

All five appear in the warnings list of the result dict from wrap() or end_session().


← Back to Guides

← Back to Guides