Multi-Tenant API Keys: Production-Grade Auth with cm_* Tokens

CallMissed
·5 min readGuide

Most AI APIs treat keys as a binary: you have one, or you don't. That works for a hobby project. It does not work when you are deploying agents in production with separate environments, separate teams, separate budgets, and a security review in your future. CallMissed's cm_* API keys are designed for what comes after the prototype.

The key format

A CallMissed key looks like this:

Code
cm_live_<random>
cm_test_<random>

The prefix tells you the environment at a glance — live for production traffic, test for sandboxed traffic against test models. Past that, the key is a high-entropy random string. You can paste it into your secrets manager and never look at it again.

What a key actually authorizes

A key is bound to a tenant and carries a set of scopes. Scopes determine which API surfaces the key can hit:

  • llm — chat completions, the OpenAI- and Anthropic-compatible endpoints
  • voice — voice agent session creation, STT, TTS
  • analytics — read-only access to usage and audit logs
  • bots — CRUD on bots and knowledge bases
  • admin — tenant-level management (do not expose this in client code)
  • A frontend that only needs to start voice sessions should hold a key with just the voice scope. A backend job that pulls usage stats should hold an analytics key. Compromise of one does not expand to the other.

    Tenant isolation, in practice

    Every database query that touches user data filters by tenant_id. Every key resolves to a tenant on every request. The two are joined by middleware, not by user code, which means no individual endpoint can accidentally leak data across tenants.

    This matters more than it sounds. Most multi-tenant breaches in AI infrastructure are not crypto failures — they are missed WHERE tenant_id = ? clauses in some endpoint that was added late and reviewed loosely. CallMissed enforces the join at the framework level so the failure mode is "endpoint returns nothing" rather than "endpoint returns someone else's conversations."

    Per-key budgets

    Set a hard cap when you create the key:

    Code
    {
      "name": "production-frontend",
      "scopes": ["llm"],
      "budget_usd": 500.00,
      "budget_window": "month"
    }

    When the cap is reached, the key returns 402 Payment Required until the window resets or you raise the cap. Combined with per-tenant defaults, this lets you give a single contractor a $50/month key without giving them production keys to your wallet.

    Expiration windows

    Keys can be created with expires_at. After that timestamp, the key is rejected. This is the right tool for:

  • Vendor integrations (give a partner a key valid for 90 days; they renew if they keep shipping)
  • CI environments (rotate quarterly without remembering to)
  • Demos (give an investor a key for the pitch week, not forever)
  • Long-lived keys without expirations are still supported — just not encouraged.

    Audit log

    Every API call writes a row: timestamp, key ID (not the key itself), endpoint, model, request size, token counts, latency, response status. The audit endpoint exposes the same data:

    Code
    GET /api/v1/audit?key_id=...&from=2026-04-01

    Filter by key, by tenant, by user. When something looks weird, this is the first place you look.

    Rotation without downtime

    Rotating a key is two steps:

  • Create a new key with the same scopes and budget
  • Push the new key to your secrets manager and let your services pick it up
  • Revoke the old key once dashboards confirm it is no longer in use
  • Both keys are valid simultaneously during the overlap window. CallMissed does not enforce a "one active key per service" model — that is an operational decision, and overlapping a rotation is the right way to do it without downtime.

    Common pitfalls

    A few mistakes we see often:

  • Embedding cm_live_* keys in client-side code. They will end up in source maps, browser caches, and Sentry breadcrumbs. Use a server-side proxy or short-lived tokens.
  • One key per project, forever. Scope creep eventually gives that key all the powers, and revoking it becomes a multi-team migration. Create new scoped keys for new use cases.
  • No expiration on contractor keys. A consultant leaves, the key stays. Set expirations.
  • Treating cm_test_* as private. They are not. Test keys hit a sandboxed model surface; assume they are visible in logs you do not control.
  • Generating a key

    The dashboard at /api-keys is the canonical UI. The same operations are available over the API for automated provisioning — most production deployments script key creation in their Terraform or Pulumi setup so a new environment ships with the right keys at the right scopes from day one.

    The bar for production AI infrastructure is the same as production database infrastructure. Scopes, budgets, expirations, audit trails, rotation. Skipping any of those is a "we'll regret this in six months" pattern. CallMissed gives you all of them out of the box.

    Related Posts