Finance layer for AI products

Wrap your LLM client
See per-customer profit.

One line of code wraps Anthropic, OpenAI, Bedrock, Google, Vercel AI, and Azure. TokenOps captures every event, reconciles your monthly vendor invoice, and breaks cost down by the customer who incurred it — so you actually know which accounts are profitable.

Tenant id + key pair in your inbox within a business day. No credit card.

6Vendor wrappers
30MCP tools
4Workflow tools
<5 minnpm install → first event
server/llm.tstokenops · v0.4.1
import { Anthropic } from "@anthropic-ai/sdk";import { TokenOps, wrapAnthropic } from "@tokenops/sdk"; const tokenops = new TokenOps({ tenantId: env.TOKENOPS_TENANT }); // One line. Every call now attributed.const client = wrapAnthropic(new Anthropic(), tokenops); const res = await client.messages.create({ model: "claude-sonnet-4-5", messages, customerId: req.user.org, // ← attributes the spend});
Live · last 5 customers
May 2026 · running
CustomerLLM costRevenueMargin
Acme Robotics$4,820.12$9,000.00+46.4%
Brightline$1,283.00$1,900.00+32.5%
Northwind Labs$2,104.55$2,400.00+12.3%
Verdant Inc.$580.40$1,500.00+61.3%
Halcyon$3,210.00$2,000.0060.5%
Captures events from
AnthropicOpenAIAmazon BedrockGoogle AIVercel AIAzure OpenAIAnthropicOpenAIAmazon BedrockGoogle AIVercel AIAzure OpenAI
The wedge

Per-customer cost is the question your spreadsheet can't answer.

AI-native companies pay six vendors, get one invoice each, and have no shared schema for cost. That makes the most important question — is this customer profitable? — a research project every month.

Invoice mystery

The Anthropic invoice doesn't say which customer cost you what.

Vendor bills aggregate at the workspace level. By the time it lands, you've lost the join between spend and the customers who incurred it.

Anthropic · April 2026INV-44-2089

claude-sonnet-4-5 · 412M in$1,236.00
claude-sonnet-4-5 · 88M out$1,320.00
claude-haiku-4-5 · 1.2B in$960.00

Total$3,516.00
↳ by customer: ?
Rate drift

Effective rate ≠ list rate. You find out at month end.

Caching, batch, region, and committed-use discounts mean what you actually pay diverges from the price card on the docs. You don't see the gap until reconciliation.

Effective $ / 1M tokensApril 2026
List: $3.00Effective: $2.41Δ −19.7%
Friday at 4pm

Your CEO wants per-customer gross margin. By Friday.

One spreadsheet, six tabs of CSVs, two SQL joins, and a promise to ‘firm it up next month.’ This is most AI startups today.

CEO
Sarah · Thu 3:47pm
Quick one — can you send me gross margin by customer for the last 90 days before Friday? Need it for the board deck.
What it does

One SDK. One reconciliation engine.
One MCP your agents already speak.

TokenOps slots in at the LLM client and stays out of your way. You wrap once, attribute by customer, and the rest — capture, normalize, reconcile, expose — happens for you.

01

Wrap any LLM client

One-line wrappers for Anthropic, OpenAI, Bedrock, Google, Vercel AI, and Azure. We never proxy your traffic — calls go directly to the vendor.

@tokenops/sdkTypeScript & PythonEdge runtimes
wrapAnthropic(client, tokenops)
wrapOpenAI(client, tokenops)
wrapBedrock(client, tokenops)
// …Google, Vercel AI, Azure
02

Reconcile invoice → events

Every event is priced against captured tokens. Monthly vendor invoices are diffed line by line so you see effective rate, drift, and savings per (vendor, period, customer).

Effective vs. listPer-customer gross marginAnomaly detection
claude-sonnet-4-5$1236.00$1198.40−3.0%
claude-haiku-4-5$960.00$982.10+2.3%
gpt-4o · batch$481.20$312.30−35.1%
03

Answer via MCP

A 30-tool MCP catalog plus 4 composite workflows. Connect Claude Desktop, Cursor, or Codex and ask finance questions in plain English — the agent reaches for the right tools.

30 tools4 workflowsstdio · SSE · HTTP
claude-desktop3 tools called
tokenops.list_customers
tokenops.cost_by_customer
tokenops.gross_margin
MCP catalog

Thirty tools, grouped into five families. Your agent picks the right one.

Connect TokenOps to Claude Desktop, Cursor, Codex, or any MCP-aware client. Read tools are scoped by tenant; write tools require a separate key. No surprises.

ReadWrite
C
tokenops.customers.*7 read tools
list_customersget_customercustomers_by_spendsearch_customerscustomer_metadataactive_customerschurn_risk
E
tokenops.events.*6 read · 1 write
list_eventsget_eventevents_by_customerevents_by_vendorsearch_eventscost_per_eventtag_event
V
tokenops.vendors.*5 read tools
list_vendorsvendor_summarylist_modelspricing_for_modelvendor_invoice_history
F
tokenops.finance.*8 read · 1 write
cost_by_customercost_by_vendorgross_margineffective_raterate_driftreconcile_invoicemonthly_pnlltv_estimateset_price_per_token
S
tokenops.system.*2 read · 1 write
healthlist_workflowsrotate_key
claude-desktop · mcpconnected
What was our gross margin on Acme last month? And which customers lost money?
Workflow tools

Four composite tools for the four finance questions you actually ask.

Each workflow stitches a dozen MCP reads into one structured answer — the kind of multi-step reasoning agents do well, served as a single tool so it's deterministic.

tokenops.workflows.build_pnl

Build a per-customer P&L

Joins captured events to your revenue source of truth and outputs gross margin per customer for any period. Used as the seed for monthly board updates.

await tokenops.workflows.build_pnl(period="2026-04", group_by="customer")
tokenops.workflows.reconcile

Reconcile vendor invoice

Diffs the month's invoice from any of the six vendors against captured events. Flags drift, missing line items, and quietly applied volume discounts.

await tokenops.workflows.reconcile(vendor="anthropic", invoice_id="INV-44-2089")
tokenops.workflows.anomaly

Detect cost anomalies

Watches per-customer cost run-rate and alerts when a customer's daily spend moves more than 2σ from its 30-day baseline. Catches runaway loops before the invoice does.

await tokenops.workflows.anomaly(window="30d", sigma=2)
tokenops.workflows.price_change

Forecast a price change

Simulates a new $/seat or $/usage price against the last 90 days. Returns customers who'd churn at margin <0% and projected revenue lift.

await tokenops.workflows.price_change(new_unit=4.50, lookback="90d")
How it works

Install. Wrap. See. Around five minutes.

01

Install the SDK

Drop @tokenops/sdk (or the Python package) into the service that holds your LLM clients. Edge runtimes welcome.

terminal
$ yarn add @tokenops/sdk$ export TOKENOPS_TENANT=…
02

Wrap your LLM client

One line per vendor. Pass customerId on each call — that's the join key for per-customer cost.

llm.ts
const client = wrapAnthropic(new Anthropic(), tokenops);// every call now attributed
03

See per-customer cost

Open the dashboard, hit /api/cost, or ask your agent. Cost lands within seconds; the monthly vendor invoice is reconciled when it arrives.

May · MTDon track
$48,210.55
LLM cost across 142 customers
FAQ

Common questions, candid answers.

Still curious? Talk to us

How is this different from Helicone, Langfuse, or Braintrust?

Those are LLM observability tools — they help you debug prompts, track latency, and grade outputs. TokenOps is a finance layer. We share the capture step, but the rest of the product is reconciliation, gross margin, and the MCP layer that lets an agent answer ‘is Acme profitable?’ Use both — they're complements, not substitutes.

Do you proxy our LLM calls?

No. The SDK wraps your client in your runtime. Calls go directly to Anthropic, OpenAI, Bedrock, Google, Vercel AI, or Azure. We capture metadata after the response returns — never inserted between you and the vendor.

What does the SDK actually capture?

Per request: model, input/output tokens, vendor-reported usage, latency, your customerId, and any tags you attach. Prompt content is opt-in and stays in your tenant — most teams leave it off.

How does invoice reconciliation work?

Each month you forward the vendor invoice (PDF or CSV) to TokenOps, or we pull it via the vendor's billing API where available. We line-item match it against captured events, surface the effective rate, and flag drift on each (vendor, model, period).

What models and vendors are supported?

Anthropic, OpenAI, Amazon Bedrock, Google AI, the Vercel AI SDK, and Azure OpenAI. Pricing is kept current for Claude, GPT-4/4o, Gemini, Llama on Bedrock, and the major open-weight models served by these clouds. New models usually land within a week.

Pricing?

Early access is free while we're shaping the product. Pricing will be usage-based with a generous free tier — TokenOps should pay for itself in the savings it surfaces on your first vendor reconciliation.

Early access

Stop guessing per-customer cost.

Drop your work email. We'll send a tenant id, an SDK key, and a thirty-second wiring guide within a business day.

  • No proxy. Your LLM calls go direct to the vendor.
  • Prompt content stays in your tenant by default.
  • MCP layer ships with read-only keys by default.