The Claude Billing Split: What Actually Changed on June 15

On June 15, 2026, Anthropic split Claude's billing into two separate consumption pools. If you missed the announcement, you are not alone — it was buried in a support article. But the implications for every engineering team running Claude are significant and immediate.

Here is what changed, what it actually costs you, and what to do about it.

35%

Token Count Inflation

Same prompts, same model name — but a new tokenizer shipped in April 2026 pushes token counts up to 35% higher, for zero additional capability gain.

2 buckets

Billing Split — June 15

Agent SDK and claude-p usage moved off your plan limit into a separate monthly credit pool. Pro gets $20/mo. Max 20x gets $200/mo. Credits don't roll over.

With Trimio Governance

Per-user spend limits, model routing, and token-level attribution mean Anthropic's billing changes don't blindside your finance team.

What Actually Happened on June 15

Anthropic announced a structural change to how Claude subscriptions are billed. Starting June 15, 2026, two distinct consumption pools now exist for any Claude subscriber:

Pool 1 — Interactive use: Claude Code, Claude Cowork, and direct Claude web/app interactions. This stays on your plan's usage limit. Your Pro $20/month still gets you the same allocation of Claude. That part didn't change.

Pool 2 — Agent SDK and programmatic access: Claude Agent SDK, claude-p command-line tool, Claude Code GitHub Actions, and any third-party application built on the Agent SDK. This moved to a separate credit pool, sized by plan tier:

Plan	Monthly Agent SDK Credit
Pro	$20
Max 5x	$100
Max 20x	$200
Team (Standard seat)	$20
Team (Premium seat)	$100
Enterprise (usage-based)	$20
Enterprise (Premium seat)	$200

Credits refresh monthly. Unused credits do not roll over. When the credit is exhausted, usage moves to standard API rates — but only if you have enabled usage credits. If you have not, Agent SDK requests stop when the credit runs out.

What It Means

Anthropic carved out Agent SDK usage from your plan's ceiling and put it in a smaller, non-renewing bucket. If you are running OpenClaw, Conductor, Claude Code in CI, or any third-party agent tool on a Pro or Max plan — your usage has a new, smaller cap. This is not a price change. It is a structural change to how much you can consume before getting cut off.

The April 2026 Change You Probably Missed

The June 15 split was announced publicly, but a quieter change shipped in April alongside Claude Opus 4.7: a new tokenizer that fundamentally alters how many tokens Anthropic counts per input.

Under the new tokenizer, the same prompt that registered as 1,000 tokens under Claude Opus 4 now registers as 1,350 tokens under Opus 4.7. The per-token price is technically unchanged — $5 per million input tokens, $25 per million output tokens. But the bill for identical content just went up by as much as 35%.

This is not a pricing increase. It is a tokenizer artifact that nobody announced with a press release. And it has real implications for any team running high-volume workloads:

Long system prompts: Architectural context, multi-role agent setups, detailed instruction files all tokenize differently under the new tokenizer. The delta is proportional to prompt complexity.
Code-heavy inputs: The new tokenizer is more efficient for natural language but less efficient for syntax-heavy content. If you are using Claude for code generation or code review, your token inflation will be higher than the average.
Structured outputs: JSON schemas, XML-formatted responses, and any output with heavy formatting characters see disproportionate token inflation under the new tokenizer.

What to Do Now

Run your most common production prompts through Anthropic's tokenizer tool before assuming token count parity between Opus 4 and Opus 4.7. If you are doing structured output or code-heavy tasks, the 35% inflation is real and your cost model needs to reflect it.

Combined Effect: What This Costs a Real Engineering Team

Let's take a concrete example. A team of 15 engineers, all running Claude Code in their local environment and via CI, plus one shared production automation pipeline running on the Agent SDK.

Under the old model (pre-June 15, pre-April tokenizer change):

15 × Pro ($20/mo): $300/month for interactive use
Production pipeline: counted against plan limits
Token inflation: none (old tokenizer)

Under the new model (post-June 15, post-April tokenizer):

15 × Pro: $300/month — interactive use unchanged
Agent SDK credit per Pro seat: $20/month — total $300 for the team
Production pipeline: burns through $300 in weeks, then hits standard API rates at $5/MTok input
Token inflation: +35% on every call — so the $300 credit buys ~285M tokens instead of 400M tokens

The real monthly bill for a team running Claude Code locally and Agent SDK in production: $800–$1,200/month where it used to be $300. For the same engineers, doing the same work, with the same tool.

Claude Cost Evolution — 15-Engineer Team

Pre-June 2025 (flat)

$300/mo

Post-June 15 + token inflation

$1,000/mo

With Trimio governance

$600/mo

Same team, same workload. Three billing eras, three cost realities. Trimio's LCR + compression recovers ~40% of the gap.

Why This Matters Beyond the Bill

The billing split reveals something important about how Anthropic is positioning its pricing going forward. The company is moving deliberately toward a model where interactive human use and programmatic machine use are priced, metered, and governed independently.

This mirrors a broader industry pattern. GitHub moved Copilot to token-based billing in June 2025. OpenAI has been meter-based from the start. Google has been gradually shifting Gemini pricing toward usage-based for enterprise. Every major AI provider is converging on the same model: access cost + consumption cost, separated.

The implication for engineering teams is not just "your bill went up." The implication is that AI tool spend is now a variable cost that responds to how your engineers use the tools — not just how many seats you have.

The Shift

AI tool spend is no longer a headcount line item. It is a behavior line item. Two engineers on the same plan, doing the same job, can cost $50/month or $3,000/month depending on how they use the tools. Finance needs attribution data it does not currently have.

What Teams Running Agentic Workflows Need to Do

1. Audit your current Agent SDK consumption. If you are running OpenClaw, Conductor, Claude Code in CI, or any automated agent pipeline on Pro or Max plans — measure your actual token consumption before June 15. The $20 credit per Pro seat will run out faster than you think. A team of 10 engineers running moderate agentic workloads will burn through $200 in less than a week.

2. Move shared production automation to a proper API key. Anthropic's own guidance is explicit: "Teams running shared production automation should use Claude Platform with an API key for predictable pay-as-you-go billing." The Agent SDK credit is sized for individual experimentation, not production workloads. If your pipeline is business-critical, it needs a proper API billing relationship.

3. Benchmark your tokenizer delta. Before assuming cost parity between Opus 4 and Opus 4.7, run your production prompts through Anthropic's tokenizer. The 35% inflation is real for structured outputs and code-heavy inputs. Update your cost models accordingly.

4. Enable budget controls before you need them. When the credit runs out, Agent SDK requests stop. If your CI pipeline is running Claude Code on a Pro seat credit and it hits the ceiling mid-build, the job fails. Set up spend alerts and hard caps before the credit runs out — not after.

How Trimio Handles This

Before Trimio

Claude billing changes blindside finance

Token inflation and billing split hit as line items with no per-user attribution. Finance sees the bill, not the cause. Engineering doesn't know their usage patterns are driving costs.

With Trimio

Every call attributed, every ceiling enforced

Trimio routes Claude API calls through a proxy layer that captures token counts, model selection, and per-user attribution at call time. When Anthropic changes tokenizer behavior, Trimio sees the delta and adjusts your cost model automatically. Per-user spend caps stop runaway pipelines before they hit the credit ceiling.

When Anthropic shipped the tokenizer change in April and the billing split in June, most teams found out from their invoice. Trimio customers found out from their dashboard — and had 30 days to adjust before the changes went live.

Stop watching invoices. Start controlling spend.

Trimio gives engineering teams token-level visibility and finance teams the ceiling they need.

One URL. Zero code changes. Per-user spend limits, model routing, and real-time attribution — without migrating your entire stack.

Join the Waitlist See How It Works