🔢 Output/Input Price Multiple Analysis

Understanding the Cost Differential Between Input and Output Tokens

Understanding the Output/Input Multiple

What is the Output/Input Multiple?

The output/input multiple shows how much more expensive output tokens (generation) are compared to input tokens (prompt processing). For example, a 4x multiple means that generating tokens costs 4 times more than reading them.

Median Multiple: 4.00x

Why Does This Matter?

  • Generation-Heavy Workloads: If your application generates long outputs (content creation, code generation, long-form answers), lower multiples save you money
  • Analysis-Heavy Workloads: If you send large prompts but get short answers (document analysis, classification), the multiple matters less
  • Batch Processing: For agentic workflows with multiple round-trips, output from one call becomes input for the next - multiples compound

⚠️ Important Caveats

These ratios likely understate the actual cost differential because they don't account for:

  • Prompt Caching: Many providers (Anthropic, OpenAI, Google) offer prompt caching that significantly reduces input token costs for repeated prefixes. When caching is active, the effective cost of input tokens can drop to 10% or even 1% of the listed rate, making the output/input multiple dramatically higher in practice.
  • Bulk Pricing: Some providers offer volume discounts on input tokens or special pricing for batch API calls, which aren't reflected in the base per-token rates shown here.
  • Free Tiers: Certain context amounts may be free or discounted (e.g., first N tokens free), affecting the real-world cost structure.

In real-world scenarios with prompt caching and bulk discounts, output tokens often cost 10x-100x more than input tokens, not just the 2-5x shown in base pricing.

198
Models Analyzed
4.00x
Median Multiple
-
Equal Pricing (1x)
-
High Multiple (5x+)

Output/Input Multiple vs Average Cost

This quadrant chart shows how the output/input pricing structure relates to overall model cost. Models are divided by the median multiple (4.00x) and median average cost.

Distribution of Output/Input Multiples

How common are different pricing structures across all models?

All Models by Output/Input Multiple

Model Name Vendor Input ($/M) Output ($/M) Out/In Multiple Avg Cost ($/M)