AI Model Costs

Compare sticker price, token efficiency, and real task cost across frontier models

Token Cost vs Effective Cost
Each model has two bars: raw cost for 2M input + 1M output tokens, and effective cost after adjusting by validated AA token-usage efficiency.

The adjustment, model by model

The gray bar is the published token price applied to a fixed 2M-input + 1M-output workload.

The purple bar applies the model's AA token-usage multiplier. If purple is below gray, the model is token-efficient; if purple is above gray, it burns more tokens than the median model.

Token Efficiency
Each dot shows a model's token efficiency relative to median AA token usage. The 1× line is median efficiency; 2× means the model uses half as many tokens as the median for the same tasks, while 0.5× means it uses twice as many.

Why sticker price moves

Token efficiency compares how many tokens each model uses on the Artificial Analysis Index against the median model.

A model at is exactly median. A model at uses half as many tokens as the median, so it is twice as token-efficient. A model at 0.5× uses twice as many tokens as the median.

The vertical scale is logarithmic, so each gridline is one doubling or halving of token efficiency.

IQ vs Effective Cost
Each model's estimated IQ plotted against its per-task effective cost (token cost × token usage multiplier)

Quality per task dollar

Effective cost combines the price sheet with observed token usage, so it approximates what a task costs in practice rather than only what tokens cost in isolation.

The strongest cost story is not the cheapest model overall. It is which models sit high on IQ while staying far to the right on the reversed cost axis.