How smart is your AI model, really?

AI IQ intelligently estimates the IQs of popular AI models

AI Models by IQ
Each model's estimated IQ plotted on a standard normal IQ distribution

How AI IQ estimates model intelligence

  1. We archive source captures from public benchmark leaderboards and extract only source-backed values
  2. We map each benchmark score to an implied IQ using calibrated difficulty curves
  3. We group 18 benchmarks into five reasoning dimensions: fluid abstraction, mathematical, programmatic, critical, and agentic
  4. We conservatively fill missing benchmark and dimension estimates only inside the scoring pipeline
  5. Every derived IQ averages all five dimensions, so missing coverage cannot make a model look better by omission
IQ vs Effective Cost
Each model's estimated IQ plotted against its per-task effective cost (sticker price × usage multiplier)
IQ 1:1 Cost

Effective cost & iso-curves

Effective cost on the X-axis is token cost (cost for 2M input + 1M output tokens) × token usage multiplier (this model's AA token usage ÷ the median). It's what each model spends to do a task that the median model handles with that 2:1 token mix.

Iso-curves trace lines of equal preference for IQ versus cost. The slider weights quality vs cost: center is 1:1, drag toward Cost to make cost matter more, or toward IQ to make quality matter more. Models above and to the right of a curve are strictly better.

Frontier IQ Over Time
X = release date. Y = estimated IQ. Provider step-lines connect each provider's flagship frontier checkpoints over time.

Tracking frontier progress

Each dot is a model with a known release date and a derived IQ estimate. Models are positioned left-to-right by release date, so the chart shows how the frontier changes over time rather than just where models rank today.

Provider-colored lines connect each lab's flagship frontier checkpoints. Codex, mini, nano, flash, coder, and smaller open-weight variants are omitted so the chart tracks each lab's main offering rather than every SKU.

This view is most useful for spotting whether a new release is actually ahead of its direct predecessor, or whether source coverage and conservative imputations are shaping the comparison.

AI Models by EQ
Each model's estimated EQ plotted on a standard normal IQ distribution

How AI IQ estimates emotional intelligence

  1. We pull in each model's Text Arena Elo score and EQ-Bench 3 Elo score
  2. We map each source score to an estimated EQ using calibrated piecewise-linear scales
  3. EQ-Bench 3 is retained as the dedicated emotional/social reasoning signal, but treated as style-sensitive because it is judged by Claude
  4. Anthropic models receive a 300-point Elo adjustment on EQ-Bench before mapping
  5. The composite EQ requires both source-backed components, then averages the available Text Arena and EQ-Bench signals
IQ vs EQ
X = composite EQ. Y = IQ. Color = model provider.
IQ 1:1 EQ

IQ and EQ tradeoffs

IQ summarizes benchmark-based reasoning ability across fluid abstraction, mathematical reasoning, programmatic reasoning, critical reasoning, and agentic reasoning dimensions.

EQ estimates interaction quality from Text Arena and EQ-Bench 3 signals, then maps those scores onto the same kind of normalized scale so models can be compared directly.

Iso-curves trace lines of equal preference between IQ and EQ. The slider weights the two: center is 1:1, drag toward EQ to make EQ matter more, or toward IQ to make IQ matter more. Models above and to the right of a curve are strictly better at that preference.

IQ vs EQ vs Cost in 3D
3D scatter: X = EQ, Y = IQ, Z = effective cost (log). Color = provider. Drag to rotate.

Three dimensions, one view

Most charts on this page reduce model comparison to two axes. This one keeps all three: EQ (X), IQ (Y), and effective cost (Z, log-scaled — the depth axis). Effective cost is sticker price for a 2M-input + 1M-output workload multiplied by the blended usage multiplier.

Drag to rotate the cloud. The dashed line is the central tradeoff axis: it is perpendicular to the isoquant surface at the middle of the cube and points toward higher IQ, higher EQ, and lower effective cost. Models nearer the green end are stronger all-around deals; models nearer the red end give up capability, cost efficiency, or both.

Color = provider, matching the legend below.

IQ Methodology
EQ Methodology