The presence premium studyData

The data

Necessity by concept state

How often developing a concept turned a failing answer into a sufficient one, for its own-intent query, held within the essential and important importance tiers. All three natural states are low and close together.

Concept state	n	Necessary
Flagged missing (add it)	24	17%
Flagged thin (deepen it)	186	10%
Already developed (placebo)	171	8%

The missing minus thin gap is +7 points, with a 95% bootstrap interval of [−8, +15] that crosses zero. No presence premium. The placebo arm at 8% confirms the instrument does not credit redundant additions.

How often the page already answered the flagged concept

The reason the necessity rates are low and flat: in every state, most concepts’ own-intent queries were already answerable from the page as written. The flag tracks the words on the page, not what the page can answer.

Concept state	Already answerable as-is
Flagged missing	62%
Flagged thin	57%
Already developed	74%

Nearly two-thirds of flagged-missing concepts were already answerable. The missing label is a weak proxy for a real retrieval gap.

The controlled leave-one-out result

When we removed a concept the page demonstrably answered and added it back, creating a guaranteed gap, presence mattered. The contrast against deepening a thin concept is the one comparison that holds.

Move	n	Necessary
Close a controlled real gap (leave-one-out)	60	37%
Close a controlled real gap, when removal broke the answer	27	82%
Deepen a flagged-thin concept (for comparison)	186	10%

Real-gap minus deepening is +27 points, 95% bootstrap interval [+22, +49]. Removing a covered concept opened a real answerability gap only 45% of the time; the rest, the page answered through other content.

Among concepts the page genuinely could not answer

A cross-check on the natural arms, narrowed to concepts whose query the page could not fully answer as written. Here too the direction favours closing absences, though the natural missing cell is tiny, which is exactly why the controlled arm was needed.

Concept state	n	Necessary
Flagged missing	9	44%
Flagged thin	79	24%
Already developed	45	31%

Directional support that real gaps matter more, on small numbers; the controlled leave-one-out arm is the load-bearing version of this test.

The pre-registered gates

Gate	Pre-registered test	Result
G1: presence premium	missing minus thin gap ≥ 25pp, CI > 0	+7pp [−8, +15]	FAIL
G3: placebo behaves	padding a developed concept lifts ≤ 12%	8%	PASS
G4: instrument control	known-answer cases scored correctly	8/8, 16/16	PASS
G5: corpus yield	≥ 60 natural absent ablations	36	RELAXED
LOO: controlled gap	real-gap vs deepening gap, CI > 0	+27pp [+22, +49]	HOLDS

Confidence intervals are 95% bootstraps over per-topic means. G1 is the pre-registered premium test and failed. Natural absences were too scarce to meet G5 (36 against a 60 target), so it was relaxed and the controlled leave-one-out arm carries the decisive presence test.

The control battery

To rule out a blunt instrument, we ran the exact answer-and-judge machinery on hand-built cases with known answers.

Control type	Expected	Correct
Positive (obvious lift)	necessary	8 / 8
Negative, already answered	not necessary	8 / 8
Negative, irrelevant addition	not necessary	8 / 8

The instrument detects necessity when it is there and rejects it when it is not, so the low rates above are a real property of the flagged set, not a measurement failure.

The corpus

45 real English explanatory pages, one per topic, across general, professional, and expert subjects, chosen to be narrow slices of broad topics. We ran 398 natural concept ablations plus 60 controlled leave-one-out tests, scored across roughly six thousand model calls. Answers were written by one neutral model (Qwen2.5-72B) and scored by a cross-family panel of three (Gemini 2.5 Flash, DeepSeek V4 Pro, GPT-4.1-mini); all non-Anthropic, since production runs an Anthropic model. This study follows the Depth Necessity Study, which tested the thin arm alone on a different corpus.

← Overview Methodology →