ContentGrapher
ContentGrapher
research/presence-premium-study/data
The presence premium studyData

The data

Necessity by concept state

How often developing a concept turned a failing answer into a sufficient one, for its own-intent query, held within the essential and important importance tiers. All three natural states are low and close together.

Concept statenNecessary
Flagged missing (add it)2417%
Flagged thin (deepen it)18610%
Already developed (placebo)1718%

The missing minus thin gap is +7 points, with a 95% bootstrap interval of [−8, +15] that crosses zero. No presence premium. The placebo arm at 8% confirms the instrument does not credit redundant additions.

How often the page already answered the flagged concept

The reason the necessity rates are low and flat: in every state, most concepts’ own-intent queries were already answerable from the page as written. The flag tracks the words on the page, not what the page can answer.

Concept stateAlready answerable as-is
Flagged missing62%
Flagged thin57%
Already developed74%

Nearly two-thirds of flagged-missing concepts were already answerable. The missing label is a weak proxy for a real retrieval gap.

The controlled leave-one-out result

When we removed a concept the page demonstrably answered and added it back, creating a guaranteed gap, presence mattered. The contrast against deepening a thin concept is the one comparison that holds.

MovenNecessary
Close a controlled real gap (leave-one-out)6037%
Close a controlled real gap, when removal broke the answer2782%
Deepen a flagged-thin concept (for comparison)18610%

Real-gap minus deepening is +27 points, 95% bootstrap interval [+22, +49]. Removing a covered concept opened a real answerability gap only 45% of the time; the rest, the page answered through other content.

Among concepts the page genuinely could not answer

A cross-check on the natural arms, narrowed to concepts whose query the page could not fully answer as written. Here too the direction favours closing absences, though the natural missing cell is tiny, which is exactly why the controlled arm was needed.

Concept statenNecessary
Flagged missing944%
Flagged thin7924%
Already developed4531%

Directional support that real gaps matter more, on small numbers; the controlled leave-one-out arm is the load-bearing version of this test.

The pre-registered gates

GatePre-registered testResult
G1: presence premiummissing minus thin gap ≥ 25pp, CI > 0+7pp [−8, +15]FAIL
G3: placebo behavespadding a developed concept lifts ≤ 12%8%PASS
G4: instrument controlknown-answer cases scored correctly8/8, 16/16PASS
G5: corpus yield≥ 60 natural absent ablations36RELAXED
LOO: controlled gapreal-gap vs deepening gap, CI > 0+27pp [+22, +49]HOLDS

Confidence intervals are 95% bootstraps over per-topic means. G1 is the pre-registered premium test and failed. Natural absences were too scarce to meet G5 (36 against a 60 target), so it was relaxed and the controlled leave-one-out arm carries the decisive presence test.

The control battery

To rule out a blunt instrument, we ran the exact answer-and-judge machinery on hand-built cases with known answers.

Control typeExpectedCorrect
Positive (obvious lift)necessary8 / 8
Negative, already answerednot necessary8 / 8
Negative, irrelevant additionnot necessary8 / 8

The instrument detects necessity when it is there and rejects it when it is not, so the low rates above are a real property of the flagged set, not a measurement failure.

The corpus

45 real English explanatory pages, one per topic, across general, professional, and expert subjects, chosen to be narrow slices of broad topics. We ran 398 natural concept ablations plus 60 controlled leave-one-out tests, scored across roughly six thousand model calls. Answers were written by one neutral model (Qwen2.5-72B) and scored by a cross-family panel of three (Gemini 2.5 Flash, DeepSeek V4 Pro, GPT-4.1-mini); all non-Anthropic, since production runs an Anthropic model. This study follows the Depth Necessity Study, which tested the thin arm alone on a different corpus.

← OverviewMethodology →