You can now read a controlled study showing that ContentGrapher's gap picks lift AI retrieval by 11 percentage points more than adding the same volume of content to parts the page already covers, on pages with five or more flagged gaps.
Until now, the case that filling flagged gaps actually improves AI retrieval rested on the underlying scoring math and small spot checks. There was no controlled evidence that the specific picks matter, separate from the general effect of adding more structural content to a page.
Two questions kept surfacing from users evaluating ContentGrapher: do the analyzer's picks beat random structural addition, and is the gap count itself meaningful? A study designed around a decoy arm, where the control treatment adds equal content to parts the analyzer said were already fine, is the only way to isolate the signal.
Coverage Score v3 stabilised the underlying instrument three days ago. With the scoring noise reduced, running a 40-page controlled test became worthwhile, because variation in the result is now signal rather than measurement drift.
The Decoy Study is live at contentgrapher.io/research/decoy-study. Forty third-party pages were each rendered in three versions (original, treatment, decoy), then run through a standard retrieval pipeline with 320 questions written by two AI personas blind to the page contents.
The headline finding is the split. On pages with five or more flagged gaps, the analyzer's specific picks lifted retrieval by 11.2 percentage points more than the decoy. On pages with two or fewer, the two arms moved retrieval by the same amount. This makes the gap count itself a signal: when ContentGrapher flags a lot of gaps, which gap you fill matters; when it flags few, the page is already close to complete and any structural addition helps about the same.
The study page links to the full methodology, the per-page data, and a NotebookLM demo. One page that did not fit the pattern is also reported in full.