Changelog29 June 2026

We audited our own tool, and it was over-flagging gaps

ContentGrapher exists to tell you which concepts your page is missing. So we held it to the same standard the product holds your content to, and measured how often it was right. The answer was uncomfortable: most of the time it told you to add a concept, that concept was already on the page. Here is what we found, how we measured it, and what we changed.

The findingAcross 793 live analyses, 87.6% of our “add this” recommendations pointed at a concept the page already covered. A safeguard shipped on 29 June took that figure to zero, and a second change corrected the coverage scores those false gaps had been quietly dragging down.

What was going wrong

When ContentGrapher reads a page, it lays out the concepts a strong explanation should contain, then checks which ones are present. The recommendations that fall out of that check are the heart of the product. The problem was that the “present or missing” decision was unreliable in one direction. It was marking concepts as missing when they were plainly on the page, and then telling you to add them.

This was not a handful of edge cases. We sorted every “add this” recommendation in production into three buckets, by whether the page could be shown to already cover the concept.

Where the recommendation pointed	Share
A concept already on the page (a quote from the page proved it)	87.6%
Of those, ones clearly and thoroughly covered	~69%
Genuinely missing from the page	8.5%
Ambiguous (mentioned, but no clean supporting quote)	~4%

The already-covered, genuinely-missing, and ambiguous rows are the three exclusive buckets and sum to 100%. The clearly-covered row is a subset of the first: the most indefensible recommendations of all. On pages that received any “add” recommendation, 93.5% had at least one false one.

“Add this” recommendations pointing at content already on the page

Before

0.0%

of these recommendations pointed at a concept the page already covered

After the fix

0.0%

the tool no longer recommends adding what it can prove is already there

Measured across 793 live English analyses (2,339 recommendations), with a 95% confidence interval of 86.2% to 88.8%. Recommendations recompute every time a report is opened, so the fix reached every existing analysis at once.

Why it happened

To decide whether a concept was on the page, the tool searched the page for the concept’s name. But that name is the tool’s own shorthand. It might call something “pricing tier” while your page says “free plan,” “$19 a month,” or “what you’ll pay.” When the exact phrase did not appear, the tool concluded the concept was absent, even when the page covered it thoroughly.

That single wrong call did real damage. It pulled down the page’s coverage score, and it generated an “add this” note for content that already existed. A reader following the advice would write something their page already said.

How we measured it

We did not want to fix this on a hunch, so we measured it first, and kept measuring. The discipline was the same one the product applies to your content: a number with a margin of error, checked against the evidence, not an assertion.

AA committed measurement. A read-only instrument reads our production analyses and counts how often an "add this" recommendation points at something the page already covers, reporting the rate with a 95% confidence interval.
BFresh samples, not stale numbers. After each change we re-ran the live tool over a fresh sample of about 150 pages and measured again, rather than trusting the figures from before the change.
CA gate per change. Each change had to clear a numbered target before we let ourselves start the next one. A change that missed its target did not ship.

What we changed

Two changes shipped on 29 June, in order, each measured before the next began.

1A safeguard on the recommendations. The tool now refuses to tell you to add a concept when it can point to the exact words on your page that already cover it. Because recommendations are recalculated every time a report is opened, this corrected every existing analysis the moment it shipped. The rate of "add this" recommendations for already-covered concepts went from 87.6% to zero.
2A correction to the score. When the name-matching step had wrongly written a concept off, the tool now keeps the model’s real reading of how well that concept is covered. This roughly halved how often a present concept was treated as a gap, from about 48% to about 23%, and lifted the average coverage score by about 0.12.

Concepts treated as gaps that the page provably already covered

Before

After the fix

Concept level, measured on a freshly run 150-page sample. Roughly halved.

About the scores going up

Correcting a systematic under-count means scores move up, and we want to be plain about why. The increase is the tool measuring more honestly, not your content changing. On our sample, the share of pages reaching the top coverage band rose from about 16% to about 45%, and the average score rose by about 0.12. Of the concepts that were upgraded, 99.8% carry a verified quote from the page, so the lift is backed by evidence on the page, not granted.

Past reports keep the scores they were given. The corrected measurement applies when you re-analyze a page, and the app shows a “scoring improved” note when a page’s history spans this update, so the jump reads as what it is: a better measurement of the same content.

Share of pages reaching the top coverage band

Before

After re-analysis

On the same 150-page sample, the average coverage score rose by about 0.12. Of the concepts that improved, 99.8% carry a verified quote from the page: a correction, not inflation.

What’s next

There is a deeper version of this still on the bench. Rather than matching a concept’s formal name, the tool would count the natural-language ways a page actually refers to it: synonyms, examples, and paraphrases. The two changes above already removed the visible problem and most of the score error, so this one is accuracy polish. It is specified, but not yet shipped.

Analyze a page →Our research →All changes