Product
ContentGrapher
ContentGrapher is a structural completeness analysis tool for explanatory content. It maps the concept graph present in a page. It builds an explanation framework representing what the topic and audience require. It measures the coverage gap between the two.
Structural completeness is distinct from prose quality, keyword density, and competitive coverage. A page is structurally complete when its concept graph contains the entities and relationships a complete explanation requires, each developed at appropriate depth and connected explicitly to the others.
ContentGrapher models each required concept as an independently matchable explanatory unit. A page missing three of eight required concepts has three units with no content to match against, regardless of overall prose quality. ContentGrapher makes that coverage gap measurable as a coverage score: an ordinal 0โ3 scale per concept, aggregated into four named bands: Incomplete, Partial, Near-complete, and Complete. In the Findability Study, ContentGrapher identified concepts that warranted a dedicated page. When those pages were built and tested against 166 real search questions across 30 topics, AI retrieval rose from 4% to 84%. Without the recommended page, the same questions returned answers 4% of the time.
Origin
ContentGrapher was created by Daniel Cheung. The foundational concept originated in ContentGraph, an open-source prototype. ContentGraph proved the core analysis approach and is available for forking and adaptation.
ContentGrapher is the production evolution of that prototype. It adds three capabilities ContentGraph did not ship: audience specification, the boundary layer, and the re-analysis loop with delta view. Analysis is operator-funded; no API key is required from the user.
Key concepts
The degree to which a page fully explains what it claims to explain. Distinct from prose quality, keyword density, and competitive coverage. A page can rank well and be structurally incomplete; a page can be well-written and fail to cover the concepts its topic requires.
The map of entities and relationships present in a piece of content. Relationships are extracted as subject-verb-object triples, classified as explicit (directly stated) or implied (inferable from context). The observed map assesses content against eight explanatory dimensions: whatIsIt, howDoesItWork, whatDoesItDependOn, whatDoesItAffectOrProduce, whoInteractsWithIt, whatConstraintsMatter, whatAlternativesOrDistinctionsMatter, whatExampleGroundsIt.
A per-concept classification indicating how effectively a concept is developed and connected within the content. Four states: well_integrated, weakly_integrated, underexplained, naming_inconsistent.
A short phrase describing each concept's functional purpose in the explanation. Assigned to concepts in both the observed map and the explanation framework. Examples: "a prerequisite the reader needs first", "the mechanism that enables the anchor", "a constraint on when it applies", "a grounding example".
The set of concepts a complete explanation of a topic requires for a specific audience. Derived from topic inference and demand signals. Each concept carries a priority (essential, important, or useful) and a presence field (yes, partial, or no).
A per-concept score on an ordinal 0โ3 scale measuring the degree to which the observed concept graph matches the explanation framework, based on integration state and depth assessment. Aggregated into a page-level score falling into one of four named bands: Incomplete, Partial, Near-complete, or Complete. The Complete band threshold is 0.70. Coverage score is tracked across snapshots within a series, enabling a trend view across re-analyses.
A scope diagnosis applied during Phase 1. Classifies each concept by whether it belongs on this page: core (essential to the page's Primary Retrieval Role; must be fully covered here), supportive (relevant but subordinate; brief mention sufficient), adjacent (belongs on a separate, more focused page), excluded (out of scope; its presence weakens the retrieval signal). Also identifies the page's Primary Retrieval Role (PRR): explain (defines or explains what something is), guide (procedural how-to execution), compare (alternatives and trade-offs), evaluate (whether X is right for a given situation), convert (transactional, drives a next step). How consistently AI models make this judgment is measured in the Agreement Study, where eight models read the same 49 pages twice. Boundary calls clustered into three stable agreement bands that cut across the open-source and closed-model divide.
The four output types from Phase 2b: toAdd (concepts the explanation framework requires that are entirely absent from the content), toClarify (concepts present but classified as weakly_integrated or underexplained), toMakeExplicit (relationships present as implied edges that the framework requires to be explicit), sentenceGuidance (multi-concept sequences with worked examples showing how to connect the concepts in prose).
A structured page blueprint produced by Phase 2c. Specifies the recommended concept order, which concepts to treat as core versus supportive, and which adjacent concepts belong on separate pages.
A before-and-after comparison of two snapshots within a series. Shows which coverage gaps closed and which remain after content edits. Coverage score and concept presence are both tracked.
The set of analyses of the same URL under one account, grouped as a named unit. Snapshots are created within a series, not as standalone analyses. The re-analysis delta view compares two snapshots within a series.
Concept graph output
Analysis produces two distinct graph outputs. The observed map contains concepts and relationships extracted from the content. The explanation framework contains concepts the topic and audience require. Coverage is the intersection of the two.
Indigo nodes: well integrated in the content. Amber: weakly integrated. Red: underexplained. Grey: observed only, present but not structurally required. Orange dashed: gaps, required by the explanation framework but absent from content.
ContentGraph and ContentGrapher
ContentGraph and ContentGrapher run the same core analysis pipeline. ContentGrapher adds three capabilities ContentGraph did not ship.
Pipeline
Phase 1 โ Extract & Observe
Scrapes the URL, extracts main content, maps every concept present, assigns integration states, identifies relationships, and applies the boundary layer.
Phase 2a โ Explanation Framework
Builds the ideal concept set for the topic and audience. Assigns priority (essential, important, useful) and presence (yes, partial, no) per concept.
Phase 2b โ Writing Guidance
Generates toAdd, toClarify, toMakeExplicit, and sentenceGuidance output from the gap between Phase 1 and Phase 2a.
Phase 2c โ Content Architecture
Produces a page blueprint specifying concept order, scope assignments (core vs. supportive), and adjacent-concept routing.
Input: any publicly accessible URL. Analysis runs on frontier AI models.
Audience specification
ContentGraph infers the audience from content. ContentGrapher accepts an explicit audience definition before analysis runs. The explanation framework is then built for that specific audience: what they already know, what they are deciding, and what a complete explanation requires for them. The same page analyzed for a technical evaluator versus a first-time buyer produces different frameworks.
Boundary layer
ContentGrapher classifies each concept relative to the page's scope (core, supportive, adjacent, excluded), identifies the Primary Retrieval Role, and diagnoses whether the page mixes roles in ways that dilute its retrieval signal.
Re-analysis loop
ContentGraph is ephemeral: the session ends and history is lost. ContentGrapher persists named snapshots within a series (a set of analyses of the same URL) and produces a delta view after each re-analysis, showing which gaps closed and which remain. Coverage score is tracked over time.
Feature comparison
| Feature | ContentGraph | ContentGrapher |
|---|---|---|
| Concept graph (observed) | โ | โ |
| Integration states | โ | โ |
| Explanation framework | โ | โ |
| Writing guidance | โ | โ |
| Content architecture | โ | โ |
| Audience specification | โ | โ |
| Boundary layer (scope diagnosis) | โ | โ |
| Re-analysis + delta view | โ | โ |
| Analysis history | โ | โ |
| Shareable links | โ | โ |
| Markdown export | โ | โ |
| JSON export | โ | โ |
| Price | Free, open-source | 5 free; packs from $29 |
Limitations
The framework models retrieval behavior, not retrieval outcomes. The explanation framework represents what a complete explanation ideally requires. A structurally complete page is not guaranteed to be cited. The Decoy Study is our first controlled measurement of how filling flagged gaps changes AI retrieval. Across 40 third-party pages, filling gaps selected by ContentGrapher's recommendations outperformed random structural addition by 11 percentage points on pages with five or more flagged gaps.
Anchor misidentification propagates through both outputs. If the primary subject is misidentified, the observed map and explanation framework are built around the wrong topic. ContentGrapher surfaces the detected anchor and allows correction before analysis continues.
The framework reflects model training knowledge. The explanation framework is built using the model's topic knowledge. For fast-moving domains, the framework may be incomplete or out of date.
Relationship extraction favors explicit prose. Relationships implied in tables, lists, code blocks, or visual layouts may not be extracted or may be classified as implied rather than explicit. Content that encodes structure in non-prose formats may produce a less complete observed map than its actual coverage warrants.
Analysis targets explanatory content. ContentGrapher is designed for content whose purpose is to explain: definitions, guides, how-tos, comparisons, evaluations. It applies less directly to narrative, opinion, or transactional content.
Out of scope
ContentGrapher does not benchmark content against competing pages. Demand signals inform the explanation framework as inputs to topic coverage requirements, not as competitive coverage targets. Inclusion decisions are determined by what the explanation requires for the topic and audience.
ContentGrapher does not generate content, monitor AI-generated responses, predict rankings, or measure prose readability. Its output is structural: a concept map, a coverage gap, writing guidance specific to identified gaps, and a page architecture blueprint.