ContentGrapher
ContentGrapher
research/findability-study/data
The findability studyJune 2026

The full data

Every table behind the study, including all 30 per-domain results, the classifier stability runs, and the complete results of the unpublished hardening run the study is built on.

Headline: all 30 pages, six conditions

ConditionRouting @0.6Findability @0.6 (cosine)Findability (judge)
Treatment83.1%84.2%68.3%
Treatment-narrow83.1%84.2%67.5%
Decoy0.0%3.9%16.9%
Addition-only83.1%84.2%68.3%
Random0.0%6.7%17.5%
Source-only (before)0.0%6.4%18.1%
Treatment

Routing @0.6

83.1%

Findability @0.6 (cosine)

84.2%

Findability (judge)

68.3%
Treatment-narrow

Routing @0.6

83.1%

Findability @0.6 (cosine)

84.2%

Findability (judge)

67.5%
Decoy

Routing @0.6

0.0%

Findability @0.6 (cosine)

3.9%

Findability (judge)

16.9%
Addition-only

Routing @0.6

83.1%

Findability @0.6 (cosine)

84.2%

Findability (judge)

68.3%
Random

Routing @0.6

0.0%

Findability @0.6 (cosine)

6.7%

Findability (judge)

17.5%
Source-only (before)

Routing @0.6

0.0%

Findability @0.6 (cosine)

6.4%

Findability (judge)

18.1%

Judge findability is the 3-temperature majority vote. The judge credits decoy hubs when residual source-page content partially answers a query, which is why its gap is narrower than the cosine gap. Both are reported; the claim uses cosine with the judge as corroboration.

Bootstrapped confidence intervals

10,000 resamples over per-source means. The pre-registered floor was a lower bound above 5pp.

DeltaMean95% CI
Routing delta (treatment minus decoy)83.1pp[75.8, 89.7]
Findability delta (treatment minus decoy, cosine)80.3pp[72.2, 88.1]
Treatment routing, absolute83.1%[75.6, 90.0]
Treatment findability, absolute84.2%[76.9, 90.8]
Routing delta (treatment minus decoy)

Mean

83.1pp

95% CI

[75.8, 89.7]
Findability delta (treatment minus decoy, cosine)

Mean

80.3pp

95% CI

[72.2, 88.1]
Treatment routing, absolute

Mean

83.1%

95% CI

[75.6, 90.0]
Treatment findability, absolute

Mean

84.2%

95% CI

[76.9, 90.8]

By retrieval role

RolenRouting @0.6Findability @0.6Decoy findability
explain879.2%79.2%4.2%
convert683.3%83.3%2.8%
guide691.7%97.2%11.1%
compare585.0%85.0%0.0%
evaluate576.7%76.7%0.0%
explain

n

8

Routing @0.6

79.2%

Findability @0.6

79.2%

Decoy findability

4.2%
convert

n

6

Routing @0.6

83.3%

Findability @0.6

83.3%

Decoy findability

2.8%
guide

n

6

Routing @0.6

91.7%

Findability @0.6

97.2%

Decoy findability

11.1%
compare

n

5

Routing @0.6

85.0%

Findability @0.6

85.0%

Decoy findability

0.0%
evaluate

n

5

Routing @0.6

76.7%

Findability @0.6

76.7%

Decoy findability

0.0%

The two roles admitted under relaxed corpus gates (compare, evaluate) sit inside the range of the others. The weakest role, evaluate at 76.7%, still beats its decoy by 73pp.

All 30 domains

Per-source treatment results at the strictest threshold. Adjacent-share is the density of belongs-elsewhere material on the source page. Note that it does not predict accuracy: the two 3% sources both hit 100% while the 39% and 41% sources sit at 67%.

DomainRoleWordsAdj-shareRoutingFindability
research.ibm.comexplain1,79022%100%100%
yoast.comconvert1,72916%100%100%
uniqode.comconvert1,78116%100%100%
intrepidtravel.comexplain3,20515%100%100%
veterinary.rossu.eduguide1,85112%67%100%
business.adobe.combuyerexplain2,66912%67%67%
vuejs.orgexplain1,50010%100%100%
docs.anthropic.combuyercompare1,24639%67%67%
wpic.coevaluate1,34314%33%33%
blog.hootsuite.combuyerguide3,4856%83%83%
techpp.comcompare2,3199%83%83%
remote100k.comevaluate1,5067%100%100%
aws.amazon.combuyerconvert1,22126%83%83%
semrush.combuyerguide3,1235%100%100%
joist.comcompare3,2375%75%75%
workday.comevaluate1,5203%100%100%
gov.ukguide1,27212%100%100%
uplead.comconvert1,33110%67%67%
rapidseedbox.comcompare5,0153%100%100%
assetpanda.comevaluate2,2945%100%100%
docs.aws.amazon.combuyerexplain1,47217%50%50%
vanquis.comconvert1,8117%67%67%
growthmarketing.studioguide1,7876%100%100%
dataally.aicompare2,0418%100%100%
teramind.coevaluate2,4305%50%50%
zendesk.combuyerexplain2,9358%50%50%
buffer.combuyerexplain1,3315%100%100%
capitalworldgroup.comexplain1,46941%67%67%
oxfordpartners.com.auconvert1,34616%83%83%
nextjs.orgguide1,37013%100%100%
research.ibm.com

Role

explain

Words

1,790

Adj-share

22%

Routing

100%

Findability

100%
yoast.com

Role

convert

Words

1,729

Adj-share

16%

Routing

100%

Findability

100%
uniqode.com

Role

convert

Words

1,781

Adj-share

16%

Routing

100%

Findability

100%
intrepidtravel.com

Role

explain

Words

3,205

Adj-share

15%

Routing

100%

Findability

100%
veterinary.rossu.edu

Role

guide

Words

1,851

Adj-share

12%

Routing

67%

Findability

100%
business.adobe.combuyer

Role

explain

Words

2,669

Adj-share

12%

Routing

67%

Findability

67%
vuejs.org

Role

explain

Words

1,500

Adj-share

10%

Routing

100%

Findability

100%
docs.anthropic.combuyer

Role

compare

Words

1,246

Adj-share

39%

Routing

67%

Findability

67%
wpic.co

Role

evaluate

Words

1,343

Adj-share

14%

Routing

33%

Findability

33%
blog.hootsuite.combuyer

Role

guide

Words

3,485

Adj-share

6%

Routing

83%

Findability

83%
techpp.com

Role

compare

Words

2,319

Adj-share

9%

Routing

83%

Findability

83%
remote100k.com

Role

evaluate

Words

1,506

Adj-share

7%

Routing

100%

Findability

100%
aws.amazon.combuyer

Role

convert

Words

1,221

Adj-share

26%

Routing

83%

Findability

83%
semrush.combuyer

Role

guide

Words

3,123

Adj-share

5%

Routing

100%

Findability

100%
joist.com

Role

compare

Words

3,237

Adj-share

5%

Routing

75%

Findability

75%
workday.com

Role

evaluate

Words

1,520

Adj-share

3%

Routing

100%

Findability

100%
gov.uk

Role

guide

Words

1,272

Adj-share

12%

Routing

100%

Findability

100%
uplead.com

Role

convert

Words

1,331

Adj-share

10%

Routing

67%

Findability

67%
rapidseedbox.com

Role

compare

Words

5,015

Adj-share

3%

Routing

100%

Findability

100%
assetpanda.com

Role

evaluate

Words

2,294

Adj-share

5%

Routing

100%

Findability

100%
docs.aws.amazon.combuyer

Role

explain

Words

1,472

Adj-share

17%

Routing

50%

Findability

50%
vanquis.com

Role

convert

Words

1,811

Adj-share

7%

Routing

67%

Findability

67%
growthmarketing.studio

Role

guide

Words

1,787

Adj-share

6%

Routing

100%

Findability

100%
dataally.ai

Role

compare

Words

2,041

Adj-share

8%

Routing

100%

Findability

100%
teramind.co

Role

evaluate

Words

2,430

Adj-share

5%

Routing

50%

Findability

50%
zendesk.combuyer

Role

explain

Words

2,935

Adj-share

8%

Routing

50%

Findability

50%
buffer.combuyer

Role

explain

Words

1,331

Adj-share

5%

Routing

100%

Findability

100%
capitalworldgroup.com

Role

explain

Words

1,469

Adj-share

41%

Routing

67%

Findability

67%
oxfordpartners.com.au

Role

convert

Words

1,346

Adj-share

16%

Routing

83%

Findability

83%
nextjs.org

Role

guide

Words

1,370

Adj-share

13%

Routing

100%

Findability

100%

“buyer” marks the 8 sources admitted under the buyer-recognition gate. aws.amazon.com and docs.aws.amazon.com share a root domain by accepted exception; they are distinct properties with different roles.

Classifier stability runs

Mean Jaccard between rerun belongs-elsewhere sets0.615 (floor: 0.7)
Sources below the floor7 of 10
Best source (concept membership)Jaccard 1.0 across all three pairs
Worst sourceJaccard 0.281
Destination-name (slug) stability0.368
Top-3 destination set identical across reruns2 of 10
Mean Jaccard between rerun belongs-elsewhere sets
0.615 (floor: 0.7)
Sources below the floor
7 of 10
Best source (concept membership)
Jaccard 1.0 across all three pairs
Worst source
Jaccard 0.281
Destination-name (slug) stability
0.368
Top-3 destination set identical across reruns
2 of 10

10 sources, 3 fresh classifier reruns each, identical inputs. The published claim attaches to the modal belongs-elsewhere list across reruns, per the pre-registered policy described in the methodology.

Embedding portability

Measurement re-run on the 10 cleanest-signal sources, destination prose unchanged, fresh embeddings and indexes per model. Floor: positive routing delta on at least 7 of 10 sources per model.

ModelRouting deltaFindability deltaPositive sources
text-embedding-3-large (baseline)96.7pp97.5pp10/10
text-embedding-3-small95.0pp91.7pp10/10
bge-m3 (open source, local)98.3pp59.2pp9/10
text-embedding-3-large (baseline)

Routing delta

96.7pp

Findability delta

97.5pp

Positive sources

10/10
text-embedding-3-small

Routing delta

95.0pp

Findability delta

91.7pp

Positive sources

10/10
bge-m3 (open source, local)

Routing delta

98.3pp

Findability delta

59.2pp

Positive sources

9/10

voyage-3-large was specified and not run (no API key was provisioned). bge-m3's findability delta is compressed by the fixed 0.6 threshold being effectively stricter in its cosine space; its routing delta is the largest of the three.

Real-page sidecar (teramind.co)

Four source pages where the recommendation mapped to a page that already exists on the site. Each cell shows purpose-written destination / real existing page.

Source pageQueriesRouting @0.6Findability @0.6
/blog/how-to-detect-shadow-ai2100% / 50%100% / 100%
/blog/pros-and-cons-of-employee-monitoring450% / 0%50% / 0%
/blog/insider-threats667% / 33%100% / 83%
/blog/ai-usage-control2100% / 0%100% / 0%
Mean1479% / 21%88% / 46%
/blog/how-to-detect-shadow-ai

Queries

2

Routing @0.6

100% / 50%

Findability @0.6

100% / 100%
/blog/pros-and-cons-of-employee-monitoring

Queries

4

Routing @0.6

50% / 0%

Findability @0.6

50% / 0%
/blog/insider-threats

Queries

6

Routing @0.6

67% / 33%

Findability @0.6

100% / 83%
/blog/ai-usage-control

Queries

2

Routing @0.6

100% / 0%

Findability @0.6

100% / 0%
Mean

Queries

14

Routing @0.6

79% / 21%

Findability @0.6

88% / 46%

Illustrative only: 4 sources, 14 queries, operator-judged mappings. Where the real page genuinely covers the moved concept it reaches 83% to 100% findability; where the mapped concepts were product-specific, marketing-written pages lost to purpose-written explainers.

The unpublished hardening run, in full

The study's direct predecessor ran the same six-condition design on 10 sources and was held back from publication for the reasons documented in the methodology. Its complete headline table is published here for the record, because the published study's design claims only make sense against it.

Condition (n=10)Routing @0.6Findability @0.6 (cosine)Findability (judge)
Before (source only)n/a10%37%
Treatment92%95%75%
Treatment-narrow92%95%75%
Decoy0%12%33%
Addition-only92%95%77%
Random0%13%35%
Before (source only)

Routing @0.6

n/a

Findability @0.6 (cosine)

10%

Findability (judge)

37%
Treatment

Routing @0.6

92%

Findability @0.6 (cosine)

95%

Findability (judge)

75%
Treatment-narrow

Routing @0.6

92%

Findability @0.6 (cosine)

95%

Findability (judge)

75%
Decoy

Routing @0.6

0%

Findability @0.6 (cosine)

12%

Findability (judge)

33%
Addition-only

Routing @0.6

92%

Findability @0.6 (cosine)

95%

Findability (judge)

77%
Random

Routing @0.6

0%

Findability @0.6 (cosine)

13%

Findability (judge)

35%

Supporting results from that run: sign test 58 of 58 non-tied wins (p = 3.47e-18); chunk-size sweep at 256, 512, and 1024 tokens identical on the tested source; judge internal consistency 94% 3-of-3; cross-model judge calibration 93% agreement on 30 borderline cases.

Hardening run vs the published study

MeasureHardening run (unpublished)The findability study
Sources1030, role-balanced
Routing accuracy @0.6 (treatment / decoy)92% / 0%83.1% / 0.0%
Findability @0.6, cosine (treatment / decoy)95% / 12%84.2% / 3.9%
Findability, judge (treatment / decoy)75% / 33%68.3% / 16.9%
Findability delta+83pp (no CI at n=10)+80.3pp, 95% CI [72.2, 88.1]
Sign test58/58 wins, p = 3.47e-18164/164 wins, p = 4.3e-50
Addition-only vs treatmentidentical (removal decorative)identical (reproduced)
Random arm vs decoyindistinguishableindistinguishable (reproduced)
Embedding models13
Classifier stabilitynot measuredJaccard 0.615, 7/10 below floor
Sources

Hardening run (unpublished)

10

The findability study

30, role-balanced
Routing accuracy @0.6 (treatment / decoy)

Hardening run (unpublished)

92% / 0%

The findability study

83.1% / 0.0%
Findability @0.6, cosine (treatment / decoy)

Hardening run (unpublished)

95% / 12%

The findability study

84.2% / 3.9%
Findability, judge (treatment / decoy)

Hardening run (unpublished)

75% / 33%

The findability study

68.3% / 16.9%
Findability delta

Hardening run (unpublished)

+83pp (no CI at n=10)

The findability study

+80.3pp, 95% CI [72.2, 88.1]
Sign test

Hardening run (unpublished)

58/58 wins, p = 3.47e-18

The findability study

164/164 wins, p = 4.3e-50
Addition-only vs treatment

Hardening run (unpublished)

identical (removal decorative)

The findability study

identical (reproduced)
Random arm vs decoy

Hardening run (unpublished)

indistinguishable

The findability study

indistinguishable (reproduced)
Embedding models

Hardening run (unpublished)

1

The findability study

3
Classifier stability

Hardening run (unpublished)

not measured

The findability study

Jaccard 0.615, 7/10 below floor

Reading the drop from 95% to 84%. The published study's absolute numbers are lower than the hardening run's because the corpus tripled and was forced to include the page types the original gates excluded. That is the point of the exercise: the hardening run's 95% was measured on a corpus selected for exactly the anatomy the signal works best on. The 84% with a confidence interval is the number we are willing to publish; every structural pattern (removal decorative, random arm at baseline, treatment-narrow indistinguishable) reproduced exactly at triple the sample.

← Back to the studyRead the methodology