The language studyData

The data

Per language, averaged over 12 topics

The reported completeness score, the same score with the score’s name-check neutralised, the independent reviewer accuracy, and the share of concepts the analysis named in English.

Language	Reported	Gap vs EN	Model saw	Score counted	Accuracy	EN-named
English	0.60	—	58%	47%	2.03	17%
French	0.41	−0.19	65%	18%	2.14	69%
German	0.45	−0.15	62%	29%	2.11	42%
Japanese	0.37	−0.23	68%	16%	2.92	74%
Korean	0.37	−0.23	68%	11%	2.81	77%

“Model saw” is the share of concepts the analysis judged well-connected; “Score counted” is the share that survived the name-check. The gap between them is the broken step. Round-trip control (English → Japanese → English): score gap +0.01 [−0.03, 0.05]. n = 12 topics, three runs each.

Per topic, reported score by language

Each row is one page, scored in all five languages on identical content. The non-English score is lower in nearly every cell; the few exceptions (CBT, one Japanese run) show the effect is a strong tendency, not a fixed penalty.

Topic	Tier	EN	FR	DE	JA	KO
Compound interest	general	0.64	0.38	0.59	0.40	0.40
Photosynthesis	general	0.42	0.23	0.40	0.27	0.32
Cognitive behavioral therapy	general	0.63	0.66	0.61	0.54	0.70
Cancer	general	0.69	0.48	0.57	0.24	0.31
DNS	professional	0.64	0.35	0.39	0.36	0.43
Load balancing	professional	0.65	0.31	0.44	0.69	0.28
Kubernetes	professional	0.68	0.43	0.53	0.31	0.44
Scrum	professional	0.61	0.51	0.52	0.32	0.32
DNA	expert	0.68	0.56	0.08	0.27	0.41
Bayes' theorem	expert	0.47	0.32	0.47	0.32	0.35
Complementary medicine	expert	0.47	0.30	0.45	0.46	0.19
Cellular respiration	expert	0.63	0.40	0.34	0.32	0.34

The DNA row in German (0.08) is an extreme case of the same mechanism plus a one-off analysis hiccup on that run; the pattern is consistent across the rest. Tiers are general, professional, and expert.

← Overview Methodology →