Definition
What does this page mean by "what AI systems cite"?
This page is a synthesis of three public assets: the original Russian-language citation audit on vc.ru,1 the English LinkedIn version for an international AEO and GEO audience,2 and a separate Runet market analysis built on 150 million links collected from six AI services.3 Together they answer a founder-level question: what makes a page retrievable, quotable, and reusable inside ChatGPT, Alice, Perplexity, Google AI Overviews, Claude, and Gemini?
The short answer is practical. AI systems reward pages that solve a broad commercial question on a trusted surface, state the answer early, and package the answer into reusable chunks such as question-led sections, tables, checklists, and source-backed statistics.1, 2 This is a different optimization target from classic SEO alone. The English source makes that explicit by framing retrieval as chunk selection rather than site-wide favoritism: the model picks dense, relevant tokens, not the prettiest domain homepage.2
| Signal | What the source showed | Why it matters |
|---|---|---|
| Audit size | 158 publications reviewed, with 78 audited across 28 criteria.1, 2 | Large enough to compare platform, topic, age, and structure patterns instead of one-off anecdotes. |
| Platform effect | vc.ru reached a 52% citation rate in the sample; same-author copies on a corporate blog and Medium were 0%.1, 2 | Off-site authority can matter more than generic "good writing" on a weak surface. |
| Age effect | Articles older than two months were cited 43% of the time, versus 15% at one month and 7% for fresh pages.2 | AI visibility needs indexing and trust accumulation time. |
| Structure effect | Question-form H2s lifted citation by 19%, and comparison tables or checklists added another 18% in the sample.2 | Reusable answer units are easier for retrieval systems to extract and reuse. |
| Market scale | The partner dataset processed 150 million links from six AI services.3 | AI traffic is not a toy signal. It is large enough to shape where brands publish and measure visibility. |
Methodology
What did the 158-publication audit actually test?
The original audit came from an internal Humanswith.ai experiment that asked a concrete question: why would one article get cited by ChatGPT, Perplexity, Claude, Google AI Overviews, and other systems while a near-duplicate on another surface remained invisible?1, 2 The team reviewed 158 publications, audited 78 of them across 28 criteria, and then compared citation behavior across the major systems in the sample.1, 2
The audit did not treat "citation" as a vague vibe. It compared topic framing, surface authority, publication age, section structure, use of facts, table or checklist presence, and heading format. That matters because it separates two different problems. One is writing clarity. The other is retrieval fitness. The sources argue that AI citation depends more on the second problem than most content teams assume.1, 2
For a founder or CMO, that distinction matters. You should treat the audit as directional evidence about what makes a business page reusable inside AI answers, not as proof that one platform will always beat another in every market.
Findings
What patterns explained citation better than generic writing quality?
The most important finding in both the Russian and English sources is almost awkwardly simple: writing quality alone did not explain citation outcomes well.1, 2 In the sample, cited vc.ru articles averaged a quality score of 60.4, while uncited articles averaged 62.5.1, 2 The ignored articles were not dramatically worse. In formal terms they were slightly better. Citation still went elsewhere.
The sources reduce that result to a five-level hierarchy: topic first, platform second, age third, high-impact text structure fourth, and minimum structural threshold fifth.1, 2 Remove a high-level multiplier and the lower-level craft improvements do not rescue the page.
Working formula from the public English adaptation: right topic × authoritative platform × 2+ months of age × question-based headings × minimum structure threshold (FAQ + lists + tables + 1500+ words) = LLM citation.2
Cross-engine reading
What did the English LinkedIn version add for non-Russian readers?
The English adaptation turned the field study into a clearer model of retrieval behavior. It states that the LLM does not pick websites in the abstract; it picks chunks that survive retrieval and Top-K selection.2 That framing helps explain why broad commercial topics, question-form headings, and self-contained sections outperform dense but poorly packaged material.
The English version also adds an engine-by-engine reading that the Russian article only hints at. Google AI Overviews still inherit part of classic Google authority, citing roughly 62% of domains that already rank in traditional results.2 Perplexity is described as stricter about sources and primary references, while Claude is portrayed as more conservative and more likely to lean toward established publishers and well-structured documentation.2 Those differences do not erase the core pattern. They show where the same article structure needs stronger sourcing or a different distribution surface.
For this page, that matters because the target audience is not only a Russian-language GEO operator. It is also an English-speaking founder, CMO, or operator who needs one coherent explanation of why citation readiness depends on structure, not on "AI-ready" buzzwords.
Market picture
What does the 150M-link Runet dataset change about the market picture?
The 150M-link source is different from the two citation-audit assets and should be treated differently.3 It is a partner-dataset market analysis, not a first-party product claim. Gregory Shevchenko introduces it as work from GPTfox and explicitly frames it as a dataset Humanswith.ai uses in Russian-market projects rather than a tool to promote on the personal site.3
Its value is scale. The dataset covers 150 million links extracted from real answers across six AI services, which means it can show which categories and surfaces already absorb AI traffic at market level.3 In the public write-up, financial aggregators, review platforms, and strong UGC surfaces appear repeatedly. The article also argues that vc.ru keeps outrunning classic editorial media in many GEO-style comparisons, which supports the platform-authority conclusion from the 158-publication audit.3
The correct use of this dataset is strategic. It helps answer where Russian-language AI traffic already clusters and which third-party surfaces deserve distribution effort. The incorrect use is to cite it as if it were a neutral product benchmark for Humanswith.ai itself. This page keeps that caveat visible on purpose.3
Implications
What should founders and CMOs do with these findings?
The immediate lesson is not "write more content." It is "publish the right answer on the right surface, then give it time to become retrievable." The sources point to five operating rules that are practical enough to reuse.1, 2, 3
- Pick a broad business question before you pick a narrow vertical angle.
- Publish first on surfaces that already carry trust in your market, then connect that authority back to the first-party site.
- Make each H2 behave like a user question and make the next paragraph answer it directly.
- Package evidence into tables, checklists, and dated source-backed statements instead of burying it in narrative paragraphs.
- Measure citation, not only traffic, and allow at least a two-month window before calling the page dead.2
On this site, that is why the research archive and the writing archive exist as first-party citation targets, while public platform assets on vc.ru and LinkedIn remain visible as supporting authority surfaces rather than hidden "off-site" work.
Caveats
Which caveats matter before anyone turns this into a universal law?
Three caveats belong next to every citation claim on this topic. First, the 158-publication audit is a real field study, but it is still sample-bound.1, 2 Second, the English adaptation adds cross-engine interpretation, which is useful but should not be mistaken for a fully controlled benchmark.2 Third, the 150M-link Runet article is a partner-dataset market analysis, so its strategic value is strongest when you use it to understand category-level traffic patterns, not to borrow someone else's product positioning.3
Those caveats do not weaken the case for AEO and GEO. They make the case cleaner. A useful research page should tell readers which claims are first-party findings, which claims are synthesis, and which claims depend on partner data.
Research method
How this page was assembled
This page is a founder-authored synthesis, not a new net-new experiment. I used three source files from the local authority knowledge base, checked each numerical claim against those files, and kept the public-source caveats visible inside the body and source list. The goal was to create one page that can answer five recurring questions: definition, findings, methodology, implications, and caveats.
- Primary source 1: the original Russian-language vc.ru publication on the 158-publication citation audit.
- Primary source 2: the public English LinkedIn adaptation that expands the retrieval explanation and adds cross-engine commentary.
- Supporting source 3: the vc.ru article on 150 million Runet AI links, clearly labeled here as partner-dataset evidence.
Sources
References and source notes
We analyzed 158 publications to understand what ChatGPT and Alice cite.
Russian-language original field study. Best source for the base sample, platform effect, and early practical formula.
Source 2 · LinkedIn public English versionWe audited 158 articles to find out what ChatGPT actually cites.
Best source for the English-language explanation of retrieval logic, age curve, structure lifts, and cross-engine reading.
Source 3 · VC.ru partner-dataset market analysisHow AI traffic in Runet actually works: analysis of 150 million links.
Use as market evidence with caveat. This is partner data introduced by Gregory Shevchenko, not a product endorsement page.
FAQ
Frequently asked questions
Q: Does better writing alone get a page cited by AI systems?
A: No. The audit's core surprise is that generic writing quality did not explain citation as well as topic framing, platform authority, publication age, and answer-ready structure.1, 2
Q: How long should a team wait before judging a new page?
A: The public English source reports 43% citation for 2+ month pages, versus 15% at one month and 7% for fresh pages.2 That does not mean every page will wait two months, but it does mean that one week is the wrong evaluation window.
Q: Should founders treat the 150M-link article as a product proof page?
A: No. Treat it as partner-dataset market evidence about where AI traffic clusters in the Russian-language market.3 That is why this page repeats the caveat instead of turning the source into a sales proxy.
Q: What internal pages should I read next on this site?
A: Start with the research archive for the authority layer and the writing archive for the broader editorial map. The speaking archive adds public talks and decks that reinforce the same topic set.
Q: Why do you recommend six FAQ questions?
A: Six is a practical baseline: it gives you multiple reusable answer chunks, covers objections, and increases the odds that one answer matches a prompt. Use fewer if you genuinely have fewer questions—do not pad with filler.
Q: Should FAQ answers cite sources?
A: When you make factual or comparative claims, yes. Keep a visible Sources section with links to the exact pages behind the claims, and keep the visible FAQ aligned with the FAQ schema when you update the page.
Related pages