The Science — Papers & Research
The foundational GEO paper (KDD 2024), large-scale citation studies, LLM bias research, search volume decline data, and myth-busting with empirical evidence.
1. The Foundational Paper: GEO — Generative Engine Optimization
Authors: Aggarwal, Murahari, Rajpurohit, Kalyan, Narasimhan, Deshpande
Institutions: Princeton, Georgia Tech, The Allen Institute of AI, IIT Delhi
Venue: KDD 2024 (30th ACM SIGKDD Conference)
arXiv: 2311.09735
GEO-Bench: The Benchmark
| Parameter | Detail |
|---|---|
| Total queries | 10,000 (8K train / 1K validation / 1K test) |
| Query distribution | 80% informational, 10% transactional, 10% navigational |
| Domains | 25 categories |
| Sources per query | Top 5 Google search results |
| Data sources | 9 datasets (MS MARCO, ORCAS-1, Natural Questions, AllSouls, LIMA, etc.) |
Nine Optimization Strategies Tested
| Strategy | Visibility Change | Notes |
|---|---|---|
| Quotation Addition | +41% | Most effective overall |
| Statistics Addition | +32% | Quantitative > qualitative |
| Fluency Optimization | +28% | Readability improvements |
| Cite Sources | +27% | +115% for rank-5 sites |
| Technical Terms | +18% | Domain-specific terminology |
| Easy-to-Understand | +14% | Simplified language |
| Authoritative | +12% | Persuasive tone |
| Unique Words | +6% | Minimal impact |
| Keyword Stuffing | -9% | Harmful — traditional SEO tactic backfires |
The Democratization Effect
Lower-ranked websites benefit dramatically more from GEO than top-ranked ones:
| Google Rank | Strategy | Visibility Change |
|---|---|---|
| Rank 5 | Cite Sources | +115.1% |
| Rank 5 | Statistics Addition | +97.9% |
| Rank 1 | Cite Sources | -30.3% |
| Rank 1 | Statistics Addition | -20.6% |
Domain-Specific Effectiveness
| Domain | Most Effective Strategy |
|---|---|
| Debate / History | Authoritative, Quotation Addition |
| Science | Authoritative, Fluency Optimization |
| Business / Health | Fluency Optimization |
| Law & Government | Cite Sources, Statistics Addition |
| Facts / Statements | Cite Sources |
| People & Society | Quotation Addition |
| Opinion | Statistics Addition |
Best Combination
Fluency Optimization + Statistics Addition showed maximum combined performance: 5.5% improvement over best single-method. Cite Sources averaged 31.4% improvement when combined with other methods. Not all combinations are additive.
2. Large-Scale Citation Studies
Profound: 680 Million Citations
| Platform | #1 Source | Share | #2 Source |
|---|---|---|---|
| ChatGPT | Wikipedia | 7.8% | Reddit (1.8%) |
| Perplexity | 6.6% | YouTube (2.0%) | |
| Google AI Overviews | 2.2% | YouTube (1.9%) |
Cross-platform overlap is remarkably low: only 11% of domains are cited by both ChatGPT and Perplexity. Commercial domains (.com) account for 80.41% of all citations; nonprofit (.org) accounts for 11.29%.
Yext: 17.2 Million Citations
Analyzed citations across ChatGPT, Perplexity, Google AI Overviews, and Claude. Key finding: “There is no single AI optimization strategy” that works across all models. Each platform has fundamentally different citation patterns. Recommended: optimize per-platform rather than using a universal approach.
Ahrefs: 17 Million Citations
AI-cited content is 25.7% fresher than traditional search results. The top 30 domains capture 67% of all citations. Ahrefs found that content cited by AI tends to be more recently published and more frequently updated.
Brandlight: Traditional vs. AI Search
3. Search Volume Decline Data
| Source | Prediction / Finding |
|---|---|
| Gartner (Feb 2024) | Traditional search volume drops 25% by 2026 |
| Gartner (extended) | Organic traffic drops 50% by 2028 |
| Bain & Company | 60% of searches now end without a click (zero-click) |
| Ahrefs | AI Overviews reduce CTR for top pages by 58% |
| Apple | Safari had first-ever search decline (May 2025) |
| Google searches/user | Dropped 20% YoY in U.S. (2025) |
| AI chatbot traffic | 80.92% YoY growth |
| Consumer AI usage | 58% use AI for product recommendations (up from 25% in 2023) |
4. The a16z Perspective
“Visibility means showing up directly in the answer itself, rather than ranking high.”
Zach Cohen & Seema Amble, a16z (May 2025)
Key data points from a16z’s “How Generative Engine Optimization Rewrites the Rules of Search”:
| Metric | Value |
|---|---|
| SEO market foundation | $80 billion+ |
| AI search query length | 23 words avg (vs. 4 in traditional) |
| AI session depth | 6 minutes average |
| ChatGPT → Vercel signups | 10% of new signups |
| New success metric | "Reference rates" — how often cited in AI answers |
“How you're encoded into the AI layer is the new competitive advantage.”
a16z
5. LLM Bias and Stochasticity Research
LLM Whisperer (Carnegie Mellon, 2024)
449 prompts across 77 product categories, 1,000 responses per prompt. Found that subtle synonym replacements can increase brand mention likelihood by up to 78%. Semantically equivalent prompts produced absolute mention differences of 7.4% to 18.6%. Maximum variance: 100% (InstantPot from 0% to 100% between equivalent prompts).
Position Bias in LLM Recommendations
First-mentioned brands receive “direct-answer language” while later positions get “other options include” framing. Only 3–4 brands are cited per ChatGPT response vs. 13 for Perplexity, creating winner-take-all dynamics. Less than 1-in-100 chance of any platform producing the same recommendation list twice.
6. Myth-Busting
| Myth | Reality | Evidence |
|---|---|---|
| llms.txt files help AI find you | Zero evidence of any effect | No peer-reviewed study; no LLM provider has confirmed they use it |
| Schema markup increases citations | Improves accuracy, NOT frequency | Search Atlas: 748K queries show no frequency effect |
| Backlinks drive AI visibility | Weak/neutral correlation | Digital Bloom, Seer Interactive |
| Keyword optimization works for AI | Keyword stuffing is HARMFUL (-9%) | GEO paper (KDD 2024) |
| Fresh content helps | TRUE — strongest myth with evidence | 65% of AI hits on <1yr content; 3x boost for 14-day freshness |
7. Follow-Up Papers
| Paper | Key Finding |
|---|---|
| CORE (Jin et al., 2026) | 91.4% promotion success rate @Top-5 by targeting synthesis stage (not retrieval) |
| E-GEO (2025) | GEO signals diverge substantially from SEO signals in e-commerce contexts |
| Diagnosing Citation Failures | Asks WHY a document fails to be cited, rather than generic rewriting |
| Beyond Keywords (Content-Centric Agents) | End-to-end GSEO framework for post-keyword era |
| Kumar & Lakkaraju (2024) | Strategic text sequences can manipulate LLM recommendations (adversarial angle) |
8. Implications for Bitsy
Sources
- GEO Paper — Aggarwal et al. (KDD 2024)
- GEO Paper — ACM Digital Library
- CORE Paper — Jin et al. (2026)
- E-GEO Paper (2025)
- LLM Whisperer — CMU (2024)
- Profound — AI Platform Citation Patterns
- Ahrefs — 17 Million AI Citations Analysis
- Brandlight — SEO vs. GEO Overlap
- a16z — GEO Over SEO (May 2025)
- Gartner — Search Volume Decline Prediction
- Digital Bloom — 2025 AI Citation Report
- Genezio — Content Types Driving LLM Mentions
- Search Atlas — LLM Citation Behavior Comparison