The citation data from production sites
Across the active client sites running on the authority site system, factual comparison content earns more than 40% of AI citations — consistently, across Google AI Overviews, Perplexity, and ChatGPT browse.
That’s not a hypothesis. It’s measured from real sites, in real niches, over a 12-month period spanning the transition from traditional search rankings to AI-mediated answers.
Here’s what the data shows about what gets cited, and how to build for it.
What AI models are looking for
AI search engines aren’t searching your site the way Google’s crawler does. They’re retrieving and synthesising. The process is:
- Chunk — the content is broken into semantic chunks
- Retrieve — relevant chunks are retrieved for the query
- Synthesise — a response is generated, with citations where the source is clear
Content that performs well in this pipeline has three characteristics:
Direct-answer H2s. The heading is a question. The first paragraph under that heading is a direct answer. No preamble, no “in this section we’ll explore.”
Factual comparison. “X vs Y” structure, with clear criteria and verifiable claims. This is the content type AI models cite most because it’s the easiest to attribute to a specific source — the comparison is unique to that piece.
Named authorship. Citations require attribution. If there’s no named author — no Person schema, no byline — there’s nothing to attribute the citation to. Anonymous content gets used but rarely cited.
The schema that makes you citation-ready
Based on what’s working on production sites, four schema types directly correlate with citation frequency:
FAQPage schema. On every cluster page, with questions drawn from actual People Also Ask data. Format: @type: Question, name: [the question], acceptedAnswer: @type: Answer, text: [the answer]. The answer should be 40–100 words — direct, complete, and attributable.
Article schema with dateModified. AI models weight recency. Content with a dateModified in the last 12 months ranks higher in AI retrieval for time-sensitive queries. This means you need an annual refresh cadence — not just updating the date, but actually updating the content.
Person schema with sameAs. The sameAs property pointing to your LinkedIn, YouTube, Twitter, and any published books or speaking appearances. This builds entity authority — the signal that tells AI models the named author is a real person with verifiable expertise.
Speakable schema. Underused but effective. Marks specific passages as optimised for voice and AI retrieval. Works particularly well for FAQ sections and definition blocks.
The content architecture
The three-tier system produces citation-ready content by design:
Pillar pages are comprehensive. They earn citations for broad definitional queries — “what is [X]” type questions.
Cluster spokes are specific. They earn citations for comparison and how-to queries — “how does X compare to Y,” “what’s the best X for Y situation.”
Entity pages are factual. Pure information density with primary source citations. These earn citations for data queries — statistics, definitions, regulatory information.
The pattern that maximises citations: a cluster spoke written as a factual comparison, with FAQPage schema, a named author with Person schema, and a dateModified within the last 6 months.
What to build next
If you’re building for AI citations in 2026:
- Audit your existing content for direct-answer H2s. Rewrite headings to match question intent.
- Add FAQPage schema to every cluster page. Pull questions from PAA data.
- Build or update your author page with Person schema and sameAs properties.
- Establish an annual refresh calendar — schedule
dateModifiedupdates with substantive content changes. - Create factual comparison content for your top 3 topic clusters.
The Playbook covers all of this in Chapter 5, with the full schema templates and the content brief format I use on production builds.