What ChatGPT actually searches for: 5 million fanout queries analyzed

A new study by Peec AI, based on 5 million query fanouts collected across ChatGPT, Perplexity, and Grok between April 1 and April 21, 2026, reveals that the AI platforms running behind popular chatbot interfaces do not simply search for what users type. According to Peec AI, ChatGPT consistently rewrites user queries before executing them - injecting words like "best," "reviews," and the current year even when those terms never appeared in the original prompt.

What the study measured

The data set behind the research covers 5 million query fanouts collected over a 21-day period. Tomek Rudzki, a GEO (generative engine optimization) expert at Peec AI, published the findings on May 5, 2026, in a post on the company's blog.

The term query fanout refers to the set of additional searches an AI system runs behind the scenes after receiving a single user prompt. When a user asks ChatGPT "what are the best project management tools?", according to Peec AI, the model does not execute that one query alone. It generates multiple sub-queries covering comparisons, reviews, specific brands, and the current year, then synthesizes the results from all of them into a single answer. These background searches are the fanouts.

The mechanics matter because, as Peec AI notes, ChatGPT uses Reciprocal Rank Fusion (RRF) - an algorithm that combines relevance scores across multiple sub-queries. Content that surfaces across several fanout searches receives a higher composite score than content that answers only one. That scoring dynamic has direct consequences for which pages get cited in AI responses and which do not.

Researcher Metehan Yesilyurt at Peec AI discovered that ChatGPT uses the RRF algorithm to merge results from parallel sub-queries, meaning a page that ranks for only one angle of a topic earns a lower aggregate score than a page covering several. Understanding the specific terms ChatGPT tends to inject therefore gives content teams a structural advantage: they can ensure their pages address the angles the model is most likely to search for.

The top 10 injected words

According to Peec AI's analysis of fanout data from April 1-21, 2026, the ten most frequently injected words - those appearing in fanout queries despite being absent from the original user prompt - ranked as follows:

"best" appeared in 15.33% of affected ChatGPT responses, followed by "what" at 8.72%, "review(s)" at 6.84%, "2026" at 5.44%, "top" at 5.24%, "comparison" at 4.48%, "vs" at 4.27%, "company/companies" at 4.02%, "service(s)" at 3.99%, and "software" at 3.47%.

The list is dominated by commercial and evaluative terms. Six of the ten - best, reviews, top, comparison, vs, and software - are vocabulary typical of purchase research, competitive analysis, or product evaluation. Only "what" and the year modifier sit outside that cluster.

Why "best" wins: the listicle mechanism

The word "best" leads the rankings by a wide margin. According to Peec AI, for advice-style questions - prompts framed as "Should I use X?" or "Do you recommend Y?" - ChatGPT injects the word "best" in 24.3% of fanouts. That is roughly one in every four responses to advisory queries.

The practical implication is significant. Listicle formats built around phrases like "10 Best..." or "Top 7..." are structurally aligned with what the model is actually querying, regardless of how the user framed the original question. This is why, according to the Peec AI research, pages from review aggregators and best-of compilations consistently appear as cited sources in ChatGPT answers. The model is constructing queries that retrieve exactly those formats.

Rudzki identified four-word phrase patterns in the original user prompts that tend to trigger this behaviour. Queries in which someone asks for a recommendation about what to buy, which service to use, or how to choose between options are among the most reliable triggers for "best" injection, according to Peec AI's analysis of prompt-level four-gram patterns.

This finding has a direct connection to the economics of AI-visible content. Sites operating in the G2, Gartner, Clutch, and Forbes category - platforms built specifically around ranked product comparisons - benefit from this injection behaviour regardless of whether users ask for a ranked list. The model searches for one anyway.

The listicle strategy and its risks were examined by SEO analyst Lily Ray in May 2026, who argued that the tactic works but is being eroded as AI systems identify and discount content produced specifically to game citation mechanics. The Peec AI data explains, at a mechanistic level, why the strategy has been so effective in the first place.

"Reviews" as a hidden validation layer

The third most injected word is "review(s)" at 6.84%. According to Peec AI, ChatGPT looks for reviews even when users never ask for them. A query like "what are the best tools for X?" triggers internal review-hunting sub-queries behind the scenes.

Rudzki examined this behaviour using a sample project for the fintech brand Revolut. He found two sources driving how AI described the brand: Glassdoor with a 4.87 out of 5 rating - strongly positive - and Sitejabber with 1.3 out of 5, an outlier that contained multiple reviews flagged as suspicious. Most brand monitoring programmes focus on major platforms. But according to Peec AI, smaller niche review sites get read by AI too, and their influence on how a brand is described in AI responses can be disproportionate to their traffic or reputation.

For SEO and GEO teams, the finding opens a question that goes beyond on-site content: what third-party review sites is ChatGPT citing, and what do those pages say about the brand? The answer is not always obvious. Tools like Peec AI, Profound, WAIKAY, and others have emerged specifically to track brand visibility in AI-generated answers, a category that barely existed 18 months ago.

The freshness signal: why "2026" appears in 5.44% of queries

LLMs surface fresh content because their training data has a cutoff, and they use web retrieval to compensate. According to Peec AI, ChatGPT adds the current year to its fanout queries in 5.44% of prompts. Grok does this even more frequently.

The consequence is straightforward. A page that has not been updated recently will underperform against a newer version of nominally the same content, all other things being equal. According to Peec AI, the pages most worth updating are those already serving as sources for AI responses - refreshing an article already in circulation delivers more lift than publishing a new article with no existing citation record.

The year-modifier finding also explains why content marked as "updated for 2026" or published in early 2026 tends to appear more often in AI responses to time-sensitive queries. ChatGPT is effectively filtering for recency as a proxy for relevance.

How ChatGPT, Perplexity, and Grok differ

The most visible structural difference between the three platforms is volume. According to Peec AI's data, Perplexity averages 1.4 fanouts per query, ChatGPT averages 2.1, and Grok reaches 6.8 - more than triple ChatGPT's rate.

Volume alone does not capture the differences in strategy. Each platform treats the fanout differently.

Perplexity, according to Peec AI, takes a minimal approach: it commonly strips the user's query down to its core, removing filler language, and runs a cleaner version of the same search. No new angles are added. No additional context is injected. For anyone trying to understand what Perplexity is optimizing for, this approach offers very little signal to act on.

ChatGPT stays close to the original intent but injects specific brand names, product comparisons, and the vocabulary described above. According to Peec AI, a question like "What CRM has the best support for syncing data?" becomes, inside ChatGPT's search layer, something closer to "CRM integration sync capabilities - Salesforce, HubSpot, Microsoft Dynamics comparison."

Grok operates differently again. According to Peec AI, it treats queries like research briefs. It starts broad, adds year modifiers (2025, 2026), then generates brand-versus-brand comparisons, then targets specific trusted sources using the site: operator. A single query about the best dash cam can produce between five and eight fanouts that collectively map the full purchase decision journey.

The Grok approach is distinctive in another way: it explicitly names the sources it trusts. According to Peec AI, Grok uses the site: operator in 18.3% of all chats. Reddit appears in 10.5% of all Grok chats, and 9 out of 10 of those are structured as site:reddit.com searches - a deliberate directive to pull community opinions, not a passing mention. Wirecutter and Consumer Reports appear in Grok's fanouts too, and according to Peec AI, 100% of those appearances were injected by Grok itself; they were not present in the original user prompt.

This explicit trusted-source targeting has no direct equivalent in ChatGPT's documented behaviour. It means that for Grok specifically, being cited by Reddit, Wirecutter, or Consumer Reports is not just a general authority signal - it is a pathway into a model that actively sends traffic to those domains by name.

Context for the marketing community

The PPC Land coverage of Google's AI Mode has documented how Google's own system employs what the company calls a "query fan-out technique," breaking user questions into dozens of sub-queries executed simultaneously. Google's VP of Product Robby Stein explained this mechanism publicly on October 30, 2025. Google's fan-out count, according to data shared by Nectiv co-founder Chris Long via LinkedIn, sits at around 9 fanouts in Gemini 3 - significantly higher than ChatGPT's 2.1 average.

The Peec AI research provides rare empirical grounding to what has largely been anecdotal optimisation advice. Cyrus Shepard's analysis of 54 experiments published on May 7, 2026, placed URL accessibility and search rank at the top of a scored list of AI citation factors - findings that align with the Peec AI dataset in one key respect: AI models are still querying the web and surfacing ranked results. What the fanout data adds is specificity about which additional terms get appended, meaning teams optimising only for the literal keyword a user typed are covering less than half the actual query space.

Google's AI search guide, published on May 15, 2026, explicitly rated certain fan-out targeting tactics as counterproductive, cautioning against creating large volumes of content aimed at individual sub-queries. The Peec AI data suggests the more productive application of fanout research is topical coverage - ensuring a single strong piece addresses the cluster of terms ChatGPT is likely to inject, rather than attempting to build separate content for each possible sub-query.

The "software" entry at 3.47% in the injected-word table prompted discussion among practitioners following the study's publication. Austin S., listed as Head of Organic and AI Search at Amazon Ads, noted in comments on the LinkedIn post sharing the study that the term suggests an inherent B2B skew in ChatGPT's fanout vocabulary. Rudzki confirmed in replies that "software" appears even in fanouts for prompts phrased as "best ways to do..." or "how can I..." - contexts where a software recommendation was not obviously implied by the user's question. That pattern suggests ChatGPT's training data has a strong association between advice-seeking queries and software product categories, regardless of whether the question was commercial in nature.

How to observe fanouts directly

Peec AI outlines three methods for accessing fanout data on specific prompts.

The manual method involves opening an AI search engine, pulling up the browser's network tab, and watching API calls fire as the model generates its answer. The fanout queries appear inside those requests. It works but operates one query at a time.

The second approach uses the Peec AI dashboard, where the "Latest Fanout Queries" section for any tracked prompt shows what ChatGPT, Grok, and Perplexity searched for in generating the response, without requiring manual inspection of network traffic.

The third approach uses the Peec AI MCP (Model Context Protocol) integration, which Peec AI describes as an official partnership with Claude. According to Peec AI, connecting the MCP to a Claude instance allows content teams to pull batches of fanout data across tracked prompts and use Claude to identify common themes across the injected terms.

The iPullRank AI search manual, released on August 29, 2025, covered fan-out techniques as one of its core chapters, establishing fanout optimization as a recognized discipline rather than an experimental tactic. The Peec AI dataset, derived from 5 million real queries rather than constructed experiments, provides the kind of scale that prior fanout research had lacked.

The study adds a quantitative dimension to a body of qualitative observation that has been accumulating since AI search became a mainstream channel. The 15.33% rate for "best," the 5.44% rate for the current year, the 6.84% rate for "reviews," and the divergence in fanout counts across platforms give marketing teams numbers to work with rather than patterns to guess at.

Timeline

July 16, 2025: Surfer publishes early AI search fanout study documenting how ChatGPT and Google AI Mode use sub-queries to build responses
August 29, 2025: iPullRank releases a 20-chapter AI search manual including query fan-out optimization techniques
October 30, 2025: Google VP of Product Robby Stein publicly explains Google's query fan-out technique for AI Mode
April 1-21, 2026: Peec AI collects 5 million query fanouts from ChatGPT, Perplexity, and Grok for the study period
April 27, 2026: Datos Q1 2026 state of search report documents ChatGPT's position shifting in post-search destination rankings
May 5, 2026: Tomek Rudzki publishes "Patterns we see in ChatGPT query fanouts" on the Peec AI blog
May 7, 2026: Cyrus Shepard publishes analysis of 54 AI citation experiments, ranking URL accessibility and search rank at the top of AI citation factors
May 13, 2026: Lily Ray discusses the listicle strategy and its vulnerabilities in AI search, contextualizing why "best" injection makes those formats effective
May 15, 2026: Google publishes AI search guide cautioning against content created solely to target estimated fan-out sub-queries
June 6, 2026: Chris Long of Nectiv shares the Peec AI study on LinkedIn, noting a Gemini 3 fanout count of approximately 9 per query - significantly higher than ChatGPT's 2.1 average

Summary

Who: Tomek Rudzki, GEO expert at Peec AI, authored the study. Chris Long, co-founder of Nectiv, shared and provided commentary on the research. Metehan Yesilyurt, a GEO researcher at Peec AI, contributed the finding about ChatGPT's use of the Reciprocal Rank Fusion algorithm.

What: A study analyzing 5 million query fanouts collected from ChatGPT, Perplexity, and Grok between April 1 and April 21, 2026. The research identifies the most frequently injected words in ChatGPT fanout queries, compares fanout volume and strategy across three AI platforms, and explains why content formats like listicles and review pages perform disproportionately well in AI search results.

When: The data collection period ran from April 1 to April 21, 2026. The study was published on May 5, 2026. Chris Long shared the findings publicly on June 6, 2026.

Where: The study was published on the Peec AI blog at peec.ai. Chris Long's commentary appeared in a LinkedIn post. The underlying data was collected by Peec AI's tracking infrastructure, which monitors fanout queries across ChatGPT, Perplexity, and Grok.

Why: Query fanouts represent the actual search vocabulary AI systems use to build their responses - not the vocabulary users type. Because ChatGPT uses the RRF algorithm to combine scores across multiple sub-queries, content that matches only the user's literal phrasing covers a fraction of the query space the model is actually evaluating. The research gives marketing and content teams a data-driven basis for understanding which terms and angles to cover in order to increase the probability of being cited by AI search engines.