Dog wearing a Perplexity medallion admires itself on a podium ringed by AI research charts

Perplexity co-authored the study that praises its own AI agents

AI agents cut task time 87 percent in a study Perplexity co-authored using its own data and products, even as copyright and security cases mount against it.

A study circulating since June 8, 2026 makes a confident case for autonomous AI agents. It reports that an agent performs roughly fifty times more work per session than a chatbot, finishes the average task in a fraction of the time, and leaves users measurably more satisfied. The findings flatter one company above all others. That company is Perplexity, which supplied the data, whose two products are the only ones compared, and three of whose employees appear among the paper's four authors.

None of that makes the numbers wrong. It does change how they should be read. The paper is a non-peer-reviewed preprint, posted to arXiv on June 5, 2026 under the identifier 2606.07489 and dated June 8. Its results are the kind a vendor would want in circulation: a demonstration that its newer, paid agent outperforms its older, free assistant. For a marketing and advertising readership now weighing agent tools from many suppliers, the more useful question is not what the study found, but how much weight an interested party's findings should carry.

A study, and who wrote it

The paper, "How AI Agents Reshape Knowledge Work: Autonomy, Efficiency, and Scope," lists Jeremy Yang of Harvard University alongside Kate Zyskowski, Noah Yonack and Jerry Ma, all of Perplexity. The lead author is based at Harvard Business School; the correspondence addresses point to Harvard and to Perplexity directly. Three of the four authors, then, work for the company whose products the study evaluates, and an earlier paper by an overlapping group examined Perplexity usage in 2025.

The data are Perplexity's own. The comparison sets the company's answer engine, Perplexity Search, against its autonomous agent, Perplexity Computer, which was released on February 25, 2026. No competing product appears anywhere in the analysis. The headline result, that the agent beats the assistant, is therefore a within-company contrast between a basic tier and a premium one, rather than a test of agents against the wider field.

The method does have a real strength worth stating plainly. Rather than rely on surveys, the authors matched near-identical opening queries that the same users sent to both products, keeping only pairs with a cosine similarity above 0.99, and drew 10,000 such pairs from 8,357 individuals. Matching the task content this way controls for the fact that people send easy questions to a chatbot and hard jobs to an agent. The design is sound. What it cannot do is remove the interest the authors have in the outcome.

What the study claims

Taken at face value, the figures are large. According to the paper, a Computer session ran 26 minutes of autonomous work against 33 seconds for a Search session, a 48-fold gap. On matched tasks, the agent cut average completion time from 269 minutes to 36, which the authors translate into an 87 percent reduction in time and a 94 percent reduction in cost after applying wages from the United States Bureau of Labor Statistics for May 2025. The savings were largest in labour-heavy domains, with programming compressing from 596 minutes to 48.

The study also reports a quality gain. Using a signal it calls next-turn dissatisfaction, it found that the agent drew meaningful dissatisfaction on 1.3 percent of queries against 2.9 percent for the assistant, a 55 percent reduction. And it argues for a broader effect it considers more important than speed: scope expansion. Agent users worked outside their primary occupation 59 percent of the time against 50 percent with the assistant, and each agent task required substantive expertise in 2.40 distinct knowledge domains against 1.74. The authors describe the result as a "reduction in coordination costs," with the worker's role shifting, in their words, from "operator to supervisor."

For marketers, two data points stand out inside the paper. Marketing and sales ranked as the third-busiest subject area in a sample of 100,000 agent queries, at 7.6 percent. And among the work behaviours that grew most under the agent, the activity labelled creating visual designs or displays rose 18 points, with preparing informational or instructional materials up 16. Those are content-production tasks. The study does not measure marketing outcomes, but it documents an agent absorbing the kind of artifact-building work marketing teams perform daily.

How "quality" was actually measured

The satisfaction claim deserves the closest reading, because it is the one most likely to be quoted out of context. The 55 percent figure does not come from any independent assessment of whether the agent's output was accurate, original or correct. It is a behavioural proxy, scored from what a user did in the next turn: whether they re-asked, corrected, reported an error or retried. A user who accepts a polished but flawed deliverable registers as satisfied. By this measure, satisfaction and correctness are not the same thing.

The efficiency numbers rest on modelling rather than observation. The counterfactual assumes a person using only Search would perform every non-research step by hand, with the time for each tool action estimated from an assumed schedule and, separately, by a language model reading the query text. The authors stress-tested those assumptions and reported the cost advantage survived large adjustments, which is to their credit. They are also candid that the supporting interviews, conducted with 25 users, are self-reported and, in their own phrase, "subject to recall and selection bias."

The sample is narrow in a way the authors disclose. The 90-day window, they write, "captures an early-adoption period in which users are disproportionately power users and paying subscribers." Whether the gains hold for a general population, or for skeptical buyers rather than enthusiasts, remains untested. The matched-pair method also covers only those agent tasks that have a close assistant equivalent, and many do not.

What independent research found

Vendor-independent work points in a more cautious direction on the question the study treats as settled. A November 2025 study from Carnegie Mellon and Stanford found AI agents completing tasks far faster but with significant quality gaps, and reported that AI automation can slow human work by 17.7 percent once verification time is counted. That research, which compared 48 workers against four agent frameworks across 16 tasks, was not produced by a company selling the tools. Its speed findings broadly echo the Perplexity paper. Its verdict on quality does not.

The wider market has hedged its bets too. Gartner has predicted that more than 40 percent of agentic AI projects will be cancelled by the end of 2027 on cost and unclear value, and ad tech has braced for AI agents amid hallucination risk in real-time bidding, where a single decimal error carries immediate financial exposure. These are not reasons to dismiss the study. They are reasons to treat a single favourable paper as one input among many.

The record the study does not mention

A study praising the satisfaction and capability of Perplexity Computer arrives against a documented history of disputes over how Perplexity's products gather and reproduce content. That history is relevant precisely because the qualities the paper celebrates, speed and autonomous data gathering, are the same qualities at issue in the litigation.

Cloudflare documented what it called stealth crawling by Perplexity, reporting that the company obscured its crawler identity and impersonated a standard Chrome browser to reach pages that had blocked it through robots.txt, with the infrastructure firm counting 20 to 25 million daily requests from the declared crawler and another 3 to 6 million from an undeclared one. Cloudflare's chief executive likened the conduct to that of state-linked hackers rather than a reputable AI company. Perplexity denied training models on the data and disputed the characterisation.

The copyright claims have multiplied. Encyclopaedia Britannica and Merriam-Webster sued Perplexity in September 2025, alleging that its PerplexityBot scraped their sites and that its retrieval system reproduced copyrighted material. Reddit filed a separate federal suit in October 2025 accusing Perplexity and several data brokers of circumventing technical controls. In the most pointed example for this study, a CNN lawsuit accuses Perplexity of copying 17,000 works, with the complaint alleging that the Comet browser assistant reproduced verbatim text from a paywalled article. The same Comet assistant the paper credits with high user satisfaction is, in that complaint, accused of lifting content it should not have been able to read.

The autonomy itself has drawn legal limits. Amazon sued Perplexity in November 2025 over Comet agents accessing its marketplace while presenting a Chrome user-agent, and a federal court blocked Comet's agents from Amazon's password-protected accounts in March 2026. Perplexity characterised the action as an attempt to block innovation and appealed. A coalition of publishers then backed Amazon in the appeal, warning that agent spoofing corrupts advertising metrics and undermines publisher revenue. On the security side, Comet faced documented prompt-injection vulnerabilities, in which malicious page content could hijack an assistant holding broad access to a user's accounts. A separate class action alleging Perplexity routed chat data to Meta and Google was voluntarily dismissed in May 2026, leaving the underlying privacy questions unresolved.

These are allegations and disputes, several contested by Perplexity and at least one already dropped. They do not establish that the study's numbers are false. They do establish that the company has a direct commercial interest in a flattering account of the very products under scrutiny, and that independent parties have repeatedly challenged how those products operate.

Why this matters for the marketing community

Vendor-authored efficiency figures rarely stay in academic papers. They migrate into pitch decks, board memos and budget cases. The marketing industry has spent a year absorbing agent products from the largest platforms: McKinsey named agentic AI the most significant emerging trend for marketing, Adobe shipped an Experience Platform Agent Orchestrator, and Amazon launched an Ads Agent for campaign management. A clean claim that agents cut task cost by 94 percent is exactly the sort of figure that lands in those discussions, and exactly the sort that benefits from a note on where it came from.

The publisher angle sharpens the point. For ad-funded media, Perplexity's contested crawling and agent behaviour is not abstract. The publishers backing Amazon argued that AI agents disguising themselves as human visitors distort the traffic and engagement metrics that advertising depends on. A paper celebrating an agent's ability to gather and act on web content, written by that agent's maker, sits uneasily beside a separate body of reporting in which the same content-gathering is the subject of court filings.

Industry bodies have tried to hold both ideas at once. IAB Spain placed agentic AI at the centre of its 2026 roadmap while insisting human supervision remains critical. That is roughly the posture the evidence supports. Agents are fast, and the speed gains appear real across independent and vendor studies alike. The claims about quality, scope and cost are softer, rest partly on proxies, and in this instance come from a source with something to sell. Read that way, the Harvard and Perplexity paper is a data point worth knowing, not a benchmark to plan around.

Timeline

2022: Perplexity launches Perplexity Search, the answer engine used as the study's baseline.
August 2024: Cloudflare documents what it describes as Perplexity stealth crawling that obscured the crawler's identity to bypass site blocks.
July 5, 2025: Gartner warns that more than 40 percent of agentic AI projects may be cancelled by the end of 2027.
July 27, 2025: McKinsey identifies agentic AI as the most significant emerging trend for marketing.
September 10, 2025: Encyclopaedia Britannica and Merriam-Webster sue Perplexity for copyright infringement.
October 22, 2025: Reddit sues Perplexity and several data brokers over circumventing technical controls.
October 25, 2025: Perplexity's Comet browser faces documented prompt-injection vulnerabilities.
November 9, 2025: Amazon sues Perplexity over Comet agents accessing its marketplace.
November 10, 2025: A Carnegie Mellon and Stanford study reports agents working far faster but with quality gaps.
February 25, 2026: Perplexity Computer is released.
February 27 to May 27, 2026: the study's 90-day observation window.
March 9, 2026: a federal court blocks Comet's agents from Amazon's password-protected accounts.
April 25, 2026: PPC Land documents how ad tech braces for AI agents amid hallucination risk in real-time bidding.
April 29, 2026: publishers file a brief backing Amazon, warning agent spoofing corrupts ad metrics.
May 1, 2026: a privacy class action against Perplexity is voluntarily dismissed with claims unresolved.
Early June 2026: CNN sues Perplexity over an alleged 17,000 copied works.
June 5, 2026: the study is posted to arXiv as preprint 2606.07489.
June 8, 2026: the paper "How AI Agents Reshape Knowledge Work" is published.

Summary

Who: Jeremy Yang of Harvard University and three Perplexity employees, Kate Zyskowski, Noah Yonack and Jerry Ma, authored the study using Perplexity's own production data.

What: A non-peer-reviewed preprint reporting that Perplexity's autonomous agent outperforms its conversational assistant, with 26 minutes of autonomous work per session against 33 seconds, an 87 percent cut in task time, a 94 percent cut in cost, and 55 percent lower user dissatisfaction. The quality figure rests on a behavioural proxy rather than an independent accuracy assessment, the efficiency figures rely on modelled assumptions, and the only products compared are Perplexity's own.

When: The paper was published on June 8, 2026, after a June 5 preprint, with data spanning February 27 to May 27, 2026, the period following the February 25 release of Perplexity Computer.

Where: The analysis was conducted inside Perplexity's product ecosystem, using United States Bureau of Labor Statistics wage data and the Department of Labor's O*NET activity taxonomy.

Why: The findings matter to marketing because agent vendors and platforms are circulating efficiency claims as buying arguments, and marketing and sales ranked among the busiest uses in the data. The relevance is heightened, and complicated, by a documented record in which Perplexity faces copyright suits from CNN, Reddit and Encyclopaedia Britannica, a court order limiting its Comet agent on Amazon, publisher warnings that agent spoofing corrupts ad metrics, security flaws in Comet, and an unresolved privacy case, all of which bear on how much weight a self-authored study of the same products should carry.

Luis Rijo

Luís Rijo is a seasoned marketing professional with over 10 years of experience in Digital Marketing, Search, Social, Display, Video, and DOOH. Based in Europe. Also writing in the spend. Reach out via luis@ppc.land