AI agents leak owner data at scale, study finds

A paper published on April 21, 2026, on arXiv by researchers from Washington University in St. Louis and the University of California, Los Angeles, presents what is described as one of the first large-scale empirical studies of behavioral transfer between humans and their autonomous AI agents in a natural deployment setting. The findings carry direct implications for anyone deploying AI agents in marketing workflows, advertising platforms, or any context where an agent interacts publicly on behalf of a person or brand.

The paper, titled "Behavioral Transfer in AI Agents: Evidence and Privacy Implications," uses data from Moltbook, a social media platform launched on January 28, 2026, where autonomous AI agents interact with one another. The platform is built on OpenClaw, an open-source framework that allows users to deploy AI agents locally, powered by large language model APIs. A distinguishing feature of the platform is that each agent is publicly linked to its human owner's Twitter/X account, creating a rare empirical opportunity: researchers can compare what the agent says publicly with what its owner says on an independent platform.

The dataset and methodology

The study's core dataset consists of 10,659 matched human-agent pairs. As of February 2, 2026, the Moltbook platform listed approximately 1.6 million registered AI agents. Most were dormant. The researchers collected data on February 2, 2026, and February 7, 2026, retrieving 86,497 posts authored by 20,894 distinct agents that had published at least one post. Of these, 17,745 agents - or 84.9% - had linked Twitter accounts. The final analysis sample of 10,659 pairs required that the agent had at least one post with content text, that the owner's Twitter profile was retrievable, and that the owner had at least one non-verification tweet.

The researchers constructed 43 text-based behavioral features organized across four dimensions: topics (6 features), values (7 features), affect (12 features), and style (18 features). Topics covered content domains such as crypto, AI, development, trading, philosophy, and meme-slang. Values included five moral foundations measured using keyword dictionaries adapted from established social science frameworks, plus two political ideology measures. Affect spanned seven sentiment features measured using the VADER lexicon and five discrete emotion categories. Style covered five lexical complexity measures, eight communication-style indicators, and five pronoun-use features. Each feature was computed separately for the human's Twitter posts and the agent's Moltbook posts, then compared at the pair level using Spearman rank correlations.

Behavioral transfer is real and statistically robust

The headline result is stark. Of the 43 behavioral features tested, 37 - or 86% - showed statistically significant positive correlation between the human owner and their agent, after correction for multiple testing. That result held across all four behavioral dimensions. Transfer was not confined to one or two obvious dimensions. It appeared in topics, in moral values, in emotional tone, and in the fine-grained stylistic habits of how people write.

The strongest individual effect was found in capitalization ratio - the fraction of characters that are uppercase - which showed a Spearman correlation of 0.174. Average text length came next at 0.139, followed by use of third-person pronouns at 0.142. Crypto topic intensity showed a correlation of 0.166 between owners and their agents. Negative sentiment proportion reached 0.153. Even more granular features such as emoji rate (0.068), formality index (0.085), and exclamation rate (0.038) showed significant transfer.

These are modest absolute values by some standards. The researchers note that prior work in computational linguistics finds that the same individual's stylistic behavior is consistent across platforms at roughly 0.10 to 0.15 correlation. The human-to-agent correlations fall in a comparable range - notable given that the two platforms have entirely different formats, norms, and audiences. Twitter's short-form constraint produces a human median of 106 characters per post; Moltbook's forum-style interface yields an agent median of 471 characters. Platform format is not the source of the similarity.

A series of robustness checks confirmed the pattern holds under scrutiny. Permutation tests, which randomly reassign agents to humans 10,000 times to build a null distribution, confirmed that all 37 significant features exceeded the baseline, ruling out the possibility that platform-level topic overlap is responsible for the correlations. A separate test using Sentence-BERT neural text embeddings - a method that captures overall semantic similarity without any predefined keyword lists - found that matched human-agent pairs showed a mean cosine similarity of 0.288, compared to 0.205 for randomly paired humans and agents. The difference is statistically significant at p less than 0.0001, with a Cohen's d of 96.2.

The researchers also addressed the possibility that some agents are human-controlled "puppet" accounts rather than truly autonomous systems. Moltbook's posting patterns are consistent with autonomous operation: the normalized Shannon entropy of the hourly posting distribution is 0.984 out of a maximum of 1.0, and 26.9% of posts occur between midnight and 8AM Eastern Time, a pattern inconsistent with human-controlled activity. Applying two detection methods - one based on four temporal criteria and one based on inter-post interval variability - and removing suspected puppets (up to 20.1% of the sample) left 32 to 36 of the original 37 significant features intact, with mean changes in correlation magnitude below 0.009.

How the transfer happens

The paper examines four possible channels through which behavioral similarity could arise: explicit bio-based configuration written by the owner; workspace configuration files assembled into the agent's system prompt; platform-mediated injection where the platform automatically scrapes owner Twitter content; and accumulated owner-agent interaction over time.

The bio-based explanation is tested directly. Approximately half the agents had no configured bio. Restricting the analysis to this no-bio subset, 33 of the original 37 significant features remained significant, with median correlation 0.074 compared to 0.067 in the full sample. Bio-based configuration cannot explain the observed pattern.

A second test addresses dimension-specific workspace configuration. If owners separately configure their agents for topics, values, affect, and style, then agents that align with their owner on one dimension would not necessarily align on others. The data contradict this. For each of the 10,659 pairs, the researchers computed a dimension-specific transfer score using cosine similarity of z-normalized feature vectors within each behavioral dimension. All six pairwise correlations between these dimension-specific scores were significantly positive, with a mean Spearman correlation of 0.092. The strongest cross-dimension coherence was between affect and style (0.224), followed by values-affect (0.114). Cross-dimension coherence is the opposite of what targeted, dimension-specific configuration would predict.

Platform-mediated injection - the possibility that Moltbook's backend automatically feeds Twitter content into agent prompts - appears inconsistent with the platform's published privacy policy, according to the paper. Reading a user's tweet history via the Twitter API requires explicit OAuth scope permissions that are neither disclosed in Moltbook's privacy policy nor consistent with the scale of 1.6 million registered accounts.

The channel most consistent with all the evidence, according to the researchers, is accumulated owner-agent interaction. Through ongoing conversations with their owners, agents are exposed to owner-specific language, task contexts, feedback, and files from the owner's computing environment. This accumulated exposure can embed owner-specific behavioral characteristics across dimensions that owners never explicitly configured. The OECD's February 2026 working paper on agentic AI similarly identified persistent memory profiles as a structural property of agentic systems, and noted that security and privacy concerns were shared by more than 80% of the developer population surveyed.

Privacy disclosure: 34.6% of agents, sensitive information surfaced

The most consequential finding concerns privacy. The researchers used an LLM-as-judge approach, submitting each of the 44,588 agent posts individually to Claude Haiku via the Anthropic Batches API, instructing the model to act as a privacy auditor and identify any disclosures of personally identifiable or sensitive information about the human owner. The taxonomy covered six categories across three tiers: highly sensitive health and financial data (Tier 1); moderately sensitive location and occupational details (Tier 2); and lower-sensitivity behavioral patterns and relational connections (Tier 3).

The classifier initially flagged 9,601 posts. Of those, 6,220 received high-confidence ratings. A human validation exercise on a stratified sample of 600 flagged posts found a false positive rate of 12.0% for high-confidence predictions and 41.1% for medium-confidence ones. A separate validation of 361 non-disclosure posts found a false negative rate of 1.7%. The main analysis uses only the high-confidence classifications, yielding 6,220 posts - or 14.0% of all posts - identified as containing owner-revealing content.

At the agent level, 3,685 of the 10,659 agents - or 34.6% - had at least one detectable disclosure event. The most prevalent category was occupational information, appearing in 75.5% of disclosing posts. Location appeared in 27.2%, relational connections in 12.9%, financial details in 12.2%, behavioral patterns in 10.4%, and health information in 2.4%.

The verbatim examples in the paper are striking. One agent disclosed its owner's severe hemophilia, ADHD, and a completed fifteen-year benzodiazepine taper. Another described the owner as facing court seizure of bank accounts, inheritance disputes, and difficulty feeding his family. A third revealed an owner's daily 06:25 CET morning schedule, including children's weather checks and the name of the owner's ski-tracking app. None of these details appeared in the owners' public Twitter histories.

This is a privacy mechanism distinct from previously documented channels. Prior research on AI privacy risks has focused on adversarial extraction of training data, inference of attributes from linguistic patterns at scale, or cross-platform data sharing. The mechanism documented here requires no external actor and operates through ordinary use. Owner-specific context accumulated by agents through daily tasks - browsing files, drafting communications, completing work-related activities - may both generate behavioral transfer and result in that context surfacing in public discourse. Spain's data protection authority mapped precisely these structural privacy risks in a March 2026 guidance document, noting that agentic AI systems "operate autonomously, access multiple internal and external services simultaneously, build persistent memory profiles of users, and execute consequential actions without mandatory human checkpoints."

Transfer predicts disclosure risk

The relationship between behavioral transfer and privacy disclosure is quantified directly. The researchers construct a holistic transfer score for each pair using cosine similarity between 43-dimensional z-normalized behavioral vectors. In logistic regressions predicting whether an agent produced at least one privacy disclosure post, a one-standard-deviation increase in holistic transfer score was associated with a 1.32-percentage-point higher probability of disclosure (p = 0.003), with comprehensive controls applied on both the human and agent side.

The effect strengthens as measurement quality improves. Restricting to pairs where the owner had at least ten tweets - the maximum collected - the marginal effect increases to 3.40 percentage points (p less than 0.001). Restricting to agents with at least three posts, the effect reaches 2.67 percentage points (p less than 0.001). A simulation-based sensitivity analysis propagating empirically estimated classification error rates through 1,000 Monte Carlo iterations found the coefficient on holistic transfer was positive in all 1,000 iterations and statistically significant at p less than 0.05 in 96.8% of them, with a mean average marginal effect of 1.27 percentage points and 95% simulation interval of 0.84 to 1.76 percentage points.

Context for the marketing and ad tech industry

The study's empirical setting - a crypto and AI-heavy early-adopter platform - is acknowledged as a limitation. Whether similar patterns arise across more heterogeneous populations, including in marketing and advertising contexts, remains an open question. Still, the structural finding applies to any agentic deployment where an agent accumulates owner-specific context through daily use. As the UK's four-regulator foresight paper on agentic AI noted on March 31, 2026, agents executing purchases or managing communications may simultaneously pull personal data from several sources and execute actions without the user experiencing each step as a separate decision. That structural feature is not platform-specific.

The ad tech industry has been accelerating its adoption of autonomous agents on a compressed timeline. IAB Tech Lab published its agentic AI standards roadmap and launched boot camps beginning February 12, 2026, and the Trade Desk launched its first in-platform AI agent, Koa Agents, on April 22, 2026, with Stagwell as pilot partner. In that context, a study showing that agents deployed for personal tasks can expose sensitive owner information in public discourse introduces a governance question that the industry's current standards activity has not directly addressed.

The paper suggests several design responses, framed as illustrative rather than prescriptive. Agent frameworks could implement transfer-aware content screening, applying additional checks before publication from agents whose behavioral profiles closely mirror their owners. Platforms could offer transparency tools showing owners what behavioral profile their agent has internalized. Agent architectures could separate owner context into tiered memory, where explicitly private information is available for task completion but excluded from public-facing outputs. Post-hoc auditing tools could surface samples of an agent's public posts for owner review. None of these directions has been adopted in current commercial agentic platforms as far as public documentation indicates.

Timeline

January 28, 2026 - Moltbook, a social media platform for autonomous AI agents built on the OpenClaw framework, launches publicly
February 2, 2026 - Researchers collect Moltbook data, finding approximately 1.6 million registered AI agents, with 20,894 having published at least one post and 86,497 posts in total; of agents with posts, 17,745 have linked Twitter accounts
February 7, 2026 - Twitter data collection completed, retrieving profile information and recent tweets for matched human owners
February 13, 2026 - OECD publishes "The Agentic AI Landscape and Its Conceptual Foundations" (No. 56), mapping adoption trends and flagging security and privacy concerns among more than 80% of surveyed developers
March 1, 2026 - Spain's data protection authority (AEPD) publishes guidance mapping GDPR risks specific to agentic AI architectures, covering memory, service integration, and autonomy vulnerabilities
March 31, 2026 - Four UK regulators - the CMA, FCA, ICO, and Ofcom - jointly publish "The Future of Agentic AI," mapping a five-level autonomy spectrum and structural governance gaps in current frameworks
April 21, 2026 - Researchers from Washington University in St. Louis and UCLA submit "Behavioral Transfer in AI Agents: Evidence and Privacy Implications" to arXiv (arXiv:2604.19925), reporting results from 10,659 matched human-agent pairs
April 22, 2026 - The Trade Desk launches Koa Agents, its first in-platform AI agent, with Stagwell as pilot partner, alongside the Open Agentic Kit standard for agentic advertising

Summary

Who: Researchers Shilei Luo and Zhiqi Zhang (Washington University in St. Louis), Hengchen Dai (UCLA), and Dennis Zhang (Washington University in St. Louis)

What: A study using 10,659 matched human-agent pairs from Moltbook demonstrating that AI agents systematically mirror their owners' behavioral characteristics across 43 features spanning topics, values, affect, and style; and that 34.6% of agents disclosed sensitive personal information about their owners in public posts, with disclosure risk significantly predicted by the degree of behavioral transfer

When: Data collected February 2-7, 2026; paper submitted to arXiv on April 21, 2026

Where: The empirical setting is Moltbook, a social platform for autonomous AI agents launched January 28, 2026, built on the OpenClaw framework; the study was conducted at Washington University in St. Louis and UCLA

Why: As AI agents accumulate owner-specific context through daily use - accessing files, drafting communications, completing tasks - they embed behavioral characteristics that go beyond explicit configuration, creating a systematic mechanism through which private human context can surface in public agent output, with implications for platform design, regulatory compliance, and the governance of agentic systems in advertising and marketing