Meta to use public posts to train AI models, users can opt out

Meta announced plans to harvest public user content for AI training, raising privacy concerns.

Luis Rijo

Apr 20, 2025 • 6 min read

Man looks concerned at phone as data flows to Meta's AI on their May 27 deadline day.

Meta this month revealed a significant update to its privacy policy that will allow the company to use public posts and comments from users over 18 to train its AI models. The announcement, made on April 7, 2025, details how the company plans to leverage public user content to develop and improve its generative AI systems. Users have until May 27, 2025, when the policy takes effect, to object to the use of their information.

The updated privacy policy specifies that Meta will collect "public information such as public posts and comments from accounts of people aged 18 or older and your interactions with AI at Meta features." According to the policy documentation, this information will be used "on the basis of legitimate interests to develop and improve generative AI models for AI at Meta."

This change represents a substantial expansion in how Meta utilizes user-generated content. While the company has long collected user data for advertising purposes, this new approach explicitly directs that data toward developing artificial intelligence models that power features like Meta AI and AI creative tools.

Travel blogger Nate Hake expressed alarm about the policy on X on April 19. "Sadly almost every platform is doing the same thing. And since the web is mostly walled gardens, it's getting to the point where the only option to protect our rights is to just not use the Internet at all," Hake stated in response to concerns about similar practices on other platforms.

The policy change arrives amid growing competition in the generative AI sector, with companies racing to develop increasingly sophisticated AI systems. These models require enormous datasets to perform effectively, making user-generated content an attractive resource for technology companies.

Want to reach marketing professionals and decision-makers? Showcase your brand, tools, or services with our sponsored content opportunities.

Advertise Now

What content is being used?

Meta's policy distinguishes between different types of content. According to the documentation, "Some of your information and activity are always public." This includes names, Facebook and Instagram usernames, profile pictures, activity on public groups, Facebook Pages, channels, as well as activity on public content such as comments, ratings, or reviews on Marketplace or public Instagram accounts.

The policy further clarifies: "When content is public, it can be seen by anyone on or across our Products and in some cases, off our Products, even if they don't have an account." This definition creates a broad scope of content that Meta can potentially use for AI training.

In addition to content explicitly set to public, Meta's definition includes avatars and "other content that you can choose to set to Public, such as posts, photos and videos that you post to your profile, stories or reels."

The expanded data collection specifically excludes "private messages with friends and family" unless users or someone in the chat shares those messages with Meta's AIs. The company also states it's not using public information from accounts of users in the EU under the age of 18, unless shared as part of interactions with AI at Meta.

Opt-out mechanism implemented amid criticism

Users concerned about their content being used for AI training can object to the use of their information. Meta has created a form where users can register objections, stating: "You have the right to object to the use of your information for these purposes. If you submit an objection, we'll send an email confirming that we won't use your interactions with AI at Meta features or your public information from Meta Products for future development and improvement of generative AI models for AI at Meta."

However, some users have reported difficulties with the opt-out process. On X, user Gisele Navarro noted, "I also tried to opt out but got rejected," suggesting potential implementation issues with the objection system. Another user, Arun Chandra, claimed "Meta sends an email with a broken link to pretend they're complying, dodging blame."

The objection form requires users to provide their country of residence, name, and email address. For individuals who have concerns about their personal information appearing in AI responses, additional documentation is required, including screenshots showing the problematic AI output.

While Meta frames the opt-out mechanism as respecting user choice, critics argue that requiring users to take active steps to protect their content places an undue burden on individuals rather than implementing privacy by default.

Legal basis and international implications

Meta's approach varies by region. In documentation, the company states: "In the European region and the United Kingdom, we rely on the basis of legitimate interests to collect and process any personal information included in the publicly available and licensed sources, as well as public information people have shared on Meta Products and interactions with AI at Meta features, to develop and improve AI at Meta."

For European users, Meta Platforms Ireland Limited is identified as the data controller responsible for personal information. The policy notes that users have "the right to lodge a complaint with your local supervisory authority or our lead supervisory authority – the Irish Data Protection Commission."

Outside the European region, including the UK, Meta Platforms Inc. assumes the role of data controller. The policy acknowledges that information may be subject to global transfers "both internally within Meta Companies, and externally with our partners, measurement vendors, service providers and other third parties."

This regional variation highlights how different privacy regulations shape data practices across jurisdictions, with European users benefiting from additional protections under frameworks like the General Data Protection Regulation.

Technical details of AI training process

Meta's documentation provides insight into how collected data is used for AI development. The company explains that generative AI models are "trained on billions of pieces of information from different types of data such as text, images and audio." By studying this information, the models "can learn things such as the relationship and associations between different types of content."

The company distinguishes between models that generate text (large language models or LLMs) and those that generate images. Text models are "trained on massive amounts of text to help them predict typical sequences, such as those found in our everyday language," while image models "are trained by looking at billions of images and their text captions."

When third-party content is collected from public sources, Meta states it doesn't "specifically link this data to any Meta account." However, the company acknowledges that even people who don't use Meta products may have their information processed: "Even if you don't use our Products or have an account, we may still process information about you to develop and improve AI at Meta."

This processing occurs when people appear in images shared on Meta products or when someone mentions information about non-users in posts or captions. Meta notes this "could include information about people who are under 18 years old" if they appear in content uploaded by others.