Developer revives RSS with AI while Google targets syndication infrastructure

As Google proposes XSLT removal threatening RSS feeds, Evan Schwartz launches Scour with semantic matching to filter 12,903 sources using machine learning technology.

Luis Rijo

Aug 21, 2025 • 10 min read

AI-powered squirrel gathering RSS feeds represents Scour's semantic content aggregation technology revival.

While Google systematically dismantles RSS infrastructure through XSLT removal proposals and Mozilla deprecates feed support in browsers, software engineer Evan Schwartz is moving in the opposite direction. Three months ago, Schwartz announced Scour's public launch on Reddit's r/rss community, introducing an artificial intelligence-powered RSS aggregation service that directly counters the tech industry's abandonment of syndication formats.

The timing proves significant as Google's latest XSLT removal proposal specifically targets RSS feed display capabilities, continuing a documented pattern that began with Google Reader's discontinuation in 2013. According to Schwartz's announcement on the platform, "Scour searches through noisy feeds for content matching your interests. You can sign up for free, add topics you're interested in (anything from 'RSS' to 'Pourover coffee brewing techniques'), and import feeds via OPML or scour the 3,200+ feeds that have already been added."

The service has expanded significantly since its initial introduction, now monitoring 12,903 sources and processing over 868,105 posts monthly according to data displayed on the platform's homepage. The system employs semantic matching technology rather than traditional keyword filtering, representing a departure from conventional RSS reader approaches that rely primarily on manual feed curation.

Subscribe PPC Land newsletter ✉️ for similar stories like this one. Receive the news every day in your inbox. Free of ads. 10 USD per year.

Binary vector embeddings power content filtering

Scour utilizes Mixedbread's mxbai-embed-large-v1 embedding model to analyze content semantically rather than through keyword matching. According to Schwartz's technical documentation, "Scour checks feeds for new content every ~15 minutes. It runs the text of each post through an embedding model, a text quality model, and language classifier. When you load your feed, Scour compares the post embedding to each of your interests to find relevant content (using Hamming Distance between binary vector embeddings)."

The binary quantization approach reduces storage requirements and accelerates similarity calculations compared to traditional vector embeddings. Schwartz explained in user discussions that "some models, including the one I'm using, have been specifically trained to try to minimize the accuracy hit" from binary quantization. The system processes content through multiple filtering layers, including quality assessment and language identification, before presenting results to users.

The service automatically deduplicates content appearing across multiple feeds and provides cross-platform comment aggregation. Users can locate discussion threads for specific articles across Reddit, Hacker News, Lobsters, and Bluesky through integrated discovery features. According to the platform documentation, "Under each item, you can find the links to comment threads on Reddit, HN, Lobsters, Bluesky, etc, if it's been posted there."

Follow on Google, Google News, X, LinkedIn, Mastodon, Bluesky, Facebook, or via RSS

Growth trajectory accelerates through community feedback

User engagement data reveals substantial growth patterns since the initial announcement. According to Schwartz's May 2025 update, the service "scoured 458,947 posts from 4,323 feeds" that month, compared to 281,968 posts from 3,034 sources in April and 276,710 posts from 2,872 sources in March. This represents consistent month-over-month increases in both content processing volume and source diversity.

Community feedback has driven feature development priorities according to update documentation. Reddit user Vahe Hovhannisyan suggested infinite scroll functionality, which Schwartz implemented in May alongside emoji-based interest identification. Another user requested dark mode toggles, leading to customizable theme controls within weeks of the suggestion.

The platform introduced email digest functionality in response to user requests, delivering weekly summaries every Friday. According to Schwartz's April update, "I'm almost done building out automatic email updates so you'll hopefully have the first personalized weekly roundup in your inboxes this Friday." The digests compile the first 10 results from weekly feed views, providing users with curated content summaries.

User testimonials highlight the discovery capabilities that distinguish Scour from traditional aggregators. One user described the experience as "not a feed reader, it's a serendipity machine," while another compared functionality to "a grep, but smart, for my feeds." A third user noted finding "a new JavaScript state management library on Github" and feeling "right on the pulse of things" for the first time since Stumbleupon's discontinuation.

Feed ecosystem integration addresses RSS reader compatibility

The platform maintains comprehensive RSS ecosystem compatibility through multiple export and syndication options. Every user's personalized feed generates RSS, Atom, and JSON formats for consumption through external readers. Individual interest categories create separate feeds, enabling granular content management across reading applications.

OPML import and export functionality facilitates migration from existing RSS readers. Users can import subscription lists from applications like Inoreader or export Scour-generated interest feeds for use in other platforms. According to platform documentation, "You can export all of your interest feeds as an OPML file on your Interests page or via this link."

The system automatically discovers RSS feeds from blog URLs, eliminating manual feed location requirements. Schwartz noted that "When you add a feed by URL on Scour, it'll automatically search all the common paths for the feed. It can even treat some blogs that don't have feeds as if they did." This functionality extends RSS aggregation to publications that haven't implemented standard syndication formats.

Cross-platform content discovery features integrate social media and traditional web publishing. The service identifies when articles appear across Reddit, Hacker News, and other platforms, providing users with comprehensive discussion context beyond original publication venues.

Buy ads on PPC Land. PPC Land has standard and native ad formats via major DSPs and ad platforms like Google Ads. Via an auction CPM, you can reach industry professionals.

Learn more

Monetization strategy balances free access with premium features

The current service operates entirely free while Schwartz develops paid enhancement options. According to his March update, "Everything Scour currently does is free and I plan to keep it that way. I am working on this full time and hoping to make a small business of it, so I'll be adding some additional paid features."

The monetization approach follows models established by services like BearBlog, focusing on useful free functionality with premium enhancements for power users. Potential paid features include advanced filtering options, expanded historical content access, and enhanced personalization algorithms according to user feedback submissions.

Mixedbread provides embedding API access as a sponsorship arrangement, reducing operational costs for the core service. According to Schwartz's May documentation, "This month, I chatted with one of their co-founders, they liked Scour, and they offered to sponsor Scour's embeddings." The partnership demonstrates how AI infrastructure costs can be managed through strategic relationships rather than direct user fees.

The technical infrastructure utilizes Rust programming language with what Schwartz describes as the "MASH" stack. Performance optimization ensures feed loading under 100 milliseconds for most queries, though "Hot mode" content filtering increases response times to approximately 250 milliseconds.

Marketing implications for content aggregation landscape

Scour's approach represents broader shifts in content discovery that affect marketing strategies across digital channels. The platform's emphasis on semantic relevance over source authority mirrors changes documented in AI-powered search systems, where content quality and topical alignment increasingly determine visibility.

The service's growth parallels concerns about RSS infrastructure degradation as major platforms reduce support for traditional syndication formats. Google's proposed XSLT removal and Mozilla's RSS deprecation create opportunities for independent aggregation services to address content discovery gaps.

Content creators benefit from increased visibility through semantic matching rather than keyword optimization. The platform's discovery mechanism identifies relevant content regardless of specific terminology, potentially surfacing materials that traditional search algorithms might overlook. This aligns with semantic analysis trends in SEO tooling that prioritize meaning over exact keyword matches.

Publishers gain audience reach through automated feed discovery and cross-platform content integration. The service's ability to identify and aggregate discussions across multiple platforms provides content creators with comprehensive engagement metrics beyond individual publication analytics.

Technical architecture enables rapid scaling

The binary vector approach provides computational advantages that support the service's growth trajectory. Processing over 868,000 posts monthly requires efficient similarity calculations, which binary quantization enables through reduced storage and accelerated distance computations. The 15-minute update frequency across 12,903 sources demonstrates the architecture's scalability.

Content quality assessment operates through multiple model layers beyond semantic matching. The system evaluates text quality, language identification, and relevance scoring before presenting results to users. This multi-stage filtering addresses common RSS aggregation challenges including spam content, duplicate posts, and irrelevant materials.

The platform's recommendation engine suggests both new feeds and additional interest topics based on user engagement patterns. According to platform documentation, "Scour recommends feeds to you based on which ones have content related to your interests. It also suggests other topics you might be interested in." This recommendation system helps users discover content sources they might not locate through manual browsing.

Integration with existing RSS infrastructure maintains compatibility while extending functionality beyond traditional feed readers. The service operates as both a standalone platform and a feed generator for external applications, addressing diverse user preferences for content consumption methods.

Future development roadmap addresses user feedback

Community-driven feature development continues shaping platform capabilities. Recent implementations include interest-specific filtering, which allows users to view content related to individual topics rather than combined feeds. This granular control addresses user requests for more targeted content discovery options.

Email digest customization represents ongoing development priorities. Users have requested alternative delivery schedules and content formatting options for weekly summaries. The current Friday delivery schedule may expand to include daily or bi-weekly options based on user preferences and engagement metrics.

Advanced filtering capabilities could include negative topic filtering, enabling users to exclude specific subjects from their feeds. This enhancement would address scenarios where users appreciate most content from particular sources while avoiding specific coverage areas.

The platform's bookmarklet functionality enables quick feed addition from web browsers, streamlining the subscription process for new content sources. Future development may include browser extensions or mobile applications to expand accessibility across different usage contexts.

Subscribe PPC Land newsletter ✉️ for similar stories like this one. Receive the news every day in your inbox. Free of ads. 10 USD per year.

Timeline

Three months ago: Evan Schwartz announces Scour on Reddit's r/rss community
March 2025: Platform processes 276,710 posts from 2,872 sources, introduces likes/dislikes functionality
April 2025: Service grows to 281,968 posts from 3,034 sources, adds "Hot Mode" and design refresh
May 2025: Expansion to 458,947 posts from 4,323 feeds, launches email digests and infinite scroll
Present day: Platform now monitors 12,903 sources processing 868,105+ monthly posts
August 2025: RSS integration gains attention as content fragmentation increases
Related development: Google's XSLT removal proposal threatens RSS infrastructure

Subscribe PPC Land newsletter ✉️ for similar stories like this one. Receive the news every day in your inbox. Free of ads. 10 USD per year.

PPC Land explains

RSS (Really Simple Syndication): RSS represents a web feed format that enables automatic content distribution from publishers to subscribers in a standardized, machine-readable format. Originally developed in the late 1990s, RSS became fundamental to blog ecosystems and news aggregation services, allowing users to subscribe to content updates without manually visiting websites. Scour's implementation extends traditional RSS functionality through AI-powered filtering and semantic analysis, addressing the format's limitations in high-volume content environments.

Semantic matching: Semantic matching technology analyzes content meaning and context rather than relying on exact keyword correspondence. Unlike traditional keyword-based filtering systems, semantic matching uses machine learning models to understand conceptual relationships between words and phrases. Scour employs this approach to identify relevant content even when articles use different terminology to discuss similar topics, significantly improving content discovery accuracy compared to conventional RSS readers.

Binary vector embeddings: Binary vector embeddings convert text content into numerical representations using only binary values (0 and 1) rather than continuous numbers. This quantization approach reduces storage requirements and accelerates similarity calculations while maintaining semantic meaning capture. Scour utilizes binary embeddings through Mixedbread's mxbai-embed-large-v1 model, enabling efficient content comparison across hundreds of thousands of posts while minimizing computational overhead and infrastructure costs.

Content aggregation: Content aggregation involves collecting and organizing information from multiple sources into unified interfaces for easier consumption. Modern aggregation services like Scour address information overload by filtering and categorizing content based on user preferences rather than simply combining feeds. The practice has evolved from basic RSS compilation to sophisticated AI-driven curation that considers semantic relevance, content quality, and user engagement patterns.

Feed discovery: Feed discovery encompasses the automated identification and integration of content sources without manual intervention. Scour's feed discovery system automatically locates RSS feeds from blog URLs and can treat publications without standard syndication formats as feed sources. This capability reduces barriers to content source addition and helps users access a broader range of publications than traditional RSS readers typically support.

OPML (Outline Processor Markup Language): OPML provides a standardized format for importing and exporting RSS subscription lists between different feed readers and aggregation services. Scour's OPML support enables users to migrate existing subscriptions from other platforms and export interest-based feeds for use in external applications. This compatibility maintains ecosystem interoperability while allowing users to leverage Scour's AI capabilities alongside existing RSS workflows.

Embedding models: Embedding models transform text content into numerical vectors that capture semantic meaning and enable computational analysis of language. These models learn relationships between words and concepts through training on large text datasets, producing representations that group similar content in mathematical space. Scour's use of Mixedbread's embedding model enables the platform to understand content relationships and match articles to user interests based on meaning rather than surface-level keyword matching.

User interests: User interests in Scour's context represent topic categories that guide content filtering and recommendation algorithms. Users can define interests ranging from broad subjects like "RSS" to specific areas like "Pourover coffee brewing techniques." The system uses these interest definitions to evaluate content relevance through semantic matching, creating personalized feeds that surface materials aligned with individual preferences rather than general popularity metrics.

Content filtering: Content filtering involves automated selection and organization of information based on predetermined criteria or learned preferences. Scour's filtering system operates through multiple layers including semantic relevance, content quality assessment, language identification, and deduplication. This multi-stage approach addresses common RSS aggregation challenges including spam content, duplicate posts, and irrelevant materials while maintaining high-quality content delivery.

Platform integration: Platform integration refers to Scour's ability to connect with and enhance existing RSS ecosystem tools and workflows. The service generates feeds in RSS, Atom, and JSON formats for external reader consumption while providing cross-platform content discovery across Reddit, Hacker News, Lobsters, and Bluesky. This integration approach allows users to leverage Scour's AI capabilities within their preferred reading environments rather than requiring complete workflow changes.

Subscribe PPC Land newsletter ✉️ for similar stories like this one. Receive the news every day in your inbox. Free of ads. 10 USD per year.

Summary

Who: Evan Schwartz, software engineer and solo developer, launched Scour as an AI-powered RSS aggregation service with community feedback driving development priorities.

What: Scour provides semantic content filtering across 12,903 RSS sources using binary vector embeddings and machine learning to surface relevant articles from noisy feeds, processing over 868,000 posts monthly.

When: Initial announcement occurred three months ago on Reddit, with consistent monthly growth from 2,872 sources in March 2025 to current 12,903 sources by August 2025.

Where: The web-based service operates globally with technical infrastructure built using Rust and the "MASH" stack, sponsored by Mixedbread for embedding API access.

Why: The platform addresses information overload in RSS feeds and news aggregators where valuable content gets buried in high-volume sources like Hacker News, using AI to surface relevant materials based on user interests rather than popularity metrics.