Databricks today launched CustomerLake, an agentic customer data platform built natively inside the Databricks environment, with Integral Ad Science joining as a launch partner - a pairing that places IAS's contextual and media quality signals directly inside the data infrastructure where marketers already store and process their customer records.
What CustomerLake actually is
For years the problem with customer data platforms has been straightforward, if frustrating. Brands pour resources into CRMs, CDPs, and data warehouses, yet the intelligence needed to act on that data lives somewhere else entirely. The result is delay, manual labor, and campaigns that launch on stale signals - by the time an audience is built, the moment has often passed.
CustomerLake is Databricks' attempt to collapse that gap. According to Databricks, the product brings identity resolution, audience segmentation, and campaign activation directly to the governed data foundation where a brand's customer records already reside. Nothing is copied or moved to a separate CDP silo. The profiles, the AI models, and the campaign logic all operate within the same Databricks environment.
The architecture has a specific name: Databricks calls it an Agentic CDP, distinguishing it from two existing categories. Traditional bundled CDPs are standalone platforms that require data to be extracted and loaded into a separate system. Composable CDPs are assembled by connecting modular tools on top of a data warehouse. CustomerLake takes a third path: it is embedded natively in Databricks, which means it operates on the same Unity Catalog governance layer and the same Lakehouse storage architecture that data engineering teams already use.
How the technical stack works
CustomerLake is built on several interlocking Databricks components, and understanding them individually helps clarify what the product can and cannot do.
Lakeflow handles data ingestion. According to Databricks, it ingests and transforms batch and real-time customer data - transactions, behavioral events, engagement records, campaign responses - with native connectors to marketing tools, advertising platforms, and operational databases.
Unity Catalog governs the data once it is ingested. It provides unified access controls, lineage tracking, and business semantics across all the data assets CustomerLake uses. Critically, it supports federated access: data in Snowflake, BigQuery, cloud object storage, and other enterprise systems can be queried without replication. A single governance layer covers the full scope of a brand's data.
Genie is the natural language interface. Marketers can ask questions of their governed customer data in plain language, receive recommendations, and surface campaign insights without filing requests with data engineering teams. According to Databricks, this component is designed to free data teams from ad hoc requests so they can focus on strategic engineering.
Real-Time Profile API enables personalization at the moment of interaction. Customer attributes are available through a real-time serving layer, meaning that content, channel selection, and message delivery can be adjusted based on current signals rather than batch-processed snapshots.
Agentic Identity Resolution handles the matching of customer records across sources. According to Databricks, it combines AI-driven learning with existing rules, improving match accuracy while reducing the manual tuning that identity resolution has traditionally required.
The 3P Data Marketplace adds a third dimension: brands can enrich their own first-party profiles with third-party identity graphs and data providers that are built directly into the Databricks data foundation. No separate integration project is required.
Infinity campaigns are the operational output of the agentic layer. Rather than manual campaign execution, infinity campaigns are described by Databricks as continuous, agent-driven engagement loops. The intent is to replace periodic batch campaigns with something closer to ongoing automated engagement that responds to customer signals as they arrive.
The IAS integration
Integral Ad Science is among the announced launch partners for CustomerLake, and the nature of the integration is worth examining closely, because it goes beyond a simple data feed.
According to IAS, the partnership enables two distinct capabilities for brands using CustomerLake. The first is a connection between a brand's first-party records and IAS's contextual and attention signals, established through clean-room connections directly within the CustomerLake environment. IAS processes more than 300 billion media transactions daily. By linking first-party customer records to those signals - including content affinity scores, viewability data, and brand safety classifications - brands can overlay media interaction and page-level context onto their Customer 360 profiles. The output is sharper segmentation tied to actual media behavior, not just demographic or transactional data.
The second capability concerns lookalike audience development. According to IAS, clients can enhance their first-party audiences with IAS's media quality understanding, including the contextual and behavioral footprints of high-value customer cohorts. The goal is to identify new audiences that resemble existing high-value segments. Contextual segments built from joint customer analysis can be constructed inside the CustomerLake environment and then deployed through preferred demand-side platforms, providing campaign targeting extension without reliance on third-party cookies.
The mechanism that makes both capabilities possible is speed. In a conventional workflow, integrating third-party verification signals with first-party data is a multi-team project that can run for weeks. By the time the data is processed, the campaign window has frequently passed. According to IAS, its signals flow into customer data continuously within the CustomerLake environment, keeping profiles current automatically. When a campaign begins, it draws on the most recent signals available rather than a snapshot assembled weeks earlier.
Jay Malepati, Global Director of Customer and Marketing Data Science at Circle K, provided a direct account of the architecture's practical impact. "CustomerLake is proving to be a major unlock for our architecture because it allows us to build targeted audiences natively in Databricks, activate them seamlessly in Adobe and measure downstream campaign impact without moving our entire data lake into another platform," Malepati said. "This gives our teams a faster, more governed path from customer data to campaign execution and fundamentally changes our speed to market."
Why the martech stack context matters
The pattern CustomerLake is designed to address is one that PPC Land has tracked across multiple angles in recent years. When Treasure Data expanded AI marketing cloud access through AWS Marketplace in late 2025, the emphasis was on reducing the operational complexity that accumulates when brands manage data infrastructure separately from campaign execution. When Adobe launched AI agents for enterprise customer experience management in September 2025, the same tension was visible: the most capable AI tools are useful only when grounded in current, governed, trustworthy data.
The difficulty has never been the absence of data. It has been the separation of the data from the intelligence needed to act on it. CustomerLake's embedded approach - keeping profiles, models, and activation tooling in a single governed environment - addresses that structural problem directly.
The agentic framing is significant. PPC Land coverage of what 28 marketing executives predicted would transform digital advertising in 2026 captured a recurring theme: speed of insight to action. The time between a signal arriving and a campaign responding to it has become a competitive variable. Agentic systems that can observe, decide, and act without human bottlenecks at each step are the mechanism through which that gap narrows. CustomerLake's infinity campaigns are precisely this: continuous agent-driven loops rather than discrete batch executions.
IAS's expanding data infrastructure role
Today's announcement is not an isolated data partnership for IAS. Over the past 18 months the company has been extending the contexts in which its signals are active.
In March 2026, IAS and Mastercard connected media quality signals to purchase data for pre-bid targeting, making IAS quality data a live input to programmatic optimization rather than a post-campaign reporting artifact. In April 2026, IAS Total TV launched, giving connected TV buyers show-level, genre-level, and rating-level transparency across inventory from Disney, NBCUniversal, Paramount, and Prime Video. Earlier, in January 2026, IAS launched automated supply path optimization through its Total Visibility product, with quality spend in testing rising from 88% to 97% in a single month.
The Databricks CustomerLake partnership extends this trajectory. Rather than IAS signals appearing in post-campaign reports or feeding a single pre-bid decision layer, they are now available as a continuous input to the customer profiles from which all downstream campaign decisions flow. The scale behind those signals is substantial: IAS processed over 280 billion daily interactions through its AI-powered models during the period covered by its Ethical AI Certification from the Alliance for Audited Media, awarded in July 2025. The CustomerLake integration routes a portion of that signal volume into Databricks customer profiles on an ongoing basis.
This positions IAS not only as a verification and measurement layer - the role it has traditionally occupied - but as a data enrichment input at the profile level, upstream of campaign execution rather than downstream of it.
The addressability dimension
Third-party cookie deprecation continues reshaping how audience building works. PPC Land's coverage of data clean room developments and contextual targeting performance shows a consistent pattern: the industry is increasingly reliant on first-party data enriched with contextual and behavioral signals as a substitute for individual-level third-party identifiers.
CustomerLake is structured to operate in exactly this environment. According to IAS, contextual segments built within the CustomerLake environment can be deployed through preferred DSPs without requiring third-party cookie support. The lookalike audience extension capability builds on IAS's media quality intelligence - content affinity signals, engagement footprints, behavioral patterns across high-value cohorts - rather than deterministic cross-site tracking.
Databricks has separately been building this kind of integration story for some time. When LG Ad Solutions integrated ACR data with the Databricks Marketplace in October 2025, the principle was the same: make it possible for brands to access rich external datasets through a governed Databricks environment without rebuilding their data pipelines. CustomerLake extends that approach to the full CDP workflow.
What the product does not do
CustomerLake does not replace existing martech tools. According to Databricks, it connects to the existing martech and adtech ecosystem through native integrations for marketing tools, advertising platforms, third-party data providers, and identity graphs. The connectors include adtech and martech platforms across multiple categories. The intent is integration, not replacement.
CustomerLake also does not move or copy customer data. The profiles are built from data that already lives in Databricks. This matters for compliance: the Unity Catalog governance layer, access controls, and data lineage that enterprises already rely on for regulatory compliance apply to CustomerLake profiles and audiences without requiring a separate data governance implementation.
Whether the product can work with customer data outside Databricks is addressed in the product's FAQ: according to Databricks, Unity Catalog supports federated access to data across Databricks, Snowflake, BigQuery, cloud object storage, and other enterprise systems, meaning the scope of the Customer 360 is not limited to data already ingested into the Lakehouse.
Why this matters for the marketing community
What CustomerLake represents, taken together with IAS's role as a launch partner, is a structural shift in where media quality intelligence enters the marketing workflow. Today that intelligence is predominantly an output: it appears in reports, it informs post-campaign optimization, and it gates pre-bid decisions at the inventory level. Within the CustomerLake model, IAS signals become a profile-level input - part of the Customer 360 that defines who a customer is, how they behave in media environments, and what targeting approaches are most likely to reach similar prospects.
For media and marketing teams, the practical consequence is that campaign decisions - audience selection, message sequencing, timing, channel allocation - can be grounded in a richer and more current understanding of customer media behavior than first-party CRM data alone provides. For data teams, the consequence is fewer integration projects: IAS signals flow into profiles through the CustomerLake environment rather than requiring a separate data ingestion pipeline.
The agentic layer amplifies both effects. An AI agent selecting audiences, composing messaging, and deploying campaigns in a continuous loop is only as good as the data it draws on. CustomerLake structures the environment so that the most current first-party and third-party signals are always available to those agents, without the latency that accumulates when signals pass through multiple system boundaries before reaching the point of decision.
Timeline
- July 2025 - IAS becomes the first company to receive Ethical AI Certification from the Alliance for Audited Media, covering AI governance, data quality, and bias controls across its platform that processes up to 280 billion daily media interactions
- October 22, 2025 - LG Ad Solutions integrates ACR data with Databricks Marketplace via Delta Sharing, providing viewing data from 33 million opted-in LG Smart TVs to contracted data partners in Databricks' governed environment
- December 2025 - IAS launches IAS Agent, an AI-powered campaign assistant that surfaces campaign insights up to five times faster than manual analysis, with global rollout at no additional cost in early Q1 2026
- January 2026 - IAS launches automated supply path optimization via Total Visibility, with quality spend rising from 88% to 97% in testing and a 33% increase in conversions reported in one month
- March 26, 2026 - IAS and Mastercard announce Sales Outcomes, linking media quality signals to anonymized purchase data for in-flight programmatic optimization, with U.S. availability from Q2 2026
- April 27, 2026 - IAS Total TV launches, giving CTV advertisers show-level, genre-level, and rating-level transparency across Disney, NBCUniversal, Paramount, and Prime Video inventory
- June 4, 2026 - IAS publishes Stellantis case study covering a global attention measurement program across more than 13 billion impressions in 19 countries, with a 165% increase in engagement on high-attention inventory
- June 16, 2026 - Databricks launches CustomerLake, an Agentic CDP embedded natively in the Databricks platform, with IAS joining as a launch partner; IAS publishes partnership details on its blog
Summary
Who: Databricks, the data and AI company headquartered in San Francisco, and Integral Ad Science (IAS), a global media measurement and optimization platform owned by private equity firm Novacap since late 2025. Circle K is among the named enterprise users of CustomerLake.
What: Databricks launched CustomerLake, an Agentic CDP built natively within the Databricks platform, encompassing identity resolution, audience segmentation, real-time profile serving, infinity campaigns, and natural language access to customer data through Genie. IAS joined as a launch partner, enabling first-party data to be linked to IAS's contextual and attention signals through clean-room connections, and allowing lookalike audiences to be built and deployed through DSPs without third-party cookie reliance.
When: The announcement was made on June 16, 2026, with IAS publishing its partnership details on the same date.
Where: CustomerLake operates natively within the Databricks platform across all cloud environments where Databricks runs - Amazon Web Services, Microsoft Azure, and Google Cloud Platform. IAS signal integration and audience deployment operate through the CustomerLake environment and connect to preferred DSPs for campaign activation.
Why: Brands have long held large volumes of first-party customer data but faced structural delays when connecting that data to the intelligence needed to act on it. CustomerLake eliminates the separate CDP silo by embedding data collection, profile building, identity resolution, and campaign activation directly in the governed data foundation. IAS's role as a launch partner adds media quality and contextual enrichment to customer profiles at the point of origin, keeping signals current automatically rather than requiring periodic manual data transfers.
Discussion