Skip to main content

How a New AI Architecture Unifies 1,000+ Sources and 100 Million Rows in 5 Minutes

Gur Dotan
Apr 15 - 7 min read

In our “Engineering Energizers” Q&A series, we shine a spotlight on the brilliant engineers behind Salesforce’s innovations. Today, we feature Gur Dotan, a Software Engineering Architect who spearheaded the architectural vision and technical direction for Marketing Intelligence. This new Salesforce product unifies ad data, surfaces AI-driven insights, and automates campaign performance.

Discover how Gur’s team consolidated fragmented data from hundreds of ad sources, migrated to Salesforce-native technologies, and scaled metadata and data processing to manage over 100 million rows, all while cutting setup time from weeks to just five minutes.

What is your team’s mission?

The team’s mission is to empower digital marketers to optimize ad spend and maximize revenue through unified, data-driven insights. The challenge stems from the fragmentation of data across multiple advertising platforms — Google Ads, Facebook, Snapchat, LinkedIn, and others — each with inconsistent reporting formats and isolated analytics.

The true difficulty lies not just in the scale or variety of data, but in the disconnect between the raw data and the actionable insights marketers need. A typical marketing organization might allocate $100 million in ad spend across hundreds of campaigns and channels. The critical need is to quickly identify what’s working, what isn’t, and where to reallocate budget in real time.

The goal is to bridge the gap between raw ad data, actionable insights, and execution. Marketing Intelligence is designed to close this loop by unifying and harmonizing cross-channel data at scale, uncovering insights through AI, and feeding those recommendations back into campaign platforms to optimize performance.

Marketing Intelligence High-Level Architecture.

Why is solving cross-channel marketing analytics at scale such a technically complex problem?

The primary challenge lies in ingesting, transforming, and harmonizing vast amounts of inconsistent advertising data from hundreds of external accounts. This must be done in a way that makes the system accessible to business users, not just technical experts. Our largest customers often connect between 100 to 2,000 different data sources, which can include dozens or hundreds of individual accounts across the platforms I mentioned earlier. This data is often messy, with varying schemas and granularities, and it changes frequently.

With traditional data solutions, integrating this scale of data required extensive custom setup, often taking weeks or even months. With Marketing Intelligence, the team has completely overhauled this architecture on the Salesforce platform to automate the setup process. We have abstracted away the complex Data Cloud configurations, such as data streams, transforms, mappings, and DLOs, and packaged them into a streamlined, five-minute user flow. Now, a marketer can simply select a data source and the system automatically provisions the necessary pipelines and schema mappings behind the scenes.

Gur dives deep into Salesforce Engineering’s culture.

What were the primary engineering challenges your team encountered while unifying fragmented advertising data across thousands of sources?

Unifying data across hundreds of sources pushed every layer of the technology stack. To implement such a solution, a professional services team could spend weeks configuring the necessary data infrastructure. Deep technical expertise was essential to normalize, cleanse, and prepare the data for analytics.

With Marketing Intelligence, a new feature called Data Pipelines was introduced. This feature automates the provisioning of all necessary Data Cloud components for each source, including data streams, transformations, data lake objects, and semantic mappings. What once took days or weeks now takes just five minutes.

The system was designed to scale linearly. Each new ad account triggers a set of automated provisioning templates that deploy a complete ingestion pipeline, mapped to the customer’s standard data model. Behind the scenes, all related data cloud objects are created, each linked to a harmonized schema that supports seamless workflows and analytics.

Close coordination with the Data Cloud team was essential, particularly to develop incremental transforms — a capability that did not exist initially but was crucial to prevent overcharging for daily data reprocessing.

Data pipeline example with Google Ads.

You rebuilt this product on top of modern Salesforce technologies including Data Cloud, AgentForce, and Tableau. What integration challenges arose during that transition, and how did your team resolve them?

The most challenging aspect wasn’t just integration — it was replatforming. The process involved dismantling a legacy system built on one set of architectural assumptions and reconstructing it using entirely different Salesforce-native components. Every “Lego piece” had changed.

Rather than relying on Datorama’s legacy architecture, the new system now utilizes Data Cloud for ingestion and transformation, AgentForce for automation, Tableau for analytics, and the core platform for orchestration and trust. This required a thorough reevaluation of how each workflow operates — from campaign data onboarding to dashboarding to insight generation.

The integration challenges extended beyond the technical realm — organizational coordination was equally crucial. The team had to drive the development of new features across multiple clouds. For example, our collaboration with the semantic layer team produced several key features essential for marketing, such as logical views, goals, semantic model inheritance, and shared tables. These features also benefit data analysts in general. Another feature we strongly advocated for, 1-Click OAuth, was recently introduced to the Data Cloud connector framework, enabling non-technical marketers to easily connect and authorize their data sources. These are prominent examples where our product’s requirements actually drove improvements that now serve all Data Cloud and Tableau Next customers.

This cross-cloud dependency management became a cornerstone of the architecture strategy — and a key success factor in launching the product.

How did you architect the system to support querying hundreds of millions of rows of advertising data with low latency?

Scalability focused on two critical areas: metadata scalability and data scalability. For metadata scalability, supporting hundreds of connected sources required deploying a 1,000+ Data Lake Objects (DLOs) into a customer’s org. This pushed the core platform to its limits. Extensive platform limit testing was conducted to identify performance degradation points, and the Data Cloud team collaborated closely to address these bottlenecks.

For data scalability, the challenge was even greater. Based on historical usage patterns, customers are expected to ingest anywhere from 100 million to over 1 billion rows of advertising data over time. To maintain query performance, data partitioning strategies were implemented and deduplication was integrated into the ingestion process. Common marketing data practices such as pattern extraction and data classification were rebuilt to be materialized rather than computed at query time to further accelerate queries. The system optimizes data at the time of load to minimize query costs and maximize responsiveness.

To ensure robust performance, a suite of synthetic data generators was developed to simulate large-scale environments and benchmark the agentic workflows and dashboarding layers under real-world loads. This approach provided the confidence that the product could scale seamlessly with the largest customers from the very first day.

Gur explains why engineers should join Salesforce.

AI plays a central role in this product. From a system design perspective, where does agentic automation provide meaningful value — and where does it introduce new complexity?

Agentic automation is a cornerstone of Marketing Intelligence, but the design centers on real marketing workflows, not abstract AI concepts.

For the marketing data specialist, a data prep agent was developed to analyze incoming campaign data, classify dimensions, detect anomalies, and recommend harmonization patterns. These tasks, which once required manual SQL and offline data modeling, now receive agent-driven suggestions based on the structure and quality of the data.

For the marketing manager, the agent focuses on campaign optimization. The agent identifies underperforming campaigns, analyzes performance deltas, and recommends actions such as pausing a campaign or reallocating budget. These agent-driven recommendations bridge the gap between insight and action.

The team is also exploring agent-to-agent workflows. For instance, a data prep agent can pass harmonized data to a visualization agent, which then proposes relevant dashboards. Additionally, creative intelligence is being prototyped, where the system analyzes ad creatives and suggests new variants using generative AI.

While AI simplifies complexity, it introduces new challenges, particularly around explainability and trust. To ensure system integrity, internal quality controls are stringent, including a minimum 80% code coverage standard across all production code and a robust testing pyramid of unit and functional tests. These safeguards guarantee the reliability of automated decisions, even as the product evolves.

Learn more

Related Articles

View all