Activate Your Entire Salesforce CRM Data in Databricks: The Power of Zero Copy Data Sharing

Latest
July 10, 2025

In an era where every enterprise thrives on intelligent insights, the ability to unify, activate, and reason on your most critical business data is paramount. For years, Salesforce has empowered organizations with the world’s leading CRM and an unparalleled view of customer interactions. We are excited to announce a significant and transformative partnership with Databricks, bringing together the most extensive CRM data from Salesforce with Databricks. This collaboration goes beyond traditional integration; it’s about establishing true data intelligence, where the rich, contextualized data spanning your entire Salesforce ecosystem—from sales and service to marketing and commerce—is seamlessly available and actionable within your Databricks environment. This represents a pivotal leap towards achieving full-loop data fluidity, accelerating innovation, and driving unprecedented business outcomes.

The Power of Zero Copy: Bi-Directional Data Activation

This powerful integration is anchored by two core Zero Copy capabilities, ensuring your data is always where it needs to be without costly duplication. First, our existing Zero Copy File Federation enables Salesforce Data Cloud to directly access vast datasets within Databricks Lakehouse, powering real-time customer views and AI initiatives directly on your lake data. Now, we are thrilled to announce the Beta launch of Zero Copy Data Sharing with Databricks, an equally disruptive innovation. This capability provides seamless outbound data integration from Salesforce Data Cloud directly into Databricks, built on open standards like Iceberg. It eliminates the need to copy or move data, ensuring your enriched, unified customer profiles, engagements, segments, and insights—curated across your entire Salesforce CRM through Data Cloud—are immediately available to your data scientists and analysts in Databricks for advanced AI/ML modeling, comprehensive analytics, and broader enterprise use cases.

With this, we complete the virtuous cycle of bi-directional Zero Copy support for Databricks—through Zero Copy Data Federation (in Beta) and Zero Copy Data Sharing (in Beta) —enabling seamless data fluidity without data duplication or compute overhead.

Revolutionizing Productivity: Bypass the ETL Bottleneck

Zero Copy Data Sharing revolutionizes productivity for IT Teams, SIs, Developers, CDOs, and CIOs by bringing refined Salesforce enterprise data to business users faster and with no friction. By eliminating the need for costly and time-consuming Extract, Transform, Load (ETL) pipelines to move these valuable insights, it significantly reduces operational overhead and accelerates time-to-value for data consumers in Databricks. This directly addresses how complex traditional integration processes historically delay critical decision-making.

For data scientists, data analysts, and data citizens, this partnership provides unparalleled access to intuitive low-code and no-code user interfaces to unlock Salesforce’s rich enterprise data. Zero Copy Data Sharing ensures that the unified customer profiles, engagement data, and segment insights built within Salesforce Data Cloud are seamlessly available in Databricks. This integration becomes pivotal for building and executing cross-platform analytics use cases and robust AI/ML models. Critically, this data integrates natively with Databricks Unity Catalog, providing a single source of truth and enabling governance across all Databricks services—from data warehousing and data engineering to AI/ML and business intelligence. Powering customer experiences with a rich unified profile combining the breadth of Salesforce CRM and Databricks data now truly just takes a few clicks!

Challenges of Traditional ETL Processes

Data engineers are building ETL pipelines to move or copy the data from Salesforce to Databricks. These traditional Extract, Transform, Load (ETL) processes have long been the backbone of data integration in organizations. However, as data volumes explode and the demand for real-time insights intensifies, ETL processes come with their fair share of challenges that can significantly hinder efficiency, increase costs, and delay critical business decisions.

Complexity and Cost: Traditional ETL processes often involve building, maintaining, and scaling complex pipelines. This demands specialized skills, significant time, and substantial investment in both infrastructure and personnel. Managing intricate data movement, diverse transformations, and precise scheduling tasks can be cumbersome and costly, leading to increased operational overheads and a drain on valuable data engineering resources.
Data Freshness and Delays: Copying data during ETL processes inherently introduces delays, often resulting in stale data that fails to reflect the near real-time state of the business. Lengthy ETL job durations further exacerbate this issue, leading to significant delays in data availability for analytics and AI. Furthermore, common failures in ETL processes, such as timeouts, schema changes, or data quality issues, can prolong data processing time and disrupt crucial data delivery schedules.
Data Governance Concerns: With the growing importance of stringent data governance, security, and compliance regulations, organizations need to maintain granular control over their data. Traditional ETL pipelines, which involve copying or moving data across multiple systems, can significantly complicate data lineage tracking, raise concerns about data security, and introduce complexities in ensuring regulatory compliance across every copy.
Scalability and Flexibility Limitations: Traditional ETL processes may struggle to scale effectively to handle ever-growing volumes of data or evolving business requirements. Scaling up existing ETL pipelines can be a daunting challenge, often requiring extensive redesign, reimplementation, and significant downtime. Additionally, rigid ETL architectures frequently lack the agility and flexibility to rapidly adapt to new data sources, changing data formats, or dynamic integration patterns, slowing down innovation.

Enhance Customer Insights with Salesforce and Databricks

With the challenges of traditional ETL processes clear, the integration between Salesforce Data Cloud and Databricks brings forth a revolutionary approach to data utilization. Zero Copy Data Sharing ensures that comprehensive insights from your Salesforce CRM, unified within Data Cloud, are seamlessly available in your Databricks Lakehouse. This empowers business units across any vertical to create an actionable, single source of truth for every customer, driving significant growth opportunities.

Here are some examples of how this integration transforms customer insights across different industries and use cases:

Retailers: Retailers can now leverage Zero Copy Data Sharing to bring their rich, unified customer profiles, loyalty data, and campaign insights from Salesforce Data Cloud directly into their Databricks Lakehouse. This allows them to combine it with point-of-sale data such as transaction details and product details for comprehensive customer lifetime value analysis, inventory optimization, and hyper-personalized recommendations. With this holistic view, governed by Unity Catalog, retailers can run highly targeted campaigns and build advanced predictive models that truly understand customer behavior.
Financial Services: Financial services customers can leverage the comprehensive customer profile data from Salesforce Financial Services Cloud, enrich that data within Databricks, and run sophisticated machine learning models to predict future returns for their clients. With a 360-degree view of the customer powered by Salesforce and accessible in Databricks, they can apply Databricks Machine Learning capabilities to build and train next-best-action models, focusing on increased relevance and deep personalization in every client interaction.
Healthcare: Healthcare industries can leverage Zero Copy Data Sharing to gather patient profile data from Salesforce Health Cloud and seamlessly share it with Databricks. They can then apply Databricks AI capabilities to develop predictive models that enhance and improve patient health outcomes and offer doctors valuable insights. With this unified data from Salesforce and Databricks, healthcare services can significantly enhance their operations by improving patient care, optimizing resource allocation, and streamlining operational processes.
Education: Education sectors can now leverage Zero Copy Data Sharing to join student profiles from Salesforce Education Cloud with student engagement and learning pattern data from Databricks. This enables them to build visual analytics to accurately measure student performance, create personalized learning paths, and foster better educational outcomes. Additionally, using Databricks machine learning models, they can predict at-risk students to proactively intervene and improve retention rates.

Securely connect and authenticate to Databricks

The journey to activate your Data Cloud insights in Databricks begins by Data Cloud Admin or Data Aware Specialists with three simple clicks. It starts with assembling various datasets in a Data share in Data Cloud, establishing the connection with the Data share target that is Databricks and linking both Data share with the Data share target. Let us look at this in detail. By leveraging the OIDC based secure auth model, Databricks Admin creates the connection with Salesforce Core Org Id and Data cloud tenant endpoint. Data Cloud admin inputs both the connection Id and Account Url in Data share target to establish the connection securely.

With Core tenant Id and Tenant Endpoint from Salesforce, Databricks admin chooses Catalog in the Databricks workspace to establish the connection with Salesforce Data Cloud with the connection type as Salesforce Data Cloud File Sharing.

Upon establishing the secured connection, the data cloud admin or data aware specialist creates the data share by assembling the necessary objects

Viewing the data in Databricks

With the link that has happened successfully in the Data cloud, Databricks admin is able to query and view all the shared objects in Unity Catalog in near real time.

With the power of Change data capture functionality built in the Data cloud, Databricks users will have continuous access to the most current data. All modifications made to the Data Share are automatically propagated to Databricks, thereby eliminating the requirement for any user’s involvement.

Organizations can now truly revolutionize customer experiences and achieve unparalleled success by combining the power of Salesforce’s comprehensive CRM data with Databricks’ leading Lakehouse platform. Thanks to Zero Copy Data Sharing, this trusted, secure, and easy-to-use integration ensures that the unified, real-time customer data and insights you build in Data Cloud—originating from your entire Salesforce CRM—are immediately actionable within your Databricks environment without any ETL overhead.

This seamless, zero-copy flow is directly integrated and governed by Databricks Unity Catalog. This means the rich Salesforce data not only becomes accessible but is also securely managed, discoverable, and immediately usable across all of Databricks’ powerful services, including data warehousing, data engineering, machine learning, and business intelligence. This represents true full-loop data intelligence, where insights flow from Salesforce for activation within Databricks, enabling advanced analytics, AI-driven initiatives, and a comprehensive 360-degree customer view without the overhead and complexity of traditional ETL processes. We’re empowering teams to gain invaluable insights and personalize every customer touchpoint at unprecedented scale.

Get Started Today!

Contact your Salesforce AE today to learn how you can start leveraging Salesforce Data Cloud’s Zero Copy Data Sharing to activate your customer insights or get started on salesforce.com/data

Source link