From Flow Generalists to Champions: Building Agentic AI for Salesforce Automation

Latest
June 26, 2025

The Challenge with Flows Today

Salesforce flows sit at the heart of modern CRM automation, yet authoring them still requires a unique mix of declarative drag‑and‑drop and Apex know‑how. To ease this process, Salesforce has committed to incorporating cutting-edge Generative AI technologies such as Agentforce for Flow (A4F, now generally available). A4F uses AI to generate complete Salesforce flows from a user prompt, which can then be readily deployed on Flow Builder. These tools have already seen rapid adoption by Salesforce Admins, with thousands of unique org sign ups within the first few months.

In Figure 2 below, we present a snapshot of results with our A4F models across two deployments: v1 which uses Mistral-Nemo (12b) finetuned on text-to-flow data, and v2 which uses a stronger Mistral-Small (32b) backbone as well as a larger training corpus that includes synthetic training samples. As a metric, we report the ready-to-activate rate: the % of generations that can be directly activated in a production environment. We benchmark these models against a frontier closed-source LLM, and report performance for two types of flows – those containing only standard objects and flows containing custom objects as well. Despite starting from a significantly smaller backbone than the closed-source LLM, our A4F models strongly outperform the closed-source baseline, especially on custom flows!

This first generation of A4F models, though capable, still treat text-to-flow generation as a token generation problem: accepting a user prompt as input, and generating flow metadata as output (formatted as a JSON string, see Figure 1 above). This design passes up the ability to leverage the extensive business knowhow underpinning Salesforce Flows, e.g. that all flows can be represented as graphs consisting of node “elements” with edge “connectors” with precise triggers that dictate when they are run (in the example above, at 6 am daily). Without this knowledge, we find that models struggle to generate complex flows (e.g. with large and unusual structure or details), which poses a challenge to deploying them in production.
To remedy this, we set out to train Enterprise General Intelligence (EGI) models for flow – proprietary models fine-tuned to surpass out-of-the-box frontier models on enterprise tasks – that explicitly encode such structure and can continually self-improve from interaction within a rich flow simulation environment called Flow Simulator (FlowSim).

How we used Flow Simulator to train EGI models for A4F

Flow Simulator (FlowSim) is a comprehensive framework for building evaluation and training environments that simulate real-world enterprise scenarios. It enables benchmarking and optimization of agents, ensuring they perform reliably in real business applications.

To train flow generation models with FlowSim, we first hand-designed a Domain Specific Language (DSL) representation for flows: a set of function primitives and data models that encode flow structure and domain knowledge which can be composed to construct any flow. We implement this DSL in code as a Python schema, and then translate our existing flow metadata from JSON to DSL. Finally, we train EGI models by fine-tuning a strong open-source backbone to generate DSL flow representations (instead of JSON), in addition to a chain-of-thought trace. With this, we effectively reduce the task to code generation – a task at which LLMs already excel!

We also design automated metrics to evaluate the quality of the flow generations along two dimensions: validity (whether the generated flow is syntactically correct) and correctness (whether the generated flow matches the ground truth). By running our fine-tuned model within simulated orgs and automatically scoring its generations using these metrics as rewards, we continue to train the model with reinforcement learning.

In summary, by reformulating text-to-flow generation as code generation (in a domain specific language) and applying the EGI playbook, we train text-to-flow models that deliver highly accurate production-ready flows in less time.

EGI Phase	Our Build Phase
1. Synthesize	• Data Curation: Thousands of flows annotated by human experts, including for failed prompts, as well as validated model-generated flows from synthetic user prompts. • Defining a Domain Specific Language (DSL) for flow: Hand-designed Python schema enriched with domain knowledge and real-world constraints (from developer docs)
2. Measure	• Evaluation: Automatically measure the correctness (eg. topology and flow type) and validity (e.g. ability to load+save) of generated flows within sandbox Salesforce orgs
3. Train	• EGI Fine‑Tuning: Train EGI models for <prompt> → <chain-of-thought> + <DSL> generation starting from a strong open-source base model (Mistral-Small (34B)) • Iterative self-improvement with Reinforcement Learning (RL): Train EGI model in FlowSim simulation environment using RL with environment rewards.

To benchmark performance, we had flow experts create a challenging test split of highly complex flows for “AI Appdev” – an ambitious ongoing effort for fully autonomous software development. As the figure below shows, the first generation of A4F models perform modestly on this difficult test set, achieving ready-to-activate rates of 32-35%. We note here that ready-to-activate rate is a stringent metric: most flow generations that are not deemed “ready to activate” are almost always largely accurate and can be successfully activated with only a few human edits. Next, we benchmark our EGI models, and find that they perform significantly better, with the EGI RL model achieving a 48% activation rate (a ~50% relative improvement), despite being trained on 88% less data!

What’s Next

While these early findings showcase the potential of EGI in action, they are only scratching the surface. With Salesforce’s Flow Simulator, we hope to turbocharge EGI model development for a range of enterprise applications within a single comprehensive and tightly integrated ecosystem. Follow us on X to stay tuned for what’s next!

Viraj Prabhu
Research Scientist, AI Research

Viraj Prabhu is a Research Scientist at Salesforce AI working on developing digital AI agents that can perceive, plan, reason, and act in novel environments towards accomplishing complex goals. Previously, we was a graduate student at Georgia Tech where he earned his PhD (advised by Judy Hoffman)
Read More

More by Viraj

Zeyuan Chen
Senior Manager, Research

Zeyuan Chen is a Senior Manager of Research at Salesforce AI Research, where he has been contributing since 2019. His work focuses on advancing computer vision, machine learning, multimodal AI, AI agents, and workflow automation through code generation and data visualization. He holds a Bachelor’s
Read More

More by Zeyuan

Ran Xu
Director, AI Research

Ran Xu received his Ph.D. in computer science from University at Buffalo from 2015. Currently, he leads a group of exceptional computer vision and multimodal AI researchers at Salesforce to push the boundary of research and productive AI for CRM.

More by Ran

Denise Pérez
Senior Product Marketing Manager

I am an AI storyteller and thought leader at Salesforce AI Research, where I shape the narrative on what’s next in AI. I help define how tomorrow’s AI is understood today. Since 2021, I’ve been bridging cutting-edge research with real-world impact—translating complex breakthroughs into
Read More

More by Denise

Silvio Savarese
Executive Vice President and Chief Scientist, Salesforce AI Research

Silvio Savarese is the Executive Vice President and Chief Scientist of Salesforce AI Research, as well as an Adjunct Faculty of Computer Science at Stanford University, where he served as an Associate Professor with tenure until winter 2021. At Salesforce, he shapes the scientific direction and
Read More

More by Silvio

Source link

Interesting Posts

From the Microsoft Power Platform blogs: Reimbursement approvals; Autofill third-party web forms; App builder agent; Calendar click event

October 31, 2025

From the Microsoft Power Platform blogs: Create file groups; New AI functions; IfError; Create an AI agent

October 27, 2025

From the Microsoft Dynamics 365 Finance and Supply Chain Management: Healthcare budget control; Advanced bank reconciliation; Warehouse inventory accuracy; Electronic reporting project

October 27, 2025

From the Microsoft Power Platform blogs: Reimbursement approvals; Autofill third-party web forms; App builder agent; Calendar click event

October 31, 2025

From the Microsoft Power Platform blogs: Create file groups; New AI functions; IfError; Create an AI agent

October 27, 2025

From the Microsoft Dynamics 365 Finance and Supply Chain Management: Healthcare budget control; Advanced bank reconciliation; Warehouse inventory accuracy; Electronic reporting project

October 27, 2025

Using Project Online? Check out Dynamics 365 Project Operations

October 26, 2025

From Flow Generalists to Champions: Building Agentic AI for Salesforce Automation

The Challenge with Flows Today

How we used Flow Simulator to train EGI models for A4F

What’s Next

Interesting Posts

From the Microsoft Power Platform blogs: Reimbursement approvals; Autofill third-party web forms; App builder agent; Calendar click event

From the Microsoft Power Platform blogs: Create file groups; New AI functions; IfError; Create an AI agent

From the Microsoft Dynamics 365 Finance and Supply Chain Management: Healthcare budget control; Advanced bank reconciliation; Warehouse inventory accuracy; Electronic reporting project

Using Project Online? Check out Dynamics 365 Project Operations

Finance in Microsoft 365 Copilot is now generally available

From systems of record to systems of action: Dynamics 365, agentic business applications for the frontier

Related Posts

From the Microsoft Power Platform blogs: Reimbursement approvals; Autofill third-party web forms; App builder agent; Calendar click event

From the Microsoft Power Platform blogs: Create file groups; New AI functions; IfError; Create an AI agent

From the Microsoft Dynamics 365 Finance and Supply Chain Management: Healthcare budget control; Advanced bank reconciliation; Warehouse inventory accuracy; Electronic reporting project

Using Project Online? Check out Dynamics 365 Project Operations

Leave a Reply Cancel reply

Professionals with Cloud Technology Expertise

Quick Links

Follow Us