10 Steps To Launch an Agentic AI Pilot Program

Latest
June 7, 2025

Launching a pilot is a great way to explore AI agents in a controlled setting. You can create one, test it, and address any challenges before rolling it out. If you want the pilot to succeed, though, you need to do it right: By some estimates, more than 80% of AI projects fail — twice the rate of failure for tech projects that don’t involve artificial intelligence (AI). (Our recent article on why many agentic AI pilot programs fail offers tips on righting the ship.)

The best pilots are built on a solid plan that lays out how they’ll be designed, executed, and scaled. But you don’t need to figure this out on your own. Check out our 10-step plan below, as well as guidance and insights from companies that have already done the work, including Big Brothers Big Sisters of America, Engine, and Salesforce.

The agentic AI terms you need to know

Everyone’s talking about agentic AI, but the terms and concepts can pile up faster than your unread Slack messages. Our glossary can help you speak like a pro.

1. Find the right use case

If there’s one piece of advice Travis Gibson, chief technology officer at Big Brothers Big Sisters of America (BBBSA), has for companies that want to launch a pilot, it’s this: “Start with the problem,” he said, “not the technology.”

BBBSA easily identified a process it found challenging. For more than 100 years, the nonprofit’s mission has been to match adults (“Bigs”) with young people (“Littles”) in a mentoring relationship that benefits both. About 135,000 kids currently have a Big, but BBBSA has a waiting list of 30,000 Littles.

In the past, matching specialists needed six to eight weeks to review all the information they’d collected about Bigs and Littles, including their common interests and backgrounds, and make a match. Gibson and his team realized that an AI agent could do this more quickly. So, BBBSA used Agentforce, the Salesforce platform for building AI agents, to create one.

The team built its agent last November and offered it as a pilot to 15 agencies in March. The agent analyzes the data and suggests 8 to 10 potential matches for each kid, explaining why each match might work. Human specialists then review the recommendations and make the final decision.

2. Ask whether agentic AI is the right choice

Once you’ve identified the problem you want to solve, ask yourself, which is the best choice: generative AI, predictive AI, a chatbot, or an agent?

Generative AI is great for creating content, such as reports or emails, while predictive AI excels at using data and algorithms to predict outcomes and events. Chatbots can handle preprogrammed tasks without many variables, such as a simple product return. “But they’re pretty deterministic, and they can’t behave any other way unless you change the code,” said Irina Gutman, regional vice president, Agentforce accelerator team, Salesforce professional services.

If you need an AI that can make its own decisions, go with an AI agent. It can analyze data, use reasoning, and take action, all while communicating in natural language. “But there’s inherent variability in agentic technology,” said Gutman, “which means you’ll also need to monitor it.”

3. Start small, with a repeatable task

“Customers will say to me, ‘Give me an example of the sexiest, most complex, advanced agent you’ve ever built,’” said Gutman. “And my answer is, ‘Sure, we can do that. But actually, what you want is to start with the most boring, most repeatable, and most low-hanging fruit.’”

In other words, keep it simple, practical, and impactful. That’s what Engine, a business travel management platform, did with its Agentforce pilot. Engine is known for its stellar customer service, offering 24/7, no-hold chat and calls. Every year, the company receives 550,000 customer inquiries, 60% of which are initiated via its mobile chat function.

Engine had grown rapidly and needed to scale, but it wasn’t sure it wanted to add to its head count. To take the strain off its existing customer service agents, the company looked at the easy-to-manage, repeatable tasks it could hand off to an agent.

“We had conversations in which we were asking, ‘Where’s the repetitive work?’” said Joshua Stern, Engine’s director of GTM systems. “‘Where’s the work that you can program or teach an agent to handle, as opposed to the work that’s customer-facing, and where it makes the most sense for our human agents to be spending their time?’”

The answer: An agent could address customers’ most common request, “Cancel my reservation.” It’s a narrowly defined, repeatable task that’s perfect for a pilot, and it requires the autonomy and decision-making skills of an AI agent. For example, the agent would need to decide whether a customer was eligible to cancel a reservation.

Follow these tips to make sure your agent delivers value right away.

4. Need help? Work with a Salesforce partner

Your IT team may feel that it has all the training, tools, and support it needs to launch Agentforce on its own. After all, new research shows that businesses can develop an AI agent 16 times faster with Agentforce than by building one from scratch.

But if your team needs an extra hand in launching an agentic AI pilot program, contact a Salesforce partner. These companies, which are part of the Salesforce ecosystem, can help you build an AI agent. They’re one reason why Engine and BBBSA were so successful with their pilots: Both worked with outside partners.

Engine partnered with Astound Digital last October to build its cancellation agent, Eva (Engine Virtual Assistant), and deployed it in less than two weeks.

BBBSA enlisted Coastal Cloud to build its agent, a process that took two months. “That was the beauty of the solution,” Gibson said. “We didn’t have to hire AI or machine learning developers.” BBBSA relied instead on Coastal Cloud’s Salesforce solution architect and Salesforce administrators. “That allowed us to move fast,” Gibson said, “and get a cost-effective solution.”

To find a partner for your business, contact Salesforce’s professional services team or follow this guide.

5. Make sure your data is in good shape

You know the saying “perfect is the enemy of good”? It applies to your data, too. Agents do need decent data to train on and use for work. “But you may not need the entire set of data to be ready,” Gutman said. “You just have to have enough clean data and integrations to start on your way.”

For BBBSA, this was easy. The nonprofit’s technology strategy for hiring, budget, and match-making was already built around Salesforce’s deeply unified platform. Its data was in good shape, and much of it was managed by Data Cloud, Salesforce’s hyperscale data engine. “One of the great things about the agentic AI solution,” Gibson said, “was that it could use a lot of our unstructured data to match Bigs and Littles on their common likes and dislikes.”

6. Define your agent’s role clearly

If you want your pilot to succeed, your AI agent needs clear instructions on how to do its job. Here’s what Engine wanted Eva to do:

Validate that the customer on the chat was authorized to change their booking (or to do so on the company’s behalf).
Evaluate all existing travel reservations.
Cancel the customer’s requested booking.
Sync the cancellation to Engine’s internal booking platform, which was connected to Agentforce via a custom Application Programming Interface (API).

“We also had to make sure our agent was equipped with the right messaging,” Stern said, “that it was saying the right things, and keeping the customer informed.”

7. Put the right guardrails in place

As much as an agent needs to know what to do, it also needs to know what not to do. If you’re using Agentforce, some of those guardrails will already be in place. Salesforce’s Einstein Trust Layer eliminates bias and toxicity, and ensures security and compliance.

But every organization needs to create its own guardrails, too. “When we were preparing for our pilot,” said Gibson, “we spent a lot of time with Coastal Cloud going through what we call our standards of practices” — BBBSA’s rules that make sure Bigs and Littles are matched on certain criteria such as gender, lived experiences, career aspirations, and hobbies. The nonprofit trained the agent on research showing what data leads to the longest-lasting matches.

BBBSA then put several guardrails in place. Matches, for example, needed to be the same gender. Some agencies in the pilot wanted to be able to filter for age or religious backgrounds, so the nonprofit included those. But the biggest guardrail BBBSA installed was that the agent could only make recommendations. When it comes to matching a Big and a Little, a human always has the final say.

8. Get comfortable with iterating

If you want your agentic AI pilot program to be successful, you’ll have to make a few tweaks along the way.

Engine, for example, spent a week building its agent with Astound Digital. “Then we spent probably a week, maybe a bit longer, doing back-end testing,” Stern said, “and trying to break it, and seeing ‘oh, this is how it broke,’ and fixing a prompt in some way.”

Once past that phase, Engine continued to monitor its agent’s performance. The team learned that it needed to refine the agent-to-human handoff — in particular, human agents needed to get up to speed quickly and not ask customers repetitive questions. “We had to continuously improve on that to make sure we were giving the customers the best experience,” said Stern.

9. Measure the outcomes of your agentic AI pilot

There’s a simple way to see if your pilot’s a success: Look at the numbers. “Sometimes, that means we need to look beneath the surface,” said Gutman, “to measure not just an immediate number, but also what it means to the business.”

For Engine, this was simple. The company went live with its agent in less than two weeks, and Eva now handles 30 to 40 cancellations per week, lightening the load of its human agents.

BBBSA is still in the midst of its six-month pilot, and is looking for more data points, including ones that require time to assess. The nonprofit is tracking whether the agent takes fewer steps to identify potential matches and is evaluating the quality of its recommendations. BBBSA is also measuring agencies’ adoption and engagement rates during the pilot.

Gibson said that while the organization hopes to cut its match time in half, “We also need to use this solution for a while to see whether match retention is increasing.” In other words, it’s not enough for its agent to make recommendations quickly. The quality of the match matters just as much.

10. Get ready to scale

Once your pilot has proven successful, how do you scale? Often, one step at a time.

After BBBSA completes its pilot, it plans to roll out its agent over a six-month period, onboarding 10 to 20 agencies a month.

Engine is building its agentic capabilities one phase at a time. After successfully rolling out Eva as a cancellation agent, the company expanded its role to handle FAQs. It went live with this new capability in April, and now answers simple questions that previously required a human.

Engine also launched a new pilot: a second AI agent that analyzes Eva’s work. The new agent reviews which prompts work best, how quickly Eva responds, and whether it needs improvement in any areas. All of which foretells a much bigger role for agentic AI. “It’s important, when we’re talking about this stuff, to keep in mind that this is just the beginning,” said Stern. “This is basic stuff we’re testing, and Engine is still in very early stages here.”

Successful agentic AI pilot programs take it step by step

Though you may want to rush into a pilot, you’re more likely to succeed if you take it step by step. Our game plan for designing, executing, and scaling a pilot can help you focus on the problem you need to solve, and implement your agent in an intelligent way.

Source link