Your company may be using generative AI to analyze vast datasets, shape business strategies, streamline interactions, and provide personalized responses to customers. So you may be at the point where you’re asking a critical question about your tech stack: “Should we train our own large language model (LLM) or license a pre-trained LLM?” It’s an important question and, for the majority of companies, taking the pre-trained LLM path is often the smarter choice.
Sure, it sounds appealing to have something completely bespoke from the start, but in reality, LLM training is a large undertaking that might not be worth the time, money, and effort. Between steep costs, legal approvals, difficulty scaling and updating, and a hefty environmental impact, there are a few downsides to consider.
Many leaders don’t realize you can customize an LLM without having to fully train it from the ground up. A pre-trained model (also known as a no-training model) already has foundational knowledge and skills. It also allows you to add your own data to customize it and give it the context needed to yield great results from your generative AI queries.
Here, we’ll unpack the three compelling reasons choosing a pre-trained LLM may be the best choice for your business.
Opening up untapped data better grounds the pre-trained LLM
Today’s public-facing LLMs, like GPT-4 (which powers ChatGPT, a pre-trained model), are trained on a vast range of sources across the internet. That information isn’t always trustworthy and won’t serve your business needs without adding the context of your own data. Sure, it may be useful enough to help craft basic, impersonal sales emails or pop up a quick marketing campaign. But to gain more insights specific to your business and to get the most out of your pre-trained LLM, you need to incorporate your own data.
Most companies have a treasure trove of untapped or trapped data — that is, data that is fragmented or siloed within different teams, organizations, or documentation across the company. This data could be structured — Excel spreadsheets or SQL databases, for example — and unstructured — sales pitch emails, information-rich chat logs, PDFs, and the like. In fact, 81% of IT leaders say data silos are holding back their companies, according to a recent MuleSoft report.
Sharing trapped data takes it out of silos and opens it up to people across your organization. It then allows you to ground the LLM, or add the contextual information needed to hone generative AI queries in a universally accessible way. So when you need to summarize a financial report or train customer service agents, an LLM that uses your real-time data will result in greater performance and accuracy for your teams. Because it’s grounded with your own data — from financial reports, sales or marketing campaign results, HR history, and more — the LLM will have more context to work with.
“Imagine all of this data that cuts across multiple scenarios and personas being available to you,” said Jayesh Govindarajan, senior vice president of Salesforce AI. “Wouldn’t it be great if all that data could be used effectively as context to an LLM along with the prompt or instruction for a task to be completed?”
Could the results of sales calls benefit marketing? Could customer service queries benefit IT development? The answer to both is yes. And with this approach, you can unlock extremely valuable information sharing across the company.
“Organizations can focus on the data they already know and trust, and, most importantly, own — helping avoid the numerous pitfalls of copyright, toxicity, and unpredictability that so often undercut a generative AI deployment’s reliability,” said Silvio Savarese, Salesforce’s chief scientist leading its AI research team. “And because these datasets are so tightly focused on a domain-specific task, they can train powerful, purpose-built models that do things no general-purpose alternative can touch.”
Get more of generative AI
Building and training an LLM can cost a company millions depending on its size and needs. You have to gather and prepare vast amounts of data to train it; purchase and assemble computational resources like GPUs and storage space to run it; hire data scientists and natural language and machine learning engineers to build and run it; and more. For example, OpenAI CEO Sam Altman estimated it cost the company $100 million to train GPT-4, and it cost Google around $190 million to train Gemini Ultra, according to Stanford’s AI Index report. Then you have to account for the time it takes to train the LLM, which can be weeks or even months.
On the other hand, companies using a pre-trained LLM will gain value faster because it’s a plug-and-play model. Adding your own data to the LLM gives it more context to generate better results. You do this by using low-cost prompt grounding and retrieval augmented generation (RAG), an AI technique to automatically embed your most current and relevant proprietary data directly into your LLM prompt. Doing that can cost hundreds to thousands of dollars. Because you don’t need the aforementioned aspects of training your own LLM, a pre-trained model will essentially cost what you pay in monthly or annual service fees to a provider as well as grounding and using RAG, which is significantly less — hundreds of thousands of dollars, even millions, less — than training it yourself.
Between using a pre-trained LLM and training your own is the option to fine-tune a pre-trained model. While this can increase your costs to the tens of thousands and requires more AI expertise, fine tuning allows you to better customize and leverage the strengths of a pre-trained model. Fine tuning allows you to optimize the pre-trained model for use within a specific industry, domain, or set of problems, Savarese explained. RAG can then be used in the same way as with a pre-trained model.
“The vast majority of companies can get to value very quickly by using the no-training approach,” Govindarajan said. “The no-training approach also solves for grounding the model in the company’s data in such a way that it is update-friendly.”
In addition to the time saved and the near-immediate value, you don’t need to hire as many people to run the LLM. In most cases, it’s almost a no- or low-code process to get your pre-trained model up and running — eliminating the need to hire large teams of engineers and data scientists. You then ground the LLM with your own data to build better prompts, which you can easily update to keep your LLM fresh and relevant.
“Going with a no-training approach, you can plug in your data and start building generative AI experiences without doing much at all on the modeling layer,” Govindarajan said. “An out-of-the-box LLM can already do great summarization because it understands linguistics and text and has great compositional capabilities.”
Pre-trained LLMs are more sustainable
Now that you know you can save money and time by choosing a pre-trained LLM, let’s talk about the planet. AI can unlock insights that could help reduce global greenhouse gas emissions by 5% to 10% by 2030. For example, AI can support climate action by helping governments shape climate strategy, supporting communities by predicting extreme weather events, and assisting organizations in reducing their carbon footprint.
But AI also sucks up a lot of power. LLMs need an immense amount of computational resources — energy, carbon, and water — to train and operate. The data centers that provide this use a lot of energy, which results in carbon emissions, water depletion, and an impact on the supply chain.
Training an LLM requires considerable resource use, so anytime you can avoid duplicating that phase, it’s a win for the environment. Training GPT-3 emitted 552 tons of carbon dioxide equivalent. That’s the same as driving a car 1.4 million miles. Instead of training a new model from scratch, grounding a pre-trained LLM with your data while leveraging RAG lets organizations use AI effectively with much less impact.
Beyond the training phase, the ongoing operation of AI systems also has an environmental impact. This operational impact is correlated with LLM size, classified by the number of parameters that compose the model (usually in the billions or trillions). The larger the model, the greater the impact.
For example, Salesforce AI Research’s xGen-7B LLM has a size of 7 billion parameters while GPT-3 is composed of 175 billion parameters. The smaller model, in addition to using fewer resources for training, also uses less compute in ongoing operation. Under the right circumstances, Savarese said, these small models can offer the best of all worlds: reduced cost, lower environmental impact, and improved performance. They are often equal to large models, he added, when it comes to certain tasks, including knowledge retrieval, technical support, and answering customer questions.
“Not all of the models need to be these massive models,” said Boris Gamazaychikov, senior manager of emissions reduction at Salesforce. “You can get the work done with smaller, more domain-specific models, which is a big thing Salesforce is working on. We can create an efficient, effective model for CRM use cases. We can layer in your data on top without additional training and that makes things more efficient.”
Another factor to consider is the impact on supply chains related to GPU manufacturing and building data centers. Lead times to get machinery to build LLMs and power centers are growing and delivery times have slowed. Affordable real estate near power sources is hard to find. And the sustainability of energy use for the data centers housing the GPUs depends on where they’re located. Gamazaychikov said data centers’ emissions will vary widely based on whether local power grids rely on fossil fuels (like in India or the upper Great Plains) or renewable energy, like in Scandinavian countries or Eastern Canada.
“Customers should try to understand the related energy efficiency of the various models and what options they have,” Gamazaychikov said. “For now, it’s unfortunately pretty opaque, but a good proxy is model size (like xGen vs. GPT-3), which is something providers will offer.”
Before you make the investment
Many companies will start using generative AI more for everyday use. It is an investment, but how much of your budget it takes can vary based on the LLM model you choose. Going with a pre-trained model can help you save money and reduce your impact on the environment. With greater access to more company data, your teams can use the LLM with better efficiency — and more easily ground the LLM with updated data — leading to more productivity across your organization.