“We’re going to fine-tune a model.” Seven times out of ten, it’s not the right solution. Here’s how to differentiate the three main approaches to customize AI to your context.
Prompt engineering
You write carefully structured instructions, you provide examples in the prompt, you iterate. No changes to the model, just better input. It’s the cheapest approach, fastest to deploy, and easiest to modify.
When to use: for 80% of SMB use cases. If the task can be explained to a human in one page, it can probably be explained to an LLM by prompt.
RAG (Retrieval-Augmented Generation)
You index your knowledge base (internal documents, FAQs, manuals, database). At query time, you retrieve relevant passages and inject them into the prompt. The model responds with your up-to-date information.
When to use: when your use case depends on information that changes (internal policies, product catalog, technical documentation), or that’s too large to fit in a prompt. It’s the standard approach for internal help chatbots, customer support tools, and document research assistants.
Fine-tuning
You take a base model and train it on your own examples (typically high-quality question/answer pairs). The model “learns” your style, your business vocabulary, or specific behavior hard to obtain otherwise.
When to use: when you need a very specific style (brand tone, strict output format), rare business vocabulary, or a precise classification task on a narrow domain. And when you have the data: minimum 500-1,000 quality examples, ideally more.
Quick decision framework
1. Does it work with a good prompt and examples? If yes, stop there.
2. If not, can the missing information be indexed and retrieved on demand? If yes, RAG.
3. If not, do you have 500+ examples of the ideal output? If yes, fine-tuning may be justified. If not, go back to 1.
Winning combinations
RAG + good prompt engineering is 90% of enterprise AI solutions. Fine-tuning + RAG, for cases where you want a very specific tone on a changing knowledge base. Fine-tuning alone, almost never except for very narrow classification cases.
The trap: starting with “we’re going to fine-tune” before exhausting the first two options. Fine-tuning costs more, takes longer, and complicates maintenance. It’s a precision tool, not a starting point.