Part 5 · 6 chapters

Training and Fine-Tuning Language Models

Here you learn how models are specialized: pretraining versus fine-tuning, efficient techniques like LoRA, instruction tuning, modern alignment methods, and how to evaluate the result honestly.

Chapter 20Training and Fine-Tuning Language Models

Pretraining vs. Fine-Tuning vs. In-Context Learning

We now have prepared data and a clear picture of how models are built. Part V is about putting that data to work — training and shaping models. But before you train anything, you face a decision that can save you enormous time and money, or cost you both if you get it wrong: *which* of three very different approaches should you use to make a model do what you want? This chapter lays out pretraining, fine-tuning, and in-context learning side by side, and gives you a practical guide for choosing. Getting this choice right is one of the most valuable skills in the whole field, and beginners get it wrong constantly.

Chapter 21Training and Fine-Tuning Language Models

Fine-Tuning Your First Model

Having decided that fine-tuning is genuinely the right tool, it is time to do it. This chapter is a hands-on walkthrough of fine-tuning a model from start to finish: preparing your data, choosing a base model, setting the key knobs, training, and — crucially — checking whether it actually worked. We keep the code at a practical, illustrative level, because the exact libraries change, but the *workflow* is durable. By the end you will understand every step well enough to fine-tune a small model yourself and to read any fine-tuning tutorial without feeling lost.

Chapter 22Training and Fine-Tuning Language Models

Parameter-Efficient Fine-Tuning: LoRA, QLoRA, and PEFT

The fine-tuning of the last chapter, taken literally, means adjusting *every* weight in a model. For a model with billions of weights, that demands enormous memory and computing power — far beyond a single ordinary machine. This chapter explains the clever family of techniques that changed everything by asking a liberating question: what if we only adjusted a tiny fraction of the model? The answer, **parameter-efficient fine-tuning**, is what put fine-tuning of large models within reach of individuals. We build the intuition from the ground up, no heavy mathematics required.

Chapter 23Training and Fine-Tuning Language Models

Instruction Tuning and Alignment

We have the tools to fine-tune a model. Now we turn to the two-stage process that uses those tools to accomplish something specific and important: transforming a raw base model — a mere text continuer — into the helpful, honest, well-behaved assistant you actually want to interact with. Chapter 17 covered the *data* for this; here we cover the *process* and the deeper idea behind it, called alignment. This is the stage that makes a model usable and safe, and understanding it clarifies both what today's assistants are and why they sometimes behave as they do.

Chapter 24Training and Fine-Tuning Language Models

RLHF, DPO, and Modern Alignment Methods

Chapter 23 told us that alignment learns from human preferences, and Chapter 18 showed us what that preference data looks like. This chapter closes the loop by explaining *how* preference data is actually turned into a better-aligned model. We will demystify RLHF — the original, powerful, and somewhat painful method — and then meet DPO, the simpler successor that has largely replaced it for many uses. As always, we favor intuition over equations; by the end you will understand what these intimidating acronyms really do and why the field moved from one to the other.

Chapter 25Training and Fine-Tuning Language Models

Evaluating Models: Benchmarks, Metrics, and Pitfalls

We have now built a model from the ground up — pretrained, fine-tuned, instruction-tuned, and aligned. But a question has been lurking under every chapter of this part: how do you actually *know* it is any good? Evaluation is one of the most underrated skills in all of AI, and one of the easiest to get wrong. This chapter, closing Part V, covers how models are evaluated — benchmarks, metrics, model judges, and humans — and, just as importantly, the many ways evaluation can quietly mislead you. A model is only ever as trustworthy as the evaluation that vouches for it.

← Part 4Part 6