Complete Curriculum

All 51 Chapters

From Data and Models to Autonomous Agents — A Complete Beginner-to-Expert Guide. Each chapter is standalone and includes exercises with detailed solutions.

Chapter 1Foundations

What Is an AI Agent? From Chatbots to Autonomous Systems

We begin with the single most important idea in the book: what actually separates an *agent* from a plain chatbot or a script. By the end of this chapter you will be able to look at any AI product and say whether it is agentic, and why.

⏱ 3 min✏️ 4 exercises🖼 Figures

→

Chapter 2Foundations

Why Now? A Short History of AI, LLMs, and the Agentic Shift

The dream of building machines that can act on our behalf is not new — researchers have chased it since the 1950s, through waves of soaring hope and crushing disappointment. So why are useful agents suddenly possible now, in our decade and not an earlier one? This chapter answers that question by telling the story of how we got here. You do not need any technical background to follow it; think of it as a guided tour through seventy years of trying, failing, and finally succeeding. Understanding this arc will make everything that follows feel less like magic and more like the natural next chapter of a long story.

⏱ 8 min✏️ 6 exercises

→

Chapter 3Foundations

Setting Up Your Workspace: Tools, Keys, and Environments

Every craft has a moment before the real work begins, when you lay out your tools and make sure everything is where it should be. A clean, repeatable setup is not glamorous, but it is the difference between spending your evenings building agents and spending them fighting error messages. In this chapter we install everything you need, one piece at a time, and we explain *why* each piece exists so that nothing feels like magic. By the end you will have a working environment and you will have made your very first call to a language model.

⏱ 15 min✏️ 6 exercises🖼 Figures

→

Chapter 4Foundations

A Gentle Programming Refresher for AI Builders

You do not need to be an expert programmer to build agents, but you do need to be comfortable reading and adapting a small amount of code. This chapter refreshes exactly that — the handful of Python ideas that appear again and again in the rest of the book, plus a glance at JavaScript so it never looks foreign. If you have programmed before, treat this as a warm-up. If you are new or rusty, go slowly and type every example yourself, because code, like a musical instrument, is learned by playing rather than by watching.

⏱ 8 min✏️ 6 exercises

→

Chapter 5Foundations

The Math You Actually Need (Intuition First)

Many people approach anything labelled "math" with a flinch, bracing for walls of symbols. This chapter asks for none of that. We are not here to prove theorems; we are here to build *intuition* — a feel — for just three ideas that quietly power everything in AI: vectors, probability, and gradients. Each comes with a picture and a tiny example, and you will never be asked to derive anything. By the end, these words will feel like friends rather than threats, and the inner workings of later chapters will click into place.

⏱ 8 min✏️ 6 exercises🖼 Figures

→

Chapter 6Machine Learning Essentials

How Machines Learn: Core Concepts

In Chapter 2 we saw the great reversal at the heart of modern AI: instead of writing rules by hand, we show a machine examples and let it find the rules itself. That single idea — learning from data — is what this chapter unpacks. We will meet the three fundamental ways machines learn, define the small vocabulary you will hear in every AI conversation, and understand the one goal that everything in this field quietly serves. No code is required to follow it; the ideas are what matter, and they will anchor everything that comes after.

⏱ 7 min✏️ 5 exercises

→

Chapter 7Machine Learning Essentials

Neural Networks from Scratch (The Intuition + a Tiny Build)

"Neural network" is one of those phrases that sounds like it requires a doctorate to understand. It does not. By the end of this chapter you will know exactly what a neural network is, because you will have built the smallest possible one yourself, in a few lines of code. The secret, which the intimidating name hides, is that a neural network is just numbers, multiplication, addition, and one simple bend — repeated many times. We build it from a single piece and assemble upward, so nothing is ever mysterious.

⏱ 6 min✏️ 5 exercises🖼 Figures

→

Chapter 8Machine Learning Essentials

Training a Model: Loss, Gradients, and Backpropagation

We left Chapter 7 with a neural network full of random weights, producing nonsense. This chapter answers the question that makes everything else possible: how does that random machine become genuinely skilled? The answer is **training**, and it turns out to be a surprisingly intuitive loop — measure how wrong you are, work out which way is better, take a small step, and repeat. We will build that intuition piece by piece, and even watch a tiny model learn in a handful of lines of code.

⏱ 8 min✏️ 5 exercises🖼 Figures

→

Chapter 9Machine Learning Essentials

Embeddings: Turning Meaning into Numbers

We have reached one of the most quietly important ideas in all of modern AI. Computers work only with numbers, yet we want them to work with the meaning of words. **Embeddings** are the bridge: they turn words, sentences, and whole documents into lists of numbers arranged so that *meaning becomes geometry*. This single idea powers search, memory, and the retrieval that grounds agents — including the RAG system you already built in Chapter 36. By the end of this chapter, the magic behind that system will be fully demystified.

⏱ 7 min✏️ 5 exercises🖼 Figures

→

Chapter 10Inside Large Language Models

The Transformer Architecture, Explained Simply

The transformer is the single most important invention behind modern AI. Every model you have heard of is built on it. Yet its core idea is surprisingly graspable, and you do not need any equations to truly understand it. In this chapter we take the transformer apart piece by piece, using pictures and analogies, until the word *attention* stops being jargon and starts being obvious. By the end you will understand not just how a transformer works, but *why* this particular design unlocked everything that followed.

⏱ 10 min✏️ 6 exercises🖼 Figures

→

Chapter 11Inside Large Language Models

Tokenization: How Text Becomes Tokens

In Chapter 10 we said a model first splits text into *tokens*, then waved our hands and moved on. It is time to lift the lid on that very first step, because it is far more consequential than it appears. Tokenization quietly determines how much you pay, how much text a model can handle at once, and why models sometimes fail at tasks that look trivially easy to a human. This is a longer chapter than its humble subject suggests, and deliberately so: understanding tokenization will save you money, prevent confusion, and demystify a whole category of strange model behavior. We take it slowly, with plenty of examples, and assume no prior knowledge.

⏱ 10 min✏️ 6 exercises🖼 Figures

→

Chapter 12Inside Large Language Models

Attention and the Context Window

We met attention in Chapter 10 as the idea that every word looks at every other word. Here we deepen that picture and then follow it to one of its most important practical consequences: the **context window**, the fixed amount of text a model can hold in mind at once. The context window shapes how much a model can read, how much it costs, what it forgets, and why techniques like retrieval exist at all. For anyone building agents — which accumulate long histories of steps and observations — managing this limit well is not a side detail; it is a central skill. We take our time, with analogies and concrete strategies throughout.

⏱ 9 min✏️ 6 exercises🖼 Figures

→

Chapter 13Inside Large Language Models

How LLMs Are Pretrained

We have built up, piece by piece, all the machinery of learning: neural networks, loss, gradients, the training loop. Now we put it to work at a scale that is genuinely hard to imagine, and watch how it produces a large language model. The astonishing part of this chapter is how *simple* the core idea is. An LLM learns from one deceptively humble task — predicting the next token — repeated over a quantity of text no human could read in a thousand lifetimes. Understanding pretraining clarifies the whole pipeline you set out to learn: where a model's knowledge comes from, why it sometimes invents things, and what still has to happen before a raw model becomes the helpful assistant you talk to.

⏱ 9 min✏️ 6 exercises🖼 Figures

→

Chapter 14Inside Large Language Models

Open vs. Closed Models and the Modern Landscape

You now understand how language models are built. This final chapter of Part III turns practical: with so many models available, how do you choose one for your own project? We will map the landscape — the big split between models you access over the internet and models you run yourself — and lay out a clear framework for deciding. One warning up front, and it matters: the specific models, their names, and their prices change every few months, so this chapter deliberately avoids naming today's leaders. Instead we teach the *durable distinctions and questions* that will still guide you long after this year's leaderboard is forgotten.

⏱ 8 min✏️ 6 exercises

→

Chapter 15Data Preparation

Where Training Data Comes From

We now begin the part of the book that the user who asked for it called "data preparation" — and it deserves the attention. In Chapter 13 we saw that a model is, in the deepest sense, a reflection of the data it learned from. Everything it knows, every skill it has, and every flaw it carries traces back to that data. This chapter surveys where training data actually comes from, the genuinely thorny legal and ethical questions it raises, and how to think about data quality from the very first step. It assumes no background, and it sets up the hands-on cleaning, instruction, preference, and synthetic-data chapters that follow.

⏱ 8 min✏️ 6 exercises

→

Chapter 16Data Preparation

Cleaning, Deduplicating, and Filtering Data

If Chapter 15 was about choosing your ingredients, this chapter is about washing and chopping them. Raw data — especially raw web text — is genuinely filthy: full of duplicates, junk, broken formatting, and content you do not want anywhere near your model. Cleaning it is unglamorous, hands-on work, and it is some of the highest-leverage work in all of machine learning. We will walk through a practical cleaning pipeline step by step, with runnable code you can adapt to your own datasets, and we will see exactly why one step — deduplication — matters far more than beginners expect.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 17Data Preparation

Building Datasets for Instruction Tuning

In Chapter 13 we ended with an unsettling fact: a freshly pretrained model is only a *text continuer*, not a helpful assistant. Ask it a question and it might continue with more questions. This chapter is about the data that fixes that — the instruction–response pairs used to teach a model to actually follow requests. We will see exactly what a good example looks like, what separates a great dataset from a useless one, how much data you really need, and how to build your own. The actual training process comes in Part V; here, our entire focus is the data that makes it work.

⏱ 8 min✏️ 6 exercises

→

Chapter 18Data Preparation

Preference and RLHF Data: How Human Feedback Is Collected

Instruction tuning, from the last chapter, teaches a model to follow requests. But it leaves a subtler question unanswered: when two responses are both reasonable, which is *better* — more helpful, more honest, safer, better phrased? Teaching that judgment requires a different kind of data, built not from single right answers but from human *preferences* between alternatives. This chapter explains what preference data is, why it takes the form of comparisons, how it is collected, and the genuine difficulties involved. The training methods that consume this data come in Part V; here we focus entirely on the data itself.

⏱ 8 min✏️ 6 exercises🖼 Figures

→

Chapter 19Data Preparation

Synthetic Data and Data Augmentation

The last two chapters revealed an expensive truth: good instruction and preference data is largely made by hand, and human effort does not scale cheaply. So a natural question arises — could a capable model help generate the data used to train models? It can, and **synthetic data** has become one of the most important and fastest-growing techniques in the field. But it comes with serious, sometimes subtle dangers. This chapter, closing Part IV, explains what synthetic data and data augmentation are, how they are produced, their genuine advantages, and the risks that make verification absolutely non-negotiable.

⏱ 7 min✏️ 6 exercises

→

Chapter 20Training and Fine-Tuning Language Models

Pretraining vs. Fine-Tuning vs. In-Context Learning

We now have prepared data and a clear picture of how models are built. Part V is about putting that data to work — training and shaping models. But before you train anything, you face a decision that can save you enormous time and money, or cost you both if you get it wrong: *which* of three very different approaches should you use to make a model do what you want? This chapter lays out pretraining, fine-tuning, and in-context learning side by side, and gives you a practical guide for choosing. Getting this choice right is one of the most valuable skills in the whole field, and beginners get it wrong constantly.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 21Training and Fine-Tuning Language Models

Fine-Tuning Your First Model

Having decided that fine-tuning is genuinely the right tool, it is time to do it. This chapter is a hands-on walkthrough of fine-tuning a model from start to finish: preparing your data, choosing a base model, setting the key knobs, training, and — crucially — checking whether it actually worked. We keep the code at a practical, illustrative level, because the exact libraries change, but the *workflow* is durable. By the end you will understand every step well enough to fine-tune a small model yourself and to read any fine-tuning tutorial without feeling lost.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 22Training and Fine-Tuning Language Models

Parameter-Efficient Fine-Tuning: LoRA, QLoRA, and PEFT

The fine-tuning of the last chapter, taken literally, means adjusting *every* weight in a model. For a model with billions of weights, that demands enormous memory and computing power — far beyond a single ordinary machine. This chapter explains the clever family of techniques that changed everything by asking a liberating question: what if we only adjusted a tiny fraction of the model? The answer, **parameter-efficient fine-tuning**, is what put fine-tuning of large models within reach of individuals. We build the intuition from the ground up, no heavy mathematics required.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 23Training and Fine-Tuning Language Models

Instruction Tuning and Alignment

We have the tools to fine-tune a model. Now we turn to the two-stage process that uses those tools to accomplish something specific and important: transforming a raw base model — a mere text continuer — into the helpful, honest, well-behaved assistant you actually want to interact with. Chapter 17 covered the *data* for this; here we cover the *process* and the deeper idea behind it, called alignment. This is the stage that makes a model usable and safe, and understanding it clarifies both what today's assistants are and why they sometimes behave as they do.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 24Training and Fine-Tuning Language Models

RLHF, DPO, and Modern Alignment Methods

Chapter 23 told us that alignment learns from human preferences, and Chapter 18 showed us what that preference data looks like. This chapter closes the loop by explaining *how* preference data is actually turned into a better-aligned model. We will demystify RLHF — the original, powerful, and somewhat painful method — and then meet DPO, the simpler successor that has largely replaced it for many uses. As always, we favor intuition over equations; by the end you will understand what these intimidating acronyms really do and why the field moved from one to the other.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 25Training and Fine-Tuning Language Models

Evaluating Models: Benchmarks, Metrics, and Pitfalls

We have now built a model from the ground up — pretrained, fine-tuned, instruction-tuned, and aligned. But a question has been lurking under every chapter of this part: how do you actually *know* it is any good? Evaluation is one of the most underrated skills in all of AI, and one of the easiest to get wrong. This chapter, closing Part V, covers how models are evaluated — benchmarks, metrics, model judges, and humans — and, just as importantly, the many ways evaluation can quietly mislead you. A model is only ever as trustworthy as the evaluation that vouches for it.

⏱ 7 min✏️ 6 exercises

→

Chapter 26Using Language Models in Practice

Running Inference: Local and in the Cloud

We have spent five parts understanding how models are built. Now the book pivots to the half you will spend most of your time in: *using* them. This part is about putting a finished model to work, and it begins with the most basic act of all — running the model to get an answer, which is called inference. We will see the two places you can run a model, what actually happens inside when it generates text, the settings that shape its output, and how to think about speed and cost. Everything here is practical and beginner-friendly, and it is the ground floor for building agents.

⏱ 7 min✏️ 6 exercises

→

Chapter 27Using Language Models in Practice

Prompt Engineering Fundamentals

If you learn one practical skill from this book, make it this one. Prompt engineering — the craft of writing the input that gets a model to do what you want — is the cheapest, fastest, and highest-leverage skill in all of applied AI. A better prompt costs nothing, takes effect instantly, and frequently outperforms expensive fine-tuning. This chapter teaches the fundamentals with plenty of before-and-after examples, and it assumes nothing. Master these basics and you will solve the large majority of tasks without ever touching training, exactly as Chapter 20 promised.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 28Using Language Models in Practice

Advanced Prompting: Chain-of-Thought, Few-Shot, and Self-Consistency

With the fundamentals of clear, specific prompting in hand, we can add a few powerful techniques that push quality higher on the genuinely hard tasks — the ones involving reasoning, multi-step logic, or a precise format. These techniques are not always needed, and using them where they do not belong just wastes tokens, so we will be equally clear about *when* to use each. By the end you will know how to make a model think more carefully, learn from examples, and double-check itself, and how to tell whether any of it actually helped.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 29Using Language Models in Practice

Structured Outputs and Function/Tool Calling

This is the most important chapter in the book's transition from understanding models to building agents — the hinge on which everything that follows turns. Until now, our models have produced prose for humans to read. But an agent needs two things a chatbot does not: output that *programs* can act on, and the ability to *use tools* to affect the world. Both come from the same underlying capability, and this chapter builds it carefully from the ground up. Take your time here; once tool calling clicks, agents stop being mysterious, because an agent is essentially a loop around the idea you are about to learn.

⏱ 8 min✏️ 6 exercises🖼 Figures

→

Chapter 30Using Language Models in Practice

Working with LLM APIs in Code

We close Part VI with the practical engineering that surrounds every real model call. Using a model in a playground is easy; using it reliably inside a program means handling the message format, carrying on a conversation, recovering from failures, and keeping costs under control. None of this is glamorous, but it is exactly the difference between a fragile demo and something you can depend on — and it is the immediate groundwork for the agents you will build next. As always, everything is hands-on and beginner-friendly, building on the first API call you made all the way back in Chapter 3.

⏱ 6 min✏️ 6 exercises

→

Chapter 31The Core of AI Agents

Anatomy of an Agent: Perception, Reasoning, and Action

Welcome to the heart of the book. Everything so far — how models work, how they are trained, how to use them, and the tool calling of Chapter 29 — was preparation for this part, where we finally build agents in earnest. We begin by laying out the complete anatomy of an agent, expanding the tiny loop from Chapter 1 into a full architecture and naming every component you will build in the chapters ahead. By the end you will have a clear mental blueprint of what an agent *is*, made of parts you already understand, and a working loop in code. This chapter is the map for all of Part VII.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 32The Core of AI Agents

The ReAct Pattern: Reasoning + Acting

In the last chapter we built the agent loop. Now we sharpen the reasoning-and-acting cycle into a specific, powerful pattern called **ReAct** — short for Reasoning + Acting — which was the breakthrough that first made tool-using agents genuinely reliable. The idea is simple to state and surprisingly deep: instead of thinking everything through up front or acting blindly, the agent alternates between thinking and acting, one step at a time. We will implement it from scratch and trace it in detail, so you understand not just how it works but why it works so well.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 33The Core of AI Agents

Tool Use: Giving Agents Hands

Tools are how an agent reaches out and affects the world — the hands attached to the model's brain. Chapter 29 introduced the mechanism of tool calling; this chapter is about the craft of *building* tools well: designing them so the model uses them correctly, validating what the model sends, handling the inevitable failures, and keeping everything safe. Good tools are the difference between an agent that reliably gets things done and one that flails. We will build several real tools and cover the practices that make them dependable.

⏱ 7 min✏️ 6 exercises

→

Chapter 34The Core of AI Agents

Memory: Short-Term, Long-Term, and Episodic

An agent that forgets everything the moment it finishes a step cannot pursue a goal, hold a conversation, or learn from experience. Memory is what gives an agent continuity — the ability to carry context across steps and across sessions. This chapter builds the main kinds of agent memory from the ground up, explains when each is needed, and reveals a satisfying connection: long-term memory for agents turns out to be the retrieval you already learned, applied to the agent's own past. By the end you will know how to give an agent a working memory and a lasting one.

⏱ 7 min✏️ 6 exercises

→

Chapter 35The Core of AI Agents

Planning and Task Decomposition

Some goals are too big to accomplish in a single step. "Research a topic and write a report" is not one action but dozens, and an agent that tries to do it all at once will flounder. Planning is how an agent tackles such goals: by breaking them into smaller, achievable pieces and working through them. This chapter covers task decomposition, the two broad styles of planning, and the crucial tension between planning ahead and adapting as you go — a tension you have already glimpsed in ReAct. Planning is where the components of Part VII come together to handle real complexity.

⏱ 7 min✏️ 6 exercises

→

Chapter 36The Core of AI Agents

Retrieval-Augmented Generation (RAG)

A language model is a brilliant mind locked in a room with no library. It knows nothing about your company's documents, last week's meeting notes, or the PDF sitting on your desktop — and when you ask it about something it has not seen, it will sometimes invent a confident, fluent, and completely wrong answer. *Retrieval-Augmented Generation*, or RAG, is the single most important technique for fixing both problems at once. In this chapter we build a complete RAG pipeline from scratch, one stage at a time, so that you understand every moving part before any framework hides them from you.

⏱ 10 min✏️ 6 exercises🖼 Figures

→

Chapter 37The Core of AI Agents

Vector Databases and Semantic Search

We close the core of agents with the storage layer that quietly powers two of its most important capabilities. Both the RAG of Chapter 36 and the agent memory of Chapter 34 work by storing embeddings and finding the most similar ones — and when you have more than a handful of items, you need a specialized tool to do that quickly: a vector database. This chapter explains what one is, why a plain list does not scale, how vector databases achieve their speed, and how to use one in practice. It is the infrastructure beneath retrieval and memory, and it completes Part VII.

⏱ 7 min✏️ 6 exercises

→

Chapter 38Building Real-World Agents

Agent Frameworks Overview: LangGraph, CrewAI, and Agent SDKs

You have built agents from scratch in Part VII — the loop, tools, memory, planning. That foundation is exactly what lets this part go faster, because now we meet the **frameworks** that package all of that machinery for you. This chapter is a tour of the major framework families, what each is good at, and how to choose among them without getting swept up in hype. Because this is a fast-moving area, we focus on durable patterns rather than today's exact APIs, and we lean on a reassuring truth: every framework is built from the concepts you already understand.

⏱ 7 min✏️ 6 exercises

→

Chapter 39Building Real-World Agents

Building a Single Agent with LangGraph

Time to build. In this chapter we take one framework — LangGraph — and use it to construct a complete single agent, turning the loop you already know into an explicit, controllable graph. LangGraph represents an agent as a graph of steps, which makes its control flow visible and easy to shape, and it is an excellent vehicle for understanding how a real framework organizes an agent. The code here shows the *shape* of LangGraph rather than its exact, ever-changing API, so the lessons transfer even as details evolve; always check the current documentation when you build for real.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 40Building Real-World Agents

The Model Context Protocol (MCP)

As you connect agents to more and more tools and data sources, a quiet problem emerges: every connection is custom, one-off wiring. The Model Context Protocol, or MCP, is the standard that solves this — a universal way for agents to plug into tools and data, much as a single port lets any device connect to any computer. MCP has gone from a niche idea to a widely adopted standard in a remarkably short time, and understanding it is increasingly essential for building real agents. This chapter explains what MCP is, why a standard matters so much, and how to use and build with it.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 41Building Real-World Agents

Multi-Agent Systems and Orchestration

So far we have built single agents. But some problems are best tackled not by one generalist agent but by a *team* of specialized agents working together — a researcher, a writer, a reviewer, each focused on what it does best. This chapter explores multi-agent systems: how several agents collaborate, the patterns for coordinating them, and — just as importantly — when a team of agents helps and when a single well-designed agent is the wiser choice. As always, we favor clear thinking over hype, because multiplying agents is as easy to overdo as it is to underuse.

⏱ 7 min✏️ 6 exercises

→

Chapter 42Building Real-World Agents

Giving Agents Real Tools: Web, Code, Files, and APIs

An agent is only as capable as the tools it can reach, and the toy tools of earlier chapters — a calculator, a clock — were just warm-ups. This chapter connects agents to the messy, powerful real world: browsing the web, running code, reading and writing files, and calling external services. These capabilities are what make agents genuinely useful — and genuinely risky. So we cover each with two lenses always in view: **reliability** (real tools fail constantly) and **security** (real tools are where an agent can do real harm). This is where careful engineering separates a robust agent from a dangerous demo.

⏱ 7 min✏️ 6 exercises

→

Chapter 43Advanced and Cutting-Edge Topics

Agentic RAG and GraphRAG

Part IX takes you to the cutting edge — the techniques and concerns that separate a hobby agent from a production-grade one. We begin where retrieval left off. The basic RAG of Chapter 36 was a fixed pipeline: embed the question, retrieve, generate. It works, but it is rigid. This chapter shows two ways retrieval grows up: **agentic RAG**, where the agent decides for itself when and how to retrieve, and **GraphRAG**, which retrieves over a web of connected facts rather than isolated chunks. Both make retrieval smarter, and both build directly on what you already know.

⏱ 7 min✏️ 6 exercises🖼 Figures

→

Chapter 44Advanced and Cutting-Edge Topics

Evaluating and Observing Agents

Chapter 25 taught you to evaluate models; agents are harder still. A model produces one output you can judge, but an agent takes many steps — reasoning, calling tools, observing results — and any of them can go wrong. To know whether an agent works, you must be able to *see* what it did and *judge* the whole path it took, not just its final answer. This chapter covers observability (seeing the agent's steps) and evaluation (judging its behavior), the twin disciplines that turn a black box into something you can trust and improve. For agents that take real actions, this is not optional.

⏱ 7 min✏️ 6 exercises

→

Chapter 45Advanced and Cutting-Edge Topics

Guardrails, Safety, and Security

This chapter has been foreshadowed since Chapter 1, and now it arrives. An agent that takes actions in the world can cause real harm in a way a chatbot never could — and the more capable your agents become, the more this matters. We confront agent safety and security directly: the central threat of prompt injection, the danger of tool misuse, and the layered guardrails that keep agents in check. None of this is about fear; it is about building responsibly. An agent you cannot keep safe is an agent you should not deploy, and this chapter is how you keep it safe.

⏱ 7 min✏️ 6 exercises

→

Chapter 46Advanced and Cutting-Edge Topics

Small Models, Local Agents, and Cost Optimization

There is a powerful instinct, when building agents, to reach for the biggest, most capable model for everything. It is usually a mistake — an expensive one. Many tasks do not need a frontier model, and using one anyway wastes money and time. This chapter is about doing more with less, deliberately: when a small model is the right choice, how local agents fit in, the routing pattern that combines small and large models, and the concrete levers for controlling cost. Right-sizing your models to your tasks is one of the most practical skills in production agent building.

⏱ 6 min✏️ 6 exercises

→

Chapter 47Advanced and Cutting-Edge Topics

Deploying Agents to Production

A working agent in a notebook is a wonderful thing — and it is not a product. Production means turning that prototype into a service others can rely on: reliable when things go wrong, monitored so you know how it behaves, secure against attack, affordable at scale, and maintainable as it evolves. This chapter covers the gap between "it works on my machine" and "it works for real users," pulling together threads from across the book into a practical guide for shipping agents responsibly. It is the difference between a demo and dependability.

⏱ 7 min✏️ 6 exercises

→

Chapter 48Advanced and Cutting-Edge Topics

The Frontier: Latest Developments and What Comes Next

We close Part IX by looking outward — at where the field is heading and, far more usefully, at how to keep up as it gets there. A chapter titled "the latest developments" risks being out of date by the time you read it, so this one is built differently. Rather than a snapshot of today's headlines, it offers durable directions, the problems that will stay hard for a while, the fundamentals that will not change, and a practical habit for staying current. The half-life of a specific tool is short; the half-life of a way of thinking is long, and that is what we aim for here.

⏱ 6 min✏️ 6 exercises

→

Chapter 49Capstone Projects

Capstone 1: A Research Assistant Agent

You have learned every piece of agent building; now you put them together. The capstones are projects — complete agents you build end to end, applying the concepts from across the book. Our first capstone is a research assistant: give it a question, and it searches the web, reads sources, and produces a clear summary with citations. As we build, watch how naturally the pieces from earlier chapters click into place. The goal is not just a working agent but the satisfying realization that you already know how to make one.

⏱ 6 min✏️ 6 exercises🖼 Figures

→

Chapter 50Capstone Projects

Capstone 2: A Coding Agent with MCP

Our second capstone is more ambitious and more dangerous: a coding agent that reads a codebase, makes a change, runs the tests, and reports the result — connected to real developer tools through MCP (Chapter 40), and kept safe by the guardrails of Chapter 45. Coding agents are among the most useful and most popular agents being built today, and they bring together tool use, code execution, MCP, and safety in a single project. Because this agent touches files and runs code, safety is not a footnote here; it is woven through every step.

⏱ 6 min✏️ 6 exercises🖼 Figures

→

Chapter 51Capstone Projects

Capstone 3: A Multi-Agent Workflow

Here is the capstone of capstones — and the final chapter of the book. We build a small team of agents that collaborate to plan, execute, and review a real task from beginning to end. This project pulls together nearly everything you have learned: planning, the agent loop, tools, multi-agent coordination, review, and observability. If you can build and understand this, you have genuinely earned the title this book promised. We will build it step by step, and then close with a reflection on the journey from zero to here.

⏱ 6 min✏️ 6 exercises🖼 Figures

→