Part 7 · The Core of AI Agents

Chapter 31Anatomy of an Agent: Perception, Reasoning, and Action

⏱ 7 min read·✏️ 6 exercises·🖼 1 figure·The Core of AI Agents

Welcome to the heart of the book. Everything so far — how models work, how they are trained, how to use them, and the tool calling of Chapter 29 — was preparation for this part, where we finally build agents in earnest. We begin by laying out the complete anatomy of an agent, expanding the tiny loop from Chapter 1 into a full architecture and naming every component you will build in the chapters ahead. By the end you will have a clear mental blueprint of what an agent *is*, made of parts you already understand, and a working loop in code. This chapter is the map for all of Part VII.

Recap: The Agent Loop

Two ideas from earlier converge here. In Chapter 1 we met the agent loop — perceive, reason, act, observe, repeating until a goal is met. In Chapter 29 we made it concrete: an agent is a loop around tool calling, where the model decides which tool to use, your code runs it, and the result feeds back. Now we expand that loop into a complete architecture, so you can see all the moving parts and how they fit together.

The Anatomy: Five Components

Strip any agent down — from the simplest to the most sophisticated — and you find the same five components. Each maps to something you already know or are about to learn.

The model — the reasoning engine, the "brain" that decides what to do. This is the LLM from Parts III to V.
Tools — the "hands" that let the agent act in the world, from Chapter 29 and detailed in Chapter 33.
Memory — the "notebook" that lets the agent remember across steps and sessions, covered in Chapter 34.
The loop (orchestration) — the machinery that keeps the cycle running and decides when to stop. This chapter, and the frameworks of Part VIII.
The goal and instructions — what the agent is trying to achieve and how it should behave, set in the system message (Chapter 30).

Figure 31.1 — Agent anatomy: a model (brain) at the center, with tools (hands), memory (notebook), a goal, and the loop that orchestrates them.

Perception: Taking In Information

Perception is everything the agent takes in before it reasons: the goal it was given, the conversation so far, the results of any tools it has used, and anything pulled from its memory. In practice, all of this is assembled into the model's input — which means it all shares the context window from Chapter 12. Perception, for an agent, is largely the art of deciding what to put into that limited context at each step.

Reasoning: Deciding What to Do

Reasoning is the model's core job in the loop: given everything it perceives, decide the next action. Usually that decision is one of two things — call a tool (to gather information or act) or give the final answer (because the goal is met). This is exactly the tool-calling decision from Chapter 29, now framed as the thinking step of an ongoing loop. The model is the brain, and reasoning is what the brain does each time around.

Action: Affecting the World

Action is carrying out the model's chosen step by executing a tool — running a search, doing a calculation, calling an API, writing a file. As we stressed in Chapter 29, the model only requests the action; your code executes it, which is where control and safety live. Action is the moment an agent stops merely thinking and actually does something.

Observation: Reading the Result

Observation is reading the result of an action and feeding it back into the agent's perception for the next round. The search returned these results; the calculation gave this number; the API call succeeded or failed. This is the step that closes the loop — without it, the agent would act blindly and never learn what its actions accomplished. Observation is what makes the cycle a cycle.

Putting the Loop Together in Code

Here is the tiny agent from Chapter 1, now fleshed out into the smallest real agent — a loop that perceives, reasons, acts, and observes, using the tool-calling idea from Chapter 29.

python

def run_agent(goal, tools, max_steps=10):
    history = [{"role": "user", "content": goal}]   # perception starts with the goal

    for step in range(max_steps):                    # the loop, with a safety bound
        decision = model_decide(history, tools)      # REASON: tool call or final answer?

        if decision.is_final_answer:
            return decision.text                     # goal met -- we are done

        result = run_tool(decision.tool, decision.args)   # ACT (your code executes)
        history.append({"role": "tool", "content": result})  # OBSERVE: feed result back

    return "Stopped: reached the step limit without finishing."

Read it against the four steps: the goal seeds perception, model_decide reasons, run_tool acts, and appending the result observes — then the loop repeats. This handful of lines is the skeleton of every agent in this book. Everything else adds capability around this core.

State: What the Agent Carries Between Steps

Notice the history variable threading through the loop. That is the agent's state — the goal, plus the growing record of actions taken and results observed. State is what lets each step build on the last instead of starting fresh. But notice, too, that it grows with every step, and all of it competes for the context window (Chapter 12). Managing this accumulating state — keeping what matters, summarizing or dropping the rest — is one of the central challenges of agent building, and the reason memory (Chapter 34) exists.

Why the Loop Matters

The loop is precisely what separates an agent from a single model call. One call is a single shot: ask, answer, done. The loop lets the agent take many steps — gathering information, trying things, recovering from errors, and adapting as it goes. A question that no single response could answer becomes solvable when the model can search, read the result, search again, and reason over what it found. The loop turns a one-shot text generator into something that can pursue a goal.

When Does the Loop Stop?

A loop that never ends is a serious bug, so every agent needs clear stopping conditions. The natural one is success: the model decides the goal is met and gives a final answer. But you must also bound the loop defensively — with a maximum number of steps (as in the code above), a budget on cost or time, and a way to bail out on repeated errors. Without these guards, a confused agent can spin forever, calling tools in circles and running up cost.

The Components Map to the Rest of Part VII

This chapter gave you the anatomy and the loop; the rest of Part VII fills in each component in depth. Chapter 32 refines the reasoning-and-acting cycle into the ReAct pattern. Chapter 33 develops tools — the hands. Chapter 34 builds memory — the notebook. Chapter 35 adds planning — decomposing big goals into steps. And Chapters 36 and 37 cover retrieval and vector databases — how agents access knowledge and remember at scale. Keep this map in mind, and each chapter will slot into place.

Summary

An agent is built from five components: a model (the reasoning brain), tools (the hands), memory (the notebook), the loop that orchestrates them, and a goal that directs them. It runs the perceive–reason–act–observe cycle: perception assembles everything into the model's context, reasoning decides the next action, action executes a tool through your code, and observation feeds the result back to close the loop. The agent carries state — the goal and the growing history of actions and observations — which accumulates in the context window and must be managed. The loop, properly bounded with stopping conditions, is what turns a single model call into a goal-pursuing agent, and the rest of Part VII develops each component in turn.

With the anatomy in place, Chapter 32 sharpens the most important pattern for the reasoning-and-acting cycle: ReAct, which interleaves thinking and acting and is the technique that first made tool-using agents genuinely reliable.

Practice

Exercises

1Draw the agent architecture from this chapter, labeling all five components, and write one sentence on what each contributes.
2Take a real task you would want an agent to do (for example, 'find the three cheapest flights to a city next month') and map it onto the perceive–reason–act–observe cycle in detail, naming the tools it would need.
3Explain what would break if you removed the 'observe' step from the loop. Why is feeding results back essential?
4Implement the `run_agent` loop skeleton from this chapter using placeholder functions. Confirm it stops both when a final answer is given and when the step limit is reached.
5Explain, in your own words, why an agent loop can solve problems that a single model call cannot. Give a concrete example.
6Describe three different stopping conditions an agent should have, and explain what could go wrong if the loop had none of them.

View detailed solutions for all chapters →