Part 6 · Using Language Models in Practice

Chapter 29Structured Outputs and Function/Tool Calling

⏱ 8 min read·✏️ 6 exercises·🖼 1 figure·Using Language Models in Practice

This is the most important chapter in the book's transition from understanding models to building agents — the hinge on which everything that follows turns. Until now, our models have produced prose for humans to read. But an agent needs two things a chatbot does not: output that *programs* can act on, and the ability to *use tools* to affect the world. Both come from the same underlying capability, and this chapter builds it carefully from the ground up. Take your time here; once tool calling clicks, agents stop being mysterious, because an agent is essentially a loop around the idea you are about to learn.

Why Agents Need More Than Prose

Everything so far has treated the model's output as text for a person to read. That is perfect for a chatbot, but useless for automation. If your program asks a model to "extract the customer's name and order number," it cannot easily use a flowing paragraph — it needs structured data it can read reliably. And a model on its own cannot check today's weather, do exact arithmetic, or look something up in your database; it can only generate text. To do real work, an agent needs both structured outputs (machine-readable responses) and tool calling (the ability to use external capabilities). This chapter covers both, in that order, because tool calling builds directly on structured output.

Structured Outputs: Machine-Readable Responses

A structured output is a response in a fixed, predictable format that code can parse — most commonly JSON, the simple key-value format you met in earlier chapters. Instead of "The customer is Maria Lopez and her order is 4471," you ask the model to return a tidy object your program can read directly. The difference between prose and structure is the difference between something a human reads and something a program uses.

Getting Reliable JSON

To get structured output, you tell the model exactly the format you want — ideally showing the shape — and many providers offer a dedicated structured-output mode that guarantees valid JSON. Clear instructions plus an example of the desired shape go a long way.

python

prompt = (
    "Extract the name and order number from the message below. "
    'Respond with ONLY a JSON object like {"name": "...", "order": ...}. '
    "No extra text.\n\n"
    "Message: Hi, this is Maria Lopez, my order 4471 hasn't arrived."
)
# Expected response:  {"name": "Maria Lopez", "order": 4471}

Validating the Output

Models can occasionally produce malformed JSON — a stray word, a missing bracket. So you never blindly trust the output; you parse it safely and handle failure, exactly the verification mindset from Chapter 19. This single habit prevents a whole class of crashes.

python

import json

def parse_safely(text):
    try:
        return json.loads(text)          # success: usable structured data
    except json.JSONDecodeError:
        return None                       # failure: handle it, don't crash

data = parse_safely(response_text)
if data is None:
    # retry, ask the model to fix it, or fall back gracefully
    ...

Tool Calling: Giving the Model Hands

Now the centerpiece. A model is, as we said in Chapter 1, a brain in a jar — brilliant at reasoning, but with no way to act. Tool calling gives it hands. The idea: you describe a set of tools the model may use — a web search, a calculator, a database lookup, anything — and when the model decides it needs one, it produces a structured request to call that tool with specific arguments. Your code runs the tool and hands the result back, and the model continues with that new information. Notice that this depends entirely on structured output: the model's request to call a tool is a structured object, which is why we built that idea first.

How Tool Calling Works, Step by Step

The full cycle has six steps, and you will recognize the shape immediately — it is the agent loop from Chapter 1, now made concrete.

You define the tools — each with a name, a description of what it does, and the parameters it accepts.
You send the request plus the tool definitions — the user's question and the list of tools the model is allowed to use.
The model decides — it either answers directly, or returns a structured request to call a specific tool with specific arguments.
Your code executes the tool — runs the actual search, calculation, or lookup the model asked for.
You send the result back — feeding the tool's output to the model.
The model produces the final answer — or decides to call another tool, and the loop repeats.

Figure 29.1 — The tool-calling loop: the model decides to call a tool, your code executes it and returns the result, and the model continues — the agent loop from Chapter 1 made real.

A Concrete Example

Let us make it tangible with a calculator tool — something models genuinely benefit from, since exact arithmetic is not their strength. First we define the tool, then we walk the flow.

python

# 1. Define the tool: a name, a description, and its parameters.
tools = [{
    "name": "calculator",
    "description": "Evaluate a basic arithmetic expression and return the result.",
    "parameters": {"expression": "a math expression, e.g. '23 - 8 + 12'"},
}]

# 4. The function your code runs when the model asks for the tool.
def calculator(expression):
    return eval(expression)        # (use a safe evaluator in real code)

# 2-3. Send the request with the tools; the model may return a tool call:
#      {"tool": "calculator", "arguments": {"expression": "23 - 8 + 12"}}
# 4-5. You run calculator("23 - 8 + 12") -> 27, and send 27 back.
# 6.   The model replies: "The shop now has 27 apples."

Follow the numbers and you can see the whole dance: the model recognized it needed arithmetic, asked for the calculator, your code did the math, and the model wove the result into a natural answer. The model supplied the reasoning; your tool supplied the capability.

The Model Doesn't Run the Tool — You Do

This point is crucial and often misunderstood. The model never executes anything itself. It only requests a tool call by producing a structured message; your code decides whether and how to run it. This is not a limitation — it is the central safety property of tool calling. Because execution is entirely in your hands, the model cannot do anything you have not explicitly built and permitted. You control which tools exist, what they are allowed to do, and whether to honor any given request. Keep this firmly in mind; it is the foundation of building agents safely, a theme we return to in Part IX.

Why This Is the Foundation of Agents

Now we can state plainly what an agent is, with no hand-waving left. An agent is a loop around tool calling. The model reasons about a goal, decides to call a tool, observes the result, reasons again, calls another tool, and continues until the goal is met — precisely the perceive–reason–act–observe loop from Chapter 1. Tool calling is the mechanism that turns a passive text generator into a system that acts. Everything in Part VII — memory, planning, the ReAct pattern, RAG as a tool — is built on top of the capability you have just learned. This is the hinge, and you have turned it.

Designing Good Tools

Because the model reads your tool descriptions to decide what to call, the quality of those descriptions directly shapes how well it uses them. Give each tool a clear, specific name, a description that plainly states what it does and when to use it, and well-defined parameters. Validate the arguments the model provides before acting on them. We devote Chapter 33 to the craft of tool design; for now, remember that a tool the model cannot understand is a tool it will misuse.

Summary

This chapter built the technical foundation of agents. Structured outputs are machine-readable responses — usually JSON — that programs can act on, obtained by clearly specifying the format and always validating the result rather than trusting it. Tool calling builds on structured output to give a model hands: you define tools, the model returns a structured request to call one, your code executes it and returns the result, and the model continues — the six-step loop that is exactly the agent loop from Chapter 1. Critically, the model only requests tool calls while your code executes them, which is the central safety property. An agent is a loop around tool calling, which is why this chapter is the hinge on which the entire agents half of the book turns.

One practical layer remains before we build agents in earnest: the everyday engineering of calling models in real code. Chapter 30 closes Part VI with messages, conversations, error handling, and cost control.

Practice

Exercises

1Write a prompt that makes a model return a strict JSON object with specific fields you define (for example, extracting a name, date, and topic from a sentence). Run it and confirm the structure.
2Implement the `parse_safely` function and feed it both valid JSON and deliberately broken JSON. Confirm it returns the data in one case and handles the failure gracefully in the other.
3Define a simple tool (such as a calculator or a clock) with a name, description, and parameters. Describe, step by step, what would happen when the model decides to call it.
4Explain, in your own words, why the model only *requests* a tool call rather than executing it, and why this is the key safety property of tool calling.
5Map the six steps of the tool-calling loop onto the perceive–reason–act–observe loop from Chapter 1. Which tool-calling steps correspond to which parts of the agent loop?
6Write clear and unclear descriptions for the same tool, and explain why the model would choose and use the tool more reliably with the clear description.

View detailed solutions for all chapters →