Part 1 · Foundations: The World of AI Agents

Chapter 2Why Now? A Short History of AI, LLMs, and the Agentic Shift

⏱ 8 min read·✏️ 6 exercises·Foundations

The dream of building machines that can act on our behalf is not new — researchers have chased it since the 1950s, through waves of soaring hope and crushing disappointment. So why are useful agents suddenly possible now, in our decade and not an earlier one? This chapter answers that question by telling the story of how we got here. You do not need any technical background to follow it; think of it as a guided tour through seventy years of trying, failing, and finally succeeding. Understanding this arc will make everything that follows feel less like magic and more like the natural next chapter of a long story.

A Field of Broken Promises (and Real Progress)

It helps to start with a little humility. Artificial intelligence has been "just around the corner" for almost its entire history. Again and again, researchers promised thinking machines within a decade, fell short, and watched funding and enthusiasm collapse. These collapses were so regular they earned a name: AI winters.

But here is the encouraging part. Each cycle of hype and disappointment still left behind real, lasting tools — and each built the foundation for the next. The progress was genuine even when the promises were not. To understand today's agents, we will walk through three great eras, and see why only the most recent one made general-purpose agents practical.

Era One: Hand-Written Rules (1950s–1980s)

The first approach to AI was the most intuitive: if you want a machine to be intelligent, simply tell it everything it needs to know. Researchers wrote long lists of rules — statements of the form "if this is true, then do that" — and strung them together. This is now called symbolic AI, and its most famous products were expert systems that encoded the knowledge of doctors or engineers into thousands of hand-written rules.

Picture teaching someone to cook entirely by written instructions, never letting them taste or practice: "If the onion is translucent, lower the heat. If the pan smokes, remove it." For a tightly bounded task with clear rules, this works remarkably well. Early systems mastered chess this way, because chess is a closed world with exact rules.

But the real world is not chess. The number of rules needed to handle ordinary life is effectively infinite, and much of what humans know is common sense that nobody can fully write down. How do you state, in rules, everything a five-year-old knows about how objects fall, what people want, or what a sentence means? You cannot. Symbolic AI hit this wall hard, and the first AI winter followed.

Era Two: Learning From Data (1990s–2010s)

The second era replaced hand-written rules with machine learning. The idea is a genuine reversal. Rather than a human writing "this email is spam if it contains these words," you show the machine thousands of emails already labelled spam or not spam, and it discovers the telltale patterns on its own. We will explore exactly how this works in Part II; for now, the headline is what matters: the machine learns from examples instead of instructions.

This shift quietly transformed daily life. The spam filter that keeps your inbox clean, the recommendations that suggest what to watch next, the fraud detection that flags a strange purchase — all are machine learning, learning patterns from mountains of past data. It worked far better than hand-written rules ever had.

Yet it still had an important limitation. Humans had to carefully decide which features of the data the machine should pay attention to — which words, which numbers, which signals. A great deal of human expertise went into preparing the data just so. The machine learned the patterns, but people still had to point it at the right things to look at.

Era Three: Deep Learning (2012 Onward)

Around 2012, a long-simmering idea finally boiled over. Deep learning uses artificial neural networks — loosely inspired by the brain, and covered properly in Part II — with many stacked layers. Given enough data and enough computing power, these networks could learn not just the patterns but the features themselves. Humans no longer had to hand-pick what to look at; the network figured out what mattered on its own.

Three ingredients arrived together to make this possible: enormous datasets from the internet, powerful processors called GPUs (originally built for video games) that could do the necessary math at massive scale, and improved techniques for training deep networks. A landmark moment came when a deep network shattered records at recognizing objects in images, and the field never looked back. Suddenly machines could see, translate between languages, and recognize speech with startling accuracy.

The Language Model Breakthrough

In 2017 came the invention that set the stage for everything in this book: the transformer, the neural network design we will explore in Chapter 10. It was unusually good at handling language, and — crucially — it could be trained efficiently at enormous scale.

Then researchers tried something almost embarrassingly simple: make the models bigger and feed them more text. Far more. And something unexpected happened. As the models grew, they did not just get a little better — they developed surprising new abilities that no one had explicitly built in, like following instructions, answering questions, and writing code. These are sometimes called emergent abilities, because they emerged from scale rather than from design. When one of these models was wrapped in a friendly chat interface and released to the public in late 2022, hundreds of millions of people met capable AI for the first time, and the world's attention turned overnight.

From Answering to Acting: The Agentic Shift

A chatbot, however impressive, only answers. The shift to agents — systems that take actions in the world — required three things to fall into place at once, and they only did so very recently.

Models that can reason well enough. An agent must break a goal into steps and decide what to do next. Only the latest generation of models is reliable enough at this to trust with multi-step tasks.
Reliable tool calling. An agent needs to use tools — search the web, run code, call an API. This depends on the model producing precise, machine-readable instructions on demand, a capability that matured only in the last few years (and which we build in Chapter 29).
The loop and the frameworks. Wrapping a model in the perceive–reason–act–observe loop from Chapter 1, with memory and error handling, turns an answer-machine into an actor. The tools to do this reliably are very new.

When these three lined up, the change was sudden. A model that could reliably decide "call this tool with these arguments," placed inside a loop, stopped merely describing what to do and started doing it. That is the transformation happening right now, as you read this — and it is why this is the moment to learn how to build agents.

Why This Matters for You

There is a remarkable consequence of this history. For the first time, a single motivated person — with no research lab, no enormous budget, and no doctorate — can build agents that genuinely act in the world. The hardest parts, the models themselves, have already been built by others and are available to you over an API. The gap between an idea and a working agent has collapsed from years to an afternoon.

That is the opportunity this book is built around. You are learning a craft at the exact moment it has become accessible, useful, and in demand. The earlier eras laid every brick; you get to build on top of all of them.

Summary

AI progressed through three great eras: hand-written rules that could not capture the messy real world; machine learning that learned patterns from examples but still needed human guidance about what to look at; and deep learning that learned the features themselves once data and computing power were abundant. The transformer and the discovery that scale brings emergent abilities produced today's powerful language models. Agents arrived when capable reasoning, reliable tool calling, and the agent loop finally came together — which is happening now, and which is why one person can build real agents today.

In the next chapter we stop reading and start doing: we set up your workspace and make your very first call to a language model.

Practice

Exercises

1Draw a simple timeline of the three eras described in this chapter, from the 1950s to today, and mark the rough point where building agents became practical.
2In your own words, explain why hand-written rules could master chess but not everyday language. What is it about the real world that defeats the rule-writing approach?
3Describe the key reversal between Era One (rules) and Era Two (machine learning). Why was learning from examples such a powerful change?
4List the three ingredients that had to arrive together for agents to become practical, and write one sentence on why each is necessary.
5Find one AI agent product released recently and write three sentences describing what goal it pursues and what tools it appears to use.
6The chapter argues 'we are early' in the agentic era. Write a short paragraph on what that implies for how you should learn — and why chasing every new tool might be less wise than mastering foundations.

View detailed solutions for all chapters →