5. LLM Agents

LLM-based agents (LLM agents) are a new generation of AI applications that leverage large language models (LLMs) in conjunction with modules such as planning, memory, and tool usage to execute complex tasks. In this chapter, we explore the architecture, components, and functionalities of LLM agents, highlighting how they differ from traditional automation systems.

Core Components of LLM Agents

LLM agents are built upon several core modules:

How LLM Agents Work

LLM agents operate by processing user input to determine the necessary steps for task completion. The process involves:

A high-level schematic of this architecture is shown below:

Pasted image 20250327100733.png700

Agents vs. Automation

LLM agents extend beyond simple automation by incorporating decision-making, adaptability, and memory, which allow them to handle complex, dynamic scenarios.

LLM Agents:

Traditional Automations:

For example:

Moreover, in complex workflows, multiple specialized agents can work together. Consider an ecosystem where separate agents function as a personal assistant, marketing assistant, finance & accounting agent, or software developer, collaborating to execute comprehensive tasks.

Some notable examples include:

Planning Module – Without Feedback

The planning module is responsible for decomposing user requests into detailed steps or subtasks. Key features include:

Pasted image 20250327101351.png800

Planning with Feedback

Traditional planning modules often lack mechanisms for feedback, making long-horizon planning difficult. To address this limitation, a reflection mechanism is introduced, allowing the model to iteratively refine its execution plan based on feedback from:

This iterative process is essential for complex real-world tasks where trial and error are integral to achieving optimal results. Two popular methods that incorporate feedback are ReAct and Reflexion.

ReAct

ReAct enables an LLM to solve complex tasks by interleaving three steps repeatedly:

  1. Thought: The agent deliberates about the next step.
  2. Action: The agent executes an action.
  3. Observation: The agent receives feedback from the environment, which informs further thoughts.

Pasted image 20250327101602.png600

Memory in LLM Agents

Memory plays a critical role in maintaining context and enhancing decision-making:

Memory Formats

Memory can be represented in various formats, including:

Some systems, like Ghost in the Minecraft, use a hybrid approach where keys are in natural language and values are embedding vectors, combining the strengths of different formats.

Tool Usage in LLM Agents

The tool usage module empowers agents to interact with external environments and APIs. Examples of tools include:

LLM agents leverage tools in diverse ways: