Technology foundations of AI agents: architectures, protocols (MCP, A2A), reasoning patterns, and foundational papers.
15 resources
OECD landscape paper mapping agentic AI architectures, capability tiers, and governance touchpoints for policymakers. Synthesises definitions across vendors and academic work into a shared vocabulary, flagging where existing AI policy instruments need adjustment for agents.
Stanford HAI's annual index chapter on technical performance, tracking benchmark progress for reasoning, coding, and tool-using agents. Covers capability jumps on SWE-bench, GAIA, and WebArena plus compute and cost trends across frontier model families.
IBM explainer introducing AI agent architectures, planning loops, memory, and tool use for enterprise teams. Walks through single-agent, multi-agent, and hierarchical patterns with use cases across IT operations, customer service, and supply chain automation.
OpenAI handbook on deciding when an agent is appropriate, selecting models, writing clear instructions, defining tools, and adding safety guardrails. Covers single-agent and manager patterns with examples using the OpenAI Agents SDK.
Anthropic engineering write-up distinguishing deterministic workflows from agents and documenting composable patterns like prompt chaining, routing, parallelisation, orchestrator-workers, and evaluator-optimiser. Recommends starting with the simplest pattern and adding autonomy only where it pays off.
Chip Huyen's long-form essay breaking down agent components (planning, tool use, memory, reflection), common failure modes, and evaluation challenges. Covers ReAct-style loops, function calling, and practical trade-offs when moving from prototypes to production agents.
Ada Lovelace Institute policy briefing on advanced AI assistants as systems that act on a user's behalf. Examines concentration of power, delegation risks, and policy levers, calling for UK-specific rules on consent and accountability.
Academic survey cataloguing LLM agent architectures, planning and reasoning methods, memory mechanisms, tool-use strategies, and multi-agent coordination. Also maps application domains from coding and science to robotics, with open challenges for each layer.
Yao et al. introduce the ReAct prompting framework that interleaves chain-of-thought reasoning traces with tool actions, letting models plan, act, and observe in a loop. Evaluated on HotpotQA, FEVER, ALFWorld, and WebShop.
Meta AI paper showing how a language model can teach itself to call external APIs (calculator, search, translator, calendar, Q&A) through self-supervised fine-tuning on API-augmented data, improving zero-shot performance without losing core skills.
Google Research paper showing that prompting large models with worked-example reasoning chains elicits multi-step arithmetic, commonsense, and symbolic reasoning. Establishes the chain-of-thought technique that underpins most modern agent planning loops.
Model Context Protocol 2025-11-25 specification defining the JSON-RPC interface that lets AI applications connect to tools, resources, and prompts through standard servers. Covers transport, authentication, capability negotiation, and progress notifications for agent integrations.
GitHub monorepo for the Model Context Protocol project containing the specification, schemas, reference SDKs, and example servers. Entry point for developers building MCP-compatible tools, clients, or servers across Python, TypeScript, and other languages.
Google Cloud explainer introducing the Model Context Protocol, its client-server architecture, and how MCP servers expose tools and data to agents. Walks through Vertex AI integration patterns and compares MCP with bespoke tool plumbing.
Google's launch post for Agent2Agent, an open protocol that lets agents built on different stacks discover each other, exchange capabilities, and coordinate long-running tasks. Supported at launch by 50+ partners including Atlassian, Salesforce, and SAP.