Building AI Agents: Architecture, Implementation, and Best Practices

Introduction

2025 has been declared the "year of the AI agent." What started as experimental chatbots has evolved into autonomous systems capable of planning, reasoning, using tools, and executing multi-step workflows without human intervention. The agentic AI market reached $10.41 billion in 2025, growing at 56.1% year-over-year, with 99% of enterprise developers now exploring or building AI agents.

But building production-ready AI agents requires more than chaining LLM calls together. This guide covers the architecture patterns, implementation strategies, and operational practices needed to deploy agents that are reliable, observable, and safe enough for enterprise use.

What Defines an AI Agent?

An AI agent is fundamentally different from a stateless LLM completion:

Traditional LLM:

Single request/response cycle
No memory between interactions
No access to external tools
No ability to plan multi-step actions

AI Agent:

Persistent execution across multiple steps
Maintains conversation and task context
Can use tools (APIs, databases, file systems)
Plans and reasons about how to accomplish goals
Decides when tasks are complete

The key distinction: Agents have agency. They don't just answer questions—they decompose problems, choose tools, execute actions, evaluate results, and iterate until objectives are met.

Core Architecture Patterns

1. ReAct (Reasoning + Acting)

The most widely adopted pattern for AI agents. The agent alternates between reasoning about what to do next and taking actions.

Flow:

1. Thought: Analyze the current state and decide next action
2. Action: Execute a tool or API call
3. Observation: Receive results from the action
4. (Repeat until task complete)

Example Implementation:

class ReActAgent {
  constructor(llm, tools) {
    this.llm = llm;
    this.tools = tools;
    this.maxIterations = 10;
  }
async run(task) {
const history = [];
let iteration = 0;
while (iteration &lt; this.maxIterations) {
  // Reasoning step
  const prompt = this.buildPrompt(task, history);
  const response = await this.llm.complete(prompt);

  // Parse thought and action
  const { thought, action, actionInput } = this.parseResponse(response);

  history.push({ thought, action, actionInput });

  // Check for completion
  if (action === 'finish') {
    return actionInput; // Final answer
  }

  // Execute action
  const tool = this.tools.find(t =&gt; t.name === action);
  if (!tool) {
    throw new Error(`Unknown tool: ${action}`);
  }

  try {
    const observation = await tool.execute(actionInput);
    history.push({ observation });
  } catch (error) {
    history.push({ observation: `Error: ${error.message}` });
  }

  iteration++;
}

throw new Error('Agent exceeded maximum iterations');

}
buildPrompt(task, history) {
return `
You are an AI agent. Use tools to accomplish tasks.
Available tools:
${this.tools.map(t => - ${t.name}: ${t.description}).join('\n')}
Task: ${task}
${history.map(h => {
if (h.thought) return Thought: ${h.thought}\nAction: ${h.action}\nAction Input: ${h.actionInput};
if (h.observation) return Observation: ${h.observation};
}).join('\n')}
What is your next thought and action?
Format:
Thought: [your reasoning]
Action: [tool name or "finish"]
Action Input: [input for tool or final answer]
`.trim();
}
parseResponse(response) {
const thoughtMatch = response.match(/Thought: (.?)(?=\n|$)/);
const actionMatch = response.match(/Action: (.?)(?=\n|$)/);
const inputMatch = response.match(/Action Input: ([\s\S]*?)$/);
return {
  thought: thoughtMatch?.[1]?.trim() || '',
  action: actionMatch?.[1]?.trim() || '',
  actionInput: inputMatch?.[1]?.trim() || ''
};

}
}

Strengths:

Transparent reasoning (you can see the agent's thought process)
Works with any LLM that can follow instructions
Easy to debug when things go wrong

Weaknesses:

Can get stuck in loops ("thought → same action → thought → same action")
LLM output parsing is fragile (especially with weaker models)
Wastes tokens on verbose reasoning

2. Tool-Augmented Generation

Instead of free-form reasoning, the agent gets structured tool definitions and returns JSON function calls.

Example with OpenAI Function Calling:

const tools = [
  {
    type: 'function',
    function: {
      name: 'search_database',
      description: 'Search product database by query',
      parameters: {
        type: 'object',
        properties: {
          query: { type: 'string', description: 'Search query' },
          limit: { type: 'number', description: 'Max results', default: 10 }
        },
        required: ['query']
      }
    }
  },
  {
    type: 'function',
    function: {
      name: 'send_email',
      description: 'Send email to customer',
      parameters: {
        type: 'object',
        properties: {
          to: { type: 'string', description: 'Recipient email' },
          subject: { type: 'string' },
          body: { type: 'string' }
        },
        required: ['to', 'subject', 'body']
      }
    }
  }
];
async function runAgent(userMessage) {
const messages = [{ role: 'user', content: userMessage }];
while (true) {
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages,
tools,
tool_choice: 'auto'
});
const message = response.choices[0].message;
messages.push(message);

// Agent finished
if (!message.tool_calls) {
  return message.content;
}

// Execute tool calls
for (const toolCall of message.tool_calls) {
  const args = JSON.parse(toolCall.function.arguments);
  const result = await executeTool(toolCall.function.name, args);

  messages.push({
    role: 'tool',
    tool_call_id: toolCall.id,
    content: JSON.stringify(result)
  });
}

}
}
async function executeTool(name, args) {
switch (name) {
case 'search_database':
return await db.products.search(args.query, args.limit);
case 'send_email':
return await sendEmail(args.to, args.subject, args.body);
default:
throw new Error(Unknown tool: ${name});
}
}

Strengths:

Structured, type-safe tool calls (no parsing)
Native support in OpenAI, Anthropic, Google models
Parallel tool execution (agent can call multiple tools simultaneously)

Weaknesses:

Less transparency (no explicit reasoning)
Requires models with function calling support
Still possible to make wrong tool choices

3. Multi-Agent Systems

For complex tasks, use specialized agents that collaborate:

class MultiAgentOrchestrator {
  constructor() {
    this.agents = {
      researcher: new ResearchAgent(),
      writer: new WriterAgent(),
      editor: new EditorAgent()
    };
  }
async generateArticle(topic) {
// Step 1: Research agent gathers information
const research = await this.agents.researcher.run({
task: Research ${topic} and gather key facts, statistics, and expert opinions
});
// Step 2: Writer agent creates draft
const draft = await this.agents.writer.run({
  task: `Write a 1000-word article about ${topic}`,
  context: research
});

// Step 3: Editor agent refines
const final = await this.agents.editor.run({
  task: 'Edit for clarity, accuracy, and engagement',
  content: draft,
  researchContext: research
});

return final;

}
}

When to use:

Tasks requiring distinct expertise (research vs. writing vs. coding)
Workflow has clear sequential or parallel stages
Need specialization (one agent fine-tuned for SQL, another for API calls)

Frameworks:

CrewAI: Specialized in multi-agent collaboration with role definitions
AutoGen: Microsoft's framework for conversational multi-agent systems
LangGraph: Stateful, graph-based agent orchestration

Tool Integration Best Practices

1. Define Clear Tool Contracts

const tools = [
  {
    name: 'get_weather',
    description: 'Get current weather for a city. Use when user asks about weather conditions.',
    parameters: {
      city: {
        type: 'string',
        description: 'City name (e.g., "San Francisco" or "London, UK")',
        required: true
      },
      units: {
        type: 'string',
        enum: ['celsius', 'fahrenheit'],
        description: 'Temperature units',
        default: 'celsius'
      }
    },
    // Critical: Return value schema
    returns: {
      type: 'object',
      properties: {
        temperature: { type: 'number' },
        condition: { type: 'string' },
        humidity: { type: 'number' }
      }
    }
  }
];

Why this matters:

Helps LLM understand when and how to use each tool
Enables validation of tool inputs before execution
Documents expected outputs for chaining tools

2. Implement Tool Safety Guards

class SafeToolExecutor {
  constructor(tools) {
    this.tools = tools;
    this.rateLimits = new Map();
    this.dangerousTools = new Set(['delete_database', 'send_bulk_email']);
  }
async execute(toolName, args, context) {
// Rate limiting
if (this.isRateLimited(toolName, context.userId)) {
throw new Error(Rate limit exceeded for ${toolName});
}
// Require human approval for dangerous operations
if (this.dangerousTools.has(toolName)) {
  if (!context.humanApproved) {
    return {
      requiresApproval: true,
      message: `Tool ${toolName} requires human approval`,
      pendingArgs: args
    };
  }
}

// Input validation
this.validateArgs(toolName, args);

// Execute with timeout
const tool = this.tools.find(t =&gt; t.name === toolName);
return await this.executeWithTimeout(tool, args, 30000);

}
async executeWithTimeout(tool, args, timeoutMs) {
return Promise.race([
tool.execute(args),
new Promise((_, reject) =>
setTimeout(() => reject(new Error('Tool execution timeout')), timeoutMs)
)
]);
}
}

3. Handle Tool Errors Gracefully

async function executeToolWithRecovery(tool, args, agent) {
  try {
    return await tool.execute(args);
  } catch (error) {
    // Log error
    console.error(`Tool ${tool.name} failed:`, error);
// Return error to agent with suggestions
return {
  error: error.message,
  suggestion: getErrorRecoverySuggestion(error, tool)
};

}
}
function getErrorRecoverySuggestion(error, tool) {
if (error.message.includes('not found')) {
return Try using ${tool.name} with a different query or check if the resource exists;
}
if (error.message.includes('unauthorized')) {
return 'This operation requires authentication. Use the login tool first.';
}
if (error.message.includes('rate limit')) {
return 'Rate limit exceeded. Wait 60 seconds or use cached data instead.';
}
return 'An unexpected error occurred. Try a different approach.';
}

Memory and State Management

Agents need memory to maintain context across interactions:

Short-Term Memory (Conversation Context)

class ConversationMemory {
  constructor(maxTokens = 4000) {
    this.messages = [];
    this.maxTokens = maxTokens;
  }
addMessage(role, content) {
this.messages.push({ role, content, timestamp: Date.now() });
this.prune();
}
prune() {
// Estimate token count (rough: 4 chars = 1 token)
while (this.estimateTokens() > this.maxTokens && this.messages.length > 1) {
// Keep system message, remove oldest user/assistant messages
this.messages.splice(1, 1);
}
}
estimateTokens() {
return this.messages.reduce((sum, msg) =>
sum + Math.ceil(msg.content.length / 4), 0
);
}
getMessages() {
return this.messages;
}
}

Long-Term Memory (Vector Store)

import { Pinecone } from '@pinecone-database/pinecone';
import { OpenAIEmbeddings } from 'langchain/embeddings/openai';
class AgentMemory {
constructor() {
this.embeddings = new OpenAIEmbeddings();
this.pinecone = new Pinecone({ apiKey: process.env.PINECONE_API_KEY });
this.index = this.pinecone.index('agent-memory');
}
async remember(userId, content, metadata = {}) {
const vector = await this.embeddings.embedQuery(content);
await this.index.upsert([{
  id: `${userId}-${Date.now()}`,
  values: vector,
  metadata: {
    userId,
    content,
    timestamp: Date.now(),
    ...metadata
  }
}]);

}
async recall(userId, query, topK = 5) {
const vector = await this.embeddings.embedQuery(query);
const results = await this.index.query({
  vector,
  topK,
  filter: { userId },
  includeMetadata: true
});

return results.matches.map(m =&gt; m.metadata.content);

}
}
// Usage in agent
async function runAgentWithMemory(userId, userMessage) {
const memory = new AgentMemory();
// Recall relevant past interactions
const context = await memory.recall(userId, userMessage);
const prompt = `
Previous relevant context:
${context.join('\n')}
Current user message: ${userMessage}
Respond appropriately using context from previous interactions.
`;
const response = await agent.run(prompt);
// Store this interaction for future recall
await memory.remember(userId, User: ${userMessage}\nAssistant: ${response});
return response;
}

Production Deployment Considerations

1. Observability

import { Langfuse } from 'langfuse';
class ObservableAgent {
constructor(agent) {
this.agent = agent;
this.langfuse = new Langfuse({
publicKey: process.env.LANGFUSE_PUBLIC_KEY,
secretKey: process.env.LANGFUSE_SECRET_KEY
});
}
async run(task, userId) {
const trace = this.langfuse.trace({
name: 'agent-execution',
userId,
metadata: { task }
});
try {
  const result = await this.agent.run(task, {
    onThought: (thought) =&gt; {
      trace.event({ name: 'thought', metadata: { thought } });
    },
    onAction: (action, input) =&gt; {
      trace.event({ name: 'action', metadata: { action, input } });
    },
    onObservation: (observation) =&gt; {
      trace.event({ name: 'observation', metadata: { observation } });
    }
  });

  trace.update({ output: result, status: 'success' });
  return result;
} catch (error) {
  trace.update({ status: 'error', metadata: { error: error.message } });
  throw error;
} finally {
  await this.langfuse.shutdown();
}

}
}

2. Cost Controls

class CostAwareAgent {
  constructor(agent, budget) {
    this.agent = agent;
    this.dailyBudget = budget;
    this.usage = new Map(); // userId -> daily cost
  }
async run(userId, task) {
const today = new Date().toISOString().split('T')[0];
const key = ${userId}-${today};
const currentCost = this.usage.get(key) || 0;
if (currentCost &gt;= this.dailyBudget) {
  throw new Error('Daily budget exceeded');
}

const startTokens = this.agent.getTokenCount();
const result = await this.agent.run(task);
const endTokens = this.agent.getTokenCount();

const tokensUsed = endTokens - startTokens;
const cost = this.calculateCost(tokensUsed);

this.usage.set(key, currentCost + cost);

return {
  result,
  tokensUsed,
  cost,
  remainingBudget: this.dailyBudget - (currentCost + cost)
};

}
calculateCost(tokens) {
// GPT-4 pricing: $0.03 per 1K input tokens, $0.06 per 1K output
// Simplified: assume 50/50 split
return (tokens / 1000) * 0.045;
}
}

3. Reliability Patterns

class ReliableAgent {
  constructor(agent, options = {}) {
    this.agent = agent;
    this.maxRetries = options.maxRetries || 3;
    this.backoffMs = options.backoffMs || 1000;
  }
async run(task) {
let lastError;
for (let attempt = 0; attempt &lt; this.maxRetries; attempt++) {
  try {
    // Checkpoint system: save state before each iteration
    const checkpoint = this.agent.getState();

    const result = await this.executeWithTimeout(task, 120000);

    // Validate result
    if (!this.isValidResult(result)) {
      throw new Error('Invalid agent output');
    }

    return result;
  } catch (error) {
    lastError = error;
    console.warn(`Agent attempt ${attempt + 1} failed:`, error.message);

    // Exponential backoff
    if (attempt &lt; this.maxRetries - 1) {
      await this.sleep(this.backoffMs * Math.pow(2, attempt));

      // Restore from last checkpoint
      this.agent.restoreState(checkpoint);
    }
  }
}

throw new Error(`Agent failed after ${this.maxRetries} attempts: ${lastError.message}`);

}
async executeWithTimeout(task, timeoutMs) {
return Promise.race([
this.agent.run(task),
new Promise((_, reject) =>
setTimeout(() => reject(new Error('Agent timeout')), timeoutMs)
)
]);
}
isValidResult(result) {
return result !== null &&
result !== undefined &&
typeof result === 'string' &&
result.length > 0;
}
sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}

Common Pitfalls and Solutions

1. Agent Gets Stuck in Loops

Problem: Agent repeatedly tries the same failed action.

Solution: Track action history and detect loops:

class LoopDetector {
  constructor(windowSize = 3) {
    this.history = [];
    this.windowSize = windowSize;
  }
addAction(action, input) {
this.history.push({ action, input: JSON.stringify(input) });
if (this.history.length > this.windowSize * 2) {
this.history.shift();
}
}
isInLoop() {
if (this.history.length < this.windowSize * 2) return false;
const recent = this.history.slice(-this.windowSize);
const previous = this.history.slice(-this.windowSize * 2, -this.windowSize);

return JSON.stringify(recent) === JSON.stringify(previous);

}
}

2. Hallucinated Tool Calls

Problem: Agent invents tools that don't exist.

Solution: Strict tool validation + error feedback:

async function executeTool(toolName, args, availableTools) {
  const tool = availableTools.find(t => t.name === toolName);
if (!tool) {
return {
error: Tool &quot;${toolName}&quot; does not exist.,
availableTools: availableTools.map(t => t.name),
suggestion: Did you mean one of: ${availableTools.map(t =&gt; t.name).join(', ')}?
};
}
return await tool.execute(args);
}

3. Context Window Overflow

Problem: Agent runs out of context trying to track long conversations.

Solution: Summarization + selective context:

async function manageContext(messages, llm) {
  const tokenCount = estimateTokens(messages);
  const maxTokens = 8000;
if (tokenCount > maxTokens * 0.8) {
// Summarize older messages
const toSummarize = messages.slice(0, -10);
const recent = messages.slice(-10);
const summary = await llm.complete(`

Summarize this conversation history concisely:
${toSummarize.map(m => ${m.role}: ${m.content}).join('\n')}
`);
return [
  { role: 'system', content: `Previous conversation summary: ${summary}` },
  ...recent
];

}
return messages;
}

Conclusion

AI agents are no longer science fiction—they're powering customer service, code generation, data analysis, and workflow automation in production systems. But building reliable agents requires careful architecture:

Choose the right pattern: ReAct for transparency, function calling for structure, multi-agent for complexity
Design safe tools: Validate inputs, implement rate limiting, require human approval for destructive actions
Manage state properly: Short-term memory for conversations, long-term vector storage for recall
Prioritize observability: Log every thought, action, and observation for debugging
Build in reliability: Retries, timeouts, loop detection, and graceful error handling

The agents you build today will be your team's autonomous coworkers tomorrow. Build them to be reliable, observable, and safe—because once they're running in production, you need to trust them to make the right decisions.

Further Reading: