Helicone
diff --git a/‎bifrost/app/blog/blogs/ai-agent-monitoring-tutorial/metadata.json‎
Lines changed: 12 additions & 0 deletions b/‎bifrost/app/blog/blogs/ai-agent-monitoring-tutorial/metadata.json‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎bifrost/app/blog/blogs/ai-agent-monitoring-tutorial/src.mdx‎
Lines changed: 269 additions & 0 deletions b/‎bifrost/app/blog/blogs/ai-agent-monitoring-tutorial/src.mdx‎
Lines changed: 269 additions & 0 deletions
diff --git a/‎bifrost/app/blog/page.tsx‎
Lines changed: 5 additions & 0 deletions b/‎bifrost/app/blog/page.tsx‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎bifrost/public/static/blog/ai-agent-monitoring-tutorial/building-and-monitoring-ai-agents.webp‎
248 KB b/‎bifrost/public/static/blog/ai-agent-monitoring-tutorial/building-and-monitoring-ai-agents.webp‎
248 KB
diff --git a/‎bifrost/public/static/blog/ai-agent-monitoring-tutorial/greenenergy-profit-margin.webp‎
17 KB b/‎bifrost/public/static/blog/ai-agent-monitoring-tutorial/greenenergy-profit-margin.webp‎
17 KB
diff --git a/‎bifrost/public/static/blog/ai-agent-monitoring-tutorial/halal-investing-prompt.webp‎
50 KB b/‎bifrost/public/static/blog/ai-agent-monitoring-tutorial/halal-investing-prompt.webp‎
50 KB
diff --git a/‎bifrost/public/static/blog/ai-agent-monitoring-tutorial/sessions-ai-agent.webp‎
187 KB b/‎bifrost/public/static/blog/ai-agent-monitoring-tutorial/sessions-ai-agent.webp‎
187 KB
diff --git a/‎bifrost/public/static/blog/ai-agent-monitoring-tutorial/tesla-prompt.webp‎
24.2 KB b/‎bifrost/public/static/blog/ai-agent-monitoring-tutorial/tesla-prompt.webp‎
24.2 KB
@@ -0,0 +1,12 @@
+{
+    "title": "Building and Monitoring AI Agents: A Step-by-Step Guide (Part 1)",
+    "title1": "Building and Monitoring AI Agents (Part 1): A Step-by-Step Guide",
+    "title2": "Building and Monitoring AI Agents (Part 1): A Step-by-Step Guide",
+    "description": "Building reliable AI agents requires robust monitoring and observability. In this tutorial, we'll build a simple financial research assistant and optimize its performance using Helicone's agentic AI observability features.",
+    "images": "/static/blog/ai-agent-monitoring-tutorial/building-and-monitoring-ai-agents.webp",
+    "time": "10 minute read",
+    "author": "Yusuf Ishola",
+    "date": "May 2, 2025",
+    "badge": "Best Practices"
+  }
+  
@@ -0,0 +1,269 @@
+**Time to complete: ~30 minutes**
+
+Your AI agent worked perfectly in testing, but now in production it's making bizarre recommendations and you have no idea why. Sound familiar? As AI agents grow increasingly complex, the black box problem is becoming the number one obstacle to reliable deployment.
+
+![Building and Monitoring AI Agents](/static/blog/ai-agent-monitoring-tutorial/building-and-monitoring-ai-agents.webp)
+
+In this first part of our **two-part series** on AI agent observability, we'll build a financial research assistant that demonstrates the key components of a modern AI agent. In part two, we'll explore how to effectively monitor it with Helicone's agentic AI observability features.
+
+Let's get started!
+
+## Table of Contents
+
+## Prerequisites
+
+Before we dive in, you'll need: 
+
+- **<a href="https://nodejs.org/" target="_blank" rel="noopener">Node.js 16+</a>** installed on your machine
+- **<a href="https://platform.openai.com/api-keys" target="_blank" rel="noopener">OpenAI API</a>** key
+- **<a href="https://www.alphavantage.co/support/#api-key" target="_blank" rel="noopener">Alpha Vantage API key</a>** (free tier available)
+- **<a href="https://helicone.ai/signup" target="_blank" rel="noopener">Helicone API key</a>** (free tier available)
+
+## Quick Start
+
+Want to skip ahead and try the code immediately? Clone the GitHub repository and run the code:
+
+```bash
+git clone https://github.com/Yusu-f/helicone-agent-tutorial.git
+cd helicone-agent-tutorial
+npm install
+```
+
+Create a `.env` file with your API keys
+
+```bash
+OPENAI_API_KEY=your_openai_key_here
+ALPHA_VANTAGE_API_KEY=your_alpha_vantage_key_here
+HELICONE_API_KEY=your_helicone_key_here
+```
+
+Run the assistant
+
+```bash
+npm start
+```
+
+This gives you the version of the financial assistant with basic Helicone monitoring.
+
+In part 2, we'll show you how to add comprehensive monitoring to your AI agent with Helicone's Sessions feature.
+
+## How We'll Build Our Financial Assistant
+
+Our financial assistant does two things:
+
+1. Fetches real-time price information and news for specific tickers
+2. Uses RAG to answer questions about financial concepts
+
+The agent intelligently determines which approach to take for each query—a pattern applicable to many domains beyond finance, including customer support, healthcare, and legal applications.
+
+## Key Components of Our AI Agent
+
+### 1. Tools
+
+Our agent uses OpenAI's function calling tools system to determine how to handle different queries:
+
+```javascript
+// Define OpenAI tools for function calling
+const tools = [
+  {
+    type: "function",
+    function: {
+      name: "getStockData",
+      description: "Get current price and other information for a specific stock by ticker symbol",
+      parameters: {
+        type: "object",
+        properties: {
+          ticker: {
+            type: "string",
+            description: "The stock ticker symbol, e.g., AAPL for Apple Inc."
+          }
+        },
+        required: ["ticker"]
+      }
+    }
+  },
+  ...
+]
+```
+
+This approach allows the model to decide which functions to call based on the user's query.
+
+### 2. Basic Helicone Monitoring
+
+The financial assistant uses Helicone's basic monitoring to track the cost, latency, and error rate of our LLM calls. You can create an account for free <a href="https://www.helicone.ai/signup" target="_blank" rel="noopener">here</a>. 
+
+```javascript
+const openai = new OpenAI({
+  apiKey: process.env.OPENAI_API_KEY,
+  baseURL: "https://oai.helicone.ai/v1",
+  defaultHeaders: {
+    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
+  },
+});
+```
+
+### 3. RAG & External API access
+
+For popular financial term queries, we use a vector store to retrieve relevant information, while for stock queries, such as real-time price information or news, we connect to the Alpha Vantage API:
+
+```javascript
+async function searchFinancialTerms(query, vectorStore) {
+  console.log("Searching for financial term definitions in knowledge base...");
+  
+  // Get relevant documents from vector store with similarity scores
+  const resultsWithScores = await vectorStore.similaritySearchWithScore(query, 2);
+
+  // Process results... 
+}
+```
+
+```javascript
+async function getStockData(ticker) {
+  try {
+    const url = `https://www.alphavantage.co/query?function=GLOBAL_QUOTE&symbol=${ticker}&apikey=${ALPHA_VANTAGE_API_KEY}`;
+    const response = await axios.get(url);
+    
+    ...
+  }
+}
+```
+
+The RAG implementation provides domain-specific knowledge to the agent. However, as we'll see later, **without proper monitoring**, detecting what's causing the system to fail when it does might be difficult. 
+
+### 4. Minimal Agent Loop (Tool Calling)
+
+We expose three tools to the LLM:
+
+- `getStockData`: Retrieves current price and market information for a specific ticker
+- `getStockNews`: Fetches the latest news articles related to a stock ticker
+- `searchFinancialTerms`: Queries our vector database for information about financial concepts
+
+The LLM may call a tool, receive its output as feedback, and then answer the user.
+
+The loop allows our agent to call tools and process results for as long as needed to generate an appropriate response:
+
+```javascript
+async function processQuery(userQuery, vectorStore) {
+  let messages = [
+    {
+      role: "system",
+      content: `You're a financial assistant. Use tools when needed. If you have enough information to answer, reply normally.`
+    },
+    { role: "user", content: userQuery }
+  ];
+  
+  // Add chat history for context if available
+  if (chatHistory.length > 0) {
+    messages.splice(1, 0, ...chatHistory);
+  }
+  
+  while (true) {
+    console.log("Sending query to OpenAI...");
+    const llmResp = await openai.chat.completions.create({
+      model: "gpt-3.5-turbo",
+      tools,
+      messages,
+      temperature: 0.1,
+    });
+    
+    const msg = llmResp.choices[0].message;
+    
+    if (msg.tool_calls && msg.tool_calls.length > 0) {
+      // Execute the helper and push message into history...
+
+      continue;
+    }
+    
+    // No tool call → LLM has produced the final answer
+    return msg.content;
+  }
+}
+```
+
+## Testing Our Financial Assistant
+
+Now, let's take our financial assistant for a spin!
+
+Run the following command to start the assistant:
+
+```bash
+npm start
+```
+
+We can view the results of our queries in the **Helicone dashboard**.
+
+> Prompt: What is tesla's stock price?
+
+Result:
+
+![Example of a financial research assistant responding to the prompt 'What is Tesla's stock price?' displaying the stock information.](/static/blog/ai-agent-monitoring-tutorial/tesla-prompt.webp)
+
+> Prompt: What is GreenEnergy's profit margin?
+
+Result:
+
+![Example of an AI agent failing to retrieve information about GreenEnergy's profit margin despite having it in the knowledge base.](/static/blog/ai-agent-monitoring-tutorial/greenenergy-profit-margin.webp)
+
+It looks like there's an issue with our agent!
+
+It can't find the profit margin of GreenEnergy despite it being in our knowledge base.
+
+Something is obviously wrong with our RAG implementation—but what?
+
+This is where **observability** comes in!
+
+## Debugging Our Financial Assistant 
+
+Looking at our implementation, there are several blind spots that could potentially cause issues:
+
+- **Hallucinations and retrieval issues**: Our agent failed to answer the query related to GreenEnergy despite having the requisite information—how do we pinpoint the problem?
+- **Cost Visibility**: How many tokens is each component of our agent consuming? Which queries are most expensive?
+- **Latency Issues**: If the agent becomes slow, which step is causing the bottleneck?
+- **Error Patterns**: Are certain types of queries consistently failing? Where in the pipeline do these failures occur?
+
+In <a href="https://www.helicone.ai/blog/ai-agent-monitoring-tutorial-part-2" target="_blank" rel="noopener">Part 2 of this tutorial</a> on AI agent optimization, we'll add Helicone to our financial assistant to gain comprehensive visibility into every step of the process. Here's a preview of what you can see:
+
+![Helicone dashboard showcasing the Sessions feature for debugging AI agents, demonstrating AI observability and agent monitoring capabilities.](/static/blog/ai-agent-monitoring-tutorial/sessions-ai-agent.webp)
+
+We'll monitor each step of the agent's workflow, resolve bugs, and gain insights into useful metrics like cost, latency, and error rates.
+
+Stay tuned!
+
+<CallToAction
+  title="Observe Your AI Agents with Helicone ⚡️"
+  description="Stop building AI in the dark. Get complete visibility into every step of your AI workflows, track costs down to the penny, and debug complex issues in minutes instead of days."
+  primaryButtonText="Start Monitoring for Free"
+  primaryButtonLink="https://helicone.ai/signup"
+  secondaryButtonText="See How Debugging Works"
+  secondaryButtonLink="https://docs.helicone.ai/features/sessions"
+/>
+
+### You might also like:
+
+- **<a href="https://www.helicone.ai/blog/ai-agent-monitoring-tutorial-part-2" target="_blank" rel="noopener">Part 2: Step-by-Step Guide to Building and Optimizing AI Agents</a>**
+- **<a href="https://www.helicone.ai/blog/debugging-chatbots-and-ai-agents-with-sessions" target="_blank" rel="noopener">Debugging RAG Chatbots and AI Agents with Sessions</a>**
+- **<a href="https://www.helicone.ai/blog/full-guide-to-improving-ai-agents" target="_blank" rel="noopener">The Full Developer's Guide to Building Effective AI Agents</a>**
+- **<a href="https://www.helicone.ai/blog/agentic-rag-full-developer-guide" target="_blank" rel="noopener">Building Agentic RAG Systems: A Developer's Guide to Smarter Information Retrieval</a>**
+
+<FAQ 
+  items={[
+    {
+      question: "Why do AI agents need specialized observability tools?",
+      answer: "AI agents have unique monitoring challenges including non-deterministic execution paths, multi-step LLM calls, complex branching logic, and dependencies on external systems. Unlike traditional applications with fixed flows, agents' decision trees vary with each request. Standard monitoring tools can't track these dynamic workflows or evaluate response quality across interconnected steps, which is why specialized tools like Helicone's session-based tracing are essential for AI agent observability."
+    },
+    {
+      question: "What are the biggest blind spots when deploying AI agents to production?",
+      answer: "The most dangerous blind spots include: undetected hallucinations in responses, hidden cost escalations from inefficient prompts, silent failures in multi-step reasoning chains, data leakage in RAG implementations, inconsistent performance across different user segments, and degrading accuracy over time as data or usage patterns change. Without proper observability, these issues can persist for weeks before being discovered, potentially causing significant business impact."
+    },
+    {
+      question: "What metrics should I monitor for any AI agent?",
+      answer: "Critical metrics for all AI agents include: end-to-end latency of complete workflows, token usage per step and total cost per request, step completion rates showing where agents get stuck, retrieval quality for RAG implementations, routing accuracy between different processing pathways, error rates for external API calls, and user satisfaction with responses. Tracking these metrics helps identify bottlenecks, optimize costs, and ensure reliable agent performance."
+    },
+    {
+      question: "How do I implement observability across different AI agent frameworks?",
+      answer: "Helicone offers flexible integration options for all major AI frameworks. For LangChain, CrewAI, and LlamaIndex, direct integrations are available. For custom agents or other frameworks, you can typically use either Helicone's proxy approach (changing just the base URL) or the SDK integration. The Sessions feature works consistently across most major frameworks to trace multi-step agent workflows regardless of your technology choices, giving you a unified view of all AI operations."
+    }
+  ]}
+/>
+
+<Questions />
@@ -234,6 +234,11 @@ export const BLOG_CONTENT: BlogStructure[] = [
       folderName: "the-complete-guide-to-LLM-observability-platforms",
     },
   },
+  {
+    dynmaicEntry: {
+      folderName: "ai-agent-monitoring-tutorial",
+    },
+  },
   {
     dynmaicEntry: {
       folderName: "implement-and-monitor-cag",