modules

bcherry · bcherry · commit bd5df65cba85 · 2025-09-04T08:14:48.000-07:00
diff --git a/module-2.md b/module-2.md
@@ -0,0 +1,29 @@
+# Adding turn detection
+
+In this exercise, we'll add a semantic turn detector model to the agent.
+
+Step 1: Intall the package
+
+    ```shell
+    uv add "livekit-agents[turn-detector]"
+    ```
+
+Step 2: Import the package
+
+    ```python
+    from livekit.plugins.turn_detector.multilingual import MultilingualModel
+    ```
+
+Step 3: Add the model to the agent (inside the `AgentSession` constructor)
+
+    ```python
+    turn_detection=MultilingualModel(),
+    ```
+
+Step 4: Run the agent
+
+    ```shell
+    uv run agent.py console
+    ```
+
+Now you should be able to see the turn detection in action.
diff --git a/module-3.md b/module-3.md
@@ -0,0 +1,58 @@
+# Customizing the agent's behavior
+
+Exercise 1: Change the agent's instructions and personality. Modify the system prompt in the `Assistant` class:
+
+    ```python
+    instructions="""
+    You are a hilariously funny voice AI assistant.
+    You are also a bit sarcastic.
+    Assist the user, but don't be too helpful.
+    """,
+    ```
+
+Exercise 2: Change the agent's voice. Modify the `openai.TTS` constructor:
+
+    ```python
+    tts=openai.TTS(voice="ash"),
+    ```
+
+Exercise 3: Add the fallback adapters. 
+
+    Import the `stt`, `llm`, `tts` modules:
+
+    ```python
+    from livekit.agents import stt, llm, tts
+    ```
+
+    Add the fallback adapter to the `AgentSession` constructor, using OpenAI as the fallback (since it's already installed).
+
+    Extract the VAD from the `AgentSession` constructor:
+
+    ```python
+    vad = silero.VAD.load()
+    # ...
+    vad=vad,
+    ```
+
+    Add the fallback adapters to the `AgentSession` constructor:
+
+    ```python
+    llm=llm.FallbackAdapter(
+        [
+            openai.LLM(model="gpt-4.1"),
+            openai.LLM(model="gpt-4o-mini"),
+        ]
+    ),
+    stt=stt.FallbackAdapter(
+        [
+            deepgram.STT(model="nova-3", language="multi"),
+            stt.StreamAdapter(stt=openai.STT(model="gpt-4o-transcribe"), vad=vad),
+        ]
+    ),
+    tts=tts.FallbackAdapter(
+        [
+            openai.TTS(voice="ash"),
+            deepgram.TTS(),
+        ]
+    ),
+    ```
diff --git a/module-4.md b/module-4.md
@@ -0,0 +1,110 @@
+# Adding metrics collection
+
+In this exercise, we'll add metrics collection to the agent. This includes stats on each components, as well as a custom stat measuring the total time for the agent to respond in audio.
+
+Step 1: Add the import to the `agent.py` file:
+
+    ```python
+    from livekit.agents import metrics, MetricsCollectedEvent, AgentStateChangedEvent
+    ```
+
+Step 2: Add the metrics collection to the `entrypoint` function, before the `session.start` call:
+
+    ```python
+    usage_collector = metrics.UsageCollector()
+    last_eou_metrics: metrics.EOUMetrics | None = None
+
+    @session.on("metrics_collected")
+    def _on_metrics_collected(ev: MetricsCollectedEvent):
+        nonlocal last_eou_metrics
+        if ev.metrics.type == "eou_metrics":
+            last_eou_metrics = ev.metrics
+        
+        metrics.log_metrics(ev.metrics)
+        usage_collector.collect(ev.metrics)
+
+    async def log_usage():
+        summary = usage_collector.get_summary()
+        logger.info(f"Usage: {summary}")
+
+    ctx.add_shutdown_callback(log_usage)
+
+    @session.on("agent_state_changed")
+    def _on_agent_state_changed(ev: AgentStateChangedEvent):
+        if (
+            ev.new_state == "speaking"
+            and last_eou_metrics
+            and last_eou_metrics.speech_id == session.current_speech.id
+        ):
+            logger.info(
+                f"Agent response - Time to first audio frame: {ev.created_at - last_eou_metrics.last_speaking_time}"
+            )
+    ```
+
+Now you should see real merics appear in the console when you run the agent.
+
+# Pre-emptive generation
+
+Now we'll turn on a feature to speed up handling of long messages.
+
+Add the pre-emptive generation to the `AgentSession` constructor:
+
+    ```python
+    preemptive_generation=True,
+    ```
+
+Compare the complete response latency before and after the change.
+
+
+# Optional: Langfuse tracing
+
+To add Langfuse to the agent, create an account at [Langfuse](https://langfuse.com/) and get an API key (you'll need to create an organization and a project first). 
+
+Step 1: Add your keys to the `.env.local` file:
+
+```
+LANGFUSE_PUBLIC_KEY=
+LANGFUSE_SECRET_KEY=
+LANGFUSE_HOST=
+```
+
+Step 2: Import the telemetry modules:
+
+```python
+from livekit.agents.telemetry import set_tracer_provider
+import os
+import base64
+```
+
+Step 3: Define the `setup_langfuse` function in the `agent.py` file:
+
+```python
+def setup_langfuse(
+    host: str | None = None, public_key: str | None = None, secret_key: str | None = None
+):
+    from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
+    from opentelemetry.sdk.trace import TracerProvider
+    from opentelemetry.sdk.trace.export import BatchSpanProcessor
+
+    public_key = public_key or os.getenv("LANGFUSE_PUBLIC_KEY")
+    secret_key = secret_key or os.getenv("LANGFUSE_SECRET_KEY")
+    host = host or os.getenv("LANGFUSE_HOST")
+
+    if not public_key or not secret_key or not host:
+        raise ValueError("LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, and LANGFUSE_HOST must be set")
+
+    langfuse_auth = base64.b64encode(f"{public_key}:{secret_key}".encode()).decode()
+    os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = f"{host.rstrip('/')}/api/public/otel"
+    os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"Authorization=Basic {langfuse_auth}"
+
+    trace_provider = TracerProvider()
+    trace_provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
+    set_tracer_provider(trace_provider)
+```
+
+Step 4: Add the `setup_langfuse` function call to the `entrypoint` function:
+
+```python
+async def entrypoint(ctx: JobContext):
+    setup_langfuse()
+```
diff --git a/module-5.md b/module-5.md
@@ -0,0 +1,69 @@
+# Add funcion tools
+
+In this exercise, we'll add function tools to the agent.
+
+Step 1: Add the import to the `agent.py` file:
+
+    ```python
+    from livekit.agents.llm import function_tool
+    from livekit.agents import RunContext
+    import aiohttp
+    ```
+
+Step 2: Add a simple function tool to the `Assistant` class:
+
+    ```python
+    @function_tool
+    async def lookup_weather(self, context: RunContext, location: str):
+        """Use this tool to look up current weather information in the given location.
+
+        If the location is not supported by the weather service, the tool will indicate this. You must tell the user the location's weather is unavailable.
+
+        Args:
+            location: The location to look up weather information for (e.g. city name)
+        """
+
+        logger.info(f"Looking up weather for {location}")
+
+        try:
+            async with aiohttp.ClientSession() as session:
+                async with session.get(f"http://shayne.app/weather?location={location}") as response:
+                    if response.status == 200:
+                        data = await response.json()
+                        condition = data.get("condition", "unknown")
+                        temperature = data.get("temperature", "unknown")
+                        unit = data.get("unit", "degrees")
+                        return f"{condition} with a temperature of {temperature} {unit}"
+                    else:
+                        logger.error(f"Weather API returned status {response.status}")
+                        return "Weather information is currently unavailable for this location."
+        except Exception as e:
+            logger.error(f"Error fetching weather: {e}")
+            return "Weather service is temporarily unavailable."
+    ```
+
+# Add MCP servers
+
+In this exercise, we'll add an MCP server to the agent.
+
+Step 1: Install the MCP package
+
+    ```shell
+    uv add "livekit-agents[mcp]"
+    ```
+
+Step 2: Add the import to the `agent.py` file:
+
+    ```python
+    from livekit.agents import mcp
+    ```
+    
+Step 3: Add the MCP servers to the `Assistant` class's super constructor:
+
+    ```python
+    mcp_servers=[
+        mcp.MCPServerHTTP(url="https://shayne.app/sse"),
+    ],
+    ```
+
+Your agent now has a simple MCP server that supports a tool called `add_numbers`.
diff --git a/module-6.md b/module-6.md
@@ -0,0 +1,81 @@
+# Collecting recording consent
+
+In this exercise, we'll add a task to the agent to collect recording consent.
+
+Step 1: Add the import to the `agent.py` file:
+
+    ```python
+    from livekit.agents import AgentTask
+    ```
+
+Step 2: Define the `CollectConsent` class:
+
+    ```python
+    class CollectConsent(AgentTask[bool]):
+        def __init__(self, chat_ctx=None):
+            super().__init__(
+                instructions="""
+                Ask for recording consent and get a clear yes or no answer.
+                Be polite and professional.
+                """,
+                chat_ctx=chat_ctx,
+            )
+
+    async def on_enter(self) -> None:
+        await self.session.generate_reply(instructions="""
+        Briefly introduce yourself, then ask for permission to record the call for quality assurance and training purposes.
+        Make it clear that they can decline.
+        ```
+
+    @function_tool
+    async def consent_given(self) -> None:
+        """Use this when the user gives consent to record."""
+        self.complete(True)
+
+    @function_tool
+    async def consent_denied(self) -> None:
+        """Use this when the user denies consent to record."""
+        self.complete(False)
+    ```
+
+Step 3: Start the task in the `Assistant` class's `on_enter` method, then proceed based on the result:
+
+    ```python
+    async def on_enter(self) -> None:
+        if await CollectConsent(chat_ctx=self.chat_ctx):
+            logger.info("User gave consent to record.")
+            await self.session.generate_reply(instructions="Thank the user for their consent then offer your assistance.")
+        else:
+            logger.info("User did not give consent to record.")
+            await self.session.generate_reply(instructions="Let the user know that the call will not be recorded, then offer your assistance.")
+    ```
+
+# Add a handoff
+
+In this exercise, we'll add a handoff to the agent.
+
+Step 1: Create another agent to handoff to:
+
+    ```python
+    class Manager(Agent):
+        def __init__(self, chat_ctx=None):
+            super().__init__(
+                instructions="""You are a manager for a team of helpful voice AI assistants. 
+                A customer has been escalated to you.
+                Provide your assistant and be professional.
+                """,
+                tts=openai.TTS(voice="coral"),
+                chat_ctx=chat_ctx,
+            )
+    ```
+
+Step 2: Add the handoff to the `Assistant` class, as a function tool:
+
+    ```python
+    @function_tool
+    async def escalate_to_manager(self, context: RunContext):
+        """Use this tool to escalate the call to the manager, upon user request."""
+        return Manager(chat_ctx=self.chat_ctx), "Escalating to manager..."
+    ```
+
+Now you can ask the assistant to escalate to the manager.