Skip to content

cnoe-io/cnoe-agent-utils

πŸ€– cnoe-agent-utils

PyPI version Unit Tests Publish Python Package Coverage Badge

Test AWS Bedrock Examples Test Azure OpenAI Examples Test OpenAI Examples Test GCP Vertex AI Examples Test Google Gemini Examples

  • Reusable utilities and abstractions for building agent-based (LLM-powered) systems.
  • Centralized LLM Factory supporting major providers (AWS, Azure, GCP, OpenAI, Gemini, Anthropic).
  • Centralized Tracing Utilities (since v0.2.0) to eliminate duplicated tracing code across CNOE agents.

Key Features

Core Utilities

  • Unified interface (LLM Factory) for seamless LLM instantiation across multiple clouds and vendors.
    • 🏭 LLM Factory for easy model instantiation across:
      • ☁️ AWS
      • ☁️ Azure
      • ☁️ GCP Vertex
      • πŸ€– Google Gemini
      • πŸ€– Anthropic Claude
      • πŸ€– OpenAI
  • Simple, environment-variable-driven configuration.
  • Example scripts for each LLM provider with setup instructions.

Agent Tracing (since v0.2.0)

  • Centralized tracing logic: Removes 350+ lines of repeated code per agent.
  • Single import/decorator: No more copy-pasting tracing logic.
  • Environment-based toggling: Use ENABLE_TRACING env var to control all tracing.
  • A2A Tracing Disabling: Single method to monkey-patch/disable agent-to-agent tracing everywhere.
  • Graceful fallback: Works with or without Langfuse; tracing is zero-overhead when disabled.

Note: Checkout this tutorial on Tracing

πŸš€ LLM Factory Getting Started

πŸ›‘οΈ Create and Activate a Virtual Environment

It is recommended to use a virtual environment to manage dependencies:

python3 -m venv .venv
source .venv/bin/activate

⚑ Prerequisite: Install uv

Before running the examples, install uv:

pip install uv

πŸ“¦ Installation

Installation Options

Default Installation (recommended for most users):

pip install cnoe-agent-utils

This installs all dependencies and provides full functionality. It's equivalent to pip install 'cnoe-agent-utils[all]'.

Minimal Installation (specific functionality only): Use these when you only need specific functionality or want to minimize package size:

# Anthropic Claude support only
pip install "cnoe-agent-utils[anthropic]"

# OpenAI support (openai.com GPT models) only
pip install "cnoe-agent-utils[openai]"

# Azure OpenAI support (Azure-hosted GPT models) only
pip install "cnoe-agent-utils[azure]"

# AWS support (Bedrock, etc.) only
pip install "cnoe-agent-utils[aws]"

# Google Cloud support (Vertex AI, Gemini) only
pip install "cnoe-agent-utils[gcp]"

# Advanced tracing and observability (Langfuse, OpenTelemetry) only
pip install "cnoe-agent-utils[tracing]"

# Development dependencies (testing, linting, etc.)
pip install "cnoe-agent-utils[dev]"

Using uv

# Default installation (all dependencies)
uv add cnoe-agent-utils

# Minimal installation (specific functionality only)
uv add "cnoe-agent-utils[anthropic]"
uv add "cnoe-agent-utils[openai]"
uv add "cnoe-agent-utils[azure]"
uv add "cnoe-agent-utils[aws]"
uv add "cnoe-agent-utils[gcp]"
uv add "cnoe-agent-utils[tracing]"

Local Development

If you are developing locally:

git clone https://github.com/cnoe-agent-utils/cnoe-agent-utils.git
cd cnoe-agent-utils
uv sync

πŸ§‘β€πŸ’» Usage

To test integration with different LLM providers, configure the required environment variables for each provider as shown below. Then, run the corresponding example script using uv.


πŸ€– Anthropic

Set the following environment variables:

export ANTHROPIC_API_KEY=<your_anthropic_api_key>
export ANTHROPIC_MODEL_NAME=<model_name>

# Optional: Enable extended thinking for Claude 4+ models
export ANTHROPIC_THINKING_ENABLED=true
export ANTHROPIC_THINKING_BUDGET=1024  # Default: 1024, Min: 1024

Run the example:

uv run examples/test_anthropic.py

☁️ AWS Bedrock (Anthropic Claude)

Set the following environment variables:

export AWS_PROFILE=<your_aws_profile>
export AWS_REGION=<your_aws_region>
export AWS_BEDROCK_MODEL_ID="us.anthropic.claude-3-7-sonnet-20250219-v1:0"
export AWS_BEDROCK_PROVIDER="anthropic"

# Optional: Enable extended thinking for Claude 4+ models
export AWS_BEDROCK_THINKING_ENABLED=true
export AWS_BEDROCK_THINKING_BUDGET=1024  # Default: 1024, Min: 1024

Run the example:

uv run examples/test_aws_bedrock_claude.py

AWS Bedrock Prompt Caching

AWS Bedrock supports prompt caching to reduce latency and costs by caching repeated context across requests. This feature is particularly beneficial for:

  • Multi-turn conversations with long system prompts
  • Repeated use of large context documents
  • Agent systems with consistent instructions

Enable prompt caching:

export AWS_BEDROCK_ENABLE_PROMPT_CACHE=true

Supported Models:

For the latest list of models that support prompt caching and their minimum token requirements, see the AWS Bedrock Prompt Caching documentation.

Implementation Note: When AWS_BEDROCK_ENABLE_PROMPT_CACHE=true, the library uses ChatBedrockConverse which has native prompt caching support. If your model doesn't support caching, AWS Bedrock will return a clear error message. There's no need to validate model compatibility in advanceβ€”AWS handles this automatically.

Note: Model IDs may include regional prefixes (us., eu., ap., etc.) depending on your AWS account configuration. Pass the full model ID as provided by AWS:

  • Example: us.anthropic.claude-3-7-sonnet-20250219-v1:0
  • Example: anthropic.claude-opus-4-1-20250805-v1:0

Benefits:

  • Up to 85% reduction in latency for cached content
  • Up to 90% reduction in costs for cached tokens
  • 5-minute cache TTL (automatically managed by AWS)
  • Maximum 4 cache checkpoints per request

Usage Example:

import os
from cnoe_agent_utils.llm_factory import LLMFactory
from langchain_core.messages import SystemMessage, HumanMessage

# Enable caching
os.environ["AWS_BEDROCK_ENABLE_PROMPT_CACHE"] = "true"

# Initialize LLM
llm = LLMFactory("aws-bedrock").get_llm()

# Create cache point for system message
cache_point = llm.create_cache_point()

# Build messages with cache control
messages = [
    SystemMessage(content=[
        {"text": "You are a helpful AI assistant with expertise in..."},
        cache_point  # Marks cache checkpoint
    ]),
    HumanMessage(content="What is your primary function?")
]

# Invoke with caching
response = llm.invoke(messages)

# Check cache statistics in response metadata
if hasattr(response, 'response_metadata'):
    usage = response.response_metadata.get('usage', {})
    print(f"Cache read tokens: {usage.get('cacheReadInputTokens', 0)}")
    print(f"Cache creation tokens: {usage.get('cacheCreationInputTokens', 0)}")

Run the caching example:

uv run examples/aws_bedrock_cache_example.py

Monitoring Cache Performance:

Cache hit/miss statistics are available in:

  1. Response metadata - cacheReadInputTokens and cacheCreationInputTokens
  2. CloudWatch metrics - Track cache performance across all requests
  3. Application logs - Enable via AWS_CREDENTIALS_DEBUG=true

Best Practices:

  • Use cache for system prompts and context that remain consistent across requests
  • Ensure cached content meets minimum token requirements (see AWS documentation for model-specific limits)
  • Place cache points strategically (after system messages, large context documents, or tool definitions)
  • Monitor cache hit rates to optimize placement

☁️ Azure OpenAI

Set the following environment variables:

export AZURE_OPENAI_API_KEY=<your_azure_openai_api_key>
export AZURE_OPENAI_API_VERSION=<api_version>
export AZURE_OPENAI_DEPLOYMENT=gpt-4.1
export AZURE_OPENAI_ENDPOINT=<your_azure_openai_endpoint>

Run the example:

uv run examples/test_azure_openai.py

πŸ€– OpenAI

Set the following environment variables:

export OPENAI_API_KEY=<your_openai_api_key>
export OPENAI_ENDPOINT=https://api.openai.com/v1
export OPENAI_MODEL_NAME=gpt-4.1

Optional configuration:

export OPENAI_DEFAULT_HEADERS='{"my-header-key":"my-value"}'
export OPENAI_USER=user-identifier

Run the example:

uv run examples/test_openai.py

πŸ€– Google Gemini

Set the following environment variable:

export GOOGLE_API_KEY=<your_google_api_key>

Run the example:

uv run examples/test_google_gemini.py

☁️ GCP Vertex AI

Set the following environment variables:

export GOOGLE_APPLICATION_CREDENTIALS=~/.config/gcp.json
export VERTEXAI_MODEL_NAME="gemini-2.0-flash-001"

# Optional: Enable extended thinking for Claude 4+ models on Vertex AI
export VERTEXAI_THINKING_ENABLED=true
export VERTEXAI_THINKING_BUDGET=1024  # Default: 1024, Min: 1024

Run the example:

uv run examples/test_gcp_vertexai.py

This demonstrates how to use the LLM Factory and other utilities provided by the library.


πŸ“œ License

Apache 2.0 (see LICENSE)


πŸ‘₯ Maintainers

See MAINTAINERS.md

  • Contributions welcome via PR or issue!

About

CNOE Agent Utils

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 7