Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ permissions:

env:
MIX_ENV: test
ELIXIR_VERSION: "1.17.3"
ELIXIR_VERSION: "1.18.3"
OTP_VERSION: "27.2"

jobs:
Expand Down Expand Up @@ -96,9 +96,6 @@ jobs:
fail-fast: false
matrix:
include:
- os: ubuntu-latest
elixir: "1.17.3"
otp: "27.2"
- os: ubuntu-latest
elixir: "1.18.3"
otp: "27.2"
Expand Down
23 changes: 23 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,29 @@

All notable changes to this project will be documented in this file.

## [0.13.2] - 2026-03-07

### Added

- **Auto-update memory**: `Nous.Plugins.Memory` can now automatically reflect on conversations and update memories after each run — no explicit tool calls needed. Enable with `auto_update_memory: true` in `memory_config`. Configurable reflection model, frequency, and context limits.
- New `after_run/3` callback in `Nous.Plugin` behaviour — runs once after the entire agent run completes. Wired into both `AgentRunner.run/3` and `run_with_context/3`.
- `Nous.Plugin.run_after_run/4` helper for executing the hook across all plugins
- New config options: `:auto_update_memory`, `:auto_update_every`, `:reflection_model`, `:reflection_max_tokens`, `:reflection_max_messages`, `:reflection_max_memories`
- New example: `examples/memory/auto_update.exs`

## [0.13.1] - 2026-03-06

### Added

- **Vertex AI provider**: `Nous.Providers.VertexAI` for accessing Gemini models through Google Cloud Vertex AI. Supports enterprise features (VPC-SC, CMEK, regional endpoints, IAM).
- Three auth modes: app config Goth (`config :nous, :vertex_ai, goth: MyApp.Goth`), per-model Goth (`default_settings: %{goth: MyApp.Goth}`), or direct access token (`api_key` / `VERTEX_AI_ACCESS_TOKEN`)
- Bearer token auth via `api_key` option, `VERTEX_AI_ACCESS_TOKEN` env var, or Goth integration
- Goth integration (`{:goth, "~> 1.4", optional: true}`) for automatic service account token management — reuse existing Goth processes from PubSub, etc.
- URL auto-construction from `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_REGION` env vars
- `Nous.Providers.VertexAI.endpoint/2` helper to build endpoint URLs
- Reuses existing Gemini message format, response parsing, and stream normalization
- Model string: `"vertex_ai:gemini-2.0-flash"`

## [0.12.2] - 2026-03-04

### Fixed
Expand Down
79 changes: 79 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,8 @@ IO.puts("Tokens: #{result.usage.total_tokens}")
| LM Studio | `lmstudio:qwen3` | ✅ |
| OpenAI | `openai:gpt-4` | ✅ |
| Anthropic | `anthropic:claude-sonnet-4-5-20250929` | ✅ |
| Google Gemini | `gemini:gemini-2.0-flash` | ✅ |
| Google Vertex AI | `vertex_ai:gemini-2.0-flash` | ✅ |
| Groq | `groq:llama-3.1-70b-versatile` | ✅ |
| Ollama | `ollama:llama2` | ✅ |
| OpenRouter | `openrouter:anthropic/claude-3.5-sonnet` | ✅ |
Expand All @@ -106,9 +108,86 @@ All HTTP providers use pure Elixir HTTP clients (Req + Finch). LlamaCpp runs in-
agent = Nous.new("lmstudio:qwen3") # Local (free)
agent = Nous.new("openai:gpt-4") # OpenAI
agent = Nous.new("anthropic:claude-sonnet-4-5-20250929") # Anthropic
agent = Nous.new("vertex_ai:gemini-2.0-flash") # Google Vertex AI
agent = Nous.new("llamacpp:local", llamacpp_model: llm) # Local NIF
```

### Google Vertex AI Setup

Vertex AI provides enterprise access to Gemini models. To use it with a service account:

**1. Create a service account:**

```bash
export PROJECT_ID="your-project-id"

# Enable Vertex AI API
gcloud services enable aiplatform.googleapis.com --project=$PROJECT_ID

# Create service account
gcloud iam service-accounts create nous-vertex-ai \
--display-name="Nous Vertex AI" \
--project=$PROJECT_ID

# Grant permission
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:nous-vertex-ai@${PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"

# Download key and store as env var
gcloud iam service-accounts keys create /tmp/sa.json \
--iam-account="nous-vertex-ai@${PROJECT_ID}.iam.gserviceaccount.com"

# Set the env vars
export GOOGLE_CREDENTIALS="$(cat /tmp/sa.json)"
export GOOGLE_CLOUD_PROJECT="$PROJECT_ID"
export GOOGLE_CLOUD_REGION="us-central1"
```

**2. Add Goth to your deps** (handles token refresh from the service account):

```elixir
{:goth, "~> 1.4"}
```

**3. Start Goth in your supervision tree:**

```elixir
credentials = System.get_env("GOOGLE_CREDENTIALS") |> Jason.decode!()

children = [
{Goth, name: MyApp.Goth, source: {:service_account, credentials}}
]
```

**4. Configure Nous to use Goth:**

```elixir
# Option A: Via app config (recommended for production)
# config/config.exs
config :nous, :vertex_ai, goth: MyApp.Goth

# Then just use it — no extra options needed:
agent = Nous.new("vertex_ai:gemini-2.0-flash")
{:ok, result} = Nous.run(agent, "Hello from Vertex AI!")
```

```elixir
# Option B: Per-model (useful for multiple projects/regions)
agent = Nous.new("vertex_ai:gemini-2.0-flash",
default_settings: %{goth: MyApp.Goth}
)
```

```elixir
# Option C: Direct access token (no Goth needed, e.g. for quick testing)
export VERTEX_AI_ACCESS_TOKEN="$(gcloud auth print-access-token)"

agent = Nous.new("vertex_ai:gemini-2.0-flash")
```

See [`examples/providers/vertex_ai_goth_test.exs`](examples/providers/vertex_ai_goth_test.exs) for a runnable example.

## Features

### Tool Calling
Expand Down
10 changes: 10 additions & 0 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,16 @@ export ANTHROPIC_API_KEY="sk-ant-your-key"
export OPENAI_API_KEY="sk-your-key"
```

**Google Vertex AI:**
```bash
export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_REGION="us-central1" # optional, defaults to us-central1
# Option A: Use gcloud access token
export VERTEX_AI_ACCESS_TOKEN="$(gcloud auth print-access-token)"
# Option B: Use Goth with service account (recommended for production)
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
```

**Test cloud setup:**
```bash
mix run -e "
Expand Down
2 changes: 2 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ Provider-specific configuration and features:
| [providers/openai.exs](https://github.com/nyo16/nous/blob/master/examples/providers/openai.exs) | GPT models, function calling, settings |
| [providers/lmstudio.exs](https://github.com/nyo16/nous/blob/master/examples/providers/lmstudio.exs) | Local AI with LM Studio |
| [providers/vllm_sglang.exs](https://github.com/nyo16/nous/blob/master/examples/providers/vllm_sglang.exs) | vLLM & SGLang high-performance local inference |
| [providers/vertex_ai.exs](https://github.com/nyo16/nous/blob/master/examples/providers/vertex_ai.exs) | Google Vertex AI with Goth auth |
| [providers/llamacpp.exs](https://github.com/nyo16/nous/blob/master/examples/providers/llamacpp.exs) | Local NIF-based inference via llama.cpp |
| [providers/switching_providers.exs](https://github.com/nyo16/nous/blob/master/examples/providers/switching_providers.exs) | Provider comparison and selection |

Expand All @@ -59,6 +60,7 @@ Persistent agent memory with hybrid search:
| [memory/duckdb_full.exs](https://github.com/nyo16/nous/blob/master/examples/memory/duckdb_full.exs) | DuckDB with FTS + vector search |
| [memory/hybrid_full.exs](https://github.com/nyo16/nous/blob/master/examples/memory/hybrid_full.exs) | Muninn + Zvec for maximum search quality |
| [memory/cross_agent.exs](https://github.com/nyo16/nous/blob/master/examples/memory/cross_agent.exs) | Two agents sharing memory with scoping |
| [memory/auto_update.exs](https://github.com/nyo16/nous/blob/master/examples/memory/auto_update.exs) | Auto-update memory after each run (no explicit tool calls) |

## Advanced Examples

Expand Down
82 changes: 82 additions & 0 deletions examples/memory/auto_update.exs
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# Auto-Update Memory
#
# Demonstrates automatic memory updates after each agent run.
# Instead of the agent explicitly calling remember/recall/forget tools,
# a reflection step runs after each conversation turn and updates
# memories automatically — similar to Claude Code's "recalled/wrote memory".
#
# Run: OPENAI_API_KEY="sk-..." mix run examples/memory/auto_update.exs
#
# You can also use a local model:
# mix run examples/memory/auto_update.exs

# Choose a model (local or cloud)
model = (System.get_env("OPENAI_API_KEY") && "openai:gpt-4o-mini") || "lmstudio:qwen3-4b"

alias Nous.Memory.Store

# Create an agent with auto_update_memory enabled
agent =
Nous.new(model,
plugins: [Nous.Plugins.Memory],
instructions: "You are a helpful personal assistant. Remember what the user tells you.",
deps: %{
memory_config: %{
store: Store.ETS,
auto_update_memory: true,
auto_update_every: 1,
# Use a cheaper/faster model for the reflection step (optional)
# reflection_model: "openai:gpt-4o-mini",
reflection_max_tokens: 500
}
}
)

IO.puts("=== Auto-Update Memory Demo ===\n")

# Turn 1: Tell the agent something personal
IO.puts("--- Turn 1 ---")
{:ok, result1} = Nous.run(agent, "My name is Alice and I work as a data scientist at Acme Corp.")
IO.puts("Agent: #{result1.output}\n")

# Check what memories were auto-created
store_state = result1.context.deps[:memory_config][:store_state]
{:ok, memories} = Store.ETS.list(store_state, [])
IO.puts("Memories after turn 1 (#{length(memories)}):")
for m <- memories, do: IO.puts(" - [#{m.type}] #{m.content}")
IO.puts("")

# Turn 2: Continue the conversation (pass context for continuity)
IO.puts("--- Turn 2 ---")

{:ok, result2} =
Nous.run(agent, "Actually, I just switched jobs. I'm now at TechCorp as a ML engineer.",
context: result1.context
)

IO.puts("Agent: #{result2.output}\n")

# Check memories again — should have updated, not duplicated
store_state = result2.context.deps[:memory_config][:store_state]
{:ok, memories} = Store.ETS.list(store_state, [])
IO.puts("Memories after turn 2 (#{length(memories)}):")
for m <- memories, do: IO.puts(" - [#{m.type}] (#{m.id |> String.slice(0..7)}) #{m.content}")
IO.puts("")

# Turn 3: Ask something that requires memory
IO.puts("--- Turn 3 ---")
{:ok, result3} = Nous.run(agent, "What do you know about me?", context: result2.context)
IO.puts("Agent: #{result3.output}\n")

# Final memory state
store_state = result3.context.deps[:memory_config][:store_state]
{:ok, memories} = Store.ETS.list(store_state, [])
IO.puts("=== Final Memory State (#{length(memories)} memories) ===")

for m <- memories do
IO.puts(" [#{m.type}, importance: #{m.importance}] #{m.content}")
end

run_count = result3.context.deps[:memory_config][:_run_count]
IO.puts("\nReflection runs completed: #{run_count}")
IO.puts("Done!")
136 changes: 136 additions & 0 deletions examples/providers/vertex_ai.exs
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
#!/usr/bin/env elixir

# Nous AI - Google Vertex AI Provider
#
# Vertex AI provides enterprise access to Gemini models with features like
# VPC-SC, CMEK, regional endpoints, and IAM-based access control.
#
# Prerequisites:
# - A Google Cloud project with Vertex AI API enabled
# - Authentication (one of):
# a) Access token: `export VERTEX_AI_ACCESS_TOKEN=$(gcloud auth print-access-token)`
# b) Goth with service account: `export GOOGLE_APPLICATION_CREDENTIALS=/path/to/sa.json`
# - Project configuration:
# `export GOOGLE_CLOUD_PROJECT=your-project-id`
# `export GOOGLE_CLOUD_REGION=us-central1` (optional, defaults to us-central1)

IO.puts("=== Nous AI - Vertex AI Provider ===\n")

# ============================================================================
# Option 1: Using environment variables
# ============================================================================

IO.puts("--- Setup with Environment Variables ---")

project = System.get_env("GOOGLE_CLOUD_PROJECT")
token = System.get_env("VERTEX_AI_ACCESS_TOKEN")

if project && token do
IO.puts("Project: #{project}")
IO.puts("Region: #{System.get_env("GOOGLE_CLOUD_REGION", "us-central1")}\n")

# With env vars set, just use the model string
agent =
Nous.new("vertex_ai:gemini-2.0-flash",
instructions: "You are a helpful assistant. Be concise."
)

case Nous.run(agent, "What is Elixir? Answer in one sentence.") do
{:ok, result} ->
IO.puts("Response: #{result.output}")
IO.puts("Tokens: #{result.usage.total_tokens}")

{:error, error} ->
IO.puts("Error: #{inspect(error)}")
end
else
IO.puts("""
Skipping: Set these environment variables to test:
export GOOGLE_CLOUD_PROJECT=your-project-id
export VERTEX_AI_ACCESS_TOKEN=$(gcloud auth print-access-token)
""")
end

IO.puts("")

# ============================================================================
# Option 2: Explicit configuration
# ============================================================================

IO.puts("--- Explicit Configuration ---")

IO.puts("""
# Pass base_url and api_key directly:
model = Nous.Model.parse("vertex_ai:gemini-2.0-flash",
base_url: Nous.Providers.VertexAI.endpoint("my-project", "us-central1"),
api_key: access_token
)
""")

# ============================================================================
# Option 3: Using Goth (recommended for production)
# ============================================================================

IO.puts("--- Goth Integration (Production) ---")

IO.puts("""
# 1. Add {:goth, "~> 1.4"} to your deps
# 2. Start Goth in your supervision tree:
#
# children = [
# {Goth, name: MyApp.Goth}
# ]
#
# 3. Set GOOGLE_APPLICATION_CREDENTIALS to your service account JSON
# 4. Configure Nous:
#
# config :nous, :vertex_ai,
# goth: MyApp.Goth,
# base_url: "https://us-central1-aiplatform.googleapis.com/v1/projects/my-project/locations/us-central1"
#
# 5. Use it:
# agent = Nous.new("vertex_ai:gemini-2.0-flash")
""")

# ============================================================================
# Streaming
# ============================================================================

IO.puts("--- Streaming ---")

if project && token do
agent =
Nous.new("vertex_ai:gemini-2.0-flash",
instructions: "You are a helpful assistant."
)

case Nous.run_stream(agent, "Write a haiku about Elixir.") do
{:ok, stream} ->
stream
|> Enum.each(fn
{:text_delta, text} -> IO.write(text)
{:finish, _} -> IO.puts("\n")
_ -> :ok
end)

{:error, error} ->
IO.puts("Streaming error: #{inspect(error)}")
end
else
IO.puts("Skipping streaming demo (no credentials)\n")
end

# ============================================================================
# Available Gemini Models on Vertex AI
# ============================================================================

IO.puts("--- Available Models ---")

IO.puts("""
Model | Description
-------------------------------|-------------------------------------------
gemini-2.0-flash | Fast, efficient for most tasks
gemini-2.0-flash-lite | Lightweight, lowest latency
gemini-2.5-pro-preview-06-05 | Most capable, best for complex reasoning
gemini-2.5-flash-preview-05-20 | Balanced speed and capability
""")
Loading