nyo16 · nyo16 · Mar 10, 2026 · Mar 6, 2026 · Mar 6, 2026 · Mar 7, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -12,7 +12,7 @@ permissions:
 
 env:
   MIX_ENV: test
-  ELIXIR_VERSION: "1.17.3"
+  ELIXIR_VERSION: "1.18.3"
   OTP_VERSION: "27.2"
 
 jobs:
@@ -96,9 +96,6 @@ jobs:
       fail-fast: false
       matrix:
         include:
-          - os: ubuntu-latest
-            elixir: "1.17.3"
-            otp: "27.2"
           - os: ubuntu-latest
             elixir: "1.18.3"
             otp: "27.2"

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,6 +2,29 @@
 
 All notable changes to this project will be documented in this file.
 
+## [0.13.2] - 2026-03-07
+
+### Added
+
+- **Auto-update memory**: `Nous.Plugins.Memory` can now automatically reflect on conversations and update memories after each run — no explicit tool calls needed. Enable with `auto_update_memory: true` in `memory_config`. Configurable reflection model, frequency, and context limits.
+  - New `after_run/3` callback in `Nous.Plugin` behaviour — runs once after the entire agent run completes. Wired into both `AgentRunner.run/3` and `run_with_context/3`.
+  - `Nous.Plugin.run_after_run/4` helper for executing the hook across all plugins
+  - New config options: `:auto_update_memory`, `:auto_update_every`, `:reflection_model`, `:reflection_max_tokens`, `:reflection_max_messages`, `:reflection_max_memories`
+  - New example: `examples/memory/auto_update.exs`
+
+## [0.13.1] - 2026-03-06
+
+### Added
+
+- **Vertex AI provider**: `Nous.Providers.VertexAI` for accessing Gemini models through Google Cloud Vertex AI. Supports enterprise features (VPC-SC, CMEK, regional endpoints, IAM).
+  - Three auth modes: app config Goth (`config :nous, :vertex_ai, goth: MyApp.Goth`), per-model Goth (`default_settings: %{goth: MyApp.Goth}`), or direct access token (`api_key` / `VERTEX_AI_ACCESS_TOKEN`)
+  - Bearer token auth via `api_key` option, `VERTEX_AI_ACCESS_TOKEN` env var, or Goth integration
+  - Goth integration (`{:goth, "~> 1.4", optional: true}`) for automatic service account token management — reuse existing Goth processes from PubSub, etc.
+  - URL auto-construction from `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_REGION` env vars
+  - `Nous.Providers.VertexAI.endpoint/2` helper to build endpoint URLs
+  - Reuses existing Gemini message format, response parsing, and stream normalization
+  - Model string: `"vertex_ai:gemini-2.0-flash"`
+
 ## [0.12.2] - 2026-03-04
 
 ### Fixed

diff --git a/README.md b/README.md
@@ -92,6 +92,8 @@ IO.puts("Tokens: #{result.usage.total_tokens}")
 | LM Studio | `lmstudio:qwen3` | ✅ |
 | OpenAI | `openai:gpt-4` | ✅ |
 | Anthropic | `anthropic:claude-sonnet-4-5-20250929` | ✅ |
+| Google Gemini | `gemini:gemini-2.0-flash` | ✅ |
+| Google Vertex AI | `vertex_ai:gemini-2.0-flash` | ✅ |
 | Groq | `groq:llama-3.1-70b-versatile` | ✅ |
 | Ollama | `ollama:llama2` | ✅ |
 | OpenRouter | `openrouter:anthropic/claude-3.5-sonnet` | ✅ |
@@ -106,9 +108,86 @@ All HTTP providers use pure Elixir HTTP clients (Req + Finch). LlamaCpp runs in-
 agent = Nous.new("lmstudio:qwen3")                  # Local (free)
 agent = Nous.new("openai:gpt-4")                    # OpenAI
 agent = Nous.new("anthropic:claude-sonnet-4-5-20250929")   # Anthropic
+agent = Nous.new("vertex_ai:gemini-2.0-flash")      # Google Vertex AI
 agent = Nous.new("llamacpp:local", llamacpp_model: llm)  # Local NIF
 ```
 
+### Google Vertex AI Setup
+
+Vertex AI provides enterprise access to Gemini models. To use it with a service account:
+
+**1. Create a service account:**
+
+```bash
+export PROJECT_ID="your-project-id"
+
+# Enable Vertex AI API
+gcloud services enable aiplatform.googleapis.com --project=$PROJECT_ID
+
+# Create service account
+gcloud iam service-accounts create nous-vertex-ai \
+  --display-name="Nous Vertex AI" \
+  --project=$PROJECT_ID
+
+# Grant permission
+gcloud projects add-iam-policy-binding $PROJECT_ID \
+  --member="serviceAccount:nous-vertex-ai@${PROJECT_ID}.iam.gserviceaccount.com" \
+  --role="roles/aiplatform.user"
+
+# Download key and store as env var
+gcloud iam service-accounts keys create /tmp/sa.json \
+  --iam-account="nous-vertex-ai@${PROJECT_ID}.iam.gserviceaccount.com"
+
+# Set the env vars
+export GOOGLE_CREDENTIALS="$(cat /tmp/sa.json)"
+export GOOGLE_CLOUD_PROJECT="$PROJECT_ID"
+export GOOGLE_CLOUD_REGION="us-central1"
+```
+
+**2. Add Goth to your deps** (handles token refresh from the service account):
+
+```elixir
+{:goth, "~> 1.4"}
+```
+
+**3. Start Goth in your supervision tree:**
+
+```elixir
+credentials = System.get_env("GOOGLE_CREDENTIALS") |> Jason.decode!()
+
+children = [
+  {Goth, name: MyApp.Goth, source: {:service_account, credentials}}
+]
+```
+
+**4. Configure Nous to use Goth:**
+
+```elixir
+# Option A: Via app config (recommended for production)
+# config/config.exs
+config :nous, :vertex_ai, goth: MyApp.Goth
+
+# Then just use it — no extra options needed:
+agent = Nous.new("vertex_ai:gemini-2.0-flash")
+{:ok, result} = Nous.run(agent, "Hello from Vertex AI!")
+```
+
+```elixir
+# Option B: Per-model (useful for multiple projects/regions)
+agent = Nous.new("vertex_ai:gemini-2.0-flash",
+  default_settings: %{goth: MyApp.Goth}
+)
+```
+
+```elixir
+# Option C: Direct access token (no Goth needed, e.g. for quick testing)
+export VERTEX_AI_ACCESS_TOKEN="$(gcloud auth print-access-token)"
+
+agent = Nous.new("vertex_ai:gemini-2.0-flash")
+```
+
+See [`examples/providers/vertex_ai_goth_test.exs`](examples/providers/vertex_ai_goth_test.exs) for a runnable example.
+
 ## Features
 
 ### Tool Calling

diff --git a/docs/getting-started.md b/docs/getting-started.md
@@ -49,6 +49,16 @@ export ANTHROPIC_API_KEY="sk-ant-your-key"
 export OPENAI_API_KEY="sk-your-key"
 ```
 
+**Google Vertex AI:**
+```bash
+export GOOGLE_CLOUD_PROJECT="your-project-id"
+export GOOGLE_CLOUD_REGION="us-central1"  # optional, defaults to us-central1
+# Option A: Use gcloud access token
+export VERTEX_AI_ACCESS_TOKEN="$(gcloud auth print-access-token)"
+# Option B: Use Goth with service account (recommended for production)
+export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
+```
+
 **Test cloud setup:**
 ```bash
 mix run -e "

diff --git a/examples/README.md b/examples/README.md
@@ -44,6 +44,7 @@ Provider-specific configuration and features:
 | [providers/openai.exs](https://github.com/nyo16/nous/blob/master/examples/providers/openai.exs) | GPT models, function calling, settings |
 | [providers/lmstudio.exs](https://github.com/nyo16/nous/blob/master/examples/providers/lmstudio.exs) | Local AI with LM Studio |
 | [providers/vllm_sglang.exs](https://github.com/nyo16/nous/blob/master/examples/providers/vllm_sglang.exs) | vLLM & SGLang high-performance local inference |
+| [providers/vertex_ai.exs](https://github.com/nyo16/nous/blob/master/examples/providers/vertex_ai.exs) | Google Vertex AI with Goth auth |
 | [providers/llamacpp.exs](https://github.com/nyo16/nous/blob/master/examples/providers/llamacpp.exs) | Local NIF-based inference via llama.cpp |
 | [providers/switching_providers.exs](https://github.com/nyo16/nous/blob/master/examples/providers/switching_providers.exs) | Provider comparison and selection |
 
@@ -59,6 +60,7 @@ Persistent agent memory with hybrid search:
 | [memory/duckdb_full.exs](https://github.com/nyo16/nous/blob/master/examples/memory/duckdb_full.exs) | DuckDB with FTS + vector search |
 | [memory/hybrid_full.exs](https://github.com/nyo16/nous/blob/master/examples/memory/hybrid_full.exs) | Muninn + Zvec for maximum search quality |
 | [memory/cross_agent.exs](https://github.com/nyo16/nous/blob/master/examples/memory/cross_agent.exs) | Two agents sharing memory with scoping |
+| [memory/auto_update.exs](https://github.com/nyo16/nous/blob/master/examples/memory/auto_update.exs) | Auto-update memory after each run (no explicit tool calls) |
 
 ## Advanced Examples
 

diff --git a/examples/memory/auto_update.exs b/examples/memory/auto_update.exs
@@ -0,0 +1,82 @@
+# Auto-Update Memory
+#
+# Demonstrates automatic memory updates after each agent run.
+# Instead of the agent explicitly calling remember/recall/forget tools,
+# a reflection step runs after each conversation turn and updates
+# memories automatically — similar to Claude Code's "recalled/wrote memory".
+#
+# Run: OPENAI_API_KEY="sk-..." mix run examples/memory/auto_update.exs
+#
+# You can also use a local model:
+#   mix run examples/memory/auto_update.exs
+
+# Choose a model (local or cloud)
+model = (System.get_env("OPENAI_API_KEY") && "openai:gpt-4o-mini") || "lmstudio:qwen3-4b"
+
+alias Nous.Memory.Store
+
+# Create an agent with auto_update_memory enabled
+agent =
+  Nous.new(model,
+    plugins: [Nous.Plugins.Memory],
+    instructions: "You are a helpful personal assistant. Remember what the user tells you.",
+    deps: %{
+      memory_config: %{
+        store: Store.ETS,
+        auto_update_memory: true,
+        auto_update_every: 1,
+        # Use a cheaper/faster model for the reflection step (optional)
+        # reflection_model: "openai:gpt-4o-mini",
+        reflection_max_tokens: 500
+      }
+    }
+  )
+
+IO.puts("=== Auto-Update Memory Demo ===\n")
+
+# Turn 1: Tell the agent something personal
+IO.puts("--- Turn 1 ---")
+{:ok, result1} = Nous.run(agent, "My name is Alice and I work as a data scientist at Acme Corp.")
+IO.puts("Agent: #{result1.output}\n")
+
+# Check what memories were auto-created
+store_state = result1.context.deps[:memory_config][:store_state]
+{:ok, memories} = Store.ETS.list(store_state, [])
+IO.puts("Memories after turn 1 (#{length(memories)}):")
+for m <- memories, do: IO.puts("  - [#{m.type}] #{m.content}")
+IO.puts("")
+
+# Turn 2: Continue the conversation (pass context for continuity)
+IO.puts("--- Turn 2 ---")
+
+{:ok, result2} =
+  Nous.run(agent, "Actually, I just switched jobs. I'm now at TechCorp as a ML engineer.",
+    context: result1.context
+  )
+
+IO.puts("Agent: #{result2.output}\n")
+
+# Check memories again — should have updated, not duplicated
+store_state = result2.context.deps[:memory_config][:store_state]
+{:ok, memories} = Store.ETS.list(store_state, [])
+IO.puts("Memories after turn 2 (#{length(memories)}):")
+for m <- memories, do: IO.puts("  - [#{m.type}] (#{m.id |> String.slice(0..7)}) #{m.content}")
+IO.puts("")
+
+# Turn 3: Ask something that requires memory
+IO.puts("--- Turn 3 ---")
+{:ok, result3} = Nous.run(agent, "What do you know about me?", context: result2.context)
+IO.puts("Agent: #{result3.output}\n")
+
+# Final memory state
+store_state = result3.context.deps[:memory_config][:store_state]
+{:ok, memories} = Store.ETS.list(store_state, [])
+IO.puts("=== Final Memory State (#{length(memories)} memories) ===")
+
+for m <- memories do
+  IO.puts("  [#{m.type}, importance: #{m.importance}] #{m.content}")
+end
+
+run_count = result3.context.deps[:memory_config][:_run_count]
+IO.puts("\nReflection runs completed: #{run_count}")
+IO.puts("Done!")
diff --git a/examples/providers/vertex_ai.exs b/examples/providers/vertex_ai.exs
@@ -0,0 +1,136 @@
+#!/usr/bin/env elixir
+
+# Nous AI - Google Vertex AI Provider
+#
+# Vertex AI provides enterprise access to Gemini models with features like
+# VPC-SC, CMEK, regional endpoints, and IAM-based access control.
+#
+# Prerequisites:
+#   - A Google Cloud project with Vertex AI API enabled
+#   - Authentication (one of):
+#     a) Access token: `export VERTEX_AI_ACCESS_TOKEN=$(gcloud auth print-access-token)`
+#     b) Goth with service account: `export GOOGLE_APPLICATION_CREDENTIALS=/path/to/sa.json`
+#   - Project configuration:
+#     `export GOOGLE_CLOUD_PROJECT=your-project-id`
+#     `export GOOGLE_CLOUD_REGION=us-central1`  (optional, defaults to us-central1)
+
+IO.puts("=== Nous AI - Vertex AI Provider ===\n")
+
+# ============================================================================
+# Option 1: Using environment variables
+# ============================================================================
+
+IO.puts("--- Setup with Environment Variables ---")
+
+project = System.get_env("GOOGLE_CLOUD_PROJECT")
+token = System.get_env("VERTEX_AI_ACCESS_TOKEN")
+
+if project && token do
+  IO.puts("Project: #{project}")
+  IO.puts("Region: #{System.get_env("GOOGLE_CLOUD_REGION", "us-central1")}\n")
+
+  # With env vars set, just use the model string
+  agent =
+    Nous.new("vertex_ai:gemini-2.0-flash",
+      instructions: "You are a helpful assistant. Be concise."
+    )
+
+  case Nous.run(agent, "What is Elixir? Answer in one sentence.") do
+    {:ok, result} ->
+      IO.puts("Response: #{result.output}")
+      IO.puts("Tokens: #{result.usage.total_tokens}")
+
+    {:error, error} ->
+      IO.puts("Error: #{inspect(error)}")
+  end
+else
+  IO.puts("""
+  Skipping: Set these environment variables to test:
+    export GOOGLE_CLOUD_PROJECT=your-project-id
+    export VERTEX_AI_ACCESS_TOKEN=$(gcloud auth print-access-token)
+  """)
+end
+
+IO.puts("")
+
+# ============================================================================
+# Option 2: Explicit configuration
+# ============================================================================
+
+IO.puts("--- Explicit Configuration ---")
+
+IO.puts("""
+# Pass base_url and api_key directly:
+model = Nous.Model.parse("vertex_ai:gemini-2.0-flash",
+  base_url: Nous.Providers.VertexAI.endpoint("my-project", "us-central1"),
+  api_key: access_token
+)
+""")
+
+# ============================================================================
+# Option 3: Using Goth (recommended for production)
+# ============================================================================
+
+IO.puts("--- Goth Integration (Production) ---")
+
+IO.puts("""
+# 1. Add {:goth, "~> 1.4"} to your deps
+# 2. Start Goth in your supervision tree:
+#
+#    children = [
+#      {Goth, name: MyApp.Goth}
+#    ]
+#
+# 3. Set GOOGLE_APPLICATION_CREDENTIALS to your service account JSON
+# 4. Configure Nous:
+#
+#    config :nous, :vertex_ai,
+#      goth: MyApp.Goth,
+#      base_url: "https://us-central1-aiplatform.googleapis.com/v1/projects/my-project/locations/us-central1"
+#
+# 5. Use it:
+#    agent = Nous.new("vertex_ai:gemini-2.0-flash")
+""")
+
+# ============================================================================
+# Streaming
+# ============================================================================
+
+IO.puts("--- Streaming ---")
+
+if project && token do
+  agent =
+    Nous.new("vertex_ai:gemini-2.0-flash",
+      instructions: "You are a helpful assistant."
+    )
+
+  case Nous.run_stream(agent, "Write a haiku about Elixir.") do
+    {:ok, stream} ->
+      stream
+      |> Enum.each(fn
+        {:text_delta, text} -> IO.write(text)
+        {:finish, _} -> IO.puts("\n")
+        _ -> :ok
+      end)
+
+    {:error, error} ->
+      IO.puts("Streaming error: #{inspect(error)}")
+  end
+else
+  IO.puts("Skipping streaming demo (no credentials)\n")
+end
+
+# ============================================================================
+# Available Gemini Models on Vertex AI
+# ============================================================================
+
+IO.puts("--- Available Models ---")
+
+IO.puts("""
+Model                          | Description
+-------------------------------|-------------------------------------------
+gemini-2.0-flash               | Fast, efficient for most tasks
+gemini-2.0-flash-lite          | Lightweight, lowest latency
+gemini-2.5-pro-preview-06-05   | Most capable, best for complex reasoning
+gemini-2.5-flash-preview-05-20 | Balanced speed and capability
+""")