Skip to content

feat: Add Vertex AI provider for Google Cloud Gemini access#28

Merged
nyo16 merged 10 commits intomasterfrom
feat/vertex-ai-provider
Mar 10, 2026
Merged

feat: Add Vertex AI provider for Google Cloud Gemini access#28
nyo16 merged 10 commits intomasterfrom
feat/vertex-ai-provider

Conversation

@nyo16
Copy link
Copy Markdown
Owner

@nyo16 nyo16 commented Mar 10, 2026

Summary

Add Nous.Providers.VertexAI for accessing Gemini models through Google Cloud
Vertex AI with enterprise features (VPC-SC, CMEK, regional endpoints, IAM).

Authentication

Three auth modes supported:

  • App config Goth (production): config :nous, :vertex_ai, goth: MyApp.Goth
  • Per-model Goth: default_settings: %{goth: MyApp.Goth}
  • Direct access token: api_key option or VERTEX_AI_ACCESS_TOKEN env var

Goth ({:goth, "~> 1.4"}) is an optional dependency — reuse existing Goth
processes from PubSub, Cloud Storage, etc.

What's included

  • Nous.Providers.VertexAI — provider module with Bearer token auth and SSE streaming
  • Nous.Providers.VertexAI.endpoint/2 — helper to build endpoint URLs
  • URL auto-construction from GOOGLE_CLOUD_PROJECT / GOOGLE_CLOUD_REGION env vars
  • Reuses all existing Gemini infrastructure (message format, response parsing, stream normalization)
  • Model string: "vertex_ai:gemini-2.0-flash"
  • README with full service account setup guide
  • Example scripts with Goth integration
  • 774 tests passing, 0 failures

Tested

Verified end-to-end with service account on europe-west1 (Belgium) — both
streaming and non-streaming requests working.

nyo16 added 10 commits March 6, 2026 14:50
Add Nous.Providers.VertexAI for accessing Gemini models through
Google Cloud Vertex AI with enterprise features (VPC-SC, CMEK,
regional endpoints, IAM-based access control).

Authentication supports:
- Direct access token via api_key option or VERTEX_AI_ACCESS_TOKEN env
- Goth integration (optional dep) for automatic service account token
  management — reuses existing Goth processes from PubSub etc.
- URL auto-construction from GOOGLE_CLOUD_PROJECT/GOOGLE_CLOUD_REGION

Reuses all existing Gemini infrastructure: message format conversion,
response parsing, and stream normalization.
Interactive test script that reads service account credentials from
GOOGLE_CREDENTIALS env var and tests both streaming and non-streaming
requests against Vertex AI.
Add after_run/3 callback to the Plugin behaviour, wired into AgentRunner
after successful runs. The Memory plugin uses this to automatically
reflect on conversations and update memories without agent tool calls,
similar to Claude Code's "recalled/wrote memory" behavior.

Configurable via memory_config: auto_update_memory, auto_update_every,
reflection_model, reflection_max_tokens, reflection_max_messages, and
reflection_max_memories.
- Add examples/memory/auto_update.exs demonstrating multi-turn
  auto-memory with context continuation
- Document auto-update config options in Plugins.Memory moduledoc
- Update Plugin moduledoc with after_run callback execution order
- Add auto_update.exs to examples README
- Add v0.13.2 changelog entry
- Add Google Vertex AI Setup section with service account creation steps
- Update test example to use gemini-3.1-pro model
- Expand Vertex AI README with all three auth modes (app config, per-model, direct token)
- Update changelog with Goth config details
- Bump version to 0.13.2
@nyo16 nyo16 merged commit a172877 into master Mar 10, 2026
6 checks passed
@nyo16 nyo16 deleted the feat/vertex-ai-provider branch March 10, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant