Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 27 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,32 @@

All notable changes to this project will be documented in this file.

## [0.13.2] - 2026-03-07
## [0.12.8] - 2026-03-12

### Fixed

- **Vertex AI v1/v1beta1 bug**: `Model.parse("vertex_ai:gemini-2.5-pro-preview-06-05")` with `GOOGLE_CLOUD_PROJECT` set was storing a hardcoded `v1` URL in `model.base_url`, causing the provider's `v1beta1` selection logic to be bypassed. Preview models now correctly use `v1beta1` at request time.

### Added

- **Vertex AI input validation**: Project ID and region from environment variables are now validated with helpful error messages instead of producing opaque DNS/HTTP errors.
- **`GOOGLE_CLOUD_LOCATION` support**: Added as a fallback for `GOOGLE_CLOUD_REGION`, consistent with other Google Cloud libraries and tooling.
- Multi-region example script: `examples/providers/vertex_ai_multi_region.exs`

## [0.12.7] - 2026-03-10

### Fixed

- **Vertex AI model routing**: Fixed `build_request_params/3` not including the `"model"` key in the params map, causing `chat/2` and `chat_stream/2` to always fall back to `"gemini-2.0-flash"` regardless of the requested model.
- **Vertex AI 404 on preview models**: Use `v1beta1` API version for preview and experimental models (e.g., `gemini-3.1-pro-preview`). The `v1` endpoint returns 404 for these models.

### Added

- `Nous.Providers.VertexAI.api_version_for_model/1` — returns `"v1beta1"` for preview/experimental models, `"v1"` for stable models.
- `Nous.Providers.VertexAI.endpoint/3` now accepts an optional model name to select the correct API version.
- Debug logging for Vertex AI request URLs.

## [0.12.6] - 2026-03-07

### Added

Expand All @@ -12,7 +37,7 @@ All notable changes to this project will be documented in this file.
- New config options: `:auto_update_memory`, `:auto_update_every`, `:reflection_model`, `:reflection_max_tokens`, `:reflection_max_messages`, `:reflection_max_memories`
- New example: `examples/memory/auto_update.exs`

## [0.13.1] - 2026-03-06
## [0.12.5] - 2026-03-06

### Added

Expand Down
116 changes: 92 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ IO.puts("Tokens: #{result.usage.total_tokens}")
| OpenAI | `openai:gpt-4` | ✅ |
| Anthropic | `anthropic:claude-sonnet-4-5-20250929` | ✅ |
| Google Gemini | `gemini:gemini-2.0-flash` | ✅ |
| Google Vertex AI | `vertex_ai:gemini-2.0-flash` | ✅ |
| Google Vertex AI | `vertex_ai:gemini-3.1-pro-preview` | ✅ |
| Groq | `groq:llama-3.1-70b-versatile` | ✅ |
| Ollama | `ollama:llama2` | ✅ |
| OpenRouter | `openrouter:anthropic/claude-3.5-sonnet` | ✅ |
Expand All @@ -108,15 +108,46 @@ All HTTP providers use pure Elixir HTTP clients (Req + Finch). LlamaCpp runs in-
agent = Nous.new("lmstudio:qwen3") # Local (free)
agent = Nous.new("openai:gpt-4") # OpenAI
agent = Nous.new("anthropic:claude-sonnet-4-5-20250929") # Anthropic
agent = Nous.new("vertex_ai:gemini-2.0-flash") # Google Vertex AI
agent = Nous.new("vertex_ai:gemini-3.1-pro-preview") # Google Vertex AI
agent = Nous.new("llamacpp:local", llamacpp_model: llm) # Local NIF
```

### Google Vertex AI Setup

Vertex AI provides enterprise access to Gemini models. To use it with a service account:
Vertex AI provides enterprise access to Gemini models via Google Cloud. It supports
VPC-SC, CMEK, IAM, regional/global endpoints, and all the latest Gemini models.

**1. Create a service account:**
#### Supported Models

| Model | Model ID | Endpoint | API Version |
|-------|----------|----------|-------------|
| Gemini 3.1 Pro (preview) | `gemini-3.1-pro-preview` | global only | v1beta1 |
| Gemini 3 Flash (preview) | `gemini-3-flash-preview` | global only | v1beta1 |
| Gemini 3.1 Flash-Lite (preview) | `gemini-3.1-flash-lite-preview` | global only | v1beta1 |
| Gemini 2.5 Pro | `gemini-2.5-pro` | regional + global | v1 |
| Gemini 2.5 Flash | `gemini-2.5-flash` | regional + global | v1 |
| Gemini 2.0 Flash | `gemini-2.0-flash` | regional + global | v1 |

> **Note:** Preview and experimental models automatically use the `v1beta1` API version.
> The Gemini 3.x preview models are **global endpoint only** — set `GOOGLE_CLOUD_LOCATION=global`.

#### Regional vs Global Endpoints

Vertex AI offers two endpoint types:

- **Regional** (e.g., `us-central1`, `europe-west1`): Low-latency, data residency guarantees
```
https://us-central1-aiplatform.googleapis.com/v1/projects/{project}/locations/us-central1
```
- **Global**: Higher availability, required for Gemini 3.x preview models
```
https://aiplatform.googleapis.com/v1beta1/projects/{project}/locations/global
```

The provider automatically selects the correct hostname and API version based on the
region and model name. Set `GOOGLE_CLOUD_LOCATION=global` for Gemini 3.x preview models.

#### Step 1: Create a Service Account

```bash
export PROJECT_ID="your-project-id"
Expand All @@ -129,64 +160,101 @@ gcloud iam service-accounts create nous-vertex-ai \
--display-name="Nous Vertex AI" \
--project=$PROJECT_ID

# Grant permission
# Grant the Vertex AI User role
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:nous-vertex-ai@${PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"

# Download key and store as env var
gcloud iam service-accounts keys create /tmp/sa.json \
# Download the key file
gcloud iam service-accounts keys create /tmp/sa-key.json \
--iam-account="nous-vertex-ai@${PROJECT_ID}.iam.gserviceaccount.com"
```

#### Step 2: Set Environment Variables

```bash
# Load the service account JSON into an env var (recommended — no file path dependency)
export GOOGLE_CREDENTIALS="$(cat /tmp/sa-key.json)"

# Required: your GCP project ID
export GOOGLE_CLOUD_PROJECT="your-project-id"

# Set the env vars
export GOOGLE_CREDENTIALS="$(cat /tmp/sa.json)"
export GOOGLE_CLOUD_PROJECT="$PROJECT_ID"
export GOOGLE_CLOUD_REGION="us-central1"
# Required for Gemini 3.x preview models (global endpoint only)
export GOOGLE_CLOUD_LOCATION="global"

# Or use a regional endpoint for stable models:
# export GOOGLE_CLOUD_LOCATION="us-central1"
# export GOOGLE_CLOUD_LOCATION="europe-west1"
```

**2. Add Goth to your deps** (handles token refresh from the service account):
Both `GOOGLE_CLOUD_REGION` and `GOOGLE_CLOUD_LOCATION` are supported (consistent with
other Google Cloud libraries). `GOOGLE_CLOUD_REGION` takes precedence if both are set.
Defaults to `us-central1` if neither is set.

#### Step 3: Add Goth to Your Application

Goth handles OAuth2 token fetching and auto-refresh from the service account credentials.

```elixir
# mix.exs
{:goth, "~> 1.4"}
```

**3. Start Goth in your supervision tree:**

```elixir
# application.ex — start Goth in your supervision tree
credentials = System.get_env("GOOGLE_CREDENTIALS") |> Jason.decode!()

children = [
{Goth, name: MyApp.Goth, source: {:service_account, credentials}}
]
```

**4. Configure Nous to use Goth:**
#### Step 4: Configure and Use

```elixir
# Option A: Via app config (recommended for production)
# Option A: App config (recommended for production)
# config/config.exs
config :nous, :vertex_ai, goth: MyApp.Goth

# Then just use it — no extra options needed:
agent = Nous.new("vertex_ai:gemini-2.0-flash")
# Then use it — Goth handles token refresh automatically:
agent = Nous.new("vertex_ai:gemini-3.1-pro-preview")
{:ok, result} = Nous.run(agent, "Hello from Vertex AI!")
```

```elixir
# Option B: Per-model (useful for multiple projects/regions)
agent = Nous.new("vertex_ai:gemini-2.0-flash",
# Option B: Per-model Goth (useful for multiple projects)
agent = Nous.new("vertex_ai:gemini-3-flash-preview",
default_settings: %{goth: MyApp.Goth}
)
```

```elixir
# Option C: Direct access token (no Goth needed, e.g. for quick testing)
export VERTEX_AI_ACCESS_TOKEN="$(gcloud auth print-access-token)"
# Option C: Explicit base_url (for custom endpoint or specific region)
alias Nous.Providers.VertexAI

agent = Nous.new("vertex_ai:gemini-3.1-pro-preview",
base_url: VertexAI.endpoint("my-project", "global", "gemini-3.1-pro-preview"),
default_settings: %{goth: MyApp.Goth}
)
```

agent = Nous.new("vertex_ai:gemini-2.0-flash")
```elixir
# Option D: Quick testing with gcloud CLI (no Goth needed)
# export VERTEX_AI_ACCESS_TOKEN="$(gcloud auth print-access-token)"
agent = Nous.new("vertex_ai:gemini-3.1-pro-preview")
```

See [`examples/providers/vertex_ai_goth_test.exs`](examples/providers/vertex_ai_goth_test.exs) for a runnable example.
#### Input Validation

The provider validates `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` at request time
and returns helpful error messages for invalid values instead of opaque DNS or HTTP errors.

#### Examples

- [`examples/providers/vertex_ai.exs`](examples/providers/vertex_ai.exs) — Basic usage with access token
- [`examples/providers/vertex_ai_goth_test.exs`](examples/providers/vertex_ai_goth_test.exs) — Service account with Goth
- [`examples/providers/vertex_ai_multi_region.exs`](examples/providers/vertex_ai_multi_region.exs) — Multi-region + v1/v1beta1 demo
- [`examples/providers/vertex_ai_integration_test.exs`](examples/providers/vertex_ai_integration_test.exs) — Full integration test (Flash + Pro, streaming + non-streaming)

## Features

Expand Down
6 changes: 3 additions & 3 deletions examples/providers/vertex_ai_goth_test.exs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
# Prerequisites:
# export GOOGLE_CREDENTIALS='{"type":"service_account","project_id":"...","private_key":"...",...}'
# export GOOGLE_CLOUD_PROJECT="your-project-id"
# export GOOGLE_CLOUD_REGION="europe-west1" # optional, defaults to europe-west1 (Frankfurt)
# export GOOGLE_CLOUD_REGION="us-central1" # optional, defaults to us-central1
#
# Run:
# mix run test_vertex_ai.exs
Expand All @@ -25,7 +25,7 @@ end

IO.puts("=== Vertex AI Test with Service Account ===\n")
IO.puts("Project: #{project}")
IO.puts("Region: #{System.get_env("GOOGLE_CLOUD_REGION", "europe-west1")}\n")
IO.puts("Region: #{System.get_env("GOOGLE_CLOUD_REGION", "us-central1")}\n")

# Start Goth with service account credentials from env var
credentials = Jason.decode!(credentials_json)
Expand All @@ -38,7 +38,7 @@ IO.puts("Goth started successfully.\n")
IO.puts("--- Test 1: Non-streaming ---")

agent =
Nous.new("vertex_ai:gemini-3.1-pro",
Nous.new("vertex_ai:gemini-2.0-flash",
instructions: "You are a helpful assistant. Be concise.",
default_settings: %{goth: Nous.TestGoth}
)
Expand Down
Loading
Loading