nyo16 · nyo16 · Mar 12, 2026 · Mar 10, 2026 · Mar 10, 2026 · Mar 10, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,7 +2,32 @@
 
 All notable changes to this project will be documented in this file.
 
-## [0.13.2] - 2026-03-07
+## [0.12.8] - 2026-03-12
+
+### Fixed
+
+- **Vertex AI v1/v1beta1 bug**: `Model.parse("vertex_ai:gemini-2.5-pro-preview-06-05")` with `GOOGLE_CLOUD_PROJECT` set was storing a hardcoded `v1` URL in `model.base_url`, causing the provider's `v1beta1` selection logic to be bypassed. Preview models now correctly use `v1beta1` at request time.
+
+### Added
+
+- **Vertex AI input validation**: Project ID and region from environment variables are now validated with helpful error messages instead of producing opaque DNS/HTTP errors.
+- **`GOOGLE_CLOUD_LOCATION` support**: Added as a fallback for `GOOGLE_CLOUD_REGION`, consistent with other Google Cloud libraries and tooling.
+- Multi-region example script: `examples/providers/vertex_ai_multi_region.exs`
+
+## [0.12.7] - 2026-03-10
+
+### Fixed
+
+- **Vertex AI model routing**: Fixed `build_request_params/3` not including the `"model"` key in the params map, causing `chat/2` and `chat_stream/2` to always fall back to `"gemini-2.0-flash"` regardless of the requested model.
+- **Vertex AI 404 on preview models**: Use `v1beta1` API version for preview and experimental models (e.g., `gemini-3.1-pro-preview`). The `v1` endpoint returns 404 for these models.
+
+### Added
+
+- `Nous.Providers.VertexAI.api_version_for_model/1` — returns `"v1beta1"` for preview/experimental models, `"v1"` for stable models.
+- `Nous.Providers.VertexAI.endpoint/3` now accepts an optional model name to select the correct API version.
+- Debug logging for Vertex AI request URLs.
+
+## [0.12.6] - 2026-03-07
 
 ### Added
 
@@ -12,7 +37,7 @@ All notable changes to this project will be documented in this file.
   - New config options: `:auto_update_memory`, `:auto_update_every`, `:reflection_model`, `:reflection_max_tokens`, `:reflection_max_messages`, `:reflection_max_memories`
   - New example: `examples/memory/auto_update.exs`
 
-## [0.13.1] - 2026-03-06
+## [0.12.5] - 2026-03-06
 
 ### Added
 

diff --git a/README.md b/README.md
@@ -93,7 +93,7 @@ IO.puts("Tokens: #{result.usage.total_tokens}")
 | OpenAI | `openai:gpt-4` | ✅ |
 | Anthropic | `anthropic:claude-sonnet-4-5-20250929` | ✅ |
 | Google Gemini | `gemini:gemini-2.0-flash` | ✅ |
-| Google Vertex AI | `vertex_ai:gemini-2.0-flash` | ✅ |
+| Google Vertex AI | `vertex_ai:gemini-3.1-pro-preview` | ✅ |
 | Groq | `groq:llama-3.1-70b-versatile` | ✅ |
 | Ollama | `ollama:llama2` | ✅ |
 | OpenRouter | `openrouter:anthropic/claude-3.5-sonnet` | ✅ |
@@ -108,15 +108,46 @@ All HTTP providers use pure Elixir HTTP clients (Req + Finch). LlamaCpp runs in-
 agent = Nous.new("lmstudio:qwen3")                  # Local (free)
 agent = Nous.new("openai:gpt-4")                    # OpenAI
 agent = Nous.new("anthropic:claude-sonnet-4-5-20250929")   # Anthropic
-agent = Nous.new("vertex_ai:gemini-2.0-flash")      # Google Vertex AI
+agent = Nous.new("vertex_ai:gemini-3.1-pro-preview")  # Google Vertex AI
 agent = Nous.new("llamacpp:local", llamacpp_model: llm)  # Local NIF
 ```
 
 ### Google Vertex AI Setup
 
-Vertex AI provides enterprise access to Gemini models. To use it with a service account:
+Vertex AI provides enterprise access to Gemini models via Google Cloud. It supports
+VPC-SC, CMEK, IAM, regional/global endpoints, and all the latest Gemini models.
 
-**1. Create a service account:**
+#### Supported Models
+
+| Model | Model ID | Endpoint | API Version |
+|-------|----------|----------|-------------|
+| Gemini 3.1 Pro (preview) | `gemini-3.1-pro-preview` | global only | v1beta1 |
+| Gemini 3 Flash (preview) | `gemini-3-flash-preview` | global only | v1beta1 |
+| Gemini 3.1 Flash-Lite (preview) | `gemini-3.1-flash-lite-preview` | global only | v1beta1 |
+| Gemini 2.5 Pro | `gemini-2.5-pro` | regional + global | v1 |
+| Gemini 2.5 Flash | `gemini-2.5-flash` | regional + global | v1 |
+| Gemini 2.0 Flash | `gemini-2.0-flash` | regional + global | v1 |
+
+> **Note:** Preview and experimental models automatically use the `v1beta1` API version.
+> The Gemini 3.x preview models are **global endpoint only** — set `GOOGLE_CLOUD_LOCATION=global`.
+
+#### Regional vs Global Endpoints
+
+Vertex AI offers two endpoint types:
+
+- **Regional** (e.g., `us-central1`, `europe-west1`): Low-latency, data residency guarantees
+  ```
+  https://us-central1-aiplatform.googleapis.com/v1/projects/{project}/locations/us-central1
+  ```
+- **Global**: Higher availability, required for Gemini 3.x preview models
+  ```
+  https://aiplatform.googleapis.com/v1beta1/projects/{project}/locations/global
+  ```
+
+The provider automatically selects the correct hostname and API version based on the
+region and model name. Set `GOOGLE_CLOUD_LOCATION=global` for Gemini 3.x preview models.
+
+#### Step 1: Create a Service Account
 
 ```bash
 export PROJECT_ID="your-project-id"
@@ -129,64 +160,101 @@ gcloud iam service-accounts create nous-vertex-ai \
   --display-name="Nous Vertex AI" \
   --project=$PROJECT_ID
 
-# Grant permission
+# Grant the Vertex AI User role
 gcloud projects add-iam-policy-binding $PROJECT_ID \
   --member="serviceAccount:nous-vertex-ai@${PROJECT_ID}.iam.gserviceaccount.com" \
   --role="roles/aiplatform.user"
 
-# Download key and store as env var
-gcloud iam service-accounts keys create /tmp/sa.json \
+# Download the key file
+gcloud iam service-accounts keys create /tmp/sa-key.json \
   --iam-account="nous-vertex-ai@${PROJECT_ID}.iam.gserviceaccount.com"
+```
+
+#### Step 2: Set Environment Variables
+
+```bash
+# Load the service account JSON into an env var (recommended — no file path dependency)
+export GOOGLE_CREDENTIALS="$(cat /tmp/sa-key.json)"
+
+# Required: your GCP project ID
+export GOOGLE_CLOUD_PROJECT="your-project-id"
 
-# Set the env vars
-export GOOGLE_CREDENTIALS="$(cat /tmp/sa.json)"
-export GOOGLE_CLOUD_PROJECT="$PROJECT_ID"
-export GOOGLE_CLOUD_REGION="us-central1"
+# Required for Gemini 3.x preview models (global endpoint only)
+export GOOGLE_CLOUD_LOCATION="global"
+
+# Or use a regional endpoint for stable models:
+# export GOOGLE_CLOUD_LOCATION="us-central1"
+# export GOOGLE_CLOUD_LOCATION="europe-west1"
 ```
 
-**2. Add Goth to your deps** (handles token refresh from the service account):
+Both `GOOGLE_CLOUD_REGION` and `GOOGLE_CLOUD_LOCATION` are supported (consistent with
+other Google Cloud libraries). `GOOGLE_CLOUD_REGION` takes precedence if both are set.
+Defaults to `us-central1` if neither is set.
+
+#### Step 3: Add Goth to Your Application
+
+Goth handles OAuth2 token fetching and auto-refresh from the service account credentials.
 
 ```elixir
+# mix.exs
 {:goth, "~> 1.4"}
 ```
 
-**3. Start Goth in your supervision tree:**
-
 ```elixir
+# application.ex — start Goth in your supervision tree
 credentials = System.get_env("GOOGLE_CREDENTIALS") |> Jason.decode!()
 
 children = [
   {Goth, name: MyApp.Goth, source: {:service_account, credentials}}
 ]
 ```
 
-**4. Configure Nous to use Goth:**
+#### Step 4: Configure and Use
 
 ```elixir
-# Option A: Via app config (recommended for production)
+# Option A: App config (recommended for production)
 # config/config.exs
 config :nous, :vertex_ai, goth: MyApp.Goth
 
-# Then just use it — no extra options needed:
-agent = Nous.new("vertex_ai:gemini-2.0-flash")
+# Then use it — Goth handles token refresh automatically:
+agent = Nous.new("vertex_ai:gemini-3.1-pro-preview")
 {:ok, result} = Nous.run(agent, "Hello from Vertex AI!")
 ```
 
 ```elixir
-# Option B: Per-model (useful for multiple projects/regions)
-agent = Nous.new("vertex_ai:gemini-2.0-flash",
+# Option B: Per-model Goth (useful for multiple projects)
+agent = Nous.new("vertex_ai:gemini-3-flash-preview",
   default_settings: %{goth: MyApp.Goth}
 )
 ```
 
 ```elixir
-# Option C: Direct access token (no Goth needed, e.g. for quick testing)
-export VERTEX_AI_ACCESS_TOKEN="$(gcloud auth print-access-token)"
+# Option C: Explicit base_url (for custom endpoint or specific region)
+alias Nous.Providers.VertexAI
+
+agent = Nous.new("vertex_ai:gemini-3.1-pro-preview",
+  base_url: VertexAI.endpoint("my-project", "global", "gemini-3.1-pro-preview"),
+  default_settings: %{goth: MyApp.Goth}
+)
+```
 
-agent = Nous.new("vertex_ai:gemini-2.0-flash")
+```elixir
+# Option D: Quick testing with gcloud CLI (no Goth needed)
+# export VERTEX_AI_ACCESS_TOKEN="$(gcloud auth print-access-token)"
+agent = Nous.new("vertex_ai:gemini-3.1-pro-preview")
 ```
 
-See [`examples/providers/vertex_ai_goth_test.exs`](examples/providers/vertex_ai_goth_test.exs) for a runnable example.
+#### Input Validation
+
+The provider validates `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` at request time
+and returns helpful error messages for invalid values instead of opaque DNS or HTTP errors.
+
+#### Examples
+
+- [`examples/providers/vertex_ai.exs`](examples/providers/vertex_ai.exs) — Basic usage with access token
+- [`examples/providers/vertex_ai_goth_test.exs`](examples/providers/vertex_ai_goth_test.exs) — Service account with Goth
+- [`examples/providers/vertex_ai_multi_region.exs`](examples/providers/vertex_ai_multi_region.exs) — Multi-region + v1/v1beta1 demo
+- [`examples/providers/vertex_ai_integration_test.exs`](examples/providers/vertex_ai_integration_test.exs) — Full integration test (Flash + Pro, streaming + non-streaming)
 
 ## Features
 

diff --git a/examples/providers/vertex_ai_goth_test.exs b/examples/providers/vertex_ai_goth_test.exs
@@ -5,7 +5,7 @@
 # Prerequisites:
 #   export GOOGLE_CREDENTIALS='{"type":"service_account","project_id":"...","private_key":"...",...}'
 #   export GOOGLE_CLOUD_PROJECT="your-project-id"
-#   export GOOGLE_CLOUD_REGION="europe-west1"  # optional, defaults to europe-west1 (Frankfurt)
+#   export GOOGLE_CLOUD_REGION="us-central1"  # optional, defaults to us-central1
 #
 # Run:
 #   mix run test_vertex_ai.exs
@@ -25,7 +25,7 @@ end
 
 IO.puts("=== Vertex AI Test with Service Account ===\n")
 IO.puts("Project: #{project}")
-IO.puts("Region: #{System.get_env("GOOGLE_CLOUD_REGION", "europe-west1")}\n")
+IO.puts("Region: #{System.get_env("GOOGLE_CLOUD_REGION", "us-central1")}\n")
 
 # Start Goth with service account credentials from env var
 credentials = Jason.decode!(credentials_json)
@@ -38,7 +38,7 @@ IO.puts("Goth started successfully.\n")
 IO.puts("--- Test 1: Non-streaming ---")
 
 agent =
-  Nous.new("vertex_ai:gemini-3.1-pro",
+  Nous.new("vertex_ai:gemini-2.0-flash",
     instructions: "You are a helpful assistant. Be concise.",
     default_settings: %{goth: Nous.TestGoth}
   )