Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 123 additions & 0 deletions FirebaseAI/Documentation/Hybrid/configuration-options.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
> [!WARNING]
> **Experimental:** Using the Firebase AI SDK to build hybrid experiences on Apple platforms is an Experimental feature, which means that this feature isn't subject to any SLA or deprecation policy and could change in backwards-incompatible ways.

<br />

This page describes the following configuration options for hybrid experiences on Apple platforms:

- [Set an inference mode (Model Provider preference).](#inference-modes)

- [Determine whether on-device or in-cloud inference was used.](#determine-inference-mode)

- [Specify a model to use.](#specify-model)

- [Use model configuration to control responses (like temperature).](#model-config)

**Make sure that you've completed the
[getting started guide for building hybrid experiences](get-started.md).**

## Set an inference mode

On Apple platforms, the hybrid inference behavior is controlled by how you configure the `LanguageModelProvider` when you initialize your `GenerativeModelSession`. Instead of relying on a dedicated inference mode enum, you instantiate the models and combine them using `.hybridModel(primary:secondary:)` to establish fallback priorities.

Here are the equivalent patterns for the available inference behaviors:

- **Prefer On-Device** : Attempt to use the on-device model if it's available. Otherwise, automatically *fall back to the cloud-hosted model*.

```swift
let systemModel = FirebaseAI.SystemLanguageModel.default
let geminiModel = firebaseAI.geminiModel(name: "gemini-2.5-flash-lite")
let session = firebaseAI.generativeModelSession(
model: .hybridModel(primary: systemModel, secondary: geminiModel)
)
```

- **Only On-Device** : Attempt to use the on-device model if it's available. Otherwise, *throw an error*.

```swift
let systemModel = FirebaseAI.SystemLanguageModel.default
let session = firebaseAI.generativeModelSession(model: systemModel)
```

- **Prefer In-Cloud** : Attempt to use the cloud-hosted model. If it fails (e.g. due to lack of network connection), *fall back to the on-device model*.

```swift
let systemModel = FirebaseAI.SystemLanguageModel.default
let geminiModel = firebaseAI.geminiModel(name: "gemini-2.5-flash-lite")
let session = firebaseAI.generativeModelSession(
model: .hybridModel(primary: geminiModel, secondary: systemModel)
)
```

- **Only In-Cloud** : Attempt to use the cloud-hosted model. Otherwise, *throw an error*.

```swift
let geminiModel = firebaseAI.geminiModel(name: "gemini-2.5-flash-lite")
let session = firebaseAI.generativeModelSession(model: geminiModel)
```

## Determine whether on-device or in-cloud inference was used

If your hybrid strategy relies on a fallback (like Prefer On-Device or Prefer In-Cloud), it might
be helpful to know which model ultimately served the request. This information is
provided by the `rawResponse.modelVersion` property of the response.

You can inspect the `modelVersion` and match it to your model instances to verify the source:

```swift
let response = try await session.respond(to: prompt)

if response.rawResponse.modelVersion == systemModel._modelName {
print("Inference was executed on-device.")
} else {
print("Inference was executed in the cloud via Gemini.")
}

print(response.content)
```

## Specify a model to use

You can specify a model to use when you declare your `SystemLanguageModel` (for on-device) or `GeminiModel` (for cloud) instances.

- **Specify a cloud-hosted model**:
- Provide the model name string to `firebaseAI.geminiModel(name:)`.
- Find model names for all [supported cloud-hosted Gemini models](https://firebase.google.com/docs/ai-logic/models).

- **Specify an on-device model**:
- The on-device `SystemLanguageModel` is automatically selected and managed by Apple's Foundation Models framework.
- You can influence the type of tasks it excels at by specifying a `UseCase` and safety `Guardrails` during initialization:

```swift
// Example of a specialized on-device model configuration
let customSystemModel = FirebaseAI.SystemLanguageModel(
useCase: .general,
guardrails: .default
)
```

## Use model configuration to control responses

In each request to a model, you can send along model configurations to control
how the model generates a response. Cloud-hosted models and on-device models
offer different configuration options.

When making a request using `.respond(to:options:)`, you can specify options using the `ResponseGenerationOptions.hybrid()` factory, allowing you to pass independent options for both Gemini and Foundation Models at the same time:

```swift
import FoundationModels

// Options for the cloud-hosted Gemini model
let geminiConfig = GenerationConfig(temperature: 0.8, topK: 10)

// Options for the on-device Apple Foundation model
let systemOptions = FirebaseAI.GenerationOptions(sampling: .greedy, temperature: 0.8)

let response = try await session.respond(
to: prompt,
options: .hybrid(
gemini: geminiConfig,
foundationModels: systemOptions
)
)
```
124 changes: 124 additions & 0 deletions FirebaseAI/Documentation/Hybrid/function-calling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
> [!WARNING]
> **Experimental:** Using the Firebase AI SDK to build hybrid experiences on Apple platforms is an Experimental feature, which means that this feature isn't subject to any SLA or deprecation policy and could change in backwards-incompatible ways.

<br />

Generative models are powerful at solving many types of problems. However, they
are constrained by limitations like:

- They are frozen after training, leading to stale knowledge.
- They can't query or modify external data.

Function calling can help you overcome some of these limitations.
Function calling is sometimes referred to as *tool use* because it allows a
model to use external tools such as APIs and functions to generate its final
response.

This page describes how to implement automatic function calling in your hybrid experiences for Apple apps using `FoundationModels.Tool`.

## Before you begin

Make sure that you've completed the
[getting started guide for building hybrid experiences](get-started.md).

## Automatic Function Calling via `FoundationModels.Tool`

On Apple platforms running iOS 26+, macOS 26+, or visionOS 26+, you can define tools using the `FoundationModels.Tool` protocol. The Firebase AI SDK will automatically manage the back-and-forth communication required to invoke these tools and pass the results back to the model, streamlining the function calling process.

### Step 1: Define the Tool

First, define your tool by conforming to the `FoundationModels.Tool` protocol. You can use the `@Generable` macro to easily define the expected arguments and return types.

```swift
import FirebaseAI
import FirebaseAILogic
import FoundationModels

@available(iOS 26.0, macOS 26.0, visionOS 26.0, *)
struct GetTemperatureTool: FoundationModels.Tool {
let description = "Returns the current temperature for the specified location."

@Generable
struct Location {
let city: String
@Guide(description: "The province or state.")
let region: String
let country: String
}

@Generable
struct Temperature {
@Generable enum Units { case celsius, fahrenheit, kelvin }

let temperature: Double
let units: Units
}

// The call method is automatically invoked by the SDK when the model requests it
func call(arguments: Location) async throws -> Temperature {
// TODO(developer): Make a network request to an actual weather API here

// For demo purposes, we return a hardcoded temperature
return Temperature(temperature: 25.0, units: .celsius)
}
}
```

### Step 2: Provide the Tool to the Model Session

When initializing your `GenerativeModelSession`, pass an instance of your tool in the `tools` array. The tool can be used by both the on-device `SystemLanguageModel` and cloud-hosted `GeminiModel`.

```swift
// Defining and using a `FoundationModels.Tool` requires iOS 26+ / macOS 26+
if #available(iOS 26.0, macOS 26.0, visionOS 26.0, *) {
let systemModel = FirebaseAI.SystemLanguageModel.default
let geminiModel = firebaseAI.geminiModel(name: "gemini-2.5-flash-lite")

// Initialize the tool
let temperatureTool = GetTemperatureTool()

// Set the hybrid fallback preference and provide the tools
let session = firebaseAI.generativeModelSession(
model: .hybridModel(primary: systemModel, secondary: geminiModel),
tools: [temperatureTool],
instructions: """
You are a weather bot that specializes in reporting outdoor temperatures in Celsius.
Always use the `GetTemperatureTool` function to determine the current temperature in a location.
"""
)
}
```

### Step 3: Send a Prompt

Now you can send a standard text prompt to the session. If the model determines it needs to call your tool, it will automatically pause generation, execute your `call(arguments:)` method, incorporate the result into its context, and resume generating the final response.

```swift
let prompt = "What is the current temperature in Waterloo, Ontario, Canada?"

// The SDK automatically handles calling the tool and returning the final natural language response
let response = try await session.respond(to: prompt)

print(response.content)
// Output example:
// The current temperature in Waterloo, Ontario, Canada is 25°C.
```

## Additional features and considerations

### Parallel Tool Calling
The model can call a tool multiple times in parallel to satisfy the request, such as retrieving weather details for several cities simultaneously. Ensure that your tool implementation is thread-safe and can handle concurrent `call(arguments:)` executions.

```swift
let response = try await session.respond(
to: "Is it hotter in Boston, Wichita, or Pittsburgh?"
)
```

### Error Handling
You can throw errors from your tools to escape calls when you detect something is wrong, such as when the person using your app hasn't provided permission for data access or a network call times out. This will abort the ongoing generation request.

Alternatively, if you want the model to recover from the failure, your tool can return a string (if your return type allows it) that briefly tells the model what didn't work. For example, returning `"Cannot access the weather database at this time."` allows the model to respond to the user with that context rather than throwing a fatal error.

### Best Practices for Descriptions
Similar to guided generation with `@Generable`, when you provide descriptions to properties, you help the model understand the semantics of the arguments. Keep descriptions as short as possible because long descriptions take up context size and can introduce latency.
96 changes: 96 additions & 0 deletions FirebaseAI/Documentation/Hybrid/generate-structured-output.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
> [!WARNING]
> **Experimental:** Using the Firebase AI SDK to build hybrid experiences on Apple platforms is an Experimental feature, which means that this feature isn't subject to any SLA or deprecation policy and could change in backwards-incompatible ways.

<br />

Gemini and Foundation models return responses as unstructured text by default.
However, some use cases require structured output. For example, you
might be using the response for other downstream tasks that require an
established data schema.

To ensure that the model's generated output always adheres to a specific schema,
you can define a data structure. You can then directly extract data from the model's output as Swift types with less post-processing.

This page describes how to generate structured output (like custom objects)
in your hybrid experiences for Apple apps.

## Before you begin

Make sure that you've completed the
[getting started guide for building hybrid experiences](get-started.md).

## Generate Structured Output via `@Generable`

Generating structured output is supported for
inference using both cloud-hosted and on-device models via the `@Generable` macro in Swift.

The `generativeModelSession` allows you to request structured decoding via `generating: MyType.self`. You can also use basic Swift types directly (e.g., `generating: Float.self` or `generating: [String].self`) if you just need a primitive value.

Here is an example for extracting a user profile. Notice the use of `@Guide` to control values and provide semantics. **Best Practice:** Keep descriptions as short as possible. Long descriptions take up additional context size and can introduce latency.

```swift
import FirebaseAI
import FirebaseAILogic

@Generable
struct UserProfile {
@Guide(description: "A unique username for the user.")
var username: String

// Use constraints like `.range` to enforce limits on values
@Guide(description: "The user's age.", .range(1...120))
var age: Int

@Guide(description: "A short bio about the user, no more than 100 characters.")
var bio: String

// Use constraints like `.count` to enforce array sizes
@Guide(description: "A list of the user's favorite topics.", .count(3))
var favoriteTopics: [String]
}

// Ensure you have established a `generativeModelSession`
let session = firebaseAI.generativeModelSession(
model: .hybridModel(primary: systemModel, secondary: geminiModel)
)

let prompt = "Generate a user profile for a cat lover who enjoys hiking."

// Provide `generating: UserProfile.self` to map output to your Swift type
let response = try await session.respond(to: prompt, generating: UserProfile.self)

print("Username: \(response.content.username)")
print("Bio: \(response.content.bio)")
print("Favorite Topics: \(response.content.favoriteTopics.joined(separator: ", "))")
```

> [!NOTE]
> Properties are generated by the model in the order they are declared. You can also nest custom `@Generable` types inside other `@Generable` types, and mark enumerations with associated values as `@Generable`.

The underlying system seamlessly maps your `@Generable` types to JSON schemas when querying Gemini, and handles the appropriate representation constraint when communicating with the on-device `SystemLanguageModel`.

## Stream Structured Output

When working with large or complex structured responses, you may want to update your UI as the data is being generated rather than waiting for the entire response to complete.

You can achieve this by using the `streamResponse(to:generating:)` method. The `@Generable` macro automatically synthesizes a nested `PartiallyGenerated` type for your struct, where all properties become optional. As the `LanguageModelSession` processes the stream, it yields snapshots containing these partial results.

```swift
// ... continuing from the previous example ...

let stream = session.streamResponse(to: prompt, generating: UserProfile.self)

for try await snapshot in stream {
// `snapshot.content` is of type `UserProfile.PartiallyGenerated`
let partialProfile = snapshot.content

// Properties might be nil if they haven't been generated yet
if let partialUsername = partialProfile.username {
print("Username generated so far: \\(partialUsername)")
}
}

// Once the stream completes, you can optionally collect the final decoded object
let finalResponse = try await stream.collect()
let completeProfile = finalResponse.content
```
Loading
Loading