firebase · andrewheard · May 14, 2026 · May 14, 2026 · May 14, 2026 · May 14, 2026
@@ -0,0 +1,123 @@
+> [!WARNING]
+> **Experimental:** Using the Firebase AI SDK to build hybrid experiences on Apple platforms is an Experimental feature, which means that this feature isn't subject to any SLA or deprecation policy and could change in backwards-incompatible ways.
+
+<br />
+
+This page describes the following configuration options for hybrid experiences on Apple platforms:
+
+- [Set an inference mode (Model Provider preference).](#inference-modes)
+
+- [Determine whether on-device or in-cloud inference was used.](#determine-inference-mode)
+
+- [Specify a model to use.](#specify-model)
+
+- [Use model configuration to control responses (like temperature).](#model-config)
+
+**Make sure that you've completed the
+[getting started guide for building hybrid experiences](get-started.md).**
+
+## Set an inference mode
+
+On Apple platforms, the hybrid inference behavior is controlled by how you configure the `LanguageModelProvider` when you initialize your `GenerativeModelSession`. Instead of relying on a dedicated inference mode enum, you instantiate the models and combine them using `.hybridModel(primary:secondary:)` to establish fallback priorities.
+
+Here are the equivalent patterns for the available inference behaviors:
+
+- **Prefer On-Device** : Attempt to use the on-device model if it's available. Otherwise, automatically *fall back to the cloud-hosted model*.
+
+```swift
+let systemModel = FirebaseAI.SystemLanguageModel.default
+let geminiModel = firebaseAI.geminiModel(name: "gemini-2.5-flash-lite")
+let session = firebaseAI.generativeModelSession(
+    model: .hybridModel(primary: systemModel, secondary: geminiModel)
+)
+```
+
+- **Only On-Device** : Attempt to use the on-device model if it's available. Otherwise, *throw an error*.
+
+```swift
+let systemModel = FirebaseAI.SystemLanguageModel.default
+let session = firebaseAI.generativeModelSession(model: systemModel)
+```
+
+- **Prefer In-Cloud** : Attempt to use the cloud-hosted model. If it fails (e.g. due to lack of network connection), *fall back to the on-device model*.
+
+```swift
+let systemModel = FirebaseAI.SystemLanguageModel.default
+let geminiModel = firebaseAI.geminiModel(name: "gemini-2.5-flash-lite")
+let session = firebaseAI.generativeModelSession(
+    model: .hybridModel(primary: geminiModel, secondary: systemModel)
+)
+```
+
+- **Only In-Cloud** : Attempt to use the cloud-hosted model. Otherwise, *throw an error*.
+
+```swift
+let geminiModel = firebaseAI.geminiModel(name: "gemini-2.5-flash-lite")
+let session = firebaseAI.generativeModelSession(model: geminiModel)
+```
+
+## Determine whether on-device or in-cloud inference was used
+
+If your hybrid strategy relies on a fallback (like Prefer On-Device or Prefer In-Cloud), it might
+be helpful to know which model ultimately served the request. This information is
+provided by the `rawResponse.modelVersion` property of the response.
+
+You can inspect the `modelVersion` and match it to your model instances to verify the source:
+
+```swift
+let response = try await session.respond(to: prompt)
+
+if response.rawResponse.modelVersion == systemModel._modelName {
+    print("Inference was executed on-device.")
+} else {
+    print("Inference was executed in the cloud via Gemini.")
+}
+
+print(response.content)
+```
+
+## Specify a model to use
+
+You can specify a model to use when you declare your `SystemLanguageModel` (for on-device) or `GeminiModel` (for cloud) instances.
+
+- **Specify a cloud-hosted model**:
+  - Provide the model name string to `firebaseAI.geminiModel(name:)`.
+  - Find model names for all [supported cloud-hosted Gemini models](https://firebase.google.com/docs/ai-logic/models).
+
+- **Specify an on-device model**:
+  - The on-device `SystemLanguageModel` is automatically selected and managed by Apple's Foundation Models framework.
+  - You can influence the type of tasks it excels at by specifying a `UseCase` and safety `Guardrails` during initialization:
+
+```swift
+// Example of a specialized on-device model configuration
+let customSystemModel = FirebaseAI.SystemLanguageModel(
+    useCase: .general, 
+    guardrails: .default
+)
+```
+
+## Use model configuration to control responses
+
+In each request to a model, you can send along model configurations to control
+how the model generates a response. Cloud-hosted models and on-device models
+offer different configuration options.
+
+When making a request using `.respond(to:options:)`, you can specify options using the `ResponseGenerationOptions.hybrid()` factory, allowing you to pass independent options for both Gemini and Foundation Models at the same time:
+
+```swift
+import FoundationModels
+
+// Options for the cloud-hosted Gemini model
+let geminiConfig = GenerationConfig(temperature: 0.8, topK: 10)
+
+// Options for the on-device Apple Foundation model
+let systemOptions = FirebaseAI.GenerationOptions(sampling: .greedy, temperature: 0.8)
+
+let response = try await session.respond(
+    to: prompt,
+    options: .hybrid(
+        gemini: geminiConfig, 
+        foundationModels: systemOptions
+    )
+)
+```
@@ -0,0 +1,124 @@
+> [!WARNING]
+> **Experimental:** Using the Firebase AI SDK to build hybrid experiences on Apple platforms is an Experimental feature, which means that this feature isn't subject to any SLA or deprecation policy and could change in backwards-incompatible ways.
+
+<br />
+
+Generative models are powerful at solving many types of problems. However, they
+are constrained by limitations like:
+
+- They are frozen after training, leading to stale knowledge.
+- They can't query or modify external data.
+
+Function calling can help you overcome some of these limitations.
+Function calling is sometimes referred to as *tool use* because it allows a
+model to use external tools such as APIs and functions to generate its final
+response.
+
+This page describes how to implement automatic function calling in your hybrid experiences for Apple apps using `FoundationModels.Tool`.
+
+## Before you begin
+
+Make sure that you've completed the
+[getting started guide for building hybrid experiences](get-started.md).
+
+## Automatic Function Calling via `FoundationModels.Tool`
+
+On Apple platforms running iOS 26+, macOS 26+, or visionOS 26+, you can define tools using the `FoundationModels.Tool` protocol. The Firebase AI SDK will automatically manage the back-and-forth communication required to invoke these tools and pass the results back to the model, streamlining the function calling process.
+
+### Step 1: Define the Tool
+
+First, define your tool by conforming to the `FoundationModels.Tool` protocol. You can use the `@Generable` macro to easily define the expected arguments and return types.
+
+```swift
+import FirebaseAI
+import FirebaseAILogic
+import FoundationModels
+
+@available(iOS 26.0, macOS 26.0, visionOS 26.0, *)
+struct GetTemperatureTool: FoundationModels.Tool {
+  let description = "Returns the current temperature for the specified location."
+
+  @Generable
+  struct Location {
+    let city: String
+    @Guide(description: "The province or state.")
+    let region: String
+    let country: String
+  }
+
+  @Generable
+  struct Temperature {
+    @Generable enum Units { case celsius, fahrenheit, kelvin }
+
+    let temperature: Double
+    let units: Units
+  }
+
+  // The call method is automatically invoked by the SDK when the model requests it
+  func call(arguments: Location) async throws -> Temperature {
+    // TODO(developer): Make a network request to an actual weather API here
+
+    // For demo purposes, we return a hardcoded temperature
+    return Temperature(temperature: 25.0, units: .celsius)
+  }
+}
+```
+
+### Step 2: Provide the Tool to the Model Session
+
+When initializing your `GenerativeModelSession`, pass an instance of your tool in the `tools` array. The tool can be used by both the on-device `SystemLanguageModel` and cloud-hosted `GeminiModel`.
+
+```swift
+// Defining and using a `FoundationModels.Tool` requires iOS 26+ / macOS 26+
+if #available(iOS 26.0, macOS 26.0, visionOS 26.0, *) {
+    let systemModel = FirebaseAI.SystemLanguageModel.default
+    let geminiModel = firebaseAI.geminiModel(name: "gemini-2.5-flash-lite")
+
+    // Initialize the tool
+    let temperatureTool = GetTemperatureTool()
+
+    // Set the hybrid fallback preference and provide the tools
+    let session = firebaseAI.generativeModelSession(
+        model: .hybridModel(primary: systemModel, secondary: geminiModel),
+        tools: [temperatureTool],
+        instructions: """
+        You are a weather bot that specializes in reporting outdoor temperatures in Celsius.
+        Always use the `GetTemperatureTool` function to determine the current temperature in a location.
+        """
+    )
+}
+```
+
+### Step 3: Send a Prompt
+
+Now you can send a standard text prompt to the session. If the model determines it needs to call your tool, it will automatically pause generation, execute your `call(arguments:)` method, incorporate the result into its context, and resume generating the final response.
+
+```swift
+let prompt = "What is the current temperature in Waterloo, Ontario, Canada?"
+
+// The SDK automatically handles calling the tool and returning the final natural language response
+let response = try await session.respond(to: prompt)
+
+print(response.content)
+// Output example: 
+// The current temperature in Waterloo, Ontario, Canada is 25°C.
+```
+
+## Additional features and considerations
+
+### Parallel Tool Calling
+The model can call a tool multiple times in parallel to satisfy the request, such as retrieving weather details for several cities simultaneously. Ensure that your tool implementation is thread-safe and can handle concurrent `call(arguments:)` executions.
+
+```swift
+let response = try await session.respond(
+    to: "Is it hotter in Boston, Wichita, or Pittsburgh?"
+)
+```
+
+### Error Handling
+You can throw errors from your tools to escape calls when you detect something is wrong, such as when the person using your app hasn't provided permission for data access or a network call times out. This will abort the ongoing generation request.
+
+Alternatively, if you want the model to recover from the failure, your tool can return a string (if your return type allows it) that briefly tells the model what didn't work. For example, returning `"Cannot access the weather database at this time."` allows the model to respond to the user with that context rather than throwing a fatal error.
+
+### Best Practices for Descriptions
+Similar to guided generation with `@Generable`, when you provide descriptions to properties, you help the model understand the semantics of the arguments. Keep descriptions as short as possible because long descriptions take up context size and can introduce latency.
@@ -0,0 +1,96 @@
+> [!WARNING]
+> **Experimental:** Using the Firebase AI SDK to build hybrid experiences on Apple platforms is an Experimental feature, which means that this feature isn't subject to any SLA or deprecation policy and could change in backwards-incompatible ways.
+
+<br />
+
+Gemini and Foundation models return responses as unstructured text by default.
+However, some use cases require structured output. For example, you
+might be using the response for other downstream tasks that require an
+established data schema.
+
+To ensure that the model's generated output always adheres to a specific schema,
+you can define a data structure. You can then directly extract data from the model's output as Swift types with less post-processing.
+
+This page describes how to generate structured output (like custom objects)
+in your hybrid experiences for Apple apps.
+
+## Before you begin
+
+Make sure that you've completed the
+[getting started guide for building hybrid experiences](get-started.md).
+
+## Generate Structured Output via `@Generable`
+
+Generating structured output is supported for
+inference using both cloud-hosted and on-device models via the `@Generable` macro in Swift. 
+
+The `generativeModelSession` allows you to request structured decoding via `generating: MyType.self`. You can also use basic Swift types directly (e.g., `generating: Float.self` or `generating: [String].self`) if you just need a primitive value.
+
+Here is an example for extracting a user profile. Notice the use of `@Guide` to control values and provide semantics. **Best Practice:** Keep descriptions as short as possible. Long descriptions take up additional context size and can introduce latency.
+
+```swift
+import FirebaseAI
+import FirebaseAILogic
+
+@Generable
+struct UserProfile {
+  @Guide(description: "A unique username for the user.")
+  var username: String
+
+  // Use constraints like `.range` to enforce limits on values
+  @Guide(description: "The user's age.", .range(1...120))
+  var age: Int
+
+  @Guide(description: "A short bio about the user, no more than 100 characters.")
+  var bio: String
+
+  // Use constraints like `.count` to enforce array sizes
+  @Guide(description: "A list of the user's favorite topics.", .count(3))
+  var favoriteTopics: [String]
+}
+
+// Ensure you have established a `generativeModelSession`
+let session = firebaseAI.generativeModelSession(
+    model: .hybridModel(primary: systemModel, secondary: geminiModel)
+)
+
+let prompt = "Generate a user profile for a cat lover who enjoys hiking."
+
+// Provide `generating: UserProfile.self` to map output to your Swift type
+let response = try await session.respond(to: prompt, generating: UserProfile.self)
+
+print("Username: \(response.content.username)")
+print("Bio: \(response.content.bio)")
+print("Favorite Topics: \(response.content.favoriteTopics.joined(separator: ", "))")
+```
+
+> [!NOTE]
+> Properties are generated by the model in the order they are declared. You can also nest custom `@Generable` types inside other `@Generable` types, and mark enumerations with associated values as `@Generable`.
+
+The underlying system seamlessly maps your `@Generable` types to JSON schemas when querying Gemini, and handles the appropriate representation constraint when communicating with the on-device `SystemLanguageModel`.
+
+## Stream Structured Output
+
+When working with large or complex structured responses, you may want to update your UI as the data is being generated rather than waiting for the entire response to complete.
+
+You can achieve this by using the `streamResponse(to:generating:)` method. The `@Generable` macro automatically synthesizes a nested `PartiallyGenerated` type for your struct, where all properties become optional. As the `LanguageModelSession` processes the stream, it yields snapshots containing these partial results.
+
+```swift
+// ... continuing from the previous example ...
+
+let stream = session.streamResponse(to: prompt, generating: UserProfile.self)
+
+for try await snapshot in stream {
+    // `snapshot.content` is of type `UserProfile.PartiallyGenerated`
+    let partialProfile = snapshot.content
+
+    // Properties might be nil if they haven't been generated yet
+    if let partialUsername = partialProfile.username {
+        print("Username generated so far: \\(partialUsername)")
+    }
+}
+
+// Once the stream completes, you can optionally collect the final decoded object
+let finalResponse = try await stream.collect()
+let completeProfile = finalResponse.content
+```