Skip to content

Add video generation support to Microsoft.Extensions.AI#7420

Draft
ericstj wants to merge 4 commits intodotnet:mainfrom
ericstj:videoModality
Draft

Add video generation support to Microsoft.Extensions.AI#7420
ericstj wants to merge 4 commits intodotnet:mainfrom
ericstj:videoModality

Conversation

@ericstj
Copy link
Member

@ericstj ericstj commented Mar 20, 2026

Introduces IVideoGenerator, a new modality abstraction for video generation that follows the existing patterns established by IChatClient and IImageGenerator.

Abstractions (Microsoft.Extensions.AI.Abstractions)

  • IVideoGenerator interface with GenerateAsync accepting request, options, progress, and cancellation
  • VideoGenerationRequest with Prompt and OriginalMedia (provider-neutral; image content = reference for generation, video content = source for editing)
  • VideoGenerationOptions with Count, Duration, FramesPerSecond, MediaType, ModelId, VideoSize, ResponseFormat, RawRepresentationFactory, and AdditionalProperties
  • VideoGenerationResponse with Contents, Usage, ModelId, RawRepresentation, and AdditionalProperties
  • VideoGenerationProgress for reporting async job status and percent complete via IProgress<T>
  • VideoGenerationResponseFormat enum (Uri, Data, Hosted)
  • VideoGeneratorMetadata for provider name, endpoint, and default model
  • DelegatingVideoGenerator base class for middleware
  • VideoGeneratorExtensions with GenerateAsync/EditVideoAsync/EditVideosAsync convenience overloads
  • HostedVideoGenerationTool and supporting tool content types for chat-client-driven video generation

Middleware (Microsoft.Extensions.AI)

  • VideoGeneratorBuilder with DI integration (AddVideoGenerator on IServiceCollection)
  • LoggingVideoGenerator middleware with builder extension
  • OpenTelemetryVideoGenerator middleware with builder extension
  • ConfigureOptionsVideoGenerator middleware with builder extension
  • VideoGeneratingChatClient that bridges IChatClient tool calls to IVideoGenerator

OpenAI Provider (Microsoft.Extensions.AI.OpenAI)

  • OpenAIVideoGenerator implementing IVideoGenerator via VideoClient
  • AsIVideoGenerator extension method on VideoClient
  • Routing based on request contents and AdditionalProperties keys:
    • Text-to-video: POST /videos (via SDK CreateVideoAsync)
    • Image-to-video: POST /videos with input_reference — URL in JSON or image bytes via multipart (via SDK CreateVideoAsync)
    • Edit by video ID: POST /videos/edits with edit_video_id key (via raw ClientPipeline)
    • Edit by upload: POST /videos/edits with video/* OriginalMedia as multipart (via raw ClientPipeline)
    • Extend: POST /videos/extensions with extend_video_id key (via raw ClientPipeline)
    • Characters: characters array forwarded in POST /videos body via AdditionalProperties
  • Edit, extend, and character endpoints not yet in the OpenAI SDK are supported by constructing PipelineMessage directly against VideoClient.Pipeline/VideoClient.Endpoint
  • Async create → poll → download pattern with IProgress<VideoGenerationProgress> reporting

Tests

  • Abstraction unit tests: DelegatingVideoGeneratorTests, VideoGenerationOptionsTests, VideoGenerationResponseTests, VideoGeneratorExtensionsTests, VideoGeneratorMetadataTests, VideoGeneratorTests
  • Middleware tests: ConfigureOptionsVideoGeneratorTests, LoggingVideoGeneratorTests, OpenTelemetryVideoGeneratorTests, VideoGeneratorBuilderTests, VideoGeneratorDependencyInjectionPatterns
  • OpenAI tests: OpenAIVideoGeneratorTests, OpenAIVideoGeneratorIntegrationTests
  • Shared TestVideoGenerator and VideoGeneratorIntegrationTests base class

POC Sample

  • samples/VideoGenerationPOC demonstrating all scenarios with System.CommandLine
  • Uses DataContent.LoadFromAsync for file loading with automatic media type inference
  • Uses DataContent.SaveToAsync for output
  • CLI args: --input, --edit, --extend, --character, --model, --output
video_20260319_225333.mp4
Microsoft Reviewers: Open in CodeFlow

Introduces `IVideoGenerator`, a new modality abstraction for video generation that follows the existing patterns established by `IChatClient` and `IImageGenerator`.

## Abstractions (Microsoft.Extensions.AI.Abstractions)

- `IVideoGenerator` interface with `GenerateAsync` accepting request, options, progress, and cancellation
- `VideoGenerationRequest` with `Prompt` and `OriginalMedia` (provider-neutral; image content = reference for generation, video content = source for editing)
- `VideoGenerationOptions` with `Count`, `Duration`, `FramesPerSecond`, `MediaType`, `ModelId`, `VideoSize`, `ResponseFormat`, `RawRepresentationFactory`, and `AdditionalProperties`
- `VideoGenerationResponse` with `Contents`, `Usage`, `ModelId`, `RawRepresentation`, and `AdditionalProperties`
- `VideoGenerationProgress` for reporting async job status and percent complete via `IProgress<T>`
- `VideoGenerationResponseFormat` enum (`Uri`, `Data`, `Hosted`)
- `VideoGeneratorMetadata` for provider name, endpoint, and default model
- `DelegatingVideoGenerator` base class for middleware
- `VideoGeneratorExtensions` with `GenerateAsync`/`EditVideoAsync`/`EditVideosAsync` convenience overloads
- `HostedVideoGenerationTool` and supporting tool content types for chat-client-driven video generation

## Middleware (Microsoft.Extensions.AI)

- `VideoGeneratorBuilder` with DI integration (`AddVideoGenerator` on `IServiceCollection`)
- `LoggingVideoGenerator` middleware with builder extension
- `OpenTelemetryVideoGenerator` middleware with builder extension
- `ConfigureOptionsVideoGenerator` middleware with builder extension
- `VideoGeneratingChatClient` that bridges `IChatClient` tool calls to `IVideoGenerator`

## OpenAI Provider (Microsoft.Extensions.AI.OpenAI)

- `OpenAIVideoGenerator` implementing `IVideoGenerator` via `VideoClient`
- `AsIVideoGenerator` extension method on `VideoClient`
- Routing based on request contents and `AdditionalProperties` keys:
  - **Text-to-video**: `POST /videos` (via SDK `CreateVideoAsync`)
  - **Image-to-video**: `POST /videos` with `input_reference` — URL in JSON or image bytes via multipart (via SDK `CreateVideoAsync`)
  - **Edit by video ID**: `POST /videos/edits` with `edit_video_id` key (via raw `ClientPipeline`)
  - **Edit by upload**: `POST /videos/edits` with `video/*` `OriginalMedia` as multipart (via raw `ClientPipeline`)
  - **Extend**: `POST /videos/extensions` with `extend_video_id` key (via raw `ClientPipeline`)
  - **Characters**: `characters` array forwarded in `POST /videos` body via `AdditionalProperties`
- Edit, extend, and character endpoints not yet in the OpenAI SDK are supported by constructing `PipelineMessage` directly against `VideoClient.Pipeline`/`VideoClient.Endpoint`
- Async create → poll → download pattern with `IProgress<VideoGenerationProgress>` reporting

## Tests

- Abstraction unit tests: `DelegatingVideoGeneratorTests`, `VideoGenerationOptionsTests`, `VideoGenerationResponseTests`, `VideoGeneratorExtensionsTests`, `VideoGeneratorMetadataTests`, `VideoGeneratorTests`
- Middleware tests: `ConfigureOptionsVideoGeneratorTests`, `LoggingVideoGeneratorTests`, `OpenTelemetryVideoGeneratorTests`, `VideoGeneratorBuilderTests`, `VideoGeneratorDependencyInjectionPatterns`
- OpenAI tests: `OpenAIVideoGeneratorTests`, `OpenAIVideoGeneratorIntegrationTests`
- Shared `TestVideoGenerator` and `VideoGeneratorIntegrationTests` base class

## POC Sample

- `samples/VideoGenerationPOC` demonstrating all scenarios with `System.CommandLine`
- Uses `DataContent.LoadFromAsync` for file loading with automatic media type inference
- Uses `DataContent.SaveToAsync` for output
- CLI args: `--input`, `--edit`, `--extend`, `--character`, `--model`, `--output`
@ericstj ericstj requested review from a team as code owners March 20, 2026 06:08
Copilot AI review requested due to automatic review settings March 20, 2026 06:08
@ericstj ericstj marked this pull request as draft March 20, 2026 06:09
@github-actions github-actions bot added the area-ai Microsoft.Extensions.AI libraries label Mar 20, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an experimental video-generation modality to the Microsoft.Extensions.AI stack, aligning with existing patterns used for chat/image modalities and providing a first provider implementation via OpenAI.

Changes:

  • Introduces new video generation abstractions in Microsoft.Extensions.AI.Abstractions (IVideoGenerator, request/response/options/progress types, tool content types, and extensions).
  • Adds middleware/pipeline + DI registration support in Microsoft.Extensions.AI (builder, logging, OpenTelemetry, options configuration) and a VideoGeneratingChatClient bridge for tool-driven generation.
  • Implements an OpenAI provider (OpenAIVideoGenerator) with polling + download flow, plus unit/integration tests and a POC sample.

Reviewed changes

Copilot reviewed 45 out of 45 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
test/Libraries/Microsoft.Extensions.AI.Tests/Video/VideoGeneratorDependencyInjectionPatterns.cs Validates DI registration patterns and lifetimes for IVideoGenerator.
test/Libraries/Microsoft.Extensions.AI.Tests/Video/VideoGeneratorBuilderTests.cs Tests builder pipeline ordering, null handling, and service provider flow.
test/Libraries/Microsoft.Extensions.AI.Tests/Video/SingletonVideoGeneratorExtensions.cs Test-only middleware helper for singleton pipeline behavior.
test/Libraries/Microsoft.Extensions.AI.Tests/Video/OpenTelemetryVideoGeneratorTests.cs Verifies OTel tags/events emitted for video generation and exceptions.
test/Libraries/Microsoft.Extensions.AI.Tests/Video/LoggingVideoGeneratorTests.cs Verifies logging behavior across log levels and scenarios.
test/Libraries/Microsoft.Extensions.AI.Tests/Video/ConfigureOptionsVideoGeneratorTests.cs Verifies options cloning and configuration middleware behavior.
test/Libraries/Microsoft.Extensions.AI.Tests/Microsoft.Extensions.AI.Tests.csproj Adds shared TestVideoGenerator compilation into middleware test project.
test/Libraries/Microsoft.Extensions.AI.OpenAI.Tests/OpenAIVideoGeneratorTests.cs Tests OpenAI VideoClient adapter + metadata/service retrieval.
test/Libraries/Microsoft.Extensions.AI.OpenAI.Tests/OpenAIVideoGeneratorIntegrationTests.cs Adds OpenAI integration coverage using shared base tests.
test/Libraries/Microsoft.Extensions.AI.Integration.Tests/VideoGeneratorIntegrationTests.cs Adds common integration test suite for IVideoGenerator.
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Video/VideoGeneratorTests.cs Tests IVideoGenerator test double and basic behavior.
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Video/VideoGeneratorMetadataTests.cs Tests metadata construction and defaults.
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Video/VideoGeneratorExtensionsTests.cs Tests convenience extension methods for generate/edit and service access.
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Video/VideoGenerationResponseTests.cs Tests response defaults and JSON round-tripping.
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Video/VideoGenerationOptionsTests.cs Tests options defaults, cloning, and JSON round-tripping.
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Video/DelegatingVideoGeneratorTests.cs Tests delegating generator behavior and service passthrough.
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/TestVideoGenerator.cs Adds a reusable IVideoGenerator test implementation.
src/Shared/DiagnosticIds/DiagnosticIds.cs Adds experiment IDs for video generation and OpenAI video client.
src/Libraries/Microsoft.Extensions.AI/Video/VideoGeneratorBuilderVideoGeneratorExtensions.cs Adds AsBuilder() extension for IVideoGenerator.
src/Libraries/Microsoft.Extensions.AI/Video/VideoGeneratorBuilderServiceCollectionExtensions.cs Adds IServiceCollection registration helpers for video generators.
src/Libraries/Microsoft.Extensions.AI/Video/VideoGeneratorBuilder.cs Adds middleware pipeline builder for IVideoGenerator.
src/Libraries/Microsoft.Extensions.AI/Video/LoggingVideoGeneratorBuilderExtensions.cs Adds logging middleware builder extension for video generation.
src/Libraries/Microsoft.Extensions.AI/Video/LoggingVideoGenerator.cs Implements logging middleware for IVideoGenerator.
src/Libraries/Microsoft.Extensions.AI/Video/ConfigureOptionsVideoGeneratorBuilderExtensions.cs Adds options-configuration middleware builder extension.
src/Libraries/Microsoft.Extensions.AI/Video/ConfigureOptionsVideoGenerator.cs Implements options-cloning/configuration middleware.
src/Libraries/Microsoft.Extensions.AI/OpenTelemetryConsts.cs Adds OTel constant for video content type.
src/Libraries/Microsoft.Extensions.AI/ChatCompletion/VideoGeneratingChatClientBuilderExtensions.cs Adds chat client builder extension for tool-driven video generation.
src/Libraries/Microsoft.Extensions.AI/ChatCompletion/VideoGeneratingChatClient.cs Implements chat client wrapper bridging tool calls to IVideoGenerator.
src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryVideoGeneratorBuilderExtensions.cs Adds OpenTelemetry middleware builder extension for video generation.
src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryVideoGenerator.cs Implements OTel semantic conventions for video generation operations.
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIVideoGenerator.cs Implements OpenAI-backed IVideoGenerator with create/poll/download + extra endpoints.
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIClientExtensions.cs Adds VideoClient.AsIVideoGenerator() adapter extension method.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Video/VideoGeneratorMetadata.cs Adds provider metadata type for video generators.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Video/VideoGeneratorExtensions.cs Adds convenience overloads + service access helpers for IVideoGenerator.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Video/VideoGenerationToolResultContent.cs Adds tool result content type for returning generated videos.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Video/VideoGenerationToolCallContent.cs Adds tool call content type for representing tool invocation.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Video/VideoGenerationResponse.cs Adds response type for video generation.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Video/VideoGenerationRequest.cs Adds request type with prompt + original media.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Video/VideoGenerationProgress.cs Adds progress payload for async generation status reporting.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Video/VideoGenerationOptions.cs Adds options type + response-format enum and clone support.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Video/IVideoGenerator.cs Adds the core IVideoGenerator abstraction.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Video/HostedVideoGenerationTool.cs Adds hosted tool marker for enabling video generation.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Video/DelegatingVideoGenerator.cs Adds delegating base class for composable video generator middleware.
samples/VideoGenerationPOC/VideoGenerationPOC.csproj Adds a CLI POC sample project.
samples/VideoGenerationPOC/Program.cs Implements the CLI POC demonstrating create/edit/extend/characters flows.

Comment on lines +19 to +33
/// <summary>Initializes a new instance of the <see cref="VideoGenerationRequest"/> class.</summary>
/// <param name="prompt">The prompt to guide the video generation.</param>
public VideoGenerationRequest(string prompt)
{
Prompt = prompt;
}

/// <summary>Initializes a new instance of the <see cref="VideoGenerationRequest"/> class.</summary>
/// <param name="prompt">The prompt to guide the video generation.</param>
/// <param name="originalMedia">The original media (images or videos) to base edits on.</param>
public VideoGenerationRequest(string prompt, IEnumerable<AIContent>? originalMedia)
{
Prompt = prompt;
OriginalMedia = originalMedia;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given both properties are get;set, do we need these constructors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the same pattern on ImageGenerationRequest. I'm ok to change but we should change both. Maybe make prompt required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-ai Microsoft.Extensions.AI libraries

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants