Skip to content

Inference Text/Image to video - TiledVAE feature #1516

@Domica

Description

@Domica

Proposal

As proposed through Discord feature addition, new Inference option for Text/Image to video screen for using TiledVAE to avoid OOMs on low to mid VRAM GPUs.

Would be appreciated for newbies to easier use the appliation rather than entering the Comfyui flow/node view.

Have some AI code ideas (as only know superficialy coding).

Steps (assumed):

  1. New XAML UI Card
    Path: StabilityMatrix.Avalonia/Controls/Inference/TiledVAEEncodeCard.axaml

  2. Code for the card
    Path: StabilityMatrix.Avalonia/Controls/Inference/TiledVAEEncodeCard.axaml.cs

  3. ViewModel for the card (with bindings & validation)
    Path: StabilityMatrix.Avalonia/ViewModels/Inference/TiledVAEEncodeCardViewModel.cs

  4. Inference Module (core)
    Path: StabilityMatrix.Avalonia/ViewModels/Inference/Modules/TiledVAEEncodeModule.cs

  5. Typed Node Definition for ComfyUI
    Add to existing file: StabilityMatrix.Core/Models/Api/Comfy/Nodes/ComfyNodeBuilder.cs

  6. Model register

  7. JSON Serialization Support

  8. Avalonia Style Include (if needed for discovery)

Would be appreaciated by contributors to assess the idea for implementation.

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions