Is there an existing issue for this?
Contact Details
deadpix91@gmail.com
What should this feature add?
Currently, in a multi-user or shared environment, any user can access all available main models (checkpoints) and LoRAs, as well as queue an unlimited number of generation requests. This creates significant bottlenecks, especially on modest GPUs, where heavier models or massive queues from a single user can completely lock up the system for everyone else.
This feature request proposes a built-in user management system with two main capabilities:
Granular Model & LoRA Permissions: Allow administrators to grant, restrict, or blacklist access to specific checkpoints and LoRAs on a per-user (or per-role) basis. Because different models vary wildly in generation time, heavy or complex models could be restricted to specific users, while lighter models remain available to everyone.
Rate Limiting / Generation Quotas: Implement generation limits (e.g., max X generations per minute, per hour, or per day per user). This ensures fair GPU scheduling, preventing a single user from hogging the queue and allowing a smooth multi-user experience without overtaxing the hardware.
Alternatives
Running Multiple Instances: Setting up entirely separate Invoke.AI instances for different users. This is highly inefficient as it duplicates storage for models and heavily increases system RAM/VRAM overhead.
Custom External API Wrappers: Building a custom frontend (like a Discord bot or a custom web portal) that handles permissions and rate limits before sending API calls to Invoke.AI. This requires significant development effort and doesn't natively protect the core Web UI.
Reverse Proxy Limitations: Using tools like Nginx or Traefik to rate-limit API calls. However, this only limits network requests, not the computational weight of the generation, and it cannot easily filter which specific models a user is allowed to load.
Additional Content
Implementing this would make Invoke significantly more viable for shared homelabs, or educational environments that rely on a centralized server with limited hardware.
To keep the initial implementation simple, this wouldn't necessarily require a full graphical interface right away. A simple access_config.json file where server admins could define user roles, model whitelists/blacklists, and rate-limit ratios would already be a massive improvement.
Is there an existing issue for this?
Contact Details
deadpix91@gmail.com
What should this feature add?
Currently, in a multi-user or shared environment, any user can access all available main models (checkpoints) and LoRAs, as well as queue an unlimited number of generation requests. This creates significant bottlenecks, especially on modest GPUs, where heavier models or massive queues from a single user can completely lock up the system for everyone else.
This feature request proposes a built-in user management system with two main capabilities:
Granular Model & LoRA Permissions: Allow administrators to grant, restrict, or blacklist access to specific checkpoints and LoRAs on a per-user (or per-role) basis. Because different models vary wildly in generation time, heavy or complex models could be restricted to specific users, while lighter models remain available to everyone.
Rate Limiting / Generation Quotas: Implement generation limits (e.g., max X generations per minute, per hour, or per day per user). This ensures fair GPU scheduling, preventing a single user from hogging the queue and allowing a smooth multi-user experience without overtaxing the hardware.
Alternatives
Running Multiple Instances: Setting up entirely separate Invoke.AI instances for different users. This is highly inefficient as it duplicates storage for models and heavily increases system RAM/VRAM overhead.
Custom External API Wrappers: Building a custom frontend (like a Discord bot or a custom web portal) that handles permissions and rate limits before sending API calls to Invoke.AI. This requires significant development effort and doesn't natively protect the core Web UI.
Reverse Proxy Limitations: Using tools like Nginx or Traefik to rate-limit API calls. However, this only limits network requests, not the computational weight of the generation, and it cannot easily filter which specific models a user is allowed to load.
Additional Content
Implementing this would make Invoke significantly more viable for shared homelabs, or educational environments that rely on a centralized server with limited hardware.
To keep the initial implementation simple, this wouldn't necessarily require a full graphical interface right away. A simple access_config.json file where server admins could define user roles, model whitelists/blacklists, and rate-limit ratios would already be a massive improvement.