beclab · cal-weng · Oct 11, 2025 · Oct 11, 2025 · Oct 11, 2025 · Oct 27, 2025
diff --git a/docs/manual/olares/settings/gpu-resource.md b/docs/manual/olares/settings/gpu-resource.md
@@ -15,7 +15,6 @@ This guide helps you understand and configure GPU allocation modes to maximize h
 Olares supports **only Nvidia GPUs** of **Turing architecture or later** (Turing, Ampere, Ada Lovelace, and Blackwell). 
 
 - Quick check: GTX/RTX **16 series and newer** consumer cards are supported.
-- For other models, cross-check with the [compatible GPU table](https://github.com/NVIDIA/open-gpu-kernel-modules?tab=readme-ov-file#compatible-gpus).
 - Other models: Cross-check with the [compatible GPU table](https://github.com/NVIDIA/open-gpu-kernel-modules?tab=readme-ov-file#compatible-gpus).
 - Unknown model: Run `lspci | grep -i nvidia` to query the GPU architecture code and determine compatibility.  
 :::
@@ -28,26 +27,23 @@ Even if your GPU architecture is supported, **low VRAM capacity may cause AI app
 
 Olares supports three GPU allocation modes. Choosing the right mode helps optimize performance based on your needs.
 
-### Time Slicing 
+### App Exclusive
 
-In this mode, the GPU's processing power is shared among multiple applications.  
+In this mode, the GPU’s full compute capacity and VRAM are allocated to a single application to ensure the maximized performance.
 
-* Acts as a default resource pool. Any application not explicitly assigned to a specific GPU will automatically use a time-slicing GPU if available.
+### Memory Slicing
 
-* Suitable for General-purpose use and running multiple lightweight applications.
+In this mode, GPU VRAM is allocated to multiple applications by specified VRAM quotas:
 
-### App Exclusive
+- Applications with assigned VRAM can run concurrently on the GPU.
+- The sum of all assigned VRAMs must not exceed the GPU’s physical VRAM.
 
-In this mode, the entire GPU processing power and memory is dedicated to a single application. 
+### Time Slicing
 
-* Best for intensive, performance-critical applications like AI-generated imagery or high-performance gaming servers.
-* Large memory demands may limit availability for other tasks.
-
-### Memory Slicing
-In this mode, GPU memory (VRAM) is partitioned into fixed, dedicated amounts for specific applications.
+In this mode, any number of applications can be bound to the same GPU:
 
-* Ideal for running multiple GPU-intensive applications simultaneously, each with guaranteed VRAM allocation.
-* Prevents memory conflicts between applications running on the same GPU.
+- At any instant, only one application fully occupies the GPU’s compute and VRAM.
+- VRAM contents of other applications are temporarily swapped out to system memory.
 
 ## View GPU status
 

diff --git a/docs/zh/manual/olares/settings/gpu-resource.md b/docs/zh/manual/olares/settings/gpu-resource.md
@@ -28,26 +28,21 @@ Olares 仅支持 **NVIDIA 显卡**，且要求架构为 **Turing 或更新**（T
 
 Olares 提供三种分配方式，可按场景灵活选择。
 
-### 时间分片模式
-
-在此模式下，GPU 的处理能力将在多个应用之间共享。
-
-- 该模式下，GPU 提供默认的显存资源池。未被分配独占 GPU 或专有显存的应用将自动使用时间分片模式下的 GPU（如可用）。
-- 适合通用型任务以及同时运行多个轻量级应用。
-
 ### 应用独占模式
 
-在此模式下，整个 GPU 的计算能力和显存将专用于单个应用。
-
-- 最适合高性能、资源密集型应用，如 AI 图像生成或高性能游戏服务器。
-- 大内存占用可能会限制其他任务的运行。
+在此模式下，单张 GPU 的算力和显存将分配给一个应用，以保证最佳性能。
 
 ### 显存分片模式
 
-在此模式下，GPU 显存（VRAM）被划分为固定配额，分配给指定应用。
+在此模式下，GPU 显存可按指定显存分配给多个应用。
+- 所有获得显存的应用可同时使用 GPU。
+- 所分配显存之和不得超过总物理显存。
+
+### 时间分片模式
 
-- 适合同时运行多个显卡密集型应用（如多个 AI 模型），每个应用都有独立显存配额。
-- 可避免多个应用运行在同一 GPU 上时的内存冲突。
+在此模式下，任意数量应用可绑定至同一 GPU：
+- 任一时刻仅有一个应用完全占用 GPU 算力和显存。
+- 此时其他应用的显存内容会暂时换出至系统内存。
 
 ## 查看显卡状态