docs: fix typos in day 6 inference system overview

NJX-njx · NJX-njx · commit 5b1e1e47111c · 2026-03-03T22:05:10.000+08:00
- Fix grammar: 'is hide' -&gt; 'is hidden', 'executed' -&gt; 'are executed'
- Fix image caption: .png -&gt; .jpg for H800 Node Count figure

Made-with: Cursor
diff --git a/202502OpenSourceWeek/day_6_one_more_thing_deepseekV3R1_inference_system_overview.md b/202502OpenSourceWeek/day_6_one_more_thing_deepseekV3R1_inference_system_overview.md
@@ -24,7 +24,7 @@ As we have adopted prefill-decode disaggregation architecture, we employ differe
 
 ### Computation-Communication Overlapping
 Large-scale cross-node EP introduces significant communication overhead. To mitigate this, we employ a dual-batch overlap strategy to hide communication costs and improve overall throughput by splitting a batch of requests into two microbatches. 
-During the prefilling phase, these two microbatches executed alternately and the communication cost of one microbatch is hide behind the computation of the other.
+During the prefilling phase, these two microbatches are executed alternately and the communication cost of one microbatch is hidden behind the computation of the other.
 
 ![Communication-Computation Overlapping during Prefilling Phase.png](figures/Communication-Computation%20Overlapping%20during%20Prefilling%20Phase.png)
 *Communication-Computation Overlapping during Prefilling Phase*
@@ -68,7 +68,7 @@ Over the past 24 hours (UTC+8 02/27/2025 12:00 PM to 02/28/2025 12:00 PM), the c
 Assuming the leasing cost of one H800 GPU is $2 per hour, the total daily cost amounts to $87,072.
 
 ![H800 Node Count For Inference Service.jpg](figures/H800%20Node%20Count%20For%20Inference%20Service.jpg)
-*H800 Node Count For Inference Service.png*
+*H800 Node Count For Inference Service.jpg*
 
 Within the 24-hour statistical period (UTC+8 02/27/2025 12:00 PM to 02/28/2025 12:00 PM), V3 and R1:
 - Total input tokens: 608B, of which 342B tokens (56.3%) hit the on-disk KV cache.