update roadmap

skyzh · skyzh · commit 2f2196d7ff3e · 2025-04-27T18:54:00.000-04:00
Signed-off-by: Alex Chi Z &lt;iskyzh@gmail.com&gt;
diff --git a/README.md b/README.md
@@ -30,17 +30,18 @@ You may join skyzh's Discord server and study with the tiny-llm community.
 | 1.6            | Load the Model                                              | ✅    | 🚧   | 🚧  |
 | 1.7            | Generate Responses (aka Decoding)                           | ✅    | ✅   | 🚧  |
 | 2.1            | KV Cache                                                    | ✅    | 🚧   | 🚧  |
-| 2.2            | Quantized Matmul and Linear (CPU)                           | 🚧    | 🚧   | 🚧  |
-| 2.3            | Quantized Matmul and Linear (Metal)                         | 🚧    | 🚧   | 🚧  |
-| 2.4            | Attention and Softmax Kernels                               | 🚧    | 🚧   | 🚧  |
-| 2.5            | Flash Attention                                             | 🚧    | 🚧   | 🚧  |
-| 2.6            | Paged Attention - Part 1                                    | 🚧    | 🚧   | 🚧  |
-| 2.7            | Paged Attention - Part 2                                    | 🚧    | 🚧   | 🚧  |
-| 3.1            | Streaming API Server                                        | 🚧    | 🚧   | 🚧  |
-| 3.2            | Continuous Batching                                         | 🚧    | 🚧   | 🚧  |
-| 3.3            | Speculative Decoding                                        | 🚧    | 🚧   | 🚧  |
-| 3.4            | Prefill-Decode Separation                                   | 🚧    | 🚧   | 🚧  |
+| 2.2            | Quantized Matmul and Linear - Part 1                        | ✅    | 🚧   | 🚧  |
+| 2.3            | Quantized Matmul and Linear - Part 2                        | 🚧    | 🚧   | 🚧  |
+| 2.4            | Flash Attention and Other Kernels                           | 🚧    | 🚧   | 🚧  |
+| 2.5            | Continuous Batching                                         | 🚧    | 🚧   | 🚧  |
+| 2.6            | Speculative Decoding                                        | 🚧    | 🚧   | 🚧  |
+| 2.7            | Prompt/Prefix Cache                                         | 🚧    | 🚧   | 🚧  |
+| 3.1            | Paged Attention - Part 1                                    | 🚧    | 🚧   | 🚧  |
+| 3.2            | Paged Attention - Part 2                                    | 🚧    | 🚧   | 🚧  |
+| 3.3            | Prefill-Decode Separation                                   | 🚧    | 🚧   | 🚧  |
+| 3.4            | Parallelism                                                 | 🚧    | 🚧   | 🚧  |
 | 3.5            | AI Agent                                                    | 🚧    | 🚧   | 🚧  |
+| 3.6            | Streaming API Server                                        | 🚧    | 🚧   | 🚧  |
 
 <!--