[RFC] Add FlagOS Backend Support for Multi-Chip AI Accelerators

 ## Summary

  This RFC proposes adding support for **FlagOS** (a unified open-source AI system software stack) to Taichi, enabling Taichi programs to run o
  n various domestic AI chips including MLU (Cambricon), Ascend (Huawei), DCU (Hygon), and GCU (Enflame) through FlagOS's unified compiler infr
  astructure.

  ## Motivation

  ### Current Situation
  - Taichi currently supports NVIDIA GPUs (CUDA), AMD GPUs, and other backends
  - Domestic AI chips are widely used in China but lack Taichi support
  - Each chip requires individual backend development effort

  ### Proposed Solution
  Integrate with FlagOS, which provides:
  - **Unified Compiler (FlagTree)**: Single compiler targeting multiple AI chip architectures
  - **Multi-chip Support**: MLU, Ascend, DCU, GCU, and more
  - **Mature Ecosystem**: FlagGems (operators), FlagCX (communication), FlagScale (training)

  ### Benefits
  1. **Expand Taichi's Hardware Support**: Access to domestic AI chip market
  2. **Reduce Development Effort**: One backend for multiple chips
  3. **Ecosystem Integration**: Connect Taichi with FlagOS ecosystem

  ## Proposed Design

  ### Architecture Overview

  Taichi DSL → LLVM IR → FlagTree Compiler → AI Chip Binary


  ### Key Components

  1. **RHI Device Layer** (`taichi/rhi/flagos/`)
     - Memory management via FlagOS runtime
     - Kernel launch interface
     - Multi-chip support (MLU370, Ascend910, DCU, GCU)

  2. **Code Generation Layer** (`taichi/codegen/flagos/`)
     - LLVM IR generation for AI chips
     - SPMD execution model
     - Chip-specific optimizations

  3. **Program Implementation** (`taichi/runtime/program_impls/flagos/`)
     - Kernel compilation via FlagTree
     - Kernel launch management

  ### Python API

```python
  import taichi as ti

  # Initialize FlagOS backend
  ti.init(arch=ti.flagos, flagos_chip="mlu370")

  @ti.kernel
  def compute():
      for i in range(1000000):
          pass

  compute()
```


  Implementation Plan

   Phase     Timeline    Deliverables
  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   Phase 1   1-2 weeks   Infrastructure (current PR ready)
   Phase 2   2-4 weeks   FlagTree SDK integration
   Phase 3   4-8 weeks   Advanced features & optimization

  Current Status

  ✅ Completed:

  • Core architecture (27 files, +2900 lines)
  • RHI Device layer
  • Code generation layer
  • Program implementation
  • Build system integration
  • Example programs
  • Documentation

  🔄 Pending (requires FlagOS SDK):

  • FlagTree compiler integration
  • Kernel binary generation
  • Hardware-specific optimizations

  Testing Strategy

  1. Stub Testing: Use generic stub for API testing
  2. Mock Testing: Mock FlagTree compiler for unit tests
  3. CI Integration: GitHub Actions with TI_WITH_FLAGOS=ON

  Questions for Discussion

  1. Should FlagosProgramImpl inherit from LlvmProgramImpl or be separate?
  2. Is stub/mock testing acceptable for initial merge without hardware?
  3. Who will maintain this backend long-term?

  Related Links

  • FlagOS: https://github.com/flagos-ai
  • FlagTree: https://github.com/flagos-ai/flagtree
  • Taichi FlagOS Fork: https://github.com/GWinfinity/taichi

  Checklist

  • [x] RFC created
  • [ ] Community feedback incorporated
  • [ ] Core team approval
  • [ ] PR submitted

  ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
  /cc @ailzhang @k-ye @bobcao3

  Would love to hear your thoughts on this proposal!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Add FlagOS Backend Support for Multi-Chip AI Accelerators #8786

Summary

Motivation

Current Situation

Proposed Solution

Benefits

Proposed Design

Architecture Overview

Key Components

Python API

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[RFC] Add FlagOS Backend Support for Multi-Chip AI Accelerators #8786

Description

Summary

Motivation

Current Situation

Proposed Solution

Benefits

Proposed Design

Architecture Overview

Key Components

Python API

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions