-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the PCIe wiki!
To enhance popular open-source AI tools—like run:ai, Parallax, vLLM, and Petals to natively support AI workloads that run on PCIe-attached AI accelerator cards, enabling both inference and training scenarios allowing users to build their own AI accelerator hardware-agnostic AI Platforms. Democratizing access to large language models, proving that accessible, high-performance LLM inference is achievable beyond centralized data centers and on the hardware, people already own. Thus, making Software-Defined AI Factories (SDAF) ubiquitous for inference and running Agentic AI workflows and other Edge use cases.
Many open-source AI platforms and workflows lack streamlined support for AI acceleration via PCIe-connected hardware, limiting performance and flexibility for modern hardware setups.
Extend codebases of tools such as run:ai (and similar FOSS projects) to integrate with PCIe-based AI accelerator cards, providing both efficient inference and training capabilities through hardware and software level enhancements, leveraging NIST's AI Risk Management Framework.
- Hardware-agnostic compatibility: Leverages rapid communication via PCIe to improve performance across various AI card platforms.
- Open ecosystem enhancement: Enriches popular ML tools, Multi-modal LLMs to support cutting-edge AI accelerator integration, benefiting the broader community.
- Empowers users to build their own AI accelerator hardware-agnostic AI Platforms.
- Identify candidate tools (e.g., run:ai, drivers, MPI-based frameworks) for PCIe integration.
- Design new modules or adapters that bridge existing workflows to PCIe accelerator APIs.
- Prototype with representative workloads (inference and training). Test performance and compatibility across different accelerator cards.
- Document integration steps and best practices for end users with monitoring and Observability features.
- Documentation to support transparency and accountability.
Typical resource breakdowns will include:
- Development hours for adapter design and coding
- Testing infrastructure (access to various PCIe accelerator hardware)
- Documentation and community support time
- Hardware diversity: PCIe AI cards differ in interface and requirements.
- Mitigation: Start with a few mainstream accelerator models.
- Complex integrations: Code modifications may introduce instability.
- Mitigation: Employ rigorous testing and modular architecture.
- Project scope creep: Too many tool integrations could dilute focus.
- Mitigation: Prioritize integration targets based on impact and feasibility.
- NIST AI Risk Management, and TRiSM Framework.
- Number of AI frameworks successfully extended for PCIe.
- Performance gains in inference/training benchmarks.
- Adoption by open-source communities (stars, forks, contributions).
- Positive feedback from early adopters.
- Define supported PCIe accelerator platforms and target AI frameworks.
- Create a development roadmap and prioritize integration effort.
- Develop initial proof-of-concept adapter for one combination (e.g., run:ai + a specific card). Build out benchmarking suite for performance validation.
- Expand documentation and encourage community contributions.
- Languages used: Python, C/C++ and others
- Related topics/tags: cuda, k8s, k3s, mpi4py, runai, cxl, onnxoptimizer, vllm, opentelemetry-ebpf-profiler, mpio, DisTrO, cxl-mem, photonics-computing, llamacpp, llm-d, paxos-cluster, onnxoptimizer, Triton, TensorRT, Petals, Parallax, SGLang, ray and others
- External reference: Links to Medium article comparing AI/ML hardware performance and DIY AI Hardware
- https://medium.com/@maneeshsharma_68969/comparing-performance-of-ai-ml-hardware-a0d18cf657a0
- https://medium.com/@maneeshsharma_68969/diy-ai-infrastructure-a7a1ecf8d688
- https://gradient.network/parallax.pdf
- https://arxiv.org/abs/2209.01188
- https://arxiv.org/abs/2509.26182
- https://arxiv.org/abs/2309.06180
- https://arxiv.org/abs/1706.01160
- https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
- https://www.jeffgeerling.com/blog/2025/all-intel-gpus-run-on-raspberry-pi-and-risc-v
- https://developer.nvidia.com/dcgm
- https://docs.ray.io/en/latest/cluster/getting-started.html
- https://github.com/NVIDIA/TensorRT
- https://catalog.ngc.nvidia.com/
- https://www.amd.com/content/dam/amd/en/documents/pensando-technical-docs/product-briefs/pollara-product-brief.pdf
- https://www.gigabyte.com/PC-Accessory/AI-TOP-CXL-R5X4
- https://csrc.nist.gov/projects/post-quantum-cryptography
- https://falcon-sign.info/falcon.pdf
- https://cdi.liqid.com/hubfs/Liqid-CXL%20HBA-102725.pdf
- https://cdi.liqid.com/hubfs/Liqid-CXL%202.0%20Fabric_072125.pdf
- https://www.broadcom.com/products/ethernet-connectivity/network-adapters/n1800go
- https://arxiv.org/pdf/2511.15950
- https://www.qualcomm.com/news/releases/2025/10/qualcomm-unveils-ai200-and-ai250-redefining-rack-scale-data-cent
- https://www.mobilint.com/aries/mla100
- https://www.qualcomm.com/internet-of-things/solutions/ai-on-prem-appliance
- https://www.qualcomm.com/developer/software/qualcomm-ai-inference-suite
- https://www.qualcomm.com/content/dam/qcomm-martech/dm-assets/documents/Prod_Brief_QCOM_Cloud_AI_100_Ultra.pdf
- https://store.axelera.ai/products/metis-pcie-card-unmatched-performance-for-edge-ai-applications
- https://www.gigabyte.com/Motherboard/TRX50-AERO-D-rev-12
- https://github.com/exo-explore/exo
- https://arxiv.org/pdf/2503.01861v3
- https://huggingface.co/blog/ibm-research/cuga-on-hugging-face
- https://huggingface.co/collections/nvidia/nvidia-nemotron-v3
- https://mikrotik.com/product/ccr2004_1g_2xs_pcie
- https://www.asus.com/networking-iot-servers/wired-networking/all-series/xg-c100c/
- https://www.tp-link.com/in/home-networking/pci-adapter/tx401/
- https://github.com/ml-explore/mlx
- https://aaif.io/
- https://www.kolosal.ai/
- https://www.foundrylocal.ai/models
- https://plugable.com/blogs/news/plugable-introduces-tbt5-ai-at-ces-secure-local-ai-powered-by-thunderbolt-5