Neural Processing Unit (NPU)

Schematic depiction of the outter matrix product AB of two matrices A and B. NPUs implement GEMMs by partitioning the output matrix into tiles, which are then parallel loaded from memory buffer, multiplied and accumulated into output.

A Neural Processing Unit (NPU) is a specialized hardware accelerator designed to efficiently handle the computational demands of AI and machine learning tasks, particularly neural network inference and training. NPUs are optimized for the types of operations commonly used in deep learning, such as matrix multiplications, convolutions, and activation functions. In mid-2024 the NPUs are embedded in various SoCs, allowing a wider choice in AI applications.

Feature	Google TPU (USB/M.2)	Apple Silicon	AMD	Intel (after Meteor Lake)	NVIDIA (Grace Hopper)	NVIDIA (Jetson)	Snapdragon Xlite
Product Name	Edge TPU	Apple Neural Engine	3rd Gen Ryzen AI	VPU, GNA, AI Engine	TensorRT, DLA, Grace Hopper	Jetson Xavier, Nano, TX2	Qualcomm AI Engine
Primary Use Case	Edge AI, Low Power Devices	Mobile, Desktop	GPUs with AI Capabilities	Mobile, Desktop, Edge AI	Data Center, HPC, Embedded	Embedded AI	Mobile, Edge Computing
Performance	Moderate	High	Moderate to High	Moderate to High	Very High	Moderate to High	Moderate
Efficiency	High	High	Moderate	High	Moderate to High	High	High
Special Features	Google Cloud Compatible, Tensor Operations	Unified Memory, Tight OS Integration	APUs, ROCm	Low Power, Vision Processing, Integrated AI	CUDA Integration, Tensor Cores	Low Power, Integrated AI	Integrated 5G, AI on Device
Flexibility	Specialized for TensorFlow	General Purpose	AI with General Compute	Specialized for AI and Vision	Highly Specialized	General Purpose	General Purpose
Compatibility	TensorFlow Lite	macOS	Windows, Linux	Windows, Linux	Windows, Linux	Linux	Android, Windows
Scalability	High	Moderate	Moderate	Moderate	High	Moderate	Moderate
Integration	Edge Devices	Mobile, Desktop	GPUs	Mobile, Desktop, Edge Devices	HPC, Cloud, Embedded	Embedded Systems	Mobile SoCs
Availability	USB, M.2 Modules	Built-in (A-series, M-series)	Radeon Instinct GPUs	Integrated in Meteor Lake CPUs	Available in GPUs, Servers	Available in Embedded Modules	Snapdragon SoCs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Neural Processing Unit (NPU)

External Reading:

FilesExpand file tree

npu.md

Latest commit

History

npu.md

File metadata and controls

Neural Processing Unit (NPU)

External Reading: