ComfyUI custom nodes for TencentARC/Pixal3D: image-to-3D generation, textured GLB export, FlashAttention 2/3 selection, manual camera control, and ComfyUI model unload support.
Compatibility | Windows Wheels | Build NATTEN On Windows | Troubleshooting | Chinese README
ComfyUI Manager:
- Open ComfyUI Manager.
- Search for Pixal3D by Saganaki22.
- Install the normal/stable entry. Do not install the Nightly node.
- Restart ComfyUI, then run Pixal3D Environment Check.
Manual install:
cd ComfyUI\custom_nodes
git clone https://github.com/Saganaki22/Pixal3D-ComfyUI.git
cd Pixal3D-ComfyUI
python -m pip install -r requirements.txt
python install.py --checkRestart ComfyUI, then run Pixal3D Environment Check before loading the model.
requirements.txt installs only safe Python packages. It deliberately does not install Torch, FlashAttention, Triton, Pixal3D CUDA kernels, or renderer kernels because those wheels must match your exact Python, PyTorch, CUDA, OS, and GPU.
A working generation environment needs these imports inside the same Python that launches ComfyUI:
| Piece | Required import or file | Notes |
|---|---|---|
| PyTorch CUDA | torch.cuda.is_available() == True |
CPU-only is not supported |
| Attention | flash_attn or flash_attn_interface |
FlashAttention 2 or 3 |
| Sparse GEMM | flex_gemm_ap or flex_gemm |
Pixal3D CUDA kernel |
| Mesh ops | cumesh_vb or cumesh |
Pixal3D CUDA kernel |
| Voxel/export ops | o_voxel_vb_ap or o_voxel |
Pixal3D CUDA kernel |
| DRTK | drtk |
UV/export helper |
| Pixal3D model | ComfyUI/models/Pixal3D/TencentARC_Pixal3D/pipeline.json |
Download manually or use download_if_missing=true |
| DINOv3 helper | ComfyUI/models/Pixal3D/camenduru_dinov3-vitl16-pretrain-lvd1689m/ |
Needed by the image encoder |
| MoGe | ComfyUI/models/geometry_estimation/moge_2_vitl_normal_fp16.safetensors |
Only needed for camera_mode=moge |
| RMBG-2.0 | ComfyUI/models/Pixal3D/briaai_RMBG-2.0/ |
Gated model; only needed for background_mode=auto_remove |
| NATTEN/libnatten | natten.HAS_LIBNATTEN == True |
Only needed for strict NAF |
If Environment Check says a CUDA package is missing, install a wheel that exactly matches your stack. Do not let pip replace a working Torch install while testing random wheels; use --no-deps for manual CUDA wheels.
On Windows, install wheels in this order:
- PyTorch with CUDA for your GPU/driver.
- Required Pixal3D CUDA wheels:
flex_gemm_ap,cumesh_vb,o_voxel_vb_ap, anddrtk. - One attention wheel: FlashAttention 2 (
flash_attn) or FlashAttention 3 (flash_attn_interface). - Optional strict NAF wheel: NATTEN with CUDA
libnatten.
The required Pixal3D CUDA wheels are separate from NATTEN. A working NATTEN install does not mean flex_gemm, cumesh, o_voxel, or drtk are installed.
For Python 3.12, PyTorch 2.10, CUDA 13.0 on Blackwell sm120, install the required Pixal3D CUDA wheels plus the prebuilt NATTEN/libnatten wheel with:
venv\Scripts\python.exe -m pip install --no-deps ^
"https://github.com/PozzettiAndrea/cuda-wheels/releases/download/flex_gemm_ap-latest/flex_gemm_ap-1.0.0%2Bcu130torch2.10-cp312-cp312-win_amd64.whl" ^
"https://github.com/PozzettiAndrea/cuda-wheels/releases/download/cumesh_vb-latest/cumesh_vb-1.0%2Bcu130torch2.10-cp312-cp312-win_amd64.whl" ^
"https://github.com/PozzettiAndrea/cuda-wheels/releases/download/o_voxel_vb_ap-latest/o_voxel_vb_ap-0.0.1%2Bcu130torch2.10-cp312-cp312-win_amd64.whl" ^
"https://github.com/PozzettiAndrea/cuda-wheels/releases/download/drtk-latest/drtk-0.1.0%2Bcu130torch2.10-cp312-cp312-win_amd64.whl" ^
"https://huggingface.co/drbaph/NATTEN-0.21.6-torch2100cu130-cp312-cp312-win_amd64/resolve/main/natten-0.21.6+torch2100cu130-cp312-cp312-win_amd64.whl"If your Python, PyTorch, CUDA, or GPU architecture does not match that NATTEN wheel, omit the final NATTEN URL and use naf_mode=fallback_if_missing, preload_naf=false.
For Python 3.12, PyTorch 2.8, CUDA 12.8 on Blackwell sm100/sm120, use the matching Pixal3D CUDA wheels plus the naxneri NATTEN/libnatten wheel:
venv\Scripts\python.exe -m pip install --no-deps ^
"https://github.com/PozzettiAndrea/cuda-wheels/releases/download/flex_gemm_ap-latest/flex_gemm_ap-1.0.0%2Bcu128torch2.8-cp312-cp312-win_amd64.whl" ^
"https://github.com/PozzettiAndrea/cuda-wheels/releases/download/cumesh_vb-latest/cumesh_vb-1.0%2Bcu128torch2.8-cp312-cp312-win_amd64.whl" ^
"https://github.com/PozzettiAndrea/cuda-wheels/releases/download/o_voxel_vb_ap-latest/o_voxel_vb_ap-0.0.1%2Bcu128torch2.8-cp312-cp312-win_amd64.whl" ^
"https://github.com/PozzettiAndrea/cuda-wheels/releases/download/drtk-latest/drtk-0.1.0%2Bcu128torch2.8-cp312-cp312-win_amd64.whl" ^
"https://huggingface.co/naxneri/natten-0.21.6-blackwell-cu128-cp312-cp312-win_amd64/resolve/main/natten-0.21.6-blackwell-cu128-cp312-cp312-win_amd64.whl"For PyTorch 2.9 or another CUDA 12.8 stack, change the four Pozzetti URLs to wheels built for that exact Torch version. Keep the NATTEN URL only when it matches your Python, CUDA, and GPU.
More detail: Windows wheel guide.
Pixal3D uses NAF as a feature refinement step for the shape and texture stages. NAF uses NATTEN. Strict upstream NAF only works when NATTEN includes CUDA libnatten:
python -c "import natten; print(natten.__version__, natten.HAS_LIBNATTEN)"If that prints False, you have normal NATTEN without CUDA libnatten. The node can still run, but you must use:
Pixal3D Model Loader naf_mode=fallback_if_missing
Pixal3D Model Loader preload_naf=false
Fallback mode avoids loading NAF and keeps the expected tensor shape by using DINO projection features. It is usually slower and may use more RAM/VRAM than a proper CUDA NATTEN/libnatten build, and quality can be lower than strict upstream NAF.
On Windows, a NATTEN wheel must match all of these:
Python ABI, for example cp312
PyTorch build, for example torch2.10
CUDA build, for example cu130
GPU architecture, for example sm120
OS tag, win_amd64
If you cannot find a matching Windows wheel, use fallback mode or build NATTEN from source.
Known community Windows NATTEN wheels:
| Python | PyTorch | CUDA | GPU | Wheel |
|---|---|---|---|---|
| 3.12.10 / 3.13.12 | 2.10 | 13.0 | Ampere sm86, RTX 3050-3090 Ti | NeilsMabet/Natten-0.21.6-Amphere-wheel-windows |
| 3.12 | 2.10 | 13.0 | Blackwell sm120 | drbaph/NATTEN-0.21.6-torch2100cu130-cp312-cp312-win_amd64 |
| 3.12 | 2.8+ | 12.8 | Blackwell sm100/sm120 | naxneri/natten-0.21.6-blackwell-cu128-cp312-cp312-win_amd64 |
More detail: Windows wheel guide and Build NATTEN on Windows.
If download_if_missing=false, download the model files yourself and place them in these folders. Download the full snapshots, not single random files.
| Model | Download link | Local folder | Needed when |
|---|---|---|---|
| Pixal3D | TencentARC/Pixal3D | ComfyUI/models/Pixal3D/TencentARC_Pixal3D/ |
Always |
| DINOv3 helper | camenduru/dinov3-vitl16-pretrain-lvd1689m | ComfyUI/models/Pixal3D/camenduru_dinov3-vitl16-pretrain-lvd1689m/ |
Always |
| MoGe | Comfy-Org/MoGe | ComfyUI/models/geometry_estimation/ |
camera_mode=moge |
| RMBG-2.0 | briaai/RMBG-2.0 | ComfyUI/models/Pixal3D/briaai_RMBG-2.0/ |
background_mode=auto_remove |
| NAF upsampler | valeoai/NAF | ComfyUI/models/Pixal3D/torch_hub/ cache |
Strict NAF only |
RMBG-2.0 is gated on Hugging Face. Accept the model terms and log in before downloading it. If you do not want RMBG, use a transparent PNG/WebP and set background_mode=keep_alpha, or use background_mode=none.
Expected model layout:
ComfyUI/models/
├── Pixal3D/
│ ├── TencentARC_Pixal3D/
│ │ ├── pipeline.json
│ │ └── ckpts/
│ │ ├── *.json
│ │ └── *.safetensors
│ ├── camenduru_dinov3-vitl16-pretrain-lvd1689m/
│ │ ├── config.json
│ │ ├── model.safetensors
│ │ └── preprocessor_config.json
│ └── briaai_RMBG-2.0/
│ ├── config.json
│ ├── BiRefNet_config.py
│ ├── birefnet.py
│ ├── model.safetensors
│ └── preprocessor_config.json
└── geometry_estimation/
├── moge_1_vitl_fp16.safetensors
└── moge_2_vitl_normal_fp16.safetensors
MoGe files from Comfy-Org/MoGe are stored directly in ComfyUI/models/geometry_estimation/, not in a nested Comfy-Org/MoGe folder. hf_endpoint can be changed to a Hugging Face mirror if needed.
General Windows baseline:
| Node | Setting |
|---|---|
| Pixal3D Model Loader | attention_backend=auto |
| Pixal3D Model Loader | vram_mode=dynamic_vram |
| Pixal3D Model Loader | naf_mode=fallback_if_missing unless natten.HAS_LIBNATTEN=True |
| Pixal3D Model Loader | preload_naf=false unless strict NAF works |
| Pixal3D Image To 3D | pipeline_type=1024_cascade for lower VRAM, 1536_cascade for quality |
| Pixal3D Export GLB | decimation_target=1000000, texture_size=4096 |
Lowest-VRAM/manual path:
| Node | Setting |
|---|---|
| Pixal3D Model Loader | vram_mode=hybrid_low_vram, or native_low_vram if hybrid has issues |
| Pixal3D Model Loader | load_moge=false |
| Pixal3D Model Loader | load_rembg=false |
| Pixal3D Image To 3D | camera_mode=manual |
| Pixal3D Image To 3D | background_mode=keep_alpha with transparent PNG/WebP |
| Pixal3D Camera Control | Connect manual_fov to Pixal3D Image To 3D.manual_fov |
hybrid_low_vram keeps native stage-by-stage CPU/GPU offload, but builds modules with Comfy/Aimdo-aware ops. native_low_vram keeps the older pure native staging path. Both trade speed and system RAM for lower VRAM pressure.
| Node | Purpose |
|---|---|
| Pixal3D Environment Check | Prints installed/missing dependencies |
| Pixal3D Model Loader | Loads Pixal3D and helper models |
| Pixal3D Camera Control | Manual FOV, distance, and mesh scale with Scene/POV preview |
| Pixal3D Image To 3D | Runs image-to-3D generation |
| Pixal3D Export GLB | Exports the result to textured .glb |
| Pixal3D Unload Model | Clears the Pixal3D pipeline cache and releases the model handle |
Basic workflow:
Load Image -> Pixal3D Image To 3D image
Pixal3D Model Loader -> Pixal3D Image To 3D model
Pixal3D Image To 3D -> Pixal3D Export GLB
Pixal3D Export GLB glb_path -> Preview 3D & Animation model_file
Connect Pixal3D Image To 3D rembg_image to Preview Image to inspect the image Pixal3D used after background preprocessing.
Non-square inputs are padded to square automatically before Pixal3D's square image encoder, so 9:16 or 16:9 images are not stretched. Padding happens after background handling:
auto_remove: input -> RMBG/alpha crop -> pad to square -> RGB image sent to Pixal3D
keep_alpha: transparent input -> alpha crop -> pad to square -> RGB image sent to Pixal3D
none: input -> convert to RGB -> pad to square -> RGB image sent to Pixal3D
If the input is transparent and you do not want RMBG, use background_mode=keep_alpha. background_mode=none ignores alpha by design.
For lower-poly exports, reduce Pixal3D Export GLB decimation_target. The default is 1000000; values around 5000 are allowed but can lose detail on complex geometry.
Manual camera workflow:
|
|
Load Image -> Pixal3D Camera Control image
Pixal3D Camera Control manual_fov -> Pixal3D Image To 3D manual_fov
Pixal3D Image To 3D camera_mode=manual
| Symptom | Fix |
|---|---|
No module named flash_attn |
Install a matching FlashAttention 2 wheel, or FlashAttention 3 with flash_attn_interface |
flex_gemm, cumesh, o_voxel, or drtk missing |
Install matching Pixal3D CUDA wheels for your Python/PyTorch/CUDA/OS |
natten.HAS_LIBNATTEN=False |
Use naf_mode=fallback_if_missing, preload_naf=false, or install/build CUDA NATTEN |
| Strict NAF OOM on 12 GB | Try vram_mode=hybrid_low_vram, lower naf_target_size to 256 or 128, or use naf_mode=fallback_if_missing |
| RMBG download fails | Accept gated model terms, log in, set HF_TOKEN, or use transparent input with keep_alpha |
| MoGe missing | Download Comfy-Org/MoGe files to ComfyUI/models/geometry_estimation/ or use manual camera mode |
| GLB looks fragmented | Try remesh=true; keep decimation_target=1000000 or higher |
| RAM stays high after unload | Use Pixal3D Unload Model; restart ComfyUI to return all reserved Python/PyTorch memory to the OS |
See Troubleshooting for longer explanations.
- Windows wheel guide
- Build NATTEN on Windows
- Linux/WSL CUDA guide
- Portable/standalone install
- Compatibility matrix
- Related repositories
This nodepack builds on TencentARC/Pixal3D, Trellis.2, Trellis, and Direct3D-S2.
If Pixal3D is useful in your work, please cite the upstream project:
@article{li2026pixal3d,
title={Pixal3D: Pixel-Aligned 3D Generation from Images},
author={Li, Dong-Yang and Zhao, Wang and Chen, Yuxin and Hu, Wenbo and Guo, Meng-Hao and Zhang, Fang-Lue and Shan, Ying and Hu, Shi-Min},
journal={arXiv preprint arXiv:2605.10922},
year={2026}
}

