|
| 1 | +# Feature Splatting |
| 2 | + |
| 3 | +<h4>Feature Splatting</h4> |
| 4 | + |
| 5 | +```{button-link} https://feature-splatting.github.io/ |
| 6 | +:color: primary |
| 7 | +:outline: |
| 8 | +Paper Website |
| 9 | +``` |
| 10 | + |
| 11 | +```{button-link} https://github.com/vuer-ai/feature-splatting/ |
| 12 | +:color: primary |
| 13 | +:outline: |
| 14 | +Code |
| 15 | +``` |
| 16 | + |
| 17 | +<video id="teaser" muted autoplay playsinline loop controls width="100%"> |
| 18 | + <source id="mp4" src="https://feature-splatting.github.io/resources/basic_ns_demo_feature_only.mp4" type="video/mp4"> |
| 19 | +</video> |
| 20 | + |
| 21 | +**Feature Splatting distills SAM-enhanced CLIP features into 3DGS for segmentation and editing** |
| 22 | + |
| 23 | +## Installation |
| 24 | + |
| 25 | +First install nerfstudio dependencies. Then run: |
| 26 | + |
| 27 | +```bash |
| 28 | +pip install git+https://github.com/vuer-ai/feature-splatting |
| 29 | +``` |
| 30 | + |
| 31 | +## Running Feature Splatting |
| 32 | + |
| 33 | +Details for running Feature Splatting (built with Nerfstudio!) can be found [here](https://github.com/vuer-ai/feature-splatting). |
| 34 | +Once installed, run: |
| 35 | + |
| 36 | +```bash |
| 37 | +ns-train feature-splatting --help |
| 38 | +``` |
| 39 | + |
| 40 | +Currently, we provide the following variants: |
| 41 | + |
| 42 | +| Method | Description | Memory | Quality | |
| 43 | +| ----------- | ----------------------------------------------- | ------ | ------- | |
| 44 | +| `feature-splatting` | Feature Splatting with MaskCLIP ViT-L/14@336px and MobileSAMv2 | ~8 GB | Good | |
| 45 | + |
| 46 | +Note that the reference features used in this version are different from the version used in the paper in two ways |
| 47 | + |
| 48 | +- The SAM-enhanced CLIP features are computed using MobileSAMv2, which is much faster than original SAM but slightly less accurate. |
| 49 | +- The CLIP features are computed only on the image-level. |
| 50 | + |
| 51 | +## Method |
| 52 | + |
| 53 | +Feature splatting distills CLIP features into 3DGS by view-independent rasterization, which allows open-vocabulary 2D segmentation and open-vocabulary 3D segmentation of Gaussians directly in the 3D space. This implementation supports simple editing applications by directly manipulating Gaussians. |
| 54 | + |
| 55 | +### Reference feature computation and joint supervision |
| 56 | + |
| 57 | +Feature splatting computes high-quality SAM-enhanced CLIP features as reference features. Compared to coarse CLIP features (such as those used in LERF), Feature splatting performs an object-level masked average pooling of the features to refine the boundary of objects. While the original ECCV'24 paper uses SAM for part-level masks, this implementation uses MobileSAMv2 for much faster reference features computation, which we hope would encourage downstream applications that require real-time performance. |
| 58 | + |
| 59 | +In addition to SAM-enhanced features, we also found that using DINOv2 features as a joint supervision helps regularize the internal structure of objects, which is similar to findings in existing work. |
| 60 | + |
| 61 | +### Scene Editing |
| 62 | + |
| 63 | +Thanks to the explicit representation of 3DGS, grouped Gaussians can be easily manipulated. While the original ECCV'24 paper proposes a series of editing primitives, to avoid introducing excessive dependencies or hacks, we support a subset of editing primitives in this implementation: |
| 64 | + |
| 65 | +Rigid operations |
| 66 | +- Floor estimation (for intuitive rotation and gravity estimation) |
| 67 | +- Translation |
| 68 | +- Transparent (highlights segmented object and turns background Gaussians transparent) |
| 69 | +- Rotation (yaw only w.r.t. estimated ground) |
| 70 | + |
| 71 | +Non-rigid operations |
| 72 | +- Sand-like melting (based on Taichi MPM method) |
| 73 | + |
| 74 | +<video id="teaser" muted autoplay playsinline loop controls width="100%"> |
| 75 | + <source id="mp4" src="https://feature-splatting.github.io/resources/ns_editing_compressed.mp4" type="video/mp4"> |
| 76 | +</video> |
| 77 | + |
| 78 | +If you find our work helpful for your research, please consider citing |
| 79 | + |
| 80 | +```none |
| 81 | +@inproceedings{qiu-2024-featuresplatting, |
| 82 | + title={Language-Driven Physics-Based Scene Synthesis and Editing via Feature Splatting}, |
| 83 | + author={Ri-Zhao Qiu and Ge Yang and Weijia Zeng and Xiaolong Wang}, |
| 84 | + booktitle={European Conference on Computer Vision (ECCV)}, |
| 85 | + year={2024} |
| 86 | +} |
| 87 | +``` |
0 commit comments