Skip to content

Commit bb05ba4

Browse files
committed
new landing page
1 parent f0679c8 commit bb05ba4

File tree

3 files changed

+91
-71
lines changed

3 files changed

+91
-71
lines changed

README.md

Lines changed: 13 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -1,78 +1,20 @@
1+
<p align="center">
2+
<img src="assets/nvidia-cosmos-header.png" alt="NVIDIA Cosmos Header">
3+
</p>
14

2-
![Cosmos Logo](assets/cosmos-logo.png)
5+
## GitHub project for NVIDIA Cosmos: https://github.com/nvidia-cosmos
36

4-
--------------------------------------------------------------------------------
5-
### [Website](https://www.nvidia.com/en-us/ai/cosmos/) | [HuggingFace](https://huggingface.co/collections/nvidia/cosmos-6751e884dc10e013a0a0d8e6) | [GPU-free Preview](https://build.nvidia.com/explore/discover) | [Paper](https://arxiv.org/abs/2501.03575) | [Paper Website](https://research.nvidia.com/labs/dir/cosmos1/)
7+
NVIDIA Cosmos now includes three subprojects:
68

7-
[NVIDIA Cosmos](https://www.nvidia.com/cosmos/) is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster. Cosmos contains
9+
### Cosmos-Predict1: https://github.com/nvidia-cosmos/cosmos-predict1
10+
- Cosmos-Predict1 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.
811

9-
1. pre-trained models, available via [Hugging Face](https://huggingface.co/collections/nvidia/cosmos-6751e884dc10e013a0a0d8e6) under the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/) that allows commercial use of the models for free
10-
2. training scripts under the [Apache 2 License](https://www.apache.org/licenses/LICENSE-2.0), offered through [NVIDIA Nemo Framework](https://github.com/NVIDIA/NeMo) for post-training the models for various downstream Physical AI applications
12+
### Cosmos-Transfer1: https://github.com/nvidia-cosmos/cosmos-transfer1
13+
- Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.
1114

12-
Details of the platform is described in the [Cosmos paper](https://research.nvidia.com/publication/2025-01_cosmos-world-foundation-model-platform-physical-ai). Preview access is avaiable at [build.nvidia.com](https://build.nvidia.com).
15+
### Cosmos-Reason1: https://github.com/nvidia-cosmos/cosmos-reason1
16+
- Cosmos-Reason1 models can understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.
1317

14-
## Key Features
18+
-----------------------------------------------------------
1519

16-
- [Pre-trained Diffusion-based world foundation models](cosmos1/models/diffusion/README.md) for Text2World and Video2World generation where a user can generate visual simulation based on text prompts and video prompts.
17-
- [Pre-trained Autoregressive-based world foundation models](cosmos1/models/autoregressive/README.md) for Video2World generation where a user can generate visual simulation based on video prompts and optional text prompts.
18-
- [Video tokenizers](cosmos1/models/tokenizer) for tokenizing videos into continuous tokens (latent vectors) and discrete tokens (integers) efficiently and effectively.
19-
- Video curation pipeline for building your own video dataset. [Coming soon]
20-
- [Post-training scripts](cosmos1/models/POST_TRAINING.md) via NeMo Framework to post-train the pre-trained world foundation models for various Physical AI setup.
21-
- Pre-training scripts via NeMo Framework for building your own world foundation model. [[Diffusion](https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/diffusion)] [[Autoregressive](https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/multimodal_autoregressive)] [[Tokenizer](cosmos1/models/tokenizer/nemo/README.md)].
22-
23-
## Model Family
24-
25-
| Model name | Description | Try it out |
26-
|------------|----------|----------|
27-
| [Cosmos-1.0-Diffusion-7B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Text2World) | Text to visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
28-
| [Cosmos-1.0-Diffusion-14B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Text2World) | Text to visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
29-
| [Cosmos-1.0-Diffusion-7B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
30-
| [Cosmos-1.0-Diffusion-14B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
31-
| [Cosmos-1.0-Autoregressive-4B](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-4B) | Future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
32-
| [Cosmos-1.0-Autoregressive-12B](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-12B) | Future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
33-
| [Cosmos-1.0-Autoregressive-5B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-5B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
34-
| [Cosmos-1.0-Autoregressive-13B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-13B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
35-
| [Cosmos-1.0-Guardrail](https://huggingface.co/nvidia/Cosmos-1.0-Guardrail) | Guardrail contains pre-Guard and post-Guard for safe use | Embedded in model inference scripts |
36-
37-
## Example Usage
38-
39-
### Inference
40-
41-
Follow the [Cosmos Installation Guide](INSTALL.md) to setup the docker. For inference with the pretrained models, please refer to [Cosmos Diffusion Inference](cosmos1/models/diffusion/README.md) and [Cosmos Autoregressive Inference](cosmos1/models/autoregressive/README.md).
42-
43-
The code snippet below provides a gist of the inference usage.
44-
45-
```bash
46-
PROMPT="A sleek, humanoid robot stands in a vast warehouse filled with neatly stacked cardboard boxes on industrial shelves. \
47-
The robot's metallic body gleams under the bright, even lighting, highlighting its futuristic design and intricate joints. \
48-
A glowing blue light emanates from its chest, adding a touch of advanced technology. The background is dominated by rows of boxes, \
49-
suggesting a highly organized storage system. The floor is lined with wooden pallets, enhancing the industrial setting. \
50-
The camera remains static, capturing the robot's poised stance amidst the orderly environment, with a shallow depth of \
51-
field that keeps the focus on the robot while subtly blurring the background for a cinematic effect."
52-
53-
# Example using 7B model
54-
PYTHONPATH=$(pwd) python cosmos1/models/diffusion/inference/text2world.py \
55-
--checkpoint_dir checkpoints \
56-
--diffusion_transformer_dir Cosmos-1.0-Diffusion-7B-Text2World \
57-
--prompt "$PROMPT" \
58-
--offload_prompt_upsampler \
59-
--video_save_name Cosmos-1.0-Diffusion-7B-Text2World
60-
```
61-
62-
<video src="https://github.com/user-attachments/assets/db7bebfe-5314-40a6-b045-4f6ce0a87f2a">
63-
Your browser does not support the video tag.
64-
</video>
65-
66-
We also offer [multi-GPU inference](cosmos1/models/diffusion/nemo/inference/README.md) support for Diffusion Text2World WFM models through NeMo Framework.
67-
68-
### Post-training
69-
70-
NeMo Framework provides GPU accelerated post-training with general post-training for both [diffusion](cosmos1/models/diffusion/nemo/post_training/README.md) and [autoregressive](cosmos1/models/autoregressive/nemo/post_training/README.md) models, with other types of post-training coming soon.
71-
72-
## License and Contact
73-
74-
This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
75-
76-
NVIDIA Cosmos source code is released under the [Apache 2 License](https://www.apache.org/licenses/LICENSE-2.0).
77-
78-
NVIDIA Cosmos models are released under the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license). For a custom license, please contact [cosmos-license@nvidia.com](mailto:cosmos-license@nvidia.com).
20+
This repository will be archived soon. To check out the initial release of NVIDIA Cosmos, please follow [README_CES2025.md](README_CES2025.md).

README_CES2025.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
2+
![Cosmos Logo](assets/cosmos-logo.png)
3+
4+
--------------------------------------------------------------------------------
5+
### [Website](https://www.nvidia.com/en-us/ai/cosmos/) | [HuggingFace](https://huggingface.co/collections/nvidia/cosmos-6751e884dc10e013a0a0d8e6) | [GPU-free Preview](https://build.nvidia.com/explore/discover) | [Paper](https://arxiv.org/abs/2501.03575) | [Paper Website](https://research.nvidia.com/labs/dir/cosmos1/)
6+
7+
[NVIDIA Cosmos](https://www.nvidia.com/cosmos/) is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster. Cosmos contains
8+
9+
1. pre-trained models, available via [Hugging Face](https://huggingface.co/collections/nvidia/cosmos-6751e884dc10e013a0a0d8e6) under the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/) that allows commercial use of the models for free
10+
2. training scripts under the [Apache 2 License](https://www.apache.org/licenses/LICENSE-2.0), offered through [NVIDIA Nemo Framework](https://github.com/NVIDIA/NeMo) for post-training the models for various downstream Physical AI applications
11+
12+
Details of the platform is described in the [Cosmos paper](https://research.nvidia.com/publication/2025-01_cosmos-world-foundation-model-platform-physical-ai). Preview access is avaiable at [build.nvidia.com](https://build.nvidia.com).
13+
14+
## Key Features
15+
16+
- [Pre-trained Diffusion-based world foundation models](cosmos1/models/diffusion/README.md) for Text2World and Video2World generation where a user can generate visual simulation based on text prompts and video prompts.
17+
- [Pre-trained Autoregressive-based world foundation models](cosmos1/models/autoregressive/README.md) for Video2World generation where a user can generate visual simulation based on video prompts and optional text prompts.
18+
- [Video tokenizers](cosmos1/models/tokenizer) for tokenizing videos into continuous tokens (latent vectors) and discrete tokens (integers) efficiently and effectively.
19+
- Video curation pipeline for building your own video dataset. [Coming soon]
20+
- [Post-training scripts](cosmos1/models/POST_TRAINING.md) via NeMo Framework to post-train the pre-trained world foundation models for various Physical AI setup.
21+
- Pre-training scripts via NeMo Framework for building your own world foundation model. [[Diffusion](https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/diffusion)] [[Autoregressive](https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/multimodal_autoregressive)] [[Tokenizer](cosmos1/models/tokenizer/nemo/README.md)].
22+
23+
## Model Family
24+
25+
| Model name | Description | Try it out |
26+
|------------|----------|----------|
27+
| [Cosmos-1.0-Diffusion-7B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Text2World) | Text to visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
28+
| [Cosmos-1.0-Diffusion-14B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Text2World) | Text to visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
29+
| [Cosmos-1.0-Diffusion-7B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
30+
| [Cosmos-1.0-Diffusion-14B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
31+
| [Cosmos-1.0-Autoregressive-4B](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-4B) | Future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
32+
| [Cosmos-1.0-Autoregressive-12B](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-12B) | Future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
33+
| [Cosmos-1.0-Autoregressive-5B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-5B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
34+
| [Cosmos-1.0-Autoregressive-13B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-13B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
35+
| [Cosmos-1.0-Guardrail](https://huggingface.co/nvidia/Cosmos-1.0-Guardrail) | Guardrail contains pre-Guard and post-Guard for safe use | Embedded in model inference scripts |
36+
37+
## Example Usage
38+
39+
### Inference
40+
41+
Follow the [Cosmos Installation Guide](INSTALL.md) to setup the docker. For inference with the pretrained models, please refer to [Cosmos Diffusion Inference](cosmos1/models/diffusion/README.md) and [Cosmos Autoregressive Inference](cosmos1/models/autoregressive/README.md).
42+
43+
The code snippet below provides a gist of the inference usage.
44+
45+
```bash
46+
PROMPT="A sleek, humanoid robot stands in a vast warehouse filled with neatly stacked cardboard boxes on industrial shelves. \
47+
The robot's metallic body gleams under the bright, even lighting, highlighting its futuristic design and intricate joints. \
48+
A glowing blue light emanates from its chest, adding a touch of advanced technology. The background is dominated by rows of boxes, \
49+
suggesting a highly organized storage system. The floor is lined with wooden pallets, enhancing the industrial setting. \
50+
The camera remains static, capturing the robot's poised stance amidst the orderly environment, with a shallow depth of \
51+
field that keeps the focus on the robot while subtly blurring the background for a cinematic effect."
52+
53+
# Example using 7B model
54+
PYTHONPATH=$(pwd) python cosmos1/models/diffusion/inference/text2world.py \
55+
--checkpoint_dir checkpoints \
56+
--diffusion_transformer_dir Cosmos-1.0-Diffusion-7B-Text2World \
57+
--prompt "$PROMPT" \
58+
--offload_prompt_upsampler \
59+
--video_save_name Cosmos-1.0-Diffusion-7B-Text2World
60+
```
61+
62+
<video src="https://github.com/user-attachments/assets/db7bebfe-5314-40a6-b045-4f6ce0a87f2a">
63+
Your browser does not support the video tag.
64+
</video>
65+
66+
We also offer [multi-GPU inference](cosmos1/models/diffusion/nemo/inference/README.md) support for Diffusion Text2World WFM models through NeMo Framework.
67+
68+
### Post-training
69+
70+
NeMo Framework provides GPU accelerated post-training with general post-training for both [diffusion](cosmos1/models/diffusion/nemo/post_training/README.md) and [autoregressive](cosmos1/models/autoregressive/nemo/post_training/README.md) models, with other types of post-training coming soon.
71+
72+
## License and Contact
73+
74+
This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
75+
76+
NVIDIA Cosmos source code is released under the [Apache 2 License](https://www.apache.org/licenses/LICENSE-2.0).
77+
78+
NVIDIA Cosmos models are released under the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license). For a custom license, please contact [cosmos-license@nvidia.com](mailto:cosmos-license@nvidia.com).

assets/nvidia-cosmos-header.png

1.3 MB
Loading

0 commit comments

Comments
 (0)