|
2 | 2 |
|
3 | 3 | ## Introduction |
4 | 4 |
|
5 | | -**CogKit** is an open-source initiative by Zhipu AI that provides a user-friendly interface, enabling researchers and developers to access and manipulate the Cog family of models. |
6 | | -You can check [here](docs/05-Model%20Card.md) to view support models. The project aims to streamline the application of Cog models across multimodal generation tasks such as **text-to-image (t2i)**, **text-to-video (t2v)**, **image-to-video (i2v)**. |
7 | | -It should be noted that utilization of CogKit and associated Cog models must adhere to relevant legal frameworks and ethical guidelines to ensure responsible and ethical implementation. |
| 5 | +**CogKit** is an open-source project that provides a user-friendly interface for researchers and developers to utilize ZhipuAI's [**CogView**](https://huggingface.co/collections/THUDM/cogview-67ac3f241eefad2af015669b) (image generation) and [**CogVideoX**](https://huggingface.co/collections/THUDM/cogvideo-66c08e62f1685a3ade464cce) (video generation) models. It streamlines multimodal tasks such as **text-to-image (T2I)**, **text-to-video (T2V)**, and **image-to-video (I2V)**. Users must comply with legal and ethical guidelines to ensure responsible implementation. |
| 6 | + |
| 7 | +Visit our [**Docs**](https://thudum.github.io/CogKit) to start. |
8 | 8 |
|
9 | 9 | ## Features |
10 | 10 |
|
11 | | -- Multiple models: CogVideoX, CogVideoX1.5, CogView4. |
12 | | -- Ensemble methods: (incremental) pre-training, (multimodal) instruction. |
13 | | -- Multiple precisions: 16-bit full parameter fine-tuning, frozen fine-tuning, LoRA fine-tuning. |
14 | | -- Fine-tuning methods: single machine single card, single machine multiple cards, multiple machines multiple cards. |
15 | | -- Wide range of tasks: multi-round dialogue, image generation, video generation, etc. |
16 | | -- Extreme reasoning: based on OpenAI style API, browser interface and command line interface. |
17 | | -- Embed Cache: Reduce GPU memory usage. |
| 11 | +- **Fine-tuning Methods**: Supports **LoRA** and **full-parameter fine-tuning** across various setups, including **single-machine single-GPU**, **single-machine multi-GPU**, and **multi-machine multi-GPU** configurations. |
| 12 | +- **Inference**: Provides an **OpenAI-style API** (T2I Only), a **GUI**, and a **command-line interface** for seamless model deployment. |
| 13 | +- **Embed Cache**: Optimizes GPU memory usage to enhance efficiency during inference. |
18 | 14 |
|
19 | 15 | ## Roadmap |
20 | 16 |
|
21 | 17 | - [ ] Add support for CogView4 ControlNet model |
22 | | -- [ ] Docker Image for easy deployment |
| 18 | +- [ ] Docker for easy deployment |
23 | 19 |
|
24 | 20 | ## License |
25 | 21 |
|
26 | | -This project is licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) for more details. |
| 22 | +This project is licensed under the [Apache 2.0 License](LICENSE). |
0 commit comments