You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,9 @@
2
2
3
3
## Introduction
4
4
5
-
**CogKit** is an open-source project that provides a user-friendly interface for researchers and developers to utilize ZhipuAI's [**CogView**](https://huggingface.co/collections/THUDM/cogview-67ac3f241eefad2af015669b) (image generation) and [**CogVideoX**](https://huggingface.co/collections/THUDM/cogvideo-66c08e62f1685a3ade464cce) (video generation) models. It streamlines multimodal tasks such as **text-to-image (T2I)**, **text-to-video (T2V)**, and **image-to-video (I2V)**. Users must comply with legal and ethical guidelines to ensure responsible implementation.
5
+
**`cogkit`** is an open-source project that provides a user-friendly interface for researchers and developers to utilize ZhipuAI's [**CogView**](https://huggingface.co/collections/THUDM/cogview-67ac3f241eefad2af015669b) (image generation) and [**CogVideoX**](https://huggingface.co/collections/THUDM/cogvideo-66c08e62f1685a3ade464cce) (video generation) models. It streamlines multimodal tasks such as **text-to-image (T2I)**, **text-to-video (T2V)**, and **image-to-video (I2V)**. Users must comply with legal and ethical guidelines to ensure responsible implementation.
6
6
7
-
Visit our [**Docs**](https://thudum.github.io/CogKit) to start.
7
+
Visit our [**Docs**](https://thudm.github.io/CogKit) to start.
Copy file name to clipboardExpand all lines: docs/01-Intro.md
+1-7Lines changed: 1 addition & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,13 +4,7 @@ slug: /
4
4
5
5
# Introduction
6
6
7
-
CogKit is a powerful framework for working with ZhipuAI Cog Series models, focusing on multimodal generation and fine-tuning capabilities.
8
-
It provides a unified interface for various AI tasks including text-to-image, text-to-video, and image-to-video generation.
9
-
10
-
## Key Features
11
-
12
-
-**Command-line Interface**: Easy-to-use CLI and Python API for both inference and fine-tuning
13
-
-**Fine-tuning Support**: With LoRA or full model fine-tuning support to customize models with your own data
7
+
`cogkit` is a powerful framework for working with cognitive AI models, focusing on multi-modal generation and fine-tuning capabilities. It provides a unified interface for various AI tasks including text-to-image, text-to-video, and image-to-video generation.
Copy file name to clipboardExpand all lines: docs/02-Installation.md
+9-18Lines changed: 9 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,37 +3,29 @@
3
3
4
4
# Installation
5
5
6
-
CogKit can be installed using pip. We recommend using a virtual environment to avoid conflicts with other packages.
6
+
`cogkit` can be installed using pip. We recommend using a virtual environment to avoid conflicts with other packages.
7
7
8
8
## Requirements
9
9
10
10
- Python 3.10 or higher
11
-
- CUDA-compatible GPU (for optimal performance)
12
-
- At least 8GB of GPU memory for inference, 16GB+ recommended for fine-tuning
11
+
- OpenCV and PyTorch
13
12
14
13
## Installation Steps
15
14
16
-
### Create a virtual environment (recommended)
15
+
### OpenCV
17
16
18
-
```bash
19
-
# Using venv
20
-
python -m venv cogkit-env
21
-
source cogkit-env/bin/activate
22
-
23
-
# Or using conda
24
-
conda create -n cogkit-env python=3.10
25
-
conda activate cogkit-env
26
-
```
17
+
Please refer to the [opencv-python installation guide](https://github.com/opencv/opencv-python?tab=readme-ov-file#installation-and-usage) for instructions on installing OpenCV according to your system.
27
18
28
-
### Install PyTorch
19
+
### PyTorch
29
20
30
21
Please refer to the [PyTorch installation guide](https://pytorch.org/get-started/locally/) for instructions on installing PyTorch according to your system.
Copy file name to clipboardExpand all lines: docs/03-Inference/01-CLI.md
+16-39Lines changed: 16 additions & 39 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@
4
4
<!-- TODO: check this doc -->
5
5
# Command-Line Interface
6
6
7
-
CogKit provides a powerful command-line interface (CLI) that allows you to perform various tasks without writing Python code. This guide covers the available commands and their usage.
7
+
`cogkit` provides a powerful command-line interface (CLI) that allows you to perform various tasks without writing Python code. This guide covers the available commands and their usage.
Copy file name to clipboardExpand all lines: docs/03-Inference/02-API.md
+6-2Lines changed: 6 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,11 +3,11 @@
3
3
4
4
# API
5
5
6
-
CogKit provides a powerful inference API for generating images and videos using various AI models. This document covers both the Python API and API server.
6
+
`cogkit` provides a powerful inference API for generating images and videos using various AI models. This document covers both the Python API and API server.
7
7
8
8
## Python API
9
9
10
-
You can also use Cogkit programmatically in your Python code:
10
+
You can also use `cogkit` programmatically in your Python code:
11
11
12
12
```python
13
13
from cogkit.generation import generate_image, generate_video
@@ -32,6 +32,10 @@ video = generate_video(
32
32
)
33
33
video.save("cat_video.mp4")
34
34
```
35
+
<!-- TODO: add examples for i2v -->
36
+
37
+
<!-- FIXME: correct url -->
38
+
See function signatures in [generation.py](...) for more details.
Copy file name to clipboardExpand all lines: docs/04-Finetune/03-Data Format.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,7 @@
7
7
`src/cogkit/finetune/data` directory contains various dataset templates for fine-tuning different models, please refer to the corresponding dataset template based on your task type:
8
8
9
9
## Text-to-Image Conversion Dataset (t2i)
10
+
10
11
- Each directory contains a set of image files (`.png`)
11
12
- The `metadata.jsonl` file contains text descriptions for each image
12
13
@@ -34,12 +35,11 @@
34
35
```
35
36
36
37
:::info
37
-
- Image files are optional; if not provided, the system will default to using the first frame of the video as the input image
38
-
- When image files are provided, they are associated with the video file of the same name through the id field
38
+
- Image files are optional; if not provided, the system will default to using the first frame of the video as the input image
39
+
- When image files are provided, they are associated with the video file of the same name through the id field
39
40
:::
40
41
41
42
## Notes
42
43
43
-
- Training sets (`train/`) are used for model training
44
-
- Test sets (`test/`) are used for evaluating model performance
45
-
- Each dataset will generate a `.cache/` directory during training, used to store preprocessed cache data. If the dataset changes, you need to **manually delete this directory** and retrain.
44
+
- Training sets (`train/`) are used for model training, test sets (`test/`) are used for evaluating model performance
45
+
- Each dataset will generate a `.cache/` directory during training, used to store preprocessed data. If the dataset changes, you need to **manually delete this directory** and retrain.
0 commit comments