Skip to content

Commit 6c77619

Browse files
committed
[docs] Pre-release
1 parent b40581f commit 6c77619

File tree

9 files changed

+31
-140
lines changed

9 files changed

+31
-140
lines changed

docs/01-Intro.md

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ slug: /
55
# Introduction
66
Cogkit is a powerful framework for working with cognitive AI models, focusing on multimodal generation and fine-tuning capabilities. It provides a unified interface for various AI tasks including text-to-image, text-to-video, and image-to-video generation.
77

8+
<!-- TODO: key features? -->
89
## Key Features
910

1011
- **Command-line Interface**: Easy-to-use CLI and Python API for both inference and fine-tuning
@@ -19,12 +20,3 @@ Please refer to the [Model Card](./05-Model%20Card.mdx) for more details.
1920

2021
<!-- FIXME: add link to the issues pages -->
2122
For more detailed troubleshooting (bug related issues), please refer to our GitHub issues page.
22-
23-
<!-- FIXME: add link to wechat? discord? or github disccussions? -->
24-
For general discussions and support, please join our [Discord server](https://discord.gg/cogmodels).
25-
26-
## License
27-
28-
<!-- FIXME: LICENSE file is not present in the repo -->
29-
30-
<!-- Cogkit is licensed under the [MIT License](./LICENSE). -->

docs/02-Installation.md

Lines changed: 6 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -32,20 +32,14 @@ Please refer to the [PyTorch installation guide](https://pytorch.org/get-started
3232
### Install Cogkit
3333
<!-- FIXME: Install via pip install cogkit or via clone&local install? -->
3434

35-
1. Clone the repository:
35+
1. Install Cogkit:
3636

37-
<!-- FIXME: add link to the repo -->
37+
<!-- TODO: add github link -->
3838
```bash
39-
git clone https://github.com/yourusername/cogkit.git
39+
pip install cogkit@git+https:
4040
```
4141

42-
2. Install Cogkit:
43-
44-
```bash
45-
pip install -e .
46-
```
47-
48-
3. Optional: for video tasks (e.g. text-to-video), install additional dependencies:
42+
2. Optional: for video tasks (e.g. text-to-video), install additional dependencies:
4943

5044
```bash
5145
pip install -e .[video]
@@ -57,9 +51,10 @@ Please refer to the [PyTorch installation guide](https://pytorch.org/get-started
5751
You can verify that cogkit is installed correctly by running:
5852

5953
```bash
60-
python -c "import cogkit; print(cogkit.__version__)"
54+
python -c "import cogkit"
6155
```
6256

57+
<!-- TODO: add in roadmap -->
6358
## [Optional] Install via docker
6459

6560
If you have any issues with the installation, you can install Cogkit via Docker. We provide a Docker image that includes all dependencies. You can pull the image from Docker Hub:

docs/03-Inference/01-CLI.md

Lines changed: 5 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -34,15 +34,15 @@ cogmodels inference [OPTIONS] PROMPT MODEL_ID_OR_PATH
3434

3535
### Examples
3636

37+
<!-- FIXME: Add example for i2v -->
38+
3739
```bash
3840
# Generate an image from text
39-
cogmodels inference "a beautiful sunset over mountains" runwayml/stable-diffusion-v1-5 --task t2i
41+
cogmodels inference "a beautiful sunset over mountains" "THUDM/CogView4-6B"
4042

4143
# Generate a video from text
42-
cogmodels inference "a cat playing with a ball" stabilityai/stable-video-diffusion-img2vid --task t2v
44+
cogmodels inference "a cat playing with a ball" "THUDM/CogVideoX1.5-5B"
4345

44-
# Generate a video from an image
45-
cogmodels inference "extend this image into a video" stabilityai/stable-video-diffusion-img2vid --task i2v --image_file input.png
4646
```
4747

4848
<!-- FIXME: remove this? -->
@@ -56,6 +56,7 @@ cogmodels finetune [OPTIONS]
5656

5757
> Note: The fine-tuning command is currently under development. Please check back for updates.
5858
59+
<!-- TODO: add docs for launch server -->
5960
## Launch Command
6061

6162
The `launch` command starts a web UI for interactive use:
@@ -84,34 +85,4 @@ This launches a web interface where you can:
8485
# Launch the web UI on the default port
8586
cogmodels launch
8687

87-
# Launch the web UI with a public URL
88-
cogmodels launch --share
89-
```
90-
91-
## Logging and Debugging
92-
93-
CogModels CLI provides different verbosity levels for logging:
94-
95-
```bash
96-
# Normal output
97-
cogmodels inference "prompt" model_id
98-
99-
# Verbose output (info level)
100-
cogmodels -v inference "prompt" model_id
101-
102-
# Very verbose output (debug level)
103-
cogmodels -vv inference "prompt" model_id
104-
```
105-
106-
## Environment Variables
107-
108-
The CLI behavior can be modified with environment variables:
109-
110-
- `COGMODELS_CACHE_DIR`: Directory to store cached models and data
111-
- `COGMODELS_OFFLINE`: Set to "1" to run in offline mode
112-
- `COGMODELS_VERBOSE`: Set verbosity level (0-2)
113-
114-
Example:
115-
```bash
116-
COGMODELS_CACHE_DIR=/path/to/cache cogmodels inference "prompt" model_id
11788
```

docs/03-Inference/02-API.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,24 +15,26 @@ from cogkit.generation import generate_image, generate_video
1515
# Text-to-Image generation
1616
image = generate_image(
1717
prompt="a beautiful sunset over mountains",
18-
model_id_or_path="runwayml/stable-diffusion-v1-5",
19-
num_inference_steps=50,
20-
seed=42
18+
model_id_or_path="THUDM/CogView4-6B",
19+
lora_model_id_or_path=None,
20+
transformer_path=None,
2121
)
2222
image.save("sunset.png")
2323

2424
# Text-to-Video generation
2525
video = generate_video(
2626
prompt="a cat playing with a ball",
27-
model_id_or_path="stabilityai/stable-video-diffusion-img2vid",
27+
model_id_or_path="THUDM/CogVideoX1.5-5B",
28+
lora_model_id_or_path=None,
29+
transformer_path=None,
2830
num_frames=81,
2931
fps=16,
30-
num_inference_steps=50,
31-
seed=42
3232
)
3333
video.save("cat_video.mp4")
3434
```
3535

3636
## API Server
3737

3838
<!-- FIXME: add docs for the API server -->
39+
40+
<!-- TODO: add examples -->

docs/04-Finetune/01-Prerequisites.mdx

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,6 @@ Before starting fine-tuning, please ensure your machine meets the minimum hardwa
129129

130130
## CogView Series
131131

132-
{/* <!-- TODO: add table for Cogview Series --> */}
133132
<table style={{ textAlign: "center" }}>
134133
<thead>
135134
<tr>

docs/04-Finetune/02-Quick Start.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,12 @@
44

55
Please refer to the [installation guide](../02-Installation.md) to setup your environment
66

7-
<!-- TODO: clone the repo to finetune? -->
7+
<!-- TODO: clone the repo to finetune? clone -->
88

99
## Data
1010

1111
Before fine-tuning, you need to prepare your dataset according to the expected format. See the [data format](./03-Data%20Format.md) documentation for details on how to structure your data
1212

13-
<!-- TODO: add link to data format-->
14-
1513
## Training
1614

1715
:::info
@@ -39,12 +37,17 @@ We recommend that you read the corresponding [model card](../05-Model%20Card.mdx
3937

4038
## Load Fine-tuned Model
4139

42-
<!-- TODO: 缺一个合并zero权重的脚本(合并后只有一个transformer的权重,让用户自己把这个权重
43-
替换到pipeline文件里,还是cli/api里直接提供一个transformer的权重路径?) -->
44-
4540
### LoRA
4641

47-
After fine-tuning with LoRA, you can load your trained weights during inference using the `--lora_model_id_or_path` parameter. For more details, please refer to the inference guide.
48-
42+
After fine-tuning with LoRA, you can load your trained weights during inference using the `--lora_model_id_or_path` option or parameter. For more details, please refer to the inference guide.
4943

5044
### ZeRO
45+
46+
After fine-tuning with ZeRO strategy, you need to use the `zero_to_fp32.py` script provided in the `scripts` directory to convert the ZeRO checkpoint weights into Diffusers format. For example:
47+
48+
<!-- FIXME: path to zero2diffusers.py? -->
49+
```bash
50+
python zero2diffusers.py checkpoint_dir/ output_dir/ --bfloat16
51+
```
52+
53+
During inference, pass the `output_dir/` to the `--transformer_path` option or parameter. For more details, please refer to the inference guide.

src/cogmodels/finetune/data/README.md

Lines changed: 0 additions & 71 deletions
This file was deleted.

src/cogmodels/finetune/diffusion/models/cogvideo/README.md

Whitespace-only changes.

src/cogmodels/finetune/diffusion/models/cogview/README.md

Whitespace-only changes.

0 commit comments

Comments
 (0)