Releases · hpcaitech/ColossalAI

25 May 08:26

github-actions

v0.3.0

d42b1be

Version v0.3.0 Release Today!

What's Changed

Release

[release] bump to v0.3.0 (#3830) by Frank Lee

Nfc

[nfc] fix typo colossalai/ applications/ (#3831) by digger yu
[NFC]fix typo colossalai/auto_parallel nn utils etc. (#3779) by digger yu
[NFC] fix typo colossalai/amp auto_parallel autochunk (#3756) by digger yu
[NFC] fix typo with colossalai/auto_parallel/tensor_shard (#3742) by digger yu
[NFC] fix typo applications/ and colossalai/ (#3735) by digger-yu
[NFC] polish colossalai/engine/gradient_handler/init.py code style (#3329) by Ofey Chan
[NFC] polish colossalai/context/random/init.py code style (#3327) by yuxuan-lou
[NFC] polish colossalai/fx/tracer/_tracer_utils.py (#3323) by Michelle
[NFC] polish colossalai/gemini/paramhooks/_param_hookmgr.py code style by Xu Kai
[NFC] polish initializer_data.py code style (#3287) by RichardoLuo
[NFC] polish colossalai/cli/benchmark/models.py code style (#3290) by Ziheng Qin
[NFC] polish initializer_3d.py code style (#3279) by Kai Wang (Victor Kai)
[NFC] polish colossalai/engine/gradient_accumulation/_gradient_accumulation.py code style (#3277) by Sze-qq
[NFC] polish colossalai/context/parallel_context.py code style (#3276) by Arsmart1
[NFC] polish colossalai/engine/schedule/_pipeline_schedule_v2.py code style (#3275) by Zirui Zhu
[NFC] polish colossalai/nn/_ops/addmm.py code style (#3274) by Tong Li
[NFC] polish colossalai/amp/init.py code style (#3272) by lucasliunju
[NFC] polish code style (#3273) by Xuanlei Zhao
[NFC] policy colossalai/fx/proxy.py code style (#3269) by CZYCW
[NFC] polish code style (#3268) by Yuanchen
[NFC] polish tensor_placement_policy.py code style (#3265) by Camille Zhong
[NFC] polish colossalai/fx/passes/split_module.py code style (#3263) by CsRic
[NFC] polish colossalai/global_variables.py code style (#3259) by jiangmingyan
[NFC] polish colossalai/engine/gradient_handler/_moe_gradient_handler.py (#3260) by LuGY
[NFC] polish colossalai/fx/profiler/experimental/profiler_module/embedding.py code style (#3256) by dayellow

Doc

[doc] update document of gemini instruction. (#3842) by jiangmingyan
Merge pull request #3810 from jiangmingyan/amp by jiangmingyan
[doc]fix by jiangmingyan
[doc]fix by jiangmingyan
[doc] add warning about fsdp plugin (#3813) by Hongxin Liu
[doc] add removed change of config.py by jiangmingyan
[doc] add removed warning by jiangmingyan
[doc] update amp document by Mingyan Jiang
[doc] update amp document by Mingyan Jiang
[doc] update amp document by Mingyan Jiang
[doc] update gradient accumulation (#3771) by jiangmingyan
[doc] update gradient cliping document (#3778) by jiangmingyan
[doc] add deprecated warning on doc Basics section (#3754) by Yanjia0
[doc] add booster docstring and fix autodoc (#3789) by Hongxin Liu
[doc] add tutorial for booster checkpoint (#3785) by Hongxin Liu
[doc] add tutorial for booster plugins (#3758) by Hongxin Liu
[doc] add tutorial for cluster utils (#3763) by Hongxin Liu
[doc] update hybrid parallelism doc (#3770) by jiangmingyan
[doc] update booster tutorials (#3718) by jiangmingyan
[doc] fix chat spelling error (#3671) by digger-yu
[Doc] enhancement on README.md for chat examples (#3646) by Camille Zhong
[doc] Fix typo under colossalai and doc(#3618) by digger-yu
[doc] .github/workflows/README.md (#3605) by digger-yu
[doc] fix setup.py typo (#3603) by digger-yu
[doc] fix op_builder/README.md (#3597) by digger-yu
[doc] Update .github/workflows/README.md (#3577) by digger-yu
[doc] Update 1D_tensor_parallel.md (#3573) by digger-yu
[doc] Update 1D_tensor_parallel.md (#3563) by digger-yu
[doc] Update README.md (#3549) by digger-yu
[doc] Update README-zh-Hans.md (#3541) by digger-yu
[doc] hide diffusion in application path (#3519) by binmakeswell
[doc] add requirement and highlight application (#3516) by binmakeswell
[doc] Add docs for clip args in zero optim (#3504) by YH
[doc] updated contributor list (#3474) by Frank Lee
[doc] polish diffusion example (#3386) by Jan Roudaut
[doc] add Intel cooperation news (#3333) by binmakeswell
[doc] added authors to the chat application (#3307) by Fazzie-Maqianli

Workflow

[workflow] supported test on CUDA 10.2 (#3841) by Frank Lee
[workflow] fixed testmon cache in build CI (#3806) by Frank Lee
[workflow] changed to doc build to be on schedule and release (#3825) by Frank Lee
[workflow] enblaed doc build from a forked repo (#3815) by Frank Lee
[workflow] enable testing for develop & feature branch (#3801) by Frank Lee
[workflow] fixed the docker build workflow (#3794) by Frank Lee

Booster

[booster] add warning for torch fsdp plugin doc (#3833) by wukong1992
[booster] torch fsdp fix ckpt (#3788) by wukong1992
[booster] removed models that don't support fsdp (#3744) by wukong1992
[booster] support torch fsdp plugin in booster (#3697) by wukong1992
[booster] add tests for ddp and low level zero's checkpointio (#3715) by jiangmingyan
[booster] fix no_sync method (#3709) by Hongxin Liu
[booster] update prepare dataloader method for plugin (#3706) by Hongxin Liu
[booster] refactor all dp fashion plugins (#3684) by Hongxin Liu
[booster] gemini plugin support shard checkpoint (#3610) by jiangmingyan
[booster] add low level zero plugin (#3594) by Hongxin Liu
[booster] fixed the torch ddp plugin with the new checkpoint api (#3442) by Frank Lee
[booster] implement Gemini plugin (#3352) by ver217

Docs

[docs] change placememt_policy to placement_policy (#3829) by digger yu

Evaluation

[evaluation] add automatic evaluation pipeline (#3821) by Yuanchen

Docker

[Docker] Fix a couple of build issues (#3691) by Yanming W
Fix/docker action (#3266) by liuzeming

Api

[API] add docstrings and initialization to apex amp, naive amp (#3783) by jiangmingyan

Test

[test] fixed lazy init test import error (#3799) by Frank Lee
Update test_ci.sh by Camille Zhong
[test] refactor tests with spawn (#3452) by Frank Lee
[test] reorganize zero/gem...

Assets 2

29 Mar 02:26

github-actions

v0.2.8

a0b3749

Version v0.2.8 Release Today!

What's Changed

Release

[release] v0.2.8 (#3305) by Frank Lee

Format

[format] applied code formatting on changed files in pull request 3300 (#3302) by github-actions[bot]
[format] applied code formatting on changed files in pull request 3296 (#3298) by github-actions[bot]

Doc

[doc] add ColossalChat news (#3304) by binmakeswell
[doc] add ColossalChat (#3297) by binmakeswell
[doc] fix typo (#3222) by binmakeswell
[doc] update chatgpt doc paper link (#3229) by Camille Zhong
[doc] add community contribution guide (#3153) by binmakeswell
[doc] add Intel cooperation for biomedicine (#3108) by binmakeswell

Application

[application] updated the README (#3301) by Frank Lee

Chat

[chat]polish prompts training (#3300) by BlueRum
[chat]Update Readme (#3296) by BlueRum

Coati

[coati] fix inference profanity check (#3299) by ver217
[coati] inference supports profanity check (#3295) by ver217
[coati] add repetition_penalty for inference (#3294) by ver217
[coati] fix inference output (#3285) by ver217
[Coati] first commit (#3283) by Fazzie-Maqianli

Colossalchat

[ColossalChat]add cite for datasets (#3292) by Fazzie-Maqianli

Examples

[examples] polish AutoParallel readme (#3270) by YuliangLiu0306
[examples] Solving the diffusion issue of incompatibility issue#3169 (#3170) by NatalieC323

Fx

[fx] meta registration compatibility (#3253) by HELSON
[FX] refactor experimental tracer and adapt it with hf models (#3157) by YuliangLiu0306

Booster

[booster] implemented the torch ddd + resnet example (#3232) by Frank Lee
[booster] implemented the cluster module (#3191) by Frank Lee
[booster] added the plugin base and torch ddp plugin (#3180) by Frank Lee
[booster] added the accelerator implementation (#3159) by Frank Lee
[booster] implemented mixed precision class (#3151) by Frank Lee

Ci

[CI] Fix pre-commit workflow (#3238) by Hakjin Lee

Api

[API] implement device mesh manager (#3221) by YuliangLiu0306
[api] implemented the checkpoint io module (#3205) by Frank Lee

Hotfix

[hotfix] skip torchaudio tracing test (#3211) by YuliangLiu0306
[hotfix] layout converting issue (#3188) by YuliangLiu0306

Chatgpt

[chatgpt] add precision option for colossalai (#3233) by ver217
[chatgpt] unnify datasets (#3218) by Fazzie-Maqianli
[chatgpt] support instuct training (#3216) by Fazzie-Maqianli
[chatgpt]add reward model code for deberta (#3199) by Yuanchen
[chatgpt]support llama (#3070) by Fazzie-Maqianli
[chatgpt] add supervised learning fine-tune code (#3183) by pgzhang
[chatgpt]Reward Model Training Process update (#3133) by BlueRum
[chatgpt] fix trainer generate kwargs (#3166) by ver217
[chatgpt] fix ppo training hanging problem with gemini (#3162) by ver217
[chatgpt]update ci (#3087) by BlueRum
[chatgpt]Fix examples (#3116) by BlueRum
[chatgpt] fix lora support for gpt (#3113) by BlueRum
[chatgpt] type miss of kwargs (#3107) by hiko2MSP
[chatgpt] fix lora save bug (#3099) by BlueRum

Lazyinit

[lazyinit] combine lazy tensor with dtensor (#3204) by ver217
[lazyinit] add correctness verification (#3147) by ver217
[lazyinit] refactor lazy tensor and lazy init ctx (#3131) by ver217

Auto

[auto] fix requirements typo for issue #3125 (#3209) by Yan Fang

Analyzer

[Analyzer] fix analyzer tests (#3197) by YuliangLiu0306

Dreambooth

[dreambooth] fixing the incompatibity in requirements.txt (#3190) by NatalieC323

Auto-parallel

[auto-parallel] add auto-offload feature (#3154) by Zihao

Zero

[zero] Refactor ZeroContextConfig class using dataclass (#3186) by YH

Test

[test] fixed torchrec registration in model zoo (#3177) by Frank Lee
[test] fixed torchrec model test (#3167) by Frank Lee
[test] add torchrec models to test model zoo (#3139) by YuliangLiu0306
[test] added transformers models to test model zoo (#3135) by Frank Lee
[test] added torchvision models to test model zoo (#3132) by Frank Lee
[test] added timm models to test model zoo (#3129) by Frank Lee

Refactor

[refactor] update docs (#3174) by Saurav Maheshkar

Tests

[tests] model zoo add torchaudio models (#3138) by ver217
[tests] diffuser models in model zoo (#3136) by HELSON

Docker

[docker] Add opencontainers image-spec to Dockerfile (#3006) by Saurav Maheshkar

Dtensor

[DTensor] refactor dtensor with new components (#3089) by YuliangLiu0306

Workflow

[workflow] purged extension cache before GPT test (#3128) by Frank Lee

Autochunk

[autochunk] support complete benchmark (#3121) by Xuanlei Zhao

Tutorial

[tutorial] update notes for TransformerEngine (#3098) by binmakeswell

Nvidia

[NVIDIA] Add FP8 example using TE (#3080) by Kirthi Shankar Sivamani

Full Changelog: v0.2.8...v0.2.7

Assets 2

10 Mar 06:56

github-actions

v0.2.7

26db1cb

Version v0.2.7 Release Today!

What's Changed

Release

[release] v0.2.7 (#3094) by Frank Lee
[release] v0.2.6 (#3057) by Frank Lee

Chatgpt

[chatgpt]add flag of action mask in critic(#3086) by Fazzie-Maqianli
[chatgpt] change critic input as state (#3042) by wenjunyang
[chatgpt] fix readme (#3025) by BlueRum
[chatgpt] Add saving ckpt callback for PPO (#2880) by LuGY
[chatgpt]fix inference model load (#2988) by BlueRum
[chatgpt] allow shard init and display warning (#2986) by ver217
[chatgpt] fix lora gemini conflict in RM training (#2984) by BlueRum
[chatgpt] making experience support dp (#2971) by ver217
[chatgpt]fix lora bug (#2974) by BlueRum
[chatgpt] fix inference demo loading bug (#2969) by BlueRum
[ChatGPT] fix README (#2966) by Fazzie-Maqianli
[chatgpt]add inference example (#2944) by BlueRum
[chatgpt]support opt & gpt for rm training (#2876) by BlueRum
[chatgpt] Support saving ckpt in examples (#2846) by BlueRum
[chatgpt] fix rm eval (#2829) by BlueRum
[chatgpt] add test checkpoint (#2797) by ver217
[chatgpt] update readme about checkpoint (#2792) by ver217
[chatgpt] startegy add prepare method (#2766) by ver217
[chatgpt] disable shard init for colossalai (#2767) by ver217
[chatgpt] support colossalai strategy to train rm (#2742) by BlueRum
[chatgpt]fix train_rm bug with lora (#2741) by BlueRum

Kernel

[kernel] added kernel loader to softmax autograd function (#3093) by Frank Lee
[kernel] cached the op kernel and fixed version check (#2886) by Frank Lee

Analyzer

[analyzer] a minimal implementation of static graph analyzer (#2852) by Super Daniel

Diffusers

[diffusers] fix ci and docker (#3085) by Fazzie-Maqianli

Doc

[doc] fixed typos in docs/README.md (#3082) by Frank Lee
[doc] moved doc test command to bottom (#3075) by Frank Lee
[doc] specified operating system requirement (#3019) by Frank Lee
[doc] update nvme offload doc (#3014) by ver217
[doc] add ISC tutorial (#2997) by binmakeswell
[doc] add deepspeed citation and copyright (#2996) by ver217
[doc] added reference to related works (#2994) by Frank Lee
[doc] update news (#2983) by binmakeswell
[doc] fix chatgpt inference typo (#2964) by binmakeswell
[doc] add env scope (#2933) by binmakeswell
[doc] added readme for documentation (#2935) by Frank Lee
[doc] removed read-the-docs (#2932) by Frank Lee
[doc] update installation for GPT (#2922) by binmakeswell
[doc] add os scope, update tutorial install and tips (#2914) by binmakeswell
[doc] fix GPT tutorial (#2860) by dawei-wang
[doc] fix typo in opt inference tutorial (#2849) by Zheng Zeng
[doc] update OPT serving (#2804) by binmakeswell
[doc] update example and OPT serving link (#2769) by binmakeswell
[doc] add opt service doc (#2747) by Frank Lee
[doc] fixed a typo in GPT readme (#2736) by cloudhuang
[doc] updated documentation version list (#2730) by Frank Lee

Autochunk

[autochunk] support vit (#3084) by Xuanlei Zhao
[autochunk] refactor chunk memory estimation (#2762) by Xuanlei Zhao

Dtensor

[DTensor] implement layout converter (#3055) by YuliangLiu0306
[DTensor] refactor CommSpec (#3034) by YuliangLiu0306
[DTensor] refactor sharding spec (#2987) by YuliangLiu0306
[DTensor] implementation of dtensor (#2946) by YuliangLiu0306

Workflow

[workflow] fixed doc build trigger condition (#3072) by Frank Lee
[workflow] supported conda package installation in doc test (#3028) by Frank Lee
[workflow] fixed the post-commit failure when no formatting needed (#3020) by Frank Lee
[workflow] added auto doc test on PR (#2929) by Frank Lee
[workflow] moved pre-commit to post-commit (#2895) by Frank Lee

Booster

[booster] init module structure and definition (#3056) by Frank Lee

Example

[example] fix redundant note (#3065) by binmakeswell
[example] fixed opt model downloading from huggingface by Tomek
[example] add LoRA support (#2821) by Haofan Wang

Hotfix

[hotfix] skip auto checkpointing tests (#3029) by YuliangLiu0306
[hotfix] add shard dim to aviod backward communication error (#2954) by YuliangLiu0306
[hotfix]: Remove math.prod dependency (#2837) by Jiatong (Julius) Han
[hotfix] fix autoparallel compatibility test issues (#2754) by YuliangLiu0306
[hotfix] fix chunk size can not be divided (#2867) by HELSON
Hotfix/auto parallel zh doc (#2820) by YuliangLiu0306
[hotfix] add copyright for solver and device mesh (#2803) by YuliangLiu0306
[hotfix] add correct device for fake_param (#2796) by HELSON

Revert] recover "[refactor

[revert] recover "[refactor] restructure configuration files (#2977)" (#3022) by Frank Lee

Format

[format] applied code formatting on changed files in pull request 3025 (#3026) by github-actions[bot]
[format] applied code formatting on changed files in pull request 2997 (#3008) by github-actions[bot]
[format] applied code formatting on changed files in pull request 2933 (#2939) by github-actions[bot]
[format] applied code formatting on changed files in pull request 2922 (#2923) by github-actions[bot]

Pipeline

[pipeline] Add Simplified Alpa DP Partition (#2507) by Ziyue Jiang

Fx

[fx] remove depreciated algorithms. (#2312) (#2313) by Super Daniel

Refactor

[refactor] restructure configuration files (#2977) by Saurav Maheshkar

Misc

[misc] add reference (#2930) by ver217

Autoparallel

[autoparallel] apply repeat block to reduce solving time (#2912) by YuliangLiu0306
[autoparallel] find repeat blocks (#2854) by YuliangLiu0306
[autoparallel] Patch meta information for nodes that will not be handled by SPMD solver (#2823) by Boyuan Yao
[autoparallel] Patch meta information of torch.where (#2822) by Boyuan Yao
[autoparallel] Patch meta information of torch.tanh() and torch.nn.Dropout (#2773) by Boyuan Yao
[autoparallel] Patch tensor related operations meta information (#2789) by Boyuan Yao
[autoparallel] rotor solver refactor (#2813) by Boyuan Yao
[autoparallel] Patch meta information of torch.nn.Embedding (#2760) by [Boyuan Ya...

Assets 2

10 Mar 06:57

github-actions

v0.2.6

89aa792

Version v0.2.6 Release Today!

What's Changed

Release

[release] v0.2.6 (#3057) by Frank Lee

Doc

[doc] moved doc test command to bottom (#3075) by Frank Lee
[doc] specified operating system requirement (#3019) by Frank Lee
[doc] update nvme offload doc (#3014) by ver217
[doc] add ISC tutorial (#2997) by binmakeswell
[doc] add deepspeed citation and copyright (#2996) by ver217
[doc] added reference to related works (#2994) by Frank Lee
[doc] update news (#2983) by binmakeswell
[doc] fix chatgpt inference typo (#2964) by binmakeswell
[doc] add env scope (#2933) by binmakeswell
[doc] added readme for documentation (#2935) by Frank Lee
[doc] removed read-the-docs (#2932) by Frank Lee
[doc] update installation for GPT (#2922) by binmakeswell
[doc] add os scope, update tutorial install and tips (#2914) by binmakeswell
[doc] fix GPT tutorial (#2860) by dawei-wang
[doc] fix typo in opt inference tutorial (#2849) by Zheng Zeng
[doc] update OPT serving (#2804) by binmakeswell
[doc] update example and OPT serving link (#2769) by binmakeswell
[doc] add opt service doc (#2747) by Frank Lee
[doc] fixed a typo in GPT readme (#2736) by cloudhuang
[doc] updated documentation version list (#2730) by Frank Lee

Workflow

[workflow] fixed doc build trigger condition (#3072) by Frank Lee
[workflow] supported conda package installation in doc test (#3028) by Frank Lee
[workflow] fixed the post-commit failure when no formatting needed (#3020) by Frank Lee
[workflow] added auto doc test on PR (#2929) by Frank Lee
[workflow] moved pre-commit to post-commit (#2895) by Frank Lee

Booster

[booster] init module structure and definition (#3056) by Frank Lee

Example

[example] fix redundant note (#3065) by binmakeswell
[example] fixed opt model downloading from huggingface by Tomek
[example] add LoRA support (#2821) by Haofan Wang

Autochunk

[autochunk] refactor chunk memory estimation (#2762) by Xuanlei Zhao

Chatgpt

[chatgpt] change critic input as state (#3042) by wenjunyang
[chatgpt] fix readme (#3025) by BlueRum
[chatgpt] Add saving ckpt callback for PPO (#2880) by LuGY
[chatgpt]fix inference model load (#2988) by BlueRum
[chatgpt] allow shard init and display warning (#2986) by ver217
[chatgpt] fix lora gemini conflict in RM training (#2984) by BlueRum
[chatgpt] making experience support dp (#2971) by ver217
[chatgpt]fix lora bug (#2974) by BlueRum
[chatgpt] fix inference demo loading bug (#2969) by BlueRum
[ChatGPT] fix README (#2966) by Fazzie-Maqianli
[chatgpt]add inference example (#2944) by BlueRum
[chatgpt]support opt & gpt for rm training (#2876) by BlueRum
[chatgpt] Support saving ckpt in examples (#2846) by BlueRum
[chatgpt] fix rm eval (#2829) by BlueRum
[chatgpt] add test checkpoint (#2797) by ver217
[chatgpt] update readme about checkpoint (#2792) by ver217
[chatgpt] startegy add prepare method (#2766) by ver217
[chatgpt] disable shard init for colossalai (#2767) by ver217
[chatgpt] support colossalai strategy to train rm (#2742) by BlueRum
[chatgpt]fix train_rm bug with lora (#2741) by BlueRum

Dtensor

[DTensor] refactor CommSpec (#3034) by YuliangLiu0306
[DTensor] refactor sharding spec (#2987) by YuliangLiu0306
[DTensor] implementation of dtensor (#2946) by YuliangLiu0306

Hotfix

[hotfix] skip auto checkpointing tests (#3029) by YuliangLiu0306
[hotfix] add shard dim to aviod backward communication error (#2954) by YuliangLiu0306
[hotfix]: Remove math.prod dependency (#2837) by Jiatong (Julius) Han
[hotfix] fix autoparallel compatibility test issues (#2754) by YuliangLiu0306
[hotfix] fix chunk size can not be divided (#2867) by HELSON
Hotfix/auto parallel zh doc (#2820) by YuliangLiu0306
[hotfix] add copyright for solver and device mesh (#2803) by YuliangLiu0306
[hotfix] add correct device for fake_param (#2796) by HELSON

Revert] recover "[refactor

[revert] recover "[refactor] restructure configuration files (#2977)" (#3022) by Frank Lee

Format

[format] applied code formatting on changed files in pull request 3025 (#3026) by github-actions[bot]
[format] applied code formatting on changed files in pull request 2997 (#3008) by github-actions[bot]
[format] applied code formatting on changed files in pull request 2933 (#2939) by github-actions[bot]
[format] applied code formatting on changed files in pull request 2922 (#2923) by github-actions[bot]

Pipeline

[pipeline] Add Simplified Alpa DP Partition (#2507) by Ziyue Jiang

Fx

[fx] remove depreciated algorithms. (#2312) (#2313) by Super Daniel

Refactor

[refactor] restructure configuration files (#2977) by Saurav Maheshkar

Kernel

[kernel] cached the op kernel and fixed version check (#2886) by Frank Lee

Misc

[misc] add reference (#2930) by ver217

Autoparallel

[autoparallel] apply repeat block to reduce solving time (#2912) by YuliangLiu0306
[autoparallel] find repeat blocks (#2854) by YuliangLiu0306
[autoparallel] Patch meta information for nodes that will not be handled by SPMD solver (#2823) by Boyuan Yao
[autoparallel] Patch meta information of torch.where (#2822) by Boyuan Yao
[autoparallel] Patch meta information of torch.tanh() and torch.nn.Dropout (#2773) by Boyuan Yao
[autoparallel] Patch tensor related operations meta information (#2789) by Boyuan Yao
[autoparallel] rotor solver refactor (#2813) by Boyuan Yao
[autoparallel] Patch meta information of torch.nn.Embedding (#2760) by Boyuan Yao
[autoparallel] distinguish different parallel strategies (#2699) by YuliangLiu0306

Zero

[zero] trivial zero optimizer refactoring (#2869) by YH
[zero] fix wrong import (#2777) by Boyuan Yao

Cli

[cli] handled version check exceptions (#2848) by Frank Lee

Triton

[triton] added copyright information for flash attention (#2835) by Frank Lee

Nfc

[NFC] polish colossalai/engine/schedule/_pipeline_schedule.py code style (#2744) by Michelle
[NFC] polish code format by binmakeswell
[NFC] polish colossala...

Assets 2

15 Feb 08:53

github-actions

v0.2.5

c5be83a

Version v0.2.5 Release Today!

What's Changed

Chatgpt

[chatgpt] optimize generation kwargs (#2717) by ver217

Autoparallel

[autoparallel] add shard option (#2696) by YuliangLiu0306
[autoparallel] fix parameters sharding bug (#2716) by YuliangLiu0306
[autoparallel] refactor runtime pass (#2644) by YuliangLiu0306
[autoparallel] remove deprecated codes (#2664) by YuliangLiu0306
[autoparallel] test compatibility for gemini and auto parallel (#2700) by YuliangLiu0306

Doc

[doc] updated documentation version list (#2715) by Frank Lee
[doc] add open-source contribution invitation (#2714) by binmakeswell
[doc] add Quick Preview (#2706) by binmakeswell
[doc] resize figure (#2705) by binmakeswell
[doc] add ChatGPT (#2703) by binmakeswell

Devops

[devops] add chatgpt ci (#2713) by ver217

Workflow

[workflow] fixed tensor-nvme build caching (#2711) by Frank Lee

App

[app] fix ChatGPT requirements (#2704) by binmakeswell
[app] add chatgpt application (#2698) by ver217

Full Changelog: v0.2.5...v0.2.4

Assets 2

14 Feb 12:07

github-actions

v0.2.4

c3abdd0

Version v0.2.4 Release Today!

What's Changed

Release

[release] update version (#2691) by ver217

Doc

[doc] update auto parallel paper link (#2686) by binmakeswell
[doc] added documentation sidebar translation (#2670) by Frank Lee

Zero1&2

[zero1&2] only append parameters with gradients (#2681) by HELSON

Gemini

[gemini] fix colo_init_context (#2683) by ver217
[gemini] add fake_release_chunk for keep-gathered chunk in the inference mode (#2671) by HELSON

Workflow

[workflow] fixed communtity report ranking (#2680) by Frank Lee
[workflow] added trigger to build doc upon release (#2678) by Frank Lee
[workflow] added doc build test (#2675) by Frank Lee

Autoparallel

[autoparallel] Patch meta information of torch.nn.functional.softmax and torch.nn.Softmax (#2674) by Boyuan Yao

Dooc

[dooc] fixed the sidebar itemm key (#2672) by Frank Lee

Full Changelog: v0.2.4...v0.2.3

Assets 2

13 Feb 01:52

github-actions

v0.2.3

81ea66d

Version v0.2.3 Release Today!

What's Changed

Release

[release] v0.2.3 (#2669) by Frank Lee

Doc

[doc] add CVPR tutorial (#2666) by binmakeswell

Docs

[Docs] layout converting management (#2665) by YuliangLiu0306

Autoparallel

[autoparallel] Patch meta information of torch.nn.LayerNorm (#2647) by Boyuan Yao

Full Changelog: v0.2.3...v0.2.2

Assets 2

10 Feb 03:02

github-actions

v0.2.2

b673e5f

Version v0.2.2 Release Today!

What's Changed

Release

[release] v0.2.2 (#2661) by Frank Lee

Workflow

[workflow] fixed gpu memory check condition (#2659) by Frank Lee
[workflow] fixed the test coverage report (#2614) by Frank Lee
[workflow] fixed test coverage report (#2611) by Frank Lee

Example

[example] Polish README.md (#2658) by Jiatong (Julius) Han

Doc

[doc] fixed compatiblity with docusaurus (#2657) by Frank Lee
[doc] added docusaurus-based version control (#2656) by Frank Lee
[doc] migrate the markdown files (#2652) by Frank Lee
[doc] fix typo of BLOOM (#2643) by binmakeswell
[doc] removed pre-built wheel installation from readme (#2637) by Frank Lee
[doc] updated the sphinx theme (#2635) by Frank Lee
[doc] fixed broken badge (#2623) by Frank Lee

Autoparallel

[autoparallel] refactor handlers which reshape input tensors (#2615) by YuliangLiu0306
[autoparallel] adapt autoparallel tests with latest api (#2626) by YuliangLiu0306
[autoparallel] Patch meta information of torch.matmul (#2584) by Boyuan Yao

Tutorial

[tutorial] added energonai to opt inference requirements (#2625) by Frank Lee
[tutorial] add video link (#2619) by binmakeswell

Autochunk

[autochunk] support diffusion for autochunk (#2621) by oahzxl

Build

[build] fixed the doc build process (#2618) by Frank Lee

Test

[test] fixed the triton version for testing (#2608) by Frank Lee

Full Changelog: v0.2.2...v0.2.1

Assets 2

0 Join discussion

06 Feb 13:44

github-actions

v0.2.1

f566b0c

Version v0.2.1 Release Today!

What's Changed

Workflow

[workflow] fixed broken rellease workflows (#2604) by Frank Lee
[workflow] added cuda extension build test before release (#2598) by Frank Lee
[workflow] hooked pypi release with lark (#2596) by Frank Lee
[workflow] hooked docker release with lark (#2594) by Frank Lee
[workflow] added test-pypi check before release (#2591) by Frank Lee
[workflow] fixed the typo in the example check workflow (#2589) by Frank Lee
[workflow] hook compatibility test failure to lark (#2586) by Frank Lee
[workflow] hook example test alert with lark (#2585) by Frank Lee
[workflow] added notification if scheduled build fails (#2574) by Frank Lee
[workflow] added discussion stats to community report (#2572) by Frank Lee
[workflow] refactored compatibility test workflow for maintenability (#2560) by Frank Lee
[workflow] adjust the GPU memory threshold for scheduled unit test (#2558) by Frank Lee
[workflow] fixed example check workflow (#2554) by Frank Lee
[workflow] fixed typos in the leaderboard workflow (#2567) by Frank Lee
[workflow] added contributor and user-engagement report (#2564) by Frank Lee
[workflow] only report coverage for changed files (#2524) by Frank Lee
[workflow] fixed the precommit CI (#2525) by Frank Lee
[workflow] fixed changed file detection (#2515) by Frank Lee
[workflow] fixed the skip condition of example weekly check workflow (#2481) by Frank Lee
[workflow] automated bdist wheel build (#2459) by Frank Lee
[workflow] automated the compatiblity test (#2453) by Frank Lee
[workflow] fixed the on-merge condition check (#2452) by Frank Lee
[workflow] make test coverage report collapsable (#2436) by Frank Lee
[workflow] report test coverage even if below threshold (#2431) by Frank Lee
[workflow]auto comment with test coverage report (#2419) by Frank Lee
[workflow] auto comment if precommit check fails (#2417) by Frank Lee
[workflow] added translation for non-english comments (#2414) by Frank Lee
[workflow] added precommit check for code consistency (#2401) by Frank Lee
[workflow] refactored the example check workflow (#2411) by Frank Lee
[workflow] added nightly release to pypi (#2403) by Frank Lee
[workflow] added missing file change detection output (#2387) by Frank Lee
[workflow]New version: Create workflow files for examples' auto check (#2298) by ziyuhuang123
[workflow] fixed pypi release workflow error (#2328) by Frank Lee
[workflow] fixed pypi release workflow error (#2327) by Frank Lee
[workflow] added workflow to release to pypi upon version change (#2320) by Frank Lee
[workflow] removed unused assign reviewer workflow (#2318) by Frank Lee
[workflow] rebuild cuda kernels when kernel-related files change (#2317) by Frank Lee

Release

[release] v0.2.1 (#2602) by Frank Lee

Doc

[doc] updated readme for CI/CD (#2600) by Frank Lee
[doc] fixed issue link in pr template (#2577) by Frank Lee
[doc] updated the CHANGE_LOG.md for github release page (#2552) by Frank Lee
[doc] fixed the typo in pr template (#2556) by Frank Lee
[doc] added pull request template (#2550) by Frank Lee
[doc] update example link (#2520) by binmakeswell
[doc] update opt and tutorial links (#2509) by binmakeswell
[doc] added documentation for CI/CD (#2420) by Frank Lee
[doc] updated kernel-related optimisers' docstring (#2385) by Frank Lee
[doc] updated readme regarding pypi installation (#2406) by Frank Lee
[doc] hotfix #2377 by Jiarui Fang
[doc] hotfix #2377 by jiaruifang
[doc] update stable diffusion link (#2322) by binmakeswell
[doc] update diffusion doc (#2296) by binmakeswell
[doc] update news (#2295) by binmakeswell
[doc] update news by binmakeswell

Setup

[setup] fixed inconsistent version meta (#2578) by Frank Lee
[setup] refactored setup.py for dependency graph (#2413) by Frank Lee
[setup] support pre-build and jit-build of cuda kernels (#2374) by Frank Lee
[setup] make cuda extension build optional (#2336) by Frank Lee
[setup] remove torch dependency (#2333) by Frank Lee
[setup] removed the build dependency on colossalai (#2307) by Frank Lee

Tutorial

[tutorial] polish README (#2568) by binmakeswell
[tutorial] update fastfold tutorial (#2565) by oahzxl

Polish

[polish] polish ColoTensor and its submodules (#2537) by HELSON
[polish] polish code for get_static_torch_model (#2405) by HELSON

Kernel

[kernel] fixed repeated loading of kernels (#2549) by Frank Lee

Hotfix

[hotfix] fix zero ddp warmup check (#2545) by ver217
[hotfix] fix autoparallel demo (#2533) by YuliangLiu0306
[hotfix] fix lightning error (#2529) by HELSON
[hotfix] meta tensor default device. (#2510) by Super Daniel
[hotfix] gpt example titans bug #2493 (#2494) by Jiarui Fang
[hotfix] gpt example titans bug #2493 by jiaruifang
[hotfix] add norm clearing for the overflow step (#2416) by HELSON
[hotfix] add DISTPAN argument for benchmark (#2412) by HELSON
[hotfix] fix gpt gemini example (#2404) by HELSON
[hotfix] issue #2388 by Jiarui Fang
[hotfix] issue #2388 by jiaruifang
[hotfix] fix implement error in diffusers by Jiarui Fang
[hotfix] fix implement error in diffusers by 1SAA

Autochunk

[autochunk] add benchmark for transformer and alphafold (#2543) by oahzxl
[autochunk] support multi outputs chunk search (#2538) by oahzxl
[autochunk] support transformer (#2526) by oahzxl
[autochunk] support parsing blocks (#2506) by oahzxl
[autochunk] support autochunk on evoformer (#2497) by oahzxl
[autochunk] support evoformer tracer (#2485) by oahzxl
[autochunk] add autochunk feature by Jiarui Fang

Git

[git] remove invalid submodule (#2540) by binmakeswell

Gemini

[gemini] add profiler in the demo (#2534) by HELSON
[gemini] update the gpt example (#2527) by HELSON
[gemini] update ddp strict mode (#2518) by HELSON
[gemini] add get static torch model (#2356) by HELSON

Example

[example] Add fastfold tutorial (#2528) by [LuGY]...

Assets 2

03 Jan 12:29

github-actions

v0.2.0

26e171a

Version v0.2.0 Release Today!

What's Changed

Version

[version] 0.1.14 -> 0.2.0 (#2286) by Jiarui Fang

Examples

[examples] using args and combining two versions for PaLM (#2284) by ZijianYY
[examples] replace einsum with matmul (#2210) by ZijianYY

Doc

[doc] add feature diffusion v2, bloom, auto-parallel (#2282) by binmakeswell
[doc] updated the stable diffussion on docker usage (#2244) by Frank Lee

Zero

[zero] polish low level zero optimizer (#2275) by HELSON
[zero] fix error for BEiT models (#2169) by HELSON

Example

[example] add benchmark (#2276) by Ziyue Jiang
[example] fix save_load bug for dreambooth (#2280) by BlueRum
[example] GPT polish readme (#2274) by Jiarui Fang
[example] fix gpt example with 0.1.10 (#2265) by HELSON
[example] clear diffuser image (#2262) by Fazzie-Maqianli
[example] diffusion install from docker (#2239) by Jiarui Fang
[example] fix benchmark.sh for gpt example (#2229) by HELSON
[example] make palm + GeminiDPP work (#2227) by Jiarui Fang
[example] Palm adding gemini, still has bugs (#2221) by ZijianYY
[example] update gpt example (#2225) by HELSON
[example] add benchmark.sh for gpt (#2226) by Jiarui Fang
[example] update gpt benchmark (#2219) by HELSON
[example] update GPT example benchmark results (#2212) by Jiarui Fang
[example] update gpt example for larger model scale (#2211) by Jiarui Fang
[example] update gpt readme with performance (#2206) by Jiarui Fang
[example] polish doc (#2201) by ziyuhuang123
[example] Change some training settings for diffusion (#2195) by BlueRum
[example] support Dreamblooth (#2188) by Fazzie-Maqianli
[example] gpt demo more accuracy tflops (#2178) by Jiarui Fang
[example] add palm pytorch version (#2172) by Jiarui Fang
[example] update vit readme (#2155) by Jiarui Fang
[example] add zero1, zero2 example in GPT examples (#2146) by HELSON

Hotfix

[hotfix] fix fp16 optimzier bug (#2273) by YuliangLiu0306
[hotfix] fix error for torch 2.0 (#2243) by xcnick
[hotfix] Fixing the bug related to ipv6 support by Tongping Liu
[hotfix] correcnt cpu_optim runtime compilation (#2197) by Jiarui Fang
[hotfix] add kwargs for colo_addmm (#2171) by Tongping Liu
[hotfix] Jit type hint #2161 (#2164) by アマデウス
[hotfix] fix auto policy of test_sharded_optim_v2 (#2157) by Jiarui Fang
[hotfix] fix aten default bug (#2158) by YuliangLiu0306

Autoparallel

[autoparallel] fix spelling error (#2270) by YuliangLiu0306
[autoparallel] gpt2 autoparallel examples (#2267) by YuliangLiu0306
[autoparallel] patch torch.flatten metainfo for autoparallel (#2247) by Boyuan Yao
[autoparallel] autoparallel initialize (#2238) by YuliangLiu0306
[autoparallel] fix construct meta info. (#2245) by Super Daniel
[autoparallel] record parameter attribute in colotracer (#2217) by YuliangLiu0306
[autoparallel] Attach input, buffer and output tensor to MetaInfo class (#2162) by Boyuan Yao
[autoparallel] new metainfoprop based on metainfo class (#2179) by Boyuan Yao
[autoparallel] update getitem handler (#2207) by YuliangLiu0306
[autoparallel] update_getattr_handler (#2193) by YuliangLiu0306
[autoparallel] add gpt2 performance test code (#2194) by YuliangLiu0306
[autoparallel] integrate_gpt_related_tests (#2134) by YuliangLiu0306
[autoparallel] memory estimation for shape consistency (#2144) by Boyuan Yao
[autoparallel] use metainfo in handler (#2149) by YuliangLiu0306

Gemini

[Gemini] fix the convert_to_torch_module bug (#2269) by Jiarui Fang

Pipeline middleware

[Pipeline Middleware] Reduce comm redundancy by getting accurate output (#2232) by Ziyue Jiang

Builder

[builder] builder for scaled_upper_triang_masked_softmax (#2234) by Jiarui Fang
[builder] polish builder with better base class (#2216) by Jiarui Fang
[builder] raise Error when CUDA_HOME is not set (#2213) by Jiarui Fang
[builder] multihead attn runtime building (#2203) by Jiarui Fang
[builder] unified cpu_optim fused_optim inferface (#2190) by Jiarui Fang
[builder] use runtime builder for fused_optim (#2189) by Jiarui Fang
[builder] runtime adam and fused_optim builder (#2184) by Jiarui Fang
[builder] use builder() for cpu adam and fused optim in setup.py (#2187) by Jiarui Fang

Logger

[logger] hotfix, missing _FORMAT (#2231) by Super Daniel

Diffusion

[diffusion] update readme (#2214) by HELSON

Testing

[testing] add beit model for unit testings (#2196) by HELSON

NFC

[NFC] fix some typos' (#2175) by ziyuhuang123
[NFC] update news link (#2191) by binmakeswell
[NFC] fix a typo 'stable-diffusion-typo-fine-tune' by Arsmart1

Exmaple

[exmaple] diffuser, support quant inference for stable diffusion (#2186) by BlueRum
[exmaple] add vit missing functions (#2154) by Jiarui Fang

Pipeline middleware

[Pipeline Middleware ] Fix deadlock when num_microbatch=num_stage (#2156) by Ziyue Jiang

Full Changelog: v0.2.0...v0.1.13

Assets 2

0 Join discussion

Releases: hpcaitech/ColossalAI

Version v0.3.0 Release Today!

What's Changed

Release

Nfc

Doc

Workflow

Booster

Docs

Evaluation

Docker

Api

Test

Uh oh!

Version v0.2.8 Release Today!

What's Changed

Release

Format

Doc

Application

Chat

Coati

Colossalchat

Examples

Fx

Booster

Ci

Api

Hotfix

Chatgpt

Lazyinit

Auto

Analyzer

Dreambooth

Auto-parallel

Zero

Test

Refactor

Tests

Docker

Dtensor

Workflow

Autochunk

Tutorial

Nvidia

Uh oh!

Version v0.2.7 Release Today!

What's Changed

Release

Chatgpt

Kernel

Analyzer

Diffusers

Doc

Autochunk

Dtensor

Workflow

Booster

Example

Hotfix

Revert] recover "[refactor

Format

Pipeline

Fx

Refactor

Misc

Autoparallel

Uh oh!

Version v0.2.6 Release Today!

What's Changed

Release

Doc

Workflow

Booster

Example

Autochunk

Chatgpt

Dtensor

Hotfix

Revert] recover "[refactor