Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
249 commits
Select commit Hold shift + click to select a range
8236e60
fix requirements, add allow resume_download
xiegeo Jan 27, 2024
8ed51a7
reduce memory usage
xiegeo Jan 29, 2024
be00021
decrease memory usage and default rounds
xiegeo Jan 31, 2024
6283f33
change defaults to limit memory usage, fix/debug log from breaking code.
xiegeo Feb 7, 2024
b0c04f0
improve dataloader
xiegeo Feb 16, 2024
376f5ec
ignore model.pt
xiegeo Feb 19, 2024
4c4f5ad
fix missing max_size
xiegeo Feb 19, 2024
66025b8
fix change num_*_clients support, fix change pub_data_num support, ad…
xiegeo Feb 21, 2024
b65be1d
update readme
xiegeo Feb 21, 2024
f3f18a3
fix distill bug when missing image or text clients.
xiegeo Feb 22, 2024
c8d31bc
add max_size to f30k loader
xiegeo Feb 24, 2024
54a3c29
wip: federation networking code
xiegeo Feb 28, 2024
157ec66
wip: federation network
xiegeo Mar 4, 2024
bed6b78
wip: wip: federation network
xiegeo Mar 6, 2024
7dbbfd4
wip: federation network
xiegeo Mar 6, 2024
77115ff
wip: federation network
xiegeo Mar 6, 2024
01f7b75
feat: federation network
xiegeo Mar 7, 2024
fb29180
fix bugs
xiegeo Mar 7, 2024
8cdcc99
fix bugs
xiegeo Mar 7, 2024
758562c
add log
xiegeo Mar 7, 2024
3ffd60c
debugging
xiegeo Mar 7, 2024
a117caa
fix bugs
xiegeo Mar 8, 2024
9fe748d
fix bugs
xiegeo Mar 9, 2024
62926ef
add non-deterministic mode
xiegeo Mar 9, 2024
4668f38
fix: seed was always set by ClientTrainer to 2021
xiegeo Mar 9, 2024
50a249d
working on report
xiegeo Mar 16, 2024
f845117
poc report
xiegeo Mar 16, 2024
3c1db8a
fix up gitignore
xiegeo Mar 16, 2024
13208ea
fix typo
xiegeo Mar 29, 2024
a92b621
finished api
Apr 19, 2024
fce698b
finished api
Apr 19, 2024
0e3f899
bug fixed
Apr 19, 2024
abeb2f5
bug fixed
Apr 19, 2024
90604e1
bug fixed
Apr 20, 2024
06d2d51
bug fixed
Apr 22, 2024
aeaab11
bug fixed
Apr 22, 2024
4b07b0a
bug fixed
Apr 22, 2024
4f22351
bug fixed
Apr 23, 2024
a48362c
bug fixed
Apr 23, 2024
4cf7c88
bug fixed
Apr 23, 2024
68dd7bf
bug fixed
Apr 23, 2024
ded19dd
batch inference
May 10, 2024
1983f2e
batch inference
May 11, 2024
021113c
batch inference
May 11, 2024
c6b119c
batch inference
May 11, 2024
faf502a
batch inference
May 15, 2024
192f13d
feat: add support for img and mm client
xiegeo May 15, 2024
fe0dc7f
merge
May 16, 2024
fb1bf02
merge
May 20, 2024
af88248
add port parameters
May 29, 2024
5d9f8c7
clean up after merge
xiegeo May 30, 2024
011173f
clean up
xiegeo Jun 15, 2024
487ecd6
working on vqa
xiegeo Jun 15, 2024
87720bf
working on vqa
xiegeo Jun 16, 2024
a640f6f
working on vqa
xiegeo Jun 22, 2024
0620214
bug fixes
LvjianLu Jun 23, 2024
f27886e
working on vqa
xiegeo Jun 23, 2024
d64e199
clean up __pyache__
xiegeo Jun 23, 2024
86a25da
fix imports
xiegeo Jun 23, 2024
a34f358
rename datasets to custom_datasets
xiegeo Jun 23, 2024
078ddf4
rename datasets to custom_datasets part 2
xiegeo Jun 23, 2024
a134ed2
clean up __pychache__
LvjianLu Jun 23, 2024
fa4d9e6
working on vqa, change eval flag
xiegeo Jun 24, 2024
37242aa
working on vqa: training
xiegeo Jun 25, 2024
21a22c0
fix import path
xiegeo Jun 25, 2024
918d80e
working on vqa: transfrom and collate
xiegeo Jun 25, 2024
c386e9e
vqa: transform fix grayscal
xiegeo Jun 25, 2024
08d7fd1
vqa: debug mixed device
xiegeo Jun 25, 2024
3f2949e
vqa: debug mixed device
xiegeo Jun 25, 2024
0a2c258
vqa: debug mixed device
xiegeo Jun 25, 2024
f9a2ef7
vqa: fix get_text_features
xiegeo Jun 25, 2024
56976a8
vqa fix cosine_similarity input
xiegeo Jun 25, 2024
e362191
vqa: fix get_text_features
xiegeo Jun 25, 2024
252f247
vqa fix loss
xiegeo Jun 25, 2024
36e93af
vqa add optimizer
xiegeo Jun 25, 2024
8cc2f33
vqa: fix loss
xiegeo Jun 25, 2024
b739bc7
vqa: add test accuracy
xiegeo Jun 25, 2024
f5b892c
vqa fix test accuracy
xiegeo Jun 25, 2024
7037d90
vqa report less test accuracy during training
xiegeo Jun 25, 2024
5fcae3d
vqa print matches
xiegeo Jun 25, 2024
c6339cb
vqa fix print expected answers
xiegeo Jun 25, 2024
b8a8f24
change test set to validation
xiegeo Jun 26, 2024
404e094
fix increment n
xiegeo Jun 26, 2024
3c15371
fix get_matching_text
xiegeo Jun 26, 2024
d1d0877
fix get_matching_text
xiegeo Jun 26, 2024
c2c4a2d
switch to LinearFusionModelCategorical
xiegeo Jun 26, 2024
4dfaebc
faster version of set_category_from_dataset(vqa2_train)
xiegeo Jun 26, 2024
2e320c2
fix set_category_from_dataset(vqa2_train)
xiegeo Jun 26, 2024
91e1be3
add tqdm to set_category_from_dataset
xiegeo Jun 26, 2024
4fc59e7
try to make set_category_from_dataset faster
xiegeo Jun 26, 2024
7a0388c
fix collate_fn in set_category_from_dataset
xiegeo Jun 26, 2024
120f566
cache categories
xiegeo Jun 26, 2024
a089bef
only learn categories that exits in both training and validation set,…
xiegeo Jun 26, 2024
1e48767
add hidden layers to fusion model
xiegeo Jun 26, 2024
e4560ff
fix add hidden layers to fusion model
xiegeo Jun 26, 2024
9d2fa2f
add weights to cross entropy loss
xiegeo Jun 26, 2024
5d880a0
fix global category_counts
xiegeo Jun 26, 2024
aa2661e
fix divide by zero
xiegeo Jun 26, 2024
0faa2b0
add reset_category_list
xiegeo Jun 26, 2024
9bd19c0
copy category_counts from training counts
xiegeo Jun 26, 2024
08a51fb
add debugging for unknown category
xiegeo Jun 26, 2024
6d732e7
fix unknowns count
xiegeo Jun 26, 2024
da66348
fix typo
xiegeo Jun 26, 2024
83334f9
simplify set_category_from_*
xiegeo Jun 26, 2024
e44e80e
fix enumerate(train_list)
xiegeo Jun 26, 2024
14c7cb3
Revert "simplify set_category_from_*"
xiegeo Jun 26, 2024
0c53e16
improve fusion learning speed by precalculating the base model
xiegeo Jun 26, 2024
970682c
add transform to image handling
xiegeo Jun 26, 2024
b06fca1
process_retrieval_batch
xiegeo Jun 26, 2024
bf3001c
remove NoneType
xiegeo Jun 26, 2024
02e30fb
only save image_features and caption_features
xiegeo Jun 26, 2024
0bca525
add no_grad to def process_retrieval_batch(batch):
xiegeo Jun 26, 2024
a88ba60
set num_proc to 16 in map
xiegeo Jun 26, 2024
c147ed0
revert set num_proc to 16 in map
xiegeo Jun 26, 2024
832c65f
change batch_size for map
xiegeo Jun 26, 2024
7495fe6
disable vqa2_train.map
xiegeo Jun 26, 2024
8902c03
refactor validation
xiegeo Jun 26, 2024
1675427
add lr and loss_avg
xiegeo Jun 26, 2024
6e466b9
experiment with epsilon
xiegeo Jun 26, 2024
e9342b9
add unfreeze_base_model
xiegeo Jun 27, 2024
09b7c36
add regularization
xiegeo Jun 27, 2024
9beb35e
add regularization setting
xiegeo Jun 27, 2024
5b2f9d4
use_f16
xiegeo Jun 27, 2024
2c7a6b6
add dropout
xiegeo Jun 27, 2024
7d2f033
fix answer_id, optional dropout
xiegeo Jun 28, 2024
b9ae4af
fix use_f15
xiegeo Jun 28, 2024
dc90459
fix use_f16
xiegeo Jun 28, 2024
eb8118e
fix use_f16
xiegeo Jun 28, 2024
9d64799
more workers for f16
xiegeo Jun 28, 2024
3e1d18a
increase batch size
xiegeo Jun 28, 2024
00c28e6
change loss_avg
xiegeo Jun 28, 2024
a1b7600
add random transforms for data argumentation
xiegeo Jun 28, 2024
274a7da
disable use_f16
xiegeo Jun 29, 2024
99f1b4a
convert image to tensor first
xiegeo Jun 29, 2024
2891ebf
fix random_transform
xiegeo Jun 29, 2024
408955a
change batch size
xiegeo Jun 29, 2024
3d1a4d6
try caption drop and autocast
xiegeo Jun 29, 2024
57e0c5e
fix prepare_question
xiegeo Jun 29, 2024
792c95c
fix prepare_question, change replacement word
xiegeo Jun 29, 2024
c5d914d
make prepare question less distractive
xiegeo Jun 29, 2024
4fed8f5
disable prepare_question, increase batch size and number of workers a…
xiegeo Jun 29, 2024
1d88e4c
try bfloat16
xiegeo Jun 29, 2024
0bbf94b
disable autocast
xiegeo Jun 29, 2024
64201ff
try caption drop in prepare question again
xiegeo Jun 29, 2024
960a908
add AxB input type
xiegeo Jun 29, 2024
4cda6f0
fix AxB input type
xiegeo Jun 29, 2024
f5e5fb1
fix AxB input type
xiegeo Jun 29, 2024
a842640
fix forward_fusion
xiegeo Jun 29, 2024
3c8b596
fix decode InputType
xiegeo Jun 29, 2024
4909273
fix InputType
xiegeo Jun 29, 2024
aa0b2ad
fix forward_fusion AxB
xiegeo Jun 29, 2024
b3493cb
debug the shapes
xiegeo Jun 29, 2024
4e8f512
debug the shapes v2
xiegeo Jun 29, 2024
c56ed75
fix the shapes
xiegeo Jun 29, 2024
00e68c0
fix forward_fusion
xiegeo Jun 29, 2024
dd2f42b
settable batch size
xiegeo Jun 29, 2024
df8be76
add sub images
xiegeo Jun 30, 2024
955d07c
fix syntax error
xiegeo Jun 30, 2024
cda2226
fix extend returning none
xiegeo Jun 30, 2024
2f9577d
fix use of image_forward
xiegeo Jun 30, 2024
81d86d2
debug shaps
xiegeo Jun 30, 2024
5792aa8
debug shaps v2
xiegeo Jun 30, 2024
ff67837
debug shaps v3
xiegeo Jun 30, 2024
bced6af
fix shapes
xiegeo Jun 30, 2024
5811a34
fixing shapes
xiegeo Jun 30, 2024
8606f59
split question
xiegeo Jun 30, 2024
deb4145
fix input size
xiegeo Jun 30, 2024
3dee6c2
add embed feature loss
xiegeo Jun 30, 2024
e946dca
implement VQAFusionModel
xiegeo Jul 2, 2024
4f71499
fix elif
xiegeo Jul 2, 2024
2f1ff87
fix forward_fusion
xiegeo Jul 2, 2024
ba12dc0
fix vqa1
xiegeo Jul 2, 2024
ce7725c
intergrating vqa trainer
xiegeo Jul 10, 2024
b828def
vqa trainer
xiegeo Jul 10, 2024
c63a751
fix path to custom_datasets
xiegeo Jul 10, 2024
692cddd
fix self.vqa_optimizer
xiegeo Jul 10, 2024
be75ddb
comment out sub images and sub questions
xiegeo Jul 10, 2024
2a6673e
debug device
xiegeo Jul 10, 2024
3728320
fixing device
xiegeo Jul 10, 2024
c208e08
fix to_half
xiegeo Jul 10, 2024
0c91242
try disabling fp16
xiegeo Jul 10, 2024
3ff1925
fix config
xiegeo Jul 10, 2024
da32fcf
fix shap mismatch
xiegeo Jul 10, 2024
aa0bfab
fix missing input
xiegeo Jul 10, 2024
a128308
fix number of categories less than topk
xiegeo Jul 10, 2024
07ca66e
freeze_base_image_model when training for vqa
xiegeo Jul 10, 2024
193ee4f
freeze the pretrained cnn backbone
xiegeo Jul 10, 2024
5edcf98
return epsilon to common value
xiegeo Jul 10, 2024
7df3e03
use 1024 as the hidden size
xiegeo Jul 10, 2024
4b30448
fix wandb
xiegeo Jul 10, 2024
7ba8161
Revert "return epsilon to common value"
xiegeo Jul 10, 2024
92cf717
do not clip the gradient in the first epoch
xiegeo Jul 10, 2024
2673904
fix calling vqa_lr_scheduler
xiegeo Jul 10, 2024
3aeeeff
evaluate both scores for vqa
xiegeo Jul 10, 2024
7c31ac5
fix vqa_validation
xiegeo Jul 10, 2024
ed2732b
fix test_loader
xiegeo Jul 11, 2024
53d7442
move vqa traing to after distilation
xiegeo Jul 11, 2024
cda41a5
run more epochs by default
xiegeo Jul 11, 2024
c68a5af
return default values to orignial
xiegeo Jul 14, 2024
f738a9d
no vqa dropout, modify freeze base model and hidden sizes
xiegeo Jul 16, 2024
f6b6ce6
add vqa_cat_weight
xiegeo Jul 16, 2024
1b05d0d
fix args for get_weights
xiegeo Jul 16, 2024
9631cb5
fix git_weights
xiegeo Jul 16, 2024
995a7c5
fix weights_tensor to device for None
xiegeo Jul 16, 2024
9a24449
add print_model_tree for debugging
xiegeo Jul 16, 2024
acf4c17
fix train_vqa_lr
xiegeo Jul 16, 2024
7425cc2
remove train_vqa_lr
xiegeo Jul 17, 2024
34d640b
only freeze base model in the first 5 epoches, also aways use grad clip
xiegeo Jul 17, 2024
111a81c
add pretrained_model, set full_training_epoch to 10
xiegeo Jul 18, 2024
fd8188e
fix file name
xiegeo Jul 18, 2024
691c7a0
fix when pretained model is loaded
xiegeo Jul 18, 2024
f2d9d6b
add args for no_retrieval_training and vqa_full_training_epoch, save …
xiegeo Jul 18, 2024
c1b1fab
fix no_retrieval_training
xiegeo Jul 18, 2024
7327540
do not save both vqa and net models
xiegeo Jul 18, 2024
ff6cf34
add tqdm for client training
xiegeo Jul 19, 2024
d5d5356
increase number of workers
xiegeo Jul 19, 2024
06f1069
report more vqa validation scores
xiegeo Jul 20, 2024
7dd47a1
add vqa_data_size_per_epoch
xiegeo Jul 21, 2024
47816b8
fix progress bar
xiegeo Jul 21, 2024
cfc1889
make debug log more readable
xiegeo Jul 21, 2024
f3b6009
add client_init_local_epoches
xiegeo Jul 21, 2024
9d9fb0c
fix typos
xiegeo Jul 21, 2024
c41a40f
set dropout using args
xiegeo Jul 23, 2024
199556f
fix loss_avg_rate
xiegeo Jul 23, 2024
6314356
lest vqa_validation runs
xiegeo Jul 23, 2024
7deb797
commands
xiegeo Jul 25, 2024
3175611
add vqa_filter_unknown option
xiegeo Aug 15, 2024
526bd2f
resolve circular import
xiegeo Aug 15, 2024
d55280b
fix typo
xiegeo Aug 15, 2024
ccfc891
use answers instead of multiple_choice_anser for vqa training
xiegeo Aug 15, 2024
35d7fcc
bug fix
xiegeo Aug 15, 2024
5640f94
fix vqa_validation
xiegeo Aug 15, 2024
76adfe9
fix commands
xiegeo Aug 15, 2024
653b9e5
fix overwiting i
xiegeo Aug 15, 2024
9229ac0
debug validation
xiegeo Aug 15, 2024
6ae5ce6
fix vqa_validation
xiegeo Aug 15, 2024
d10f4ca
fix import
xiegeo Aug 16, 2024
4f1b50a
fix import
xiegeo Aug 16, 2024
3d0a788
fix imports
xiegeo Aug 16, 2024
226c50e
return client to orignal settings
xiegeo Aug 19, 2024
8584746
fix vqa only properties
xiegeo Aug 20, 2024
bee5979
pass in the total number of mm clients to f30k client split generator
xiegeo Aug 21, 2024
4e739ee
bug fix
xiegeo Aug 21, 2024
c09c522
fix assert for num_users
xiegeo Aug 21, 2024
b952be6
diff the two versions of EngineBase
xiegeo Aug 26, 2024
20d7ab3
try adding init to MMClientTrainer
xiegeo Aug 26, 2024
71f0ab3
rewrite how args.num_mm_clients in passed in
xiegeo Aug 26, 2024
40d2add
add parallel MMFL
xujiangyu Sep 13, 2024
b7a884f
add examples for mmfl-dist
xujiangyu Sep 14, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
__pycache__
wandb/
*.log
*-best_model.pt
*-last_model.pt
coco_subset_idx_*
*.bbl
*.synctex.gz
52 changes: 52 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
{
"cSpell.words": [
"adamp",
"allimages",
"batchidx",
"Bstdv",
"cifa",
"cifar",
"CIFAR",
"crossfold",
"crossfolds",
"cudnn",
"CVPR",
"dset",
"flickr",
"gpuid",
"Graphcore",
"idxs",
"imagenet",
"inferencing",
"interintra",
"Jingjing",
"keepdim",
"mmdata",
"MMFL",
"MSCOCO",
"multimodal",
"Multimodal",
"multistep",
"noniid",
"optim",
"PCME",
"pycocotools",
"Qiying",
"rsum",
"subconfig",
"svhn",
"testclasses",
"tqdm",
"trainclasses",
"trainloader",
"trainloaders",
"trainval",
"trainvalclasses",
"ujson",
"unsqueeze",
"valclasses",
"vocabs",
"wandb",
"Yimu"
]
}
98 changes: 98 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,83 @@
# Networking for CreamFL

## Tasks

* get code base running locally.
* figure out how to run though the entire code quickly.
* quick test run: `python src/main.py --name quick --contrast_local_inter --contrast_local_intra --interintra_weight 0.5 --max_size 64 --pub_data_num 2 --feature_dim 2 --num_img_clients 2 --num_txt_clients 2 --num_mm_clients 3 --client_num_per_round 2 --local_epochs 2 --comm_rounds 2 --not_bert`
* `--contrast_local_inter --contrast_local_intra --interintra_weight 0.5` Cream options.
* `--max_size` added by xiegeo, 0 or 10000 for old behavior, client training data count, per client.
* `--pub_data_num` public training data size (default 50000), proportional to communication cost (memory for local simulation) cost.
* `--feature_dim` number of public features (default 256), proportional to communication cost.
* `--num_img_clients 2 --num_txt_clients 2 --num_mm_clients 3 --client_num_per_round 2` number of max client of each type, and total number of client per round.
* `--local_epochs 2 --comm_rounds 2` local and global rounds.
* `--not_bert` use a simpler model

* get code to run in a network
* see the "How to run the network" section.

## Goals

* Learn:
* Transformers
* Transformer [Attention Is All You Need 2017 v7(2023)](https://arxiv.org/abs/1706.03762)
* [An Introduction to Transformers 2023 v5(2024)](https://arxiv.org/abs/2304.10557)
* Multimodal
* [DeViSE: A Deep Visual-Semantic Embedding Model 2013](https://research.google.com/pubs/archive/41473.pdf)
* PCME [Probabilistic Embeddings for Cross-Modal Retrieval 2021 v2](https://arxiv.org/abs/2101.05068) <https://github.com/naver-ai/pcme>
* Federated Learning
* Federated Averaging [Communication-Efficient Learning of Deep Networks from Decentralized Data 2016 v4(2023)](https://arxiv.org/abs/1602.05629)

* Implement networking
* try FedML? (to much rewrite for fedML to do it properly, otherwise too hacky.)
* try custom network? (do a quick demo version)

## How to run the network

### Configuration

* flags: the same as local version.
* fed_config: setup server and client options.

### Run

A network requires n + 2 processes. Where n is the number of clients,
plus a command server over http, and a global round computation provider.

#### Command server

```bash
python src/federation/server.py --name test
```

#### Global round computation provider

```bash
python src/federation/global.py --name test --contrast_local_inter --contrast_local_intra --interintra_weight 0.5 --max_size 64 --pub_data_num 2 --feature_dim 2 --not_bert
```

#### Clients

Replace txt0 with the client to start.

```bash
python src/federation/client.py --name test --client_name txt_0 --max_size 64 --pub_data_num 2 --feature_dim 2 --not_bert
```

### File sharing

The network has to share the learned features. This could be through a file server,
a CDN, or shared network storage, ex. Directly accessing the same files is the
easies to implement and easily extends to shared network storage, so this is implemented
first for ease of local testing without lose of generality.

## Prove of Concept

see [report/poc.pdf](report/poc.pdf)

------------------------
Begin original readme

# Multimodal Federated Learning via Contrastive Representation Ensemble

This repo contains a PyTorch implementation of the paper [Multimodal Federated Learning via Contrastive Representation Ensemble](https://arxiv.org/abs/2302.08888) (ICLR 2023).
Expand Down Expand Up @@ -42,6 +122,24 @@ To reproduce CreamFL with BERT and ResNet101 as server models, run the following
python src/main.py --name CreamFL --server_lr 1e-5 --agg_method con_w --contrast_local_inter --contrast_local_intra --interintra_weight 0.5
```

## Run CreamFL retrieval parallely
[1] Run global server
```shell
bash retri_center.sh
```
[2] Run txt client
```shell
bash client_txt_retri.sh
```
[3] Run img client
```shell
bash client_img_retri.sh
```
[4] Run mm client
```shell
bash client_mm_retri.sh
```

## Citation

If you find the paper provides some insights into multimodal FL or our code useful 🤗, please consider citing:
Expand Down
4 changes: 4 additions & 0 deletions client_img_retri.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
export HF_ENDPOINT=https://hf-mirror.com
export HF_DATASETS_CACHE="/shared/.cache/huggingface/datasets"

nohup python src/retri_client_img.py --name retri_client_img --server_lr 1e-5 --seed 0 --feature_dim 256 --pub_data_num 50000 --agg_method con_w --contrast_local_inter --contrast_local_intra --interintra_weight 0.5 --local_epochs 5 --client_num_per_round 1 --num_img_clients 1 --num_txt_clients 0 --num_mm_clients 0 > retri_client_img.log 2>&1 &
4 changes: 4 additions & 0 deletions client_mm_retri.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
export HF_ENDPOINT=https://hf-mirror.com
export HF_DATASETS_CACHE="/shared/.cache/huggingface/datasets"

nohup python src/retri_client_mm.py --name retri_client_mm --server_lr 1e-5 --seed 0 --feature_dim 256 --pub_data_num 50000 --agg_method con_w --contrast_local_inter --contrast_local_intra --interintra_weight 0.5 --local_epochs 5 --client_num_per_round 1 --num_img_clients 0 --num_txt_clients 0 --num_mm_clients 1 > retri_client_mm.log 2>&1 &
4 changes: 4 additions & 0 deletions client_txt_retri.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
export HF_ENDPOINT=https://hf-mirror.com
export HF_DATASETS_CACHE="/shared/.cache/huggingface/datasets"

nohup python src/retri_client_txt.py --name retri_client_txt --server_lr 1e-5 --seed 0 --feature_dim 256 --pub_data_num 50000 --agg_method con_w --contrast_local_inter --contrast_local_intra --interintra_weight 0.5 --local_epochs 5 --client_num_per_round 1 --num_img_clients 0 --num_txt_clients 1 --num_mm_clients 0 > retri_client_txt.log 2>&1 &
Binary file removed coco_subset_idx_file
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file removed data_partition/client_cifar100_noniid.pkl
Binary file not shown.
45 changes: 45 additions & 0 deletions fed_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
wandb:
name: "cream_api"

feature_store: "/tmp/cream_api" # the path to the feature store where client and global features are shared.

# server configuration
server:
api_url: "http://localhost:2323/cream_api"
min_clients: 3 # the number of required clients reporting to start global distillation.
max_clients: 3 # the number of clients reached to start global distillation immediately.
wait_duration: 10m # the duration to wait for clients to report before starting global distillation.


# clients configuration
clients:
- name: "txt_0"
data_type: txt # img, txt, or mm: the type of data the client is handling.
local_epochs: 5
data_partition: "client_AG_NEWS_10_nets_120000_samples_hetero_0.1.pkl"
data_partition_index: 0 # This is only for testing purposes. In a real-world scenario, the data will not be loaded from the same dataset
- name: "txt_1"
data_type: txt # img, txt, or mm: the type of data the client is handling.
local_epochs: 5
data_partition: "client_AG_NEWS_10_nets_120000_samples_hetero_0.1.pkl"
data_partition_index: 1 # This is only for testing purposes. In a real-world scenario, the data will not be loaded from the same dataset
- name: "img_0"
data_type: img # img, txt, or mm: the type of data the client is handling.
local_epochs: 5
data_partition_index: 0 # This is only for testing purposes. In a real-world scenario, the data will not be loaded from the same dataset
- name: "img_1"
data_type: img # img, txt, or mm: the type of data the client is handling.
local_epochs: 5
data_partition_index: 1 # This is only for testing purposes. In a real-world scenario, the data will not be loaded from the same dataset
- name: "mm_0"
data_type: mm
local_epochs: 5
data_partition_index: 0
- name: "mm_1"
data_type: mm
local_epochs: 5
data_partition_index: 1
- name: "mm_2"
data_type: mm
local_epochs: 5
data_partition_index: 2
Loading