You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Dec 1, 2024. It is now read-only.
* benchmarked 6.7B.
* some new scripts
* update some results of opt30b
* update some results.
* added some 175B results, one piece left.
* updated the missing results.
* Update README.md
* update readme
Run the tests of data wrangling tasks in the [fm_data_tasks](https://github.com/HazyResearch/fm_data_tasks) repo from [HazyResearch](https://github.com/HazyResearch).
13
+
Check [more details](./data_wrangle/README.md).
14
+
```
15
+
cd data_wrangle
16
+
bash install
17
+
bash test_batch_query_all_opt6.7b.sh
18
+
bash test_batch_query_all_opt30b.sh
19
+
bash test_batch_query_all_opt175b.sh
20
+
```
21
+
22
+
10
23
### HELM benchmark
11
24
Run Massive Multitask Language Understanding (MMLU) scenario.
Here we show how to use FlexGen for the data wrangling tasks. The implementation follows the [fm_data_tasks](https://github.com/HazyResearch/fm_data_tasks) repo from [HazyResearch](https://github.com/HazyResearch).
3
+
Here we show how to use FlexGen for the data wrangling tasks including entity match (EM), data imputation (DI) and error detection (ED). The implementation follows the [fm_data_tasks](https://github.com/HazyResearch/fm_data_tasks) repo from [HazyResearch](https://github.com/HazyResearch).
4
4
5
5
## Install
6
6
@@ -9,10 +9,77 @@ Here we show how to use FlexGen for the data wrangling tasks. The implementation
9
9
10
10
## Examples
11
11
12
-
- To check the outcome and verify the result of a data imputation task (e.g., Restaurant), run:
12
+
- To check the outcome and verify the result of a data imputation task (e.g., Restaurant on OPT-6.7B), run:
13
13
14
14
bash test_single_query_case.sh
15
15
16
-
- To test FlexGen Throughput of a data imputation task (e.g., Restaurant), run:
16
+
- To test the throughput of FlexGen for a data imputation task (e.g., Restaurant on OPT-6.7B), run:
17
17
18
18
bash test_batch_query_case.sh
19
+
20
+
- To run the complete tests of all tasks on OPT-6.7B:
21
+
22
+
bash test_batch_query_all_opt6.7b.sh
23
+
24
+
- To run the complete tests of all tasks on OPT-30B:
25
+
26
+
bash test_batch_query_all_opt30b.sh
27
+
28
+
- To run the complete tests of all tasks on OPT-175B:
29
+
30
+
bash test_batch_query_all_opt175b.sh
31
+
32
+
33
+
34
+
## Benchmark Results
35
+
36
+
- Notice that in this data wrangling tasks, such as entity match (EM), data imputation (DI) and error detection (ED), the input sequences length is **very long** (from 123 to 1274), but the output length is **very short** (e.g., 3, 5, or 10). Most of the inference time is spent on prefill phase, so here we report the throughput that includes both input and output tokens as our measurement.
37
+
38
+
- We run the experiments on the same setting as the HELM benchmark with a single T4 (16GB) GPU, 200GB of DRAM, and 1.5TB SSD connected by NVMe.
39
+
40
+
### OPT6.7B
41
+
42
+
| Task | Tested Samples | Input Length | Output Length | Time (s) |Input + Output Throughput (token/s)|
0 commit comments