Commit d3b604a
committed
update llm government benchmark implementation
Signed-off-by: IcyFeather <mengzhuo.happy@gmail.com>1 parent 67ecaa7 commit d3b604a
994 files changed
Lines changed: 0 additions & 44460 deletions
File tree
- core/opencompass
- datasets
- ARC_c
- ARC_e
- CHARM
- few-shot-examples_Translate-EN
- few-shot-examples
- CIBench
- CLUE_C3
- CLUE_CMRC
- CLUE_DRCD
- CLUE_afqmc
- CLUE_cmnli
- CLUE_ocnli
- ChemBench
- FewCLUE_bustm
- FewCLUE_chid
- FewCLUE_cluewsc
- FewCLUE_csl
- FewCLUE_eprstmt
- FewCLUE_ocnli_fc
- FewCLUE_tnews
- FinanceIQ
- GLUE_CoLA
- GLUE_MRPC
- GLUE_QQP
- GaokaoBench
- IFEval
- MMLUArabic
- MathBench
- MedBench
- NPHardEval
- OpenFinData
- PJExam
- QuALITY
- SVAMP
- SuperGLUE_AX_b
- SuperGLUE_AX_g
- SuperGLUE_BoolQ
- SuperGLUE_CB
- SuperGLUE_COPA
- SuperGLUE_MultiRC
- SuperGLUE_RTE
- SuperGLUE_ReCoRD
- SuperGLUE_WSC
- SuperGLUE_WiC
- TabMWP
- TheoremQA
- XCOPA
- XLSum
- Xsum
- adv_glue
- adv_glue_mnli_mm
- adv_glue_mnli
- adv_glue_qnli
- adv_glue_qqp
- adv_glue_rte
- adv_glue_sst2
- agieval
- anli
- anthropics_evals
- apps
- bbh
- lib_prompt
- ceval
- civilcomments
- clozeTest_maxmin
- cmb
- cmmlu
- collections
- leaderboard
- commonsenseqa_cn
- commonsenseqa
- compassbench_20_v1_1_public
- agent
- code
- knowledge
- language
- math
- reason
- compassbench_20_v1_1
- agent
- code
- knowledge
- language
- math
- reason
- contamination
- crowspairs_cn
- crowspairs
- cvalues
- drop
- ds1000
- flames
- flores
- game24
- govrepcrs
- gpqa
- gsm8k_contamination
- gsm8k
- gsm_hard
- hellaswag
- humaneval_cn
- humaneval_multi
- humaneval_plus
- humanevalx
- humaneval
- hungarian_exam
- infinitebench
- infinitebenchcodedebug
- infinitebenchcoderun
- infinitebenchendia
- infinitebenchenmc
- infinitebenchenqa
- infinitebenchensum
- infinitebenchmathcalc
- infinitebenchmathfind
- infinitebenchretrievekv
- infinitebenchretrievenumber
- infinitebenchretrievepasskey
- infinitebenchzhqa
- iwslt2017
- jigsawmultilingual
- kaoshi
- lambada
- lawbench
- lcsts
- leval
- levalcoursera
- levalfinancialqa
- levalgovreportsumm
- levalgsm100
- levallegalcontractqa
- levalmeetingsumm
- levalmultidocqa
- levalnarrativeqa
- levalnaturalquestion
- levalnewssumm
- levalpaperassistant
- levalpatentsumm
- levalquality
- levalreviewsumm
- levalscientificqa
- levaltopicretrieval
- levaltpo
- levaltvshowsumm
- llm_compression
- longbench
- longbench2wikimqa
- longbenchdureader
- longbenchgov_report
- longbenchhotpotqa
- longbenchlcc
- longbenchlsht
- longbenchmulti_news
- longbenchmultifieldqa_en
- longbenchmultifieldqa_zh
- longbenchmusique
- longbenchnarrativeqa
- longbenchpassage_count
- longbenchpassage_retrieval_en
- longbenchpassage_retrieval_zh
- longbenchqasper
- longbenchqmsum
- longbenchrepobench
- longbenchsamsum
- longbenchtrec
- longbenchtriviaqa
- longbenchvcsum
- lveval
- lvevalcmrc_mixup
- lvevaldureader_mixup
- lvevalfactrecall_en
- lvevalfactrecall_zh
- lvevalhotpotwikiqa_mixup
- lvevallic_mixup
- lvevalloogle_CR_mixup
- lvevalloogle_MIR_mixup
- lvevalloogle_SD_mixup
- lvevalmultifieldqa_en_mixup
- lvevalmultifieldqa_zh_mixup
- mastermath2024v1
- math401
- math
- mbpp_cn
- mbpp_plus
- mbpp
- mgsm
- mmlu_pro
- mmlu
- narrativeqa
- needlebench
- atc
- needlebench_1000k
- needlebench_128k
- needlebench_200k
- needlebench_256k
- needlebench_32k
- needlebench_4k
- needlebench_8k
- nq_cn
- nq
- obqa
- piqa
- promptbench
- py150
- qabench
- qaspercut
- qasper
- race
- realtoxicprompts
- rolebench
- s3eval
- safety
- scibench
- lib_prompt
- siqa
- squad20
- storycloze
- strategyqa
- subjective
- alignbench
- alpaca_eval
- arena_hard
- compassarena
- compassbench
- creationbench
- fofo
- multiround
- subjective_cmp
- wildbench
- summedits
- summscreen
- taco
- teval
- triviaqarc
- triviaqa
- truthfulqa
- tydiqa
- wikibench
- wikitext
- winograd
- winogrande
- xiezhi
- z_bench
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Whitespace-only changes.
This file was deleted.
This file was deleted.
This file was deleted.
This file was deleted.
This file was deleted.
This file was deleted.
This file was deleted.
This file was deleted.
This file was deleted.
0 commit comments