SETA: Scaling Environments for Terminal Agents

Designing resilient toolkits and scalable RL environments for CAMEL terminal agents

Getting Started 🎯

Installation

# Clone the repository
git clone https://github.com/camel-ai/seta.git
cd seta
bash setup.sh

Run task by task

#=========================================
# Run single developer agent / workforce
#=========================================
cd evaluation/terminal_bench_run/
bash run_agent.sh \
        -a <attempt,0..n> \
        -n <total_attempts> \
        -e <conda env name> \
        -w <use_workforce>  # can have a try, focus on single chat agent now.

Log folder explaination

└── play-zork
    └── play-zork.1-of-1.test_run       # trial name
        ├── CAMEL_WORKDIR               # not used at the moment
        ├── agent-logs                  # not used at the moment
        ├── commands.txt                # not used at the moment
        ├── chatagent.log               # ❗️❗️ full history of running agent including test results
        ├── eigent_logs.json            # ⚠️ exists only when running workforce
        ├── panes                       # not used at the moment
        └── sessions                    # session logs
            ├── agent.cast              # not used at the moment
            ├── agent.log               # not used at the moment
            ├── session_logs            # ❗️❗️session logs for terminal toolkit
            │   ├── blocking_commands.log                   # ❗️❗️all block mode commands + output
            │   ├── session_run_zork_1_correct_path.log     # ❗️❗️non-block mode single session command + output
            │   ├── session_zork-1.log                      # ❗️❗️same as above session_{id}.log
            │   └── session_zork_start.log                  # ❗️❗️same as above session_{id}.log
            ├── tests.cast              # not used at the moment    
            ├── tests.log               # ❗️❗️test log
            └── tests.log.strip         # ❗️❗️test log with ansi control characters removed

Run terminal bench official evaluation

cd evaluation/terminal_bench_eval/

# terminal bench 1.0
bash run_eval.sh

# terminal bench 2.0
bash run_tb2.sh

## The agent class is implemented in tbench_camel_agent.py

❗️Note: Results of evaluation

- final results will be in `evaluation/terminal_bench_eval/run/{run_id}/results.json`

- task specific terminal session logs will be in `evaluation/terminal_bench_eval/logs/camel_logs/{task_id}/`

Train terminal agent

Everything is under training folder

Please refer to Training Setup for detailed instructions.

Note: new TerminalToolkit design document Terminal Toolkit Design

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
dataset		dataset
docs		docs
evaluation		evaluation
external		external
training		training
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SETA: Scaling Environments for Terminal Agents

Getting Started 🎯

Installation

Run task by task

Log folder explaination

Run terminal bench official evaluation

❗️Note: Results of evaluation

Train terminal agent

About

Uh oh!

Releases

Packages

Languages

License

camel-ai/seta

Folders and files

Latest commit

History

Repository files navigation

SETA: Scaling Environments for Terminal Agents

Getting Started 🎯

Installation

Run task by task

Log folder explaination

Run terminal bench official evaluation

❗️Note: Results of evaluation

Train terminal agent

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages