Skip to content

camel-ai/seta

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SETA: Scaling Environments for Terminal Agents

SETA

Designing resilient toolkits and scalable RL environments for CAMEL terminal agents


Getting Started 🎯

Installation

# Clone the repository
git clone https://github.com/camel-ai/seta.git
cd seta
bash setup.sh

Run task by task

#=========================================
# Run single developer agent / workforce
#=========================================
cd evaluation/terminal_bench_run/
bash run_agent.sh \
        -a <attempt,0..n> \
        -n <total_attempts> \
        -e <conda env name> \
        -w <use_workforce>  # can have a try, focus on single chat agent now.

Log folder explaination

└── play-zork
    └── play-zork.1-of-1.test_run       # trial name
        β”œβ”€β”€ CAMEL_WORKDIR               # not used at the moment
        β”œβ”€β”€ agent-logs                  # not used at the moment
        β”œβ”€β”€ commands.txt                # not used at the moment
        β”œβ”€β”€ chatagent.log               # ❗️❗️ full history of running agent including test results
        β”œβ”€β”€ eigent_logs.json            # ⚠️ exists only when running workforce
        β”œβ”€β”€ panes                       # not used at the moment
        └── sessions                    # session logs
            β”œβ”€β”€ agent.cast              # not used at the moment
            β”œβ”€β”€ agent.log               # not used at the moment
            β”œβ”€β”€ session_logs            # ❗️❗️session logs for terminal toolkit
            β”‚   β”œβ”€β”€ blocking_commands.log                   # ❗️❗️all block mode commands + output
            β”‚   β”œβ”€β”€ session_run_zork_1_correct_path.log     # ❗️❗️non-block mode single session command + output
            β”‚   β”œβ”€β”€ session_zork-1.log                      # ❗️❗️same as above session_{id}.log
            β”‚   └── session_zork_start.log                  # ❗️❗️same as above session_{id}.log
            β”œβ”€β”€ tests.cast              # not used at the moment    
            β”œβ”€β”€ tests.log               # ❗️❗️test log
            └── tests.log.strip         # ❗️❗️test log with ansi control characters removed

Run terminal bench official evaluation

cd evaluation/terminal_bench_eval/

# terminal bench 1.0
bash run_eval.sh

# terminal bench 2.0
bash run_tb2.sh

## The agent class is implemented in tbench_camel_agent.py 

❗️Note: Results of evaluation

- final results will be in `evaluation/terminal_bench_eval/run/{run_id}/results.json`

- task specific terminal session logs will be in `evaluation/terminal_bench_eval/logs/camel_logs/{task_id}/`

Train terminal agent

Everything is under training folder

Please refer to Training Setup for detailed instructions.

Note: new TerminalToolkit design document Terminal Toolkit Design

About

πŸ’» SETA: Scaling Environments for Terminal Agents

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published