Add Few-Shot Learning Support to Balrog #4

BartekCupial · 2024-12-02T11:15:31Z

Description

Implements Few-Shot Learning capabilities to enable model evaluation with expert trajectories in the context.

To download demonstrations

pip install gdown
gdown https://drive.google.com/uc?export=download&id=1TQbrqMSC5K_SNx9tta1Tlhtg8flSIGaJ
unzip records.zip

To run Few-Shot Learning

python -m eval agent.type=few_shot eval.icl_episodes=5

Prompt formatting

Example prompt for the agent starting playing the game with eval.icl_episodes=1

00 = Message(role=user, content=System Prompt: [], attachment=None)
01 = Message(role=user, content=****** START OF DEMONSTRATION EPISODE 1 ******, attachment=None)
02 = Message(role=user, content=Obesrvation: [], attachment=None)
03 = Message(role=assistant, content=None, attachment=None)
04 = Message(role=user, content=Obesrvation: [], attachment=None)
05 = Message(role=assistant, content=go forward, attachment=None)
06 = Message(role=user, content=Obesrvation: [], attachment=None)
07 = Message(role=assistant, content=go forward, attachment=None)
08 = Message(role=user, content=Obesrvation: [], attachment=None)
09 = Message(role=assistant, content=turn left, attachment=None)
10 = Message(role=user, content=****** END OF DEMONSTRATION EPISODE 1 ******, attachment=None)
11 = Message(role=user, content=****** Now it's your turn to play the game! ******, attachment=None)
12 = Message(role=user, content=Current Observation:

Features

each demonstration have corresponding mp4 file, which allows for quick inspection
FewShotAgent allows for context caching, can be enabled with agent.cache_icl=True

Additional Notes:

all trajectories are loaded in context, this can increase the cost of evaluation, especially for environments like nethack
for textworld environments we avoid the case where we put the solution into the context
in principle we also could incorporate similar strategy for other environments, for example in nle we could load trajectories corresponding to the same character

DavidePaglieri

Looks good to me.

BartekCupial added 8 commits December 4, 2024 08:53

Few Shot Learning

4e984fd

turn off dummy actions

21133b1

simplify dataset

ab32408

add docs for few shot learning

388de8c

update docs

e4a310b

fix download link

8783bd1

fix loading dataset

bbdaea5

add parameter to limit the size of icl context

542d224

BartekCupial force-pushed the feat/few_shot_learning branch from ee82177 to 542d224 Compare December 4, 2024 07:54

BartekCupial added 3 commits December 4, 2024 10:35

sample demonstrations randomly

649f45e

quick fix

3f1c09e

set default max_icl_history to 1000

2615e58

DavidePaglieri self-requested a review December 4, 2024 15:58

DavidePaglieri approved these changes Dec 5, 2024

View reviewed changes

BartekCupial merged commit 67a8d26 into balrog-ai:main Dec 5, 2024
4 checks passed

This was referenced Jan 7, 2025

fix: double system prompt #21

Merged

System prompt added twice with "user" instead of "system" label? #20

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Few-Shot Learning Support to Balrog #4

Add Few-Shot Learning Support to Balrog #4

Uh oh!

BartekCupial commented Dec 2, 2024 •

edited

Loading

Uh oh!

DavidePaglieri left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Few-Shot Learning Support to Balrog #4

Add Few-Shot Learning Support to Balrog #4

Uh oh!

Conversation

BartekCupial commented Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Prompt formatting

Features

Additional Notes:

Uh oh!

DavidePaglieri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BartekCupial commented Dec 2, 2024 •

edited

Loading