Skip to content

Commit e466b81

Browse files
default temperature (#43)
1 parent 7913bf8 commit e466b81

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

balrog/config/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ client:
3030
model_id: gpt-4o # Model identifier (e.g., 'gpt-4', 'gpt-3.5-turbo')
3131
base_url: http://localhost:8080/v1 # Base URL for the API (if using a local server)
3232
generate_kwargs:
33-
temperature: null # Sampling temperature. If null the API default temperature is used instead
33+
temperature: 1.0 # Sampling temperature. If null the API default temperature is used instead
3434
max_tokens: 4096 # Max tokens to generate in the response
3535
timeout: 60 # Timeout for API requests in seconds
3636
max_retries: 5 # Max number of retries for failed API calls

docs/evaluation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ python eval.py \
9393
| **client.is_chat_model** | Indicates if the model follows a chat-based interface. | `True` |
9494
| **client.generate_kwargs.temperature** | Temperature for model response randomness. | `0.0` |
9595
| **client.alternate_roles** | If True the instruction prompt will be fused with first observation. Required by some LLMs. | `False` |
96-
| **client.temperature** | If set to null will default to the API default temperature. Use a float from 0.0 to 1.0. otherwise. | `null` |
96+
| **client.temperature** | If set to null will default to the API default temperature. Use a float from 0.0 to 2.0. otherwise. | `1.0` |
9797
| **envs.names** | Dash-separated list of environments to evaluate, e.g., `nle-minihack`. | `babyai-babaisai-textworld-crafter-nle-minihack`|
9898

9999

@@ -105,4 +105,4 @@ python eval.py \
105105
- Alternate roles:
106106
Some LLMs/VLMs require alternating roles. You can fuse the instruction prompt with the first observation to comply with this with the following: `client.alternate_roles=True`
107107
- Temperature:
108-
We recommend running models with temperature ranges around 0.5-0.7, or to use the default temperature of the model APIs. Too low temperatures can cause some of the more brittle models to endlessly repeat actions or create incoherent outputs.
108+
We recommend running models with temperature ranges around 0.7-1.0, or to use the default temperature of the model APIs. Too low temperatures can cause some of the more brittle models to endlessly repeat actions or create incoherent outputs.

0 commit comments

Comments
 (0)