Skip to content

Commit 16d468e

Browse files
docs: temperature
1 parent 32cfcde commit 16d468e

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

docs/evaluation.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ python eval.py \
9393
| **client.is_chat_model** | Indicates if the model follows a chat-based interface. | `True` |
9494
| **client.generate_kwargs.temperature** | Temperature for model response randomness. | `0.0` |
9595
| **client.alternate_roles** | If True the instruction prompt will be fused with first observation. Required by some LLMs. | `False` |
96-
| **client.temperature** | If set to null will default to the API default temperature. Use a float from 0.0 to 1.0. otherwise. | `null` |
96+
| **client.temperature** | If set to null will default to the API default temperature. Use a float from 0.0 to 1.0. otherwise. | `null` |
9797
| **envs.names** | Dash-separated list of environments to evaluate, e.g., `nle-minihack`. | `babyai-babaisai-textworld-crafter-nle-minihack`|
9898

9999

@@ -104,3 +104,5 @@ python eval.py \
104104
Mac systems might complain about fork when evaluating in multiprocessing mode (`eval.num_workers > 1`). To fix this export the following before running eval: `export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES`
105105
- Alternate roles:
106106
Some LLMs/VLMs require alternating roles. You can fuse the instruction prompt with the first observation to comply with this with the following: `client.alternate_roles=True`
107+
- Temperature:
108+
We recommend running models with temperature ranges around 0.5-0.7, or to use the default temperature of the model APIs. Too low temperatures can cause some of the more brittle models to endlessly repeat actions or create incoherent outputs.

0 commit comments

Comments
 (0)