Skip to content

Hardcoded OpenAI API calls #242

@saujasv

Description

@saujasv

response = generate_from_openai_chat_completion(
model="gpt-4-1106-preview",
messages=messages,
temperature=0,
max_tokens=768,
top_p=1.0,
context_length=0,
).lower()

Model name and temperature are hard-coded. Additionally, the current design does not allow for using different providers for evaluation (say, OpenAI) and action (say, vLLM or other), since it uses the common OPENAI_API_KEY

openai.api_key = os.environ["OPENAI_API_KEY"]
openai.organization = os.environ.get("OPENAI_ORGANIZATION", "")

It would be good to support different OpenAI endpoints for WebArena internal evaluation and for the model being used (allowing for different providers), allow for setting model names through environment variables (gpt-4-1106-preview is legacy) so things don't break. Additionally, GPT-5 series models don't allow for non-default temperatures, which needs to be accounted for.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions