Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,9 +173,9 @@ class Container(Protocol):
#### 4. LLM Clients (`src/ares/llms/`)

**Core Abstractions:**
- `LLMRequest` - Dataclass with messages and optional temperature
- `LLMResponse` - Dataclass with ChatCompletion and cost tracking
- `LLMClient` Protocol - `async def __call__(request: LLMRequest) -> LLMResponse`
- `lft.OpenResponsesRequest` - Canonical Open Responses request (from linguafranca) used for observations and client inputs
- `InferenceResult` - Dataclass wrapping `lft.OpenResponsesResponse` with cost tracking
- `LLMClient` Protocol - `async def __call__(request: lft.OpenResponsesRequest) -> InferenceResult`

**Key Pattern: Queue-Mediated LLM Client (`queue_mediated_client.py`):**

Expand Down Expand Up @@ -281,12 +281,13 @@ Follow Google-style isort configuration:
- **Always import modules, not classes or functions**
- **External consumers** (examples, docs):
- ✅ Good: `import ares` → use `ares.make(...)`
- ✅ Good: `from ares import llms` → use `llms.LLMRequest`, `llms.TextData`
- ❌ Avoid: `from ares.llms import LLMRequest, TextData`
- ✅ Good: `from ares.llms import open_responses` → use `open_responses.make_request(...)`
- ✅ Good: `from ares import llms` → use `llms.TextData`, `llms.Usage`
- ❌ Avoid: `from ares.llms import OpenResponsesRequest, TextData`
- **Internal code**:
- ✅ Good: `from ares.llms import request` → use `request.LLMRequest`
- ✅ Good: `from ares.llms import open_responses` → use `open_responses.make_request(...)`
- ✅ Good: `from ares.llms import response` → use `response.TextData`, `response.Usage`
- ❌ Avoid: `from ares.llms.request import LLMRequest`
- ❌ Avoid: `from ares.llms.open_responses import Request`
- ❌ Avoid: `from ares.llms.response import TextData, Usage`
Comment thread
coderabbitai[bot] marked this conversation as resolved.
- Rationale: Makes code more readable and explicit about where objects come from

Expand Down
14 changes: 7 additions & 7 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,21 +103,21 @@ Follow **Google-style imports**: always import modules, not individual classes o
```python
# Good ✅
import ares
from ares import llms
from ares.llms import open_responses

request = llms.LLMRequest(messages=[...])
request = open_responses.make_request([open_responses.user_message("Hello")])
env = ares.make("sbv-mswea")

# Good for internal code ✅
from ares.llms import request
from ares.llms import open_responses
from ares.llms import response

req = request.LLMRequest(messages=[...])
resp = response.LLMResponse(data=[...], cost=0.0, usage=...)
req = open_responses.make_request([open_responses.user_message("Hello")])
resp = response.InferenceResult(response=response.make_response("Hello!"), cost=0.0)
Comment on lines +115 to +116

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the signature of make_response to see if tokens are required
ast-grep --pattern $'def make_response($$$) -> $_:
    $$$'

Repository: withmartian/ares

Length of output: 3038


🏁 Script executed:

#!/bin/bash
# First, let's examine the actual CONTRIBUTING.md at lines 115-116
head -120 CONTRIBUTING.md | tail -10

Repository: withmartian/ares

Length of output: 423


🏁 Script executed:

#!/bin/bash
# Search for make_response usage patterns in the codebase to see how it's typically called
rg "make_response\(" --max-count 20 -B 1 -A 1

Repository: withmartian/ares

Length of output: 3674


Consider explicitly showing token parameters in the documentation example.

While response.make_response("Hello!") is syntactically valid (both input_tokens and output_tokens default to 0), the example in CONTRIBUTING.md would be more instructive if it explicitly shows these parameters, consistent with how tests use the API:

resp = response.InferenceResult(
    response=response.make_response("Hello!", input_tokens=10, output_tokens=5),
    cost=0.0
)

This makes the full API signature clearer for readers learning the framework.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CONTRIBUTING.md` around lines 115 - 116, Update the CONTRIBUTING.md example
to explicitly pass token counts to the response factory so readers see the full
API signature: when constructing response.InferenceResult, call
response.make_response with input_tokens and output_tokens (e.g.,
response.make_response("Hello!", input_tokens=..., output_tokens=...)) and keep
cost=0.0; this clarifies the parameters used by response.make_response and
matches how tests exercise the API.


# Avoid ❌
from ares.llms import LLMRequest, TextData
from ares.llms.request import LLMRequest
from ares.llms import OpenResponsesRequest, TextData
from ares.llms.open_responses import Request
```

**Rationale:** Makes code more readable and explicit about where objects come from.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ ARES is an RL-first framework for training and evaluating LLM agents, especially

It is a modern [gym](https://github.com/Farama-Foundation/Gymnasium): the environment layer powering RL research.

ARES treats LLMRequests as observations and LLMResponses as actions within the environment, so you can focus on training just the LLM - not the Code Agent surrounding it. The interface is entirely async, and supports scaling up to hundreds or thousands of parallel environments easily - check out [example 3](https://github.com/withmartian/ares/tree/main/examples/03_parallel_eval_with_api.py) to run this yourself.
ARES treats Open Responses requests as observations and LLMResponses as actions within the environment, so you can focus on training just the LLM - not the Code Agent surrounding it. The interface is entirely async, and supports scaling up to hundreds or thousands of parallel environments easily - check out [example 3](https://github.com/withmartian/ares/tree/main/examples/03_parallel_eval_with_api.py) to run this yourself.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should give an actual class name - same comment as way above



## Quick Start
Expand Down
41 changes: 19 additions & 22 deletions docs/source/core-concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ It's important to understand two different concepts in ARES:
The orchestration logic that uses a Container and LLM to solve tasks (e.g., MiniSWECodeAgent). This is **part of the environment** and remains fixed during training. Think of it as the scaffold that defines how an LLM interacts with code.

* Agent/Policy (Trained)
The component you're actually training - a function that maps ``LLMRequestLLMResponse``. This could be a fine-tuned LLM, a prompt optimizer, or any policy that produces better responses. This is what improves through reinforcement learning.
The component you're actually training - a function that maps ``OpenResponsesRequestInferenceResult``. This could be a fine-tuned LLM, a prompt optimizer, or any policy that produces better responses. This is what improves through reinforcement learning.

System Architecture
-------------------
Expand All @@ -30,13 +30,13 @@ Here's how the components fit together:
| generates response | │ │
└──────────┬─────────────┘ │ ┌────────────────────────────────┐ │
^ │ │ │ QueueMediatedLLMClient │ │
| │ LLMResponse (action) │ │ │ │
| │ InferenceResult (action) │ │ │ │
| └──────────────────────────┼─>│ Intercepts LLM calls │ │
| │ │ from code agent via │ │
└─────────────────────────────────┼──│ QueueMediatedLLMClient │ │
LLMRequest (observation) │ └──────────────────┬─────────────┘ │
Open Responses observation │ └──────────────────┬─────────────┘ │

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "OpenResponsesRequest (observation)" instead to show the type -> RL concept

│ ^ │ │
LLMRequest │ │ LLMResponse
Open Responses │ │ InferenceResult

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nit as above

│ │ v │
│ ┌──────────────└─────────────────┐ │
│ │ CodeAgent │ │
Expand Down Expand Up @@ -87,7 +87,7 @@ The key abstraction is ``CodeEnvironment``, which:
* **Exposes LLM requests as observations** - Intercepts calls from the code agent
* **Treats LLM responses as actions** - Your trainable agent/policy provides responses

Crucially, the **CodeAgent is part of the environment**, not what you're training. Your training loop optimizes an agent/policy that produces better ``LLMResponse`` outputs given ``LLMRequest`` observations.
Crucially, the **CodeAgent is part of the environment**, not what you're training. Your training loop optimizes an agent/policy that produces better ``InferenceResult`` outputs given canonical Open Responses observations.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"canonical" is confusing here, I would remove


Standard RL Loop
~~~~~~~~~~~~~~~~
Expand All @@ -101,10 +101,10 @@ Every environment follows the standard RL pattern:
timestep = await env.reset()

while not timestep.last():
# timestep.observation is an LLMRequest from the code agent
# timestep.observation is an Open Responses request from the code agent
action = await your_policy(timestep.observation)

# action is an LLMResponse that continues the agent's execution
# action is an InferenceResult that continues the agent's execution
timestep = await env.step(action)

# timestep.reward contains the reward for the final step
Expand All @@ -116,7 +116,7 @@ TimeStep Structure
Each call to ``reset()`` or ``step()`` returns a ``TimeStep`` with:

* ``step_type``: One of ``"FIRST"``, ``"MID"``, or ``"LAST"``
* ``observation``: An ``LLMRequest`` object (or ``None`` on termination)
* ``observation``: An Open Responses request object (or ``None`` on termination)
* ``reward``: A float reward for each step
* ``discount``: A float discount factor for RL algorithms

Expand Down Expand Up @@ -160,7 +160,7 @@ Example structure:
async def run(self, task: str) -> None:
while not self.is_done():
# Ask LLM what to do next
request = LLMRequest(messages=[...])
request = open_responses.make_request([open_responses.user_message(...)])
response = await self._llm_client(request)

# Parse and execute commands from LLM response
Expand Down Expand Up @@ -234,8 +234,8 @@ Which you will need to rewrite into something like:
# Decide what to ask LLM next
...
llm_response = await self.llm_client(
LLMRequest(
messages=[...],
open_responses.make_request(
[open_responses.user_message(...)],
... # Other request params
)
)
Expand Down Expand Up @@ -293,30 +293,27 @@ Core Interface

.. code-block:: python

from linguafranca import types as lft

class LLMClient(Protocol):
async def __call__(self, request: LLMRequest) -> LLMResponse:
async def __call__(self, request: lft.OpenResponsesRequest) -> InferenceResult:
...

@dataclass(frozen=True)
class LLMRequest:
messages: Iterable[ChatCompletionMessageParam]
temperature: float | None = None

@dataclass(frozen=True)
class LLMResponse:
chat_completion_response: ChatCompletion
class InferenceResult:
response: lft.OpenResponsesResponse
cost: float

This simple interface wraps OpenAI-style chat completion APIs. The ``messages`` field follows the OpenAI format with ``role`` (system/user/assistant) and ``content``.
ARES uses linguafranca's ``OpenResponsesRequest`` as the canonical request type for observations and client inputs. Edge adapters convert to Chat/Responses/Anthropic formats only when needed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change this to a note like

ARES leverages linguafranca for request and response types - we use the OpenResponsesRequest as our base request object returned from the environment for observations, and encourage users to use linguafranca.convert_* methods for translating between different provider and local formats.


Why LLMClient?
~~~~~~~~~~~~~~

The ``LLMClient`` abstraction serves two purposes:

1. **Observations = LLM Requests**: In the RL loop, ``timestep.observation`` is an ``LLMRequest`` containing the messages the code agent wants to send to the LLM. This is the "state" your policy observes.
1. **Observations = Open Responses requests**: In the RL loop, ``timestep.observation`` is a canonical Open Responses request containing what the code agent wants to send to the LLM. This is the "state" your policy observes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same "canonical" comment


2. **Actions = LLM Responses**: In the RL loop, the ``action`` you pass to ``env.step()`` is an ``LLMResponse`` containing the LLM's reply. This is how your policy controls the agent's behavior.
2. **Actions = LLM Responses**: In the RL loop, the ``action`` you pass to ``env.step()`` is an ``InferenceResult`` containing the LLM's reply. This is how your policy controls the agent's behavior.

This framing makes it natural to think about code agent training as an RL problem: you're learning a policy that maps agent requests to helpful responses.

Expand Down
12 changes: 7 additions & 5 deletions docs/source/how-it-works.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ The ``QueueMediatedLLMClient`` implements the ``LLMClient`` protocol, but instea

Meanwhile, the environment:

1. **Watches the queue**: Extracts ``LLMRequest`` objects as they arrive
1. **Watches the queue**: Extracts canonical Open Responses requests as they arrive

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"canonical"

2. **Exposes them as observations**: Returns them from ``reset()`` and ``step()``
3. **Provides responses**: When you call ``step(action)``, sets the Future's result

Expand All @@ -39,12 +39,14 @@ The core implementation is simple:

.. code-block:: python

from linguafranca import types as lft

@dataclass(frozen=True)
class QueueMediatedLLMClient(LLMClient):
q: asyncio.Queue[ValueAndFuture[LLMRequest, LLMResponse]]
q: asyncio.Queue[ValueAndFuture[lft.OpenResponsesRequest, InferenceResult]]

async def __call__(self, request: LLMRequest) -> LLMResponse:
future = asyncio.Future[LLMResponse]()
async def __call__(self, request: lft.OpenResponsesRequest) -> InferenceResult:
future = asyncio.Future[InferenceResult]()
await self.q.put(ValueAndFuture(value=request, future=future))
return await future # Blocks until env provides response

Expand All @@ -65,7 +67,7 @@ The environment side:
self._llm_req_future = value_and_future.future
return TimeStep(step_type="MID", observation=value_and_future.value, ...)

async def step(self, action: LLMResponse) -> TimeStep:
async def step(self, action: InferenceResult) -> TimeStep:
# Unblock the code agent by providing response
self._llm_req_future.set_result(action)
return await self._get_time_step()
Expand Down
4 changes: 2 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,10 @@ See the main `README <https://github.com/withmartian/ares>`_ for installation in
Key Features
------------

* **RL-First Design**: Built around the reinforcement learning loop with observations (LLM requests) and actions (LLM responses)
* **RL-First Design**: Built around the reinforcement learning loop with observations (Open Responses requests) and actions (LLM responses)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should leave this as "LLM requests" here for explanatory purposes

* **LLM-Level Optimization**: Train the LLM within code agents, not just the agent as a whole
* **Distributed Workloads**: Support for high-volume, distributed training and evaluation
* **Mechanistic Interpretability**: Raw access to LLM requests and responses for deep analysis
* **Mechanistic Interpretability**: Raw access to canonical LLM requests and responses for deep analysis

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"canonical"

* **Async Gym/dm_env like Spec**: Close to Gym/dm_env spec, but incorporating async methods for performance

Indices and tables
Expand Down
14 changes: 8 additions & 6 deletions examples/04_rl_training_with_skyrl.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,9 @@

import ares
from ares import llms
from ares.llms import open_responses
import hydra
from linguafranca import types as lft
import omegaconf
Comment on lines 49 to 54

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARES imports appear before third-party imports; should we reorder to stdlib → third-party → local/ARES with blank lines and ARES last per CLAUDE.md?

Finding type: AI Coding Guidelines | Severity: 🟢 Low


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
examples/04_rl_training_with_skyrl.py around lines 49 to 54, the ARES imports (import
ares; from ares import llms; from ares.llms import open_responses) are placed before
third-party imports, violating the project's import ordering. Reorder imports to follow:
stdlib imports first (unchanged), then a single blank line, then all third-party imports
(hydra, linguafranca/types, omegaconf, ray, skyrl_gym) grouped together, then a single
blank line, and finally the ARES/local imports grouped together (import ares; from ares
import llms; from ares.llms import open_responses). Ensure there is exactly one blank
line separating each group and update any import lines or spacing accordingly.

import ray
import skyrl_gym
Expand Down Expand Up @@ -91,7 +93,7 @@ def __init__(self, env_config: dict | None = None, extras: dict | None = None, *
self.preset_name = extras.get("preset_name", kwargs.get("preset_name"))
if not self.preset_name:
raise ValueError("preset_name must be provided in extras or kwargs")
self.env: ares.Environment[llms.LLMResponse, llms.LLMRequest, float, float] | None = None
self.env: ares.Environment[llms.InferenceResult, lft.OpenResponsesRequest, float, float] | None = None

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q for @rsmith49
There is a little bit of asymmetry here, wdyt of it? Inputs to step are LLM responses (ARES type, though it wraps linguafranca), outputs of step are LLM requests (linguafranca type).

Is this confusing? We could alias linguafranca types so people can use ARES aliases instead, but I'm not sure if that's even more confusing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree it feels a little off. The right approach is probably to wrap lft.OpenResponsesRequest with an ARES-specific wrapper as well to future proof in case there are other top level fields we need (like cost for the response), and have it be a single attribute dataclass?

If we want to do this approach long-term, I think aliasing the type within ARES makes sense for now

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I ended up using an alias to the type called InferenceRequest
Now there's a slightly weird matchup, InferenceRequest and InferenceResult. I think this is better, but agree Result is weird, but it feels like we can't do Response


async def init(
self, prompt: base_text_env.ConversationType
Expand All @@ -104,7 +106,8 @@ async def init(
await self.env.__aenter__()
ts = await self.env.reset()

return ts.observation.messages, {} # type: ignore
assert ts.observation is not None
return open_responses.to_chat_messages(ts.observation, strict=True), {}

async def step(self, action: str) -> base_text_env.BaseTextEnvStepOutput:
"""Runs one environment step.
Expand All @@ -119,18 +122,17 @@ async def step(self, action: str) -> base_text_env.BaseTextEnvStepOutput:
"""
assert self.env is not None

llm_resp = llms.LLMResponse(
data=[llms.TextData(content=action)],
llm_resp = llms.InferenceResult(
response=llms.make_response(action),
cost=0.0,
usage=llms.Usage(prompt_tokens=-1, generated_tokens=-1),
)
ts = await self.env.step(llm_resp)

if ts.last():
# Hack to approximate a context manager
await self.env.__aexit__(None, None, None)

msgs = [] if ts.last() else ts.observation.messages
msgs = [] if ts.last() else open_responses.to_chat_messages(ts.observation, strict=True)
return base_text_env.BaseTextEnvStepOutput(
observations=msgs,
reward=ts.reward or 0.0,
Expand Down
19 changes: 10 additions & 9 deletions examples/05_tinker_train.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,10 @@
import ares
from ares import containers
from ares import llms
from ares.llms import open_responses
import chz
import frozendict
from linguafranca import types as lft
import numpy as np
Comment on lines 49 to 56

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move from ares.llms import open_responses and from linguafranca import types as lft after the third-party imports so all third-party imports appear before ARES/local imports per CLAUDE.md and CONTRIBUTING.md?

Finding type: AI Coding Guidelines | Severity: 🟢 Low


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
examples/05_tinker_train.py around lines 49-56, the local ARES imports (import ares,
from ares import containers, from ares import llms, from ares.llms import
open_responses) are placed among third-party imports. Reorder the imports so all
third-party imports (chz, frozendict, from linguafranca import types as lft, numpy,
tinker, from tinker_cookbook import cli_utils) appear together first, then add a blank
line, and place the ARES/local imports after them. Preserve existing import names and
aliases and ensure the file follows stdlib → third-party → local grouping with a
blank line separating groups.

import tinker
from tinker_cookbook import cli_utils
Expand Down Expand Up @@ -109,8 +111,8 @@ class TinkerCompatibleEnv(tinker_types.Env):
"""Adapter wrapping ARES environments to work with Tinker's RL training loop.

Handles bidirectional conversion:
- ARES LLMRequest -> Tinker ModelInput (tokenized prompts)
- Tinker Action (text) -> ARES LLMResponse
- ARES Open Responses request -> Tinker ModelInput (tokenized prompts)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include actual class name here

- Tinker Action (text) -> ARES InferenceResult
- ARES TimeStep -> Tinker StepResult

This enables using any ARES environment with Tinker's training infrastructure.
Expand All @@ -121,7 +123,7 @@ class TinkerCompatibleEnv(tinker_types.Env):

def __init__(
self,
env: ares.Environment[llms.LLMResponse, llms.LLMRequest, float, float],
env: ares.Environment[llms.InferenceResult, lft.OpenResponsesRequest, float, float],

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly looking at the change, I do kind of like llms|ares.LLMResponse, llms|ares.LLMRequest better than the new names. If we do the alias from above, IMO LLMResponse and LLMRequest as type/class names keep things semantically simpler (and less verbose)

renderer: renderers.Renderer,
convo_prefix: list[renderers.Message] | None,
max_tokens: int,
Expand All @@ -132,14 +134,14 @@ def __init__(
self.max_tokens = max_tokens

def _get_tinker_observation(
self, ts: ares.TimeStep[llms.LLMRequest | None, float, float]
self, ts: ares.TimeStep[lft.OpenResponsesRequest | None, float, float]
) -> tinker_types.Observation:
if ts.observation is None:
return tinker.ModelInput.empty()

messages = self.convo_prefix + [
renderers.Message(role=message["role"], content=message["content"]) # type: ignore
for message in ts.observation.messages
for message in open_responses.to_chat_messages(ts.observation, strict=True)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove strict=True from all the examples? Feels like it makes it slightly harder to follow, and it is already the default behavior

]
model_input = self.renderer.build_generation_prompt(messages)

Expand All @@ -149,15 +151,14 @@ def _get_tinker_observation(

return model_input

def _get_ares_action(self, action: tinker_types.Action) -> llms.LLMResponse:
def _get_ares_action(self, action: tinker_types.Action) -> llms.InferenceResult:
message, parse_success = self.renderer.parse_response(action)
if not parse_success:
_LOGGER.warning("Failed to parse response: %s", message)

return llms.LLMResponse(
data=[llms.TextData(content=_get_text_content(message))],
return llms.InferenceResult(
response=llms.make_response(_get_text_content(message)),
cost=0.0,
usage=llms.Usage(prompt_tokens=-1, generated_tokens=-1),
)

@property
Expand Down
Loading
Loading