Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent Traces Pipeline #565

Draft
wants to merge 37 commits into
base: main
Choose a base branch
from
Draft
Changes from 1 commit
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
e7df036
Start agent traces
aymeric-roucher Feb 24, 2025
6d0963e
Working local version with o1
aymeric-roucher Feb 25, 2025
a6f5a15
Update api addr
aymeric-roucher Feb 26, 2025
38bfa93
Increase concurrent requests
aymeric-roucher Feb 26, 2025
7d9fc6e
Update sbatch params
aymeric-roucher Feb 26, 2025
7a1fb98
Add conda activation
aymeric-roucher Feb 26, 2025
1a7becf
Use local model
aymeric-roucher Feb 26, 2025
f35337e
128 concurrent
aymeric-roucher Feb 26, 2025
28bc464
Log
aymeric-roucher Feb 26, 2025
319ae52
Add conda init
aymeric-roucher Feb 26, 2025
69d55f6
Fix slurm script
aymeric-roucher Feb 26, 2025
c8aa2c4
Add await
aymeric-roucher Feb 26, 2025
6df6161
Try fixing async func
aymeric-roucher Feb 26, 2025
b402450
Add stop sequences
aymeric-roucher Feb 26, 2025
b2996c1
Add port
aymeric-roucher Feb 27, 2025
f6f138b
Make synchronous
aymeric-roucher Feb 28, 2025
23c2128
Small adapts to script
aymeric-roucher Feb 28, 2025
52ac4e2
More detailed error logging
aymeric-roucher Feb 28, 2025
0adc082
Even more detailed request error logging
aymeric-roucher Feb 28, 2025
884c8e9
Reduce context length
aymeric-roucher Feb 28, 2025
64ae551
Add token counting
aymeric-roucher Feb 28, 2025
2e7d1da
Fix message roles an add token counting
aymeric-roucher Feb 28, 2025
7bcb96e
Add dummy completion
aymeric-roucher Feb 28, 2025
28afbef
Test
aymeric-roucher Feb 28, 2025
5ed2005
Running with gpt-4o
aymeric-roucher Feb 28, 2025
ce7d8bd
Update timeouts
aymeric-roucher Feb 28, 2025
6a9db1b
Adjust
aymeric-roucher Feb 28, 2025
e245aa0
Flatten messages
aymeric-roucher Feb 28, 2025
b6de9cb
Prompt more around testing the function
aymeric-roucher Feb 28, 2025
9cdf0d9
Improve explanations in prompt
aymeric-roucher Feb 28, 2025
ef3f888
Also store final outputs
aymeric-roucher Mar 13, 2025
91e4dc1
wip(generate + eda): working generation + add initial eda
baptistecolle Mar 31, 2025
5d7205d
feat(eda): uploaded dataset for training
baptistecolle Mar 31, 2025
c1cea15
feat(train): added training recipe for agentic traces
baptistecolle Mar 31, 2025
8cc3983
fix(deps): fix smolagent dep
baptistecolle Mar 31, 2025
3b021de
fix(deps): fix smolagent dep
baptistecolle Mar 31, 2025
2fbac03
fix: remove uncessary changes
baptistecolle Mar 31, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Also store final outputs
aymeric-roucher authored and baptistecolle committed Mar 31, 2025
commit ef3f888a99c1212020316ef350cd107c5b42557c
30 changes: 15 additions & 15 deletions scripts/generate_agent_traces.py
Original file line number Diff line number Diff line change
@@ -36,8 +36,9 @@ class ModifiedFinalAnswerTool(Tool):
output_type = "string"

def forward(self, answer_function: Any) -> str:
print("USING MODIFIED FINAL ANSWER TOOL")
return inspect.getsource(answer_function)
source_code = inspect.getsource(answer_function)
print("USING MODIFIED FINAL ANSWER TOOL, got source code:\n", source_code)
return source_code

def __init__(self, *args, **kwargs):
self.is_initialized = False
@@ -110,19 +111,18 @@ def model(messages, stop_sequences = None):
max_steps=10,
verbosity_level=2
)

try:
output = agent.run(task)
print("GOT OUTPUT:", output)
return agent.write_memory_to_messages()
return agent.write_memory_to_messages(), output
except Exception as e:
print(f"Error when generating agentic trace: {e}")
return None

def process_example(example, session, args, output_file, pbar=None):
prompt = f"""Here is a task to solve using a function:
{example[args.prompt_column]}

Now write a function that solves the problem, test it and return it using final_answer(your_function).
The function should take the inputs described in the task above, using them in this way: the function will be passed the 'lines' described in the task as different arguments.
For instance:
@@ -132,29 +132,28 @@ def process_example(example, session, args, output_file, pbar=None):
ALWAYS RUN THE FUNCTION IN A CODE SNIPPET WITH TEST CASES BEFORE RETURNING IT.
"""
try:
agent_runs = []
agent_outputs, agent_memories = [], []
for _ in range(args.num_generations):
agent_run = get_agent_run(session, prompt, args)
agent_runs.append(agent_run)
agent_output, agent_memory = get_agent_run(session, prompt, args)
agent_outputs.append(agent_output)
agent_memories.append(agent_memory)

if any(agent_run is None for agent_run in agent_runs):
if any(agent_output is None for agent_output in agent_outputs):
print("Error processing example")
if pbar:
pbar.update(1)
return None

generations = []
finish_reasons = []
api_metadata = []

for agent_run in agent_runs:
generations.append(agent_run)
for agent_run in agent_output:
finish_reasons.append(None)
api_metadata.append(None)

# Convert agent_run to a serializable format
serializable_generations = []
for generation in generations:
for generation in agent_memories:
if generation is not None:
# Convert to a simple list of dictionaries if it's not already
if isinstance(generation, list):
@@ -167,11 +166,12 @@ def process_example(example, session, args, output_file, pbar=None):
serializable_generations.append(str(generation))
else:
serializable_generations.append(None)

# Combine original dataset fields with generations
result = {
**example, # Preserve all original dataset fields
"generations": serializable_generations,
"final_outputs": agent_outputs,
"finish_reasons": finish_reasons,
"api_metadata": api_metadata,
}