How should I structure data for fine-tuning a model in a groupchat setting? #729

PyroGenesis · 2023-11-21T00:38:04Z

PyroGenesis
Nov 21, 2023

I'm trying to finetune some llm agents such that they perform some specialized task. However, I am unsure of how to best convert an Autogen groupchat conversation into a training format.

For example, let's say I'm working with the following conversation:

User prompt
Agent 1 speaks
Agent 2 speaks
Agent 3 speaks

Now let's say I'm trying to fine-tune Agent 3 here, which is backed by a Llama 2 model

Llama 2 model format:

<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_msg_1 }} [/INST] {{ model_answer_1 }} </s><s>[INST] {{ user_msg_2 }} [/INST]

How should I go about structuring this into training data? The 3 ways I've thought about are:

Option 1: Treat other Agent responses as separate user responses

<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ User prompt }} [/INST] {{ Agent 1 output }} [/INST] {{ Agent 2 output }} [/INST] </s><s>[INST] {{ Agent 3 output }} [/INST]

Option 2: Combine other Agent responses into single User response

<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ User prompt }} {{ Agent 1 output }} {{ Agent 2 output }} [/INST] </s><s>[INST] {{ Agent 3 output }} [/INST]

Option 3: Treat other Agent responses as AI responses

<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ User prompt }} [/INST] </s><s>[INST] {{ Agent 1 output }} [/INST] [INST] {{ Agent 2 output }} [/INST] [INST] {{ Agent 3 output }} [/INST]

Do any of these look viable? Or should I try and roll my own new format which would be more compatible with groupchat conversations?

rickyloynd-microsoft · 2023-11-21T20:17:27Z

rickyloynd-microsoft
Nov 21, 2023
Collaborator

@afourney fyi

0 replies

amanAtHoneyHealth · 2024-03-11T18:52:30Z

amanAtHoneyHealth
Mar 11, 2024

I am also looking for an answer to this.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How should I structure data for fine-tuning a model in a groupchat setting? #729

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

How should I structure data for fine-tuning a model in a groupchat setting? #729

PyroGenesis Nov 21, 2023

Option 1: Treat other Agent responses as separate user responses

Option 2: Combine other Agent responses into single User response

Option 3: Treat other Agent responses as AI responses

Replies: 2 comments

rickyloynd-microsoft Nov 21, 2023 Collaborator

amanAtHoneyHealth Mar 11, 2024

PyroGenesis
Nov 21, 2023

rickyloynd-microsoft
Nov 21, 2023
Collaborator

amanAtHoneyHealth
Mar 11, 2024