Replies: 1 comment 4 replies
-
The
Your understanding is correct that this setup deals with large contexts by iterating over text chunks and refining the response iteratively. The To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Beta Was this translation helpful? Give feedback.
-
I am using a bit customized CondensePlusContextChatEngine and am trying to understand the way it works once I use the "chat" method.
I am trying to follow all the classes and subclasses being initialized but to be honest there seems to be a high level of abstraction which makes it a bit hard for me to understand.
Could you guys support me in confirming my understanding and maybe answering a few additional questions.
If I understand correctly, the first thing that the chat method does is call the run_c3 method which take scare of condensation ,retrieval and ultimately leads to getting the response_synthetizer where things get a bit murky to me...
There seems to be called two time the function get_prefix_messages_with_context once for context and once for refinement. I do not clearly understand the refinement need here.
The call here leads us to initialization of CompactAndRefine class with the above mentioned templates.
Ultimately this CompactAndRefine's method synthetize method is being called which is taken taken from CompactAndRefine -> Refine -> BaseSynthesizer.
And if I understand correctly this is where the magic happens - with the get_response which is an abstractmethod defined in the the CompactAndRefine class, which however takes the implementation from the Refine class.
In this method we start going over some "text chunks" and if I understand correctly these text_chunks are the nodes passed to the chat method with only the content taken.
Now it seems we are iterating over every text chunk - for the first chunk doing _give_response_single and for each next doing _refine_response_single.
This whole setup is incredibly confusing to me and I am not fully clear on why this iteration is there, how it helps, how can I change it and how to control it. My assumption is that this deals with having large context and if you cannot fit the entire context to generate answer to go over it iteratively and update the answer based on each additional chunk(node).
However I do not see how it is checking what my context window is and whether it can actually fit. Can someone please help me shed a light on the logic and especially the last step - _refine_response_single which seems to do most of it :)
Beta Was this translation helpful? Give feedback.
All reactions