Skip to content

Commit 3c1dc51

Browse files
committed
update documentation
1 parent 0b19cf4 commit 3c1dc51

File tree

4 files changed

+16
-3
lines changed

4 files changed

+16
-3
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,7 @@ cython_debug/
165165
*.pdf
166166
*.svg
167167
# *.jpeg
168-
*.png
168+
# *.png
169169
*.bmp
170170

171171
### VirtualEnv template

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -249,8 +249,8 @@ If you encounter issues, follow these steps:
249249
- _Chain of Thought_ prompting techniques are a linear problem solving approach where each step builds upon the previous one. Google's approach in [arXiv:2201.11903](https://arxiv.org/pdf/2201.11903) is to augment each prompt with an additional example and chain of thought for an associated answer. (See the paper for multiple examples.)
250250
- **Dynamic resource allocation and Semantic Filters**:
251251
- An immediate improvement to the current approach would be to use dynamically-adjusted parameters. Namely, the number of iterations and number of models used in the algorithm could be adjusted to the input prompt: _e.g._ simple prompts do not require too many resources. For this, a centralized model could be used to decide the complexity of the task, prior to sending the prompt to the other LLMs.
252-
- On a similar note, the number of iterations for making progress could adjusted according to how _different_ are the model responses. Semantic entailment for LLM outputs is an active field of research, but a rather quick solution is to rely on _embeddings_. [TBC]
253-
the use of [LLM-as-a-Judge](https://arxiv.org/pdf/2306.05685) for evaluating other LLM outputs has shown good progress -- see also this [Confident AI blogpost](https://www.confident-ai.com/blog/why-llm-as-a-judge-is-the-best-llm-evaluation-method).
252+
- On a similar note, the number of iterations for making progress could adjusted according to how _different_ are the model responses. Semantic entailment for LLM outputs is an active field of research, but a rather quick solution is to rely on _embeddings_. These are commonly used in RAG pipelines, and could also be used here with _e.g._ cosine similarity. You can get started with [GCloud's text embeddings](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings) -- see [flare-ai-rag](https://github.com/flare-foundation/flare-ai-rag/tree/main) for more details.
253+
- The use of [LLM-as-a-Judge](https://arxiv.org/pdf/2306.05685) for evaluating other LLM outputs has shown good progress -- see also this [Confident AI blogpost](https://www.confident-ai.com/blog/why-llm-as-a-judge-is-the-best-llm-evaluation-method).
254254
- In line with the previously mentioned LLM-as-a-Judge, a model could potentially be used for filtering _bad_ responses. LLM-Blender, for instance, introduced in [arXiv:2306.02561](https://arxiv.org/abs/2306.02561), uses a PairRanker that achieves a ranking of outputs through pairwise comparisons via a _cross-attention encoder_.
255255
- **AI Agent Swarm**:
256256
- The structure of the reference CL implementation can be changed to adapt _swarm_-type algorithms, where tasks are broken down and distributed among specialized agents for parallel processing. In this case a centralized LLM would act as an orchestrator for managing distribution of tasks -- see _e.g._ [swarms repo](https://github.com/kyegomez/swarms).

src/README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,19 @@
33

44
# Flare AI Consensus
55

6+
## flare-ai-consensus Pipeline
7+
8+
The flare-ai-consensus template consists of the following components:
9+
10+
* **Router:** The primary interface that receives user requests, distributes them to the various AI models, and collects their intermediate responses.
11+
* **Aggregator:** synthesizes multiple model responses into a single, coherent output.
12+
* **Consensus Layer:** Defines logic for the consensus algorithm. The reference implementation is setup in the following steps:
13+
* The initial prompt is sent to a set of models, with additional system instructions.
14+
* Initial responses are aggregated by the Aggregator.
15+
* Improvement rounds follow up where aggregated responses are sent as additional context or system instructions to the models.
16+
17+
<img width="500" alt="flare-ai-consensus" src="./cl_pipeline.png" />
18+
619
## OpenRouter Clients
720

821
We implement two OpenRouter clients for interacting with the OpenRouter API: a standard sync client and an asynchronous client.

src/cl_pipeline.png

23.8 KB
Loading

0 commit comments

Comments
 (0)