Skip to content

Commit c324f5d

Browse files
committed
update readme
1 parent 04c62a7 commit c324f5d

File tree

1 file changed

+14
-15
lines changed

1 file changed

+14
-15
lines changed

README.md

Lines changed: 14 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,10 @@
55

66
Flare AI SDK for Consensus Learning.
77

8-
### 🚀 Key Features
8+
## 🚀 Key Features
99

1010
- **Consensus Learning Implementation**
11-
A Python implementation of single-node, multi-model Consensus Learning (CL). CL is a decentralized ensemble learning paradigm introduced in [arXiv:2402.16157](https://arxiv.org/abs/2402.16157).
11+
A Python implementation of single-node, multi-model Consensus Learning (CL). CL is a decentralized ensemble learning paradigm introduced in [arXiv:2402.16157](https://arxiv.org/abs/2402.16157), which is now being generalized to large language models (LLMs).
1212

1313
- **300+ LLM Support**
1414
Leverages OpenRouter to access over 300 models via a unified interface.
@@ -122,18 +122,18 @@ Deploy on a [Confidential Space](https://cloud.google.com/confidential-computing
122122

123123
### Prerequisites
124124

125-
- **Google Cloud Platform Account:**
125+
- **Google Cloud Platform Account:**
126126
Access to the [`verifiable-ai-hackathon`](https://console.cloud.google.com/welcome?project=verifiable-ai-hackathon) project is required.
127127

128-
- **OpenRouter API Key:**
128+
- **OpenRouter API Key:**
129129
Ensure your [OpenRouter API key](https://openrouter.ai/settings/keys) is in your `.env`.
130130

131-
- **gcloud CLI:**
131+
- **gcloud CLI:**
132132
Install and authenticate the [gcloud CLI](https://cloud.google.com/sdk/docs/install).
133133

134134
### Environment Configuration
135135

136-
1. **Set Environment Variables:**
136+
1. **Set Environment Variables:**
137137
Update your `.env` file with:
138138

139139
```bash
@@ -217,28 +217,27 @@ If you encounter issues, follow these steps:
217217
gcloud compute instances get-serial-port-output $INSTANCE_NAME --project=verifiable-ai-hackathon
218218
```
219219

220-
2. **Verify API Key(s):**
220+
2. **Verify API Key(s):**
221221
Ensure that all API Keys are set correctly (e.g. `OPEN_ROUTER_API_KEY`).
222222

223-
3. **Check Firewall Settings:**
223+
3. **Check Firewall Settings:**
224224
Confirm that your instance is publicly accessible on port `80`.
225225

226226
## 💡 Next Steps
227227

228228
- **Security & TEE Integration:**
229229
- Ensure execution within a Trusted Execution Environment (TEE) to maintain confidentiality and integrity.
230-
- **Factual correctness**:
230+
- **Factual Correctness**:
231231
- In line with the main theme of the hackathon, one important aspect of the outputs generated by the LLMs is their accuracy. In this regard, producing sources/citations with the answers would lead to higher trust in the setup. Sample prompts that can be used for this purpose can be found in the appendices of [arXiv:2305.14627](https://arxiv.org/pdf/2305.14627), or in [James' Coffee Blog](https://jamesg.blog/2023/04/02/llm-prompts-source-attribution).
232232
- _Note_: only certain models may be suitable for this purpose, as references generated by LLMs are often inaccurate or not even real!
233-
- **Prompt engineering**:
233+
- **Prompt Engineering**:
234234
- Our approach is very similar to the **Mixture-of-Agents (MoA)** introduced in [arXiv:2406.04692](https://arxiv.org/abs/2406.04692), which uses iterative aggregations of model responses. Ther [github repository](https://github.com/togethercomputer/MoA) does include other examples of prompts that can be used for additional context for the LLMs.
235235
- New iterations of the consensus learning algorithm could have different prompts for improving the previous responses. In this regard, the _few-shot_ prompting techniques introduced by OpenAI in [arXiv:2005.14165](https://arxiv.org/pdf/2005.14165) work by providing models with a _few_ examples of similar queries and responses in addition to the initial prompt. (See also previous work by [Radford et al.](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf).)
236236
- _Chain of Thought_ prompting techniques are a linear problem solving approach where each step builds upon the previous one. Google's approach in [arXiv:2201.11903](https://arxiv.org/pdf/2201.11903) is to augment each prompt with an additional example and chain of thought for an associated answer. (See the paper for multiple examples.)
237-
- **Dynamic resource allocation**:
237+
- **Dynamic resource allocation and Semantic Filters**:
238238
- An immediate improvement to the current approach would be to use dynamically-adjusted parameters. Namely, the number of iterations and number of models used in the algorithm could be adjusted to the input prompt: _e.g._ simple prompts do not require too many resources. For this, a centralized model could be used to decide the complexity of the task, prior to sending the prompt to the other LLMs.
239-
- On a similar note, the number of iterations for making progress could adjusted according to how _different_ are the model responses. While semantic entailment for LLM outputs is a notoriously difficult topic, the use of [LLM-as-a-Judge](https://arxiv.org/pdf/2306.05685) for evaluating other LLM outputs has shown good progress -- see also this [Confident AI blogpost](https://www.confident-ai.com/blog/why-llm-as-a-judge-is-the-best-llm-evaluation-method).
240-
- **Semantic filters**:
241-
- In line with the previously mentioned LLM-as-a-Judge, a model could potentially be used for filtering _bad_ responses.
242-
- LLM-Blender, for instance, introduced in [arXiv:2306.02561](https://arxiv.org/abs/2306.02561), uses a PairRanker that achieves a ranking of outputs through pairwise comparisons via a _cross-attention encoder_.
239+
- On a similar note, the number of iterations for making progress could adjusted according to how _different_ are the model responses. Semantic entailment for LLM outputs is an active field of research, but a rather quick solution is to rely on _embeddings_. [TBC]
240+
the use of [LLM-as-a-Judge](https://arxiv.org/pdf/2306.05685) for evaluating other LLM outputs has shown good progress -- see also this [Confident AI blogpost](https://www.confident-ai.com/blog/why-llm-as-a-judge-is-the-best-llm-evaluation-method).
241+
- In line with the previously mentioned LLM-as-a-Judge, a model could potentially be used for filtering _bad_ responses. LLM-Blender, for instance, introduced in [arXiv:2306.02561](https://arxiv.org/abs/2306.02561), uses a PairRanker that achieves a ranking of outputs through pairwise comparisons via a _cross-attention encoder_.
243242
- **AI Agent Swarm**:
244243
- The structure of the reference CL implementation can be changed to adapt _swarm_-type algorithms, where tasks are broken down and distributed among specialized agents for parallel processing. In this case a centralized LLM would act as an orchestrator for managing distribution of tasks -- see _e.g._ [swarms repo](https://github.com/kyegomez/swarms).

0 commit comments

Comments
 (0)