Skip to content
This repository was archived by the owner on Sep 12, 2024. It is now read-only.

Commit b2d8b0b

Browse files
committed
chore: versions v0.0.20
1 parent aac10d2 commit b2d8b0b

File tree

2 files changed

+237
-149
lines changed

2 files changed

+237
-149
lines changed

README-zh-CN.md

Lines changed: 119 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -19,12 +19,12 @@ Node.js运行的大语言模型LLaMA。
1919

2020
- [llama-node](#llama-node)
2121
- [介绍](#介绍)
22+
- [安装](#安装)
2223
- [模型获取](#模型获取)
2324
- [模型版本](#模型版本)
24-
- [安装](#安装)
25-
- [使用](#使用)
25+
- [使用 (llama.cpp后端)](#使用-llamacpp后端)
26+
- [使用(llama-rs后端)](#使用llama-rs后端)
2627
- [推理](#推理)
27-
- [聊天](#聊天)
2828
- [分词](#分词)
2929
- [嵌入](#嵌入)
3030
- [关于性能](#关于性能)
@@ -36,7 +36,9 @@ Node.js运行的大语言模型LLaMA。
3636

3737
## 介绍
3838

39-
这是一个基于[llama-rs](https://github.com/rustformers/llama-rs)开发的nodejs客户端库,用于Llama LLM。它使用[napi-rs](https://github.com/napi-rs/napi-rs)在node.js和llama线程之间传递消息。
39+
这是一个基于[llama-rs](https://github.com/rustformers/llama-rs)[llm-chain-llama-sys](https://github.com/sobelio/llm-chain/tree/main/llm-chain-llama/sys)开发的nodejs客户端库,用于Llama(及部分周边模型) LLM。它使用[napi-rs](https://github.com/napi-rs/napi-rs)在node.js和llama线程之间传递消息。
40+
41+
从v0.0.20开始,同时支持llama-rs和llama.cpp后端
4042

4143
当前支持平台:
4244
- darwin-x64
@@ -53,6 +55,25 @@ Node.js运行的大语言模型LLaMA。
5355

5456
---
5557

58+
## 安装
59+
60+
- 安装核心包
61+
```bash
62+
npm install llama-node
63+
```
64+
65+
- 安装llama-rs后端
66+
```bash
67+
npm install @llama-node/core
68+
```
69+
70+
- 安装llama.cpp后端
71+
```bash
72+
npm install @llama-node/llama-cpp
73+
```
74+
75+
---
76+
5677
## 模型获取
5778

5879
llama-node底层调用llama-rs,它使用的模型格式源自llama.cpp。由于meta发布模型仅用于研究机构测试,本项目不提供模型下载。如果你获取到了 **.pth** 原始模型,请阅读[Getting the weights](https://github.com/rustformers/llama-rs#getting-the-weights)这份文档并使用llama-rs提供的convert工具进行转化
@@ -65,124 +86,131 @@ llama-node底层调用llama-rs,它使用的模型格式源自llama.cpp。由
6586
- GGMF:也是旧版格式,比GGML新,比GGJT旧。
6687
- GGJT:可进行mmap映射的格式。
6788

68-
llama-rs后端现在只支持GGML / GGMF模型,llama-node也是如此。对于GGJT(mmap)模型的支持,请等待该PR[standalone loader](https://github.com/rustformers/llama-rs/pull/125)合入llama-rs。
89+
llama-rs后端现在只支持GGML / GGMF模型llama.cpp后端仅支持GGJT模型
6990

7091
---
7192

72-
## 安装
73-
```bash
74-
npm install llama-node
75-
```
76-
77-
---
78-
79-
## 使用
93+
## 使用 (llama.cpp后端)
8094

8195
当前版本只支持在一个LLama实例上进行单个推理会话。
8296

8397
如果您希望同时进行多个推理会话,则需要创建多个LLama实例。
8498

85-
### 推理
99+
llama.cpp后端现仅支持推理. 嵌入和分词功能请等待后期更新。
86100

87101
```typescript
102+
import { LLama } from "llama-node";
103+
import { LLamaCpp, LoadConfig } from "llama-node/dist/llm/llama-cpp";
88104
import path from "path";
89-
import { LLamaClient } from "llama-node";
90105

91-
const model = path.resolve(process.cwd(), "./ggml-alpaca-7b-q4.bin");
106+
const model = path.resolve(process.cwd(), "./ggml-vicuna-7b-4bit-rev1.bin");
92107

93-
const llama = new LLamaClient(
94-
{
95-
path: model,
96-
numCtxTokens: 128,
97-
},
98-
true
99-
);
108+
const llama = new LLama(LLamaCpp);
100109

101-
const template = `how are you`;
110+
const config: LoadConfig = {
111+
path: model,
112+
enableLogging: true,
113+
nCtx: 1024,
114+
nParts: -1,
115+
seed: 0,
116+
f16Kv: false,
117+
logitsAll: false,
118+
vocabOnly: false,
119+
useMlock: false,
120+
embedding: false,
121+
};
102122

103-
const prompt = `Below is an instruction that describes a task. Write a response that appropriately completes the request.
123+
llama.load(config);
104124

105-
### Instruction:
125+
const template = `How are you`;
126+
127+
const prompt = `### Human:
106128
107129
${template}
108130
109-
### Response:`;
131+
### Assistant:`;
110132

111-
llama.createTextCompletion(
133+
llama.createCompletion(
112134
{
113-
prompt,
114-
numPredict: 128,
115-
temp: 0.2,
116-
topP: 1,
135+
nThreads: 4,
136+
nTokPredict: 2048,
117137
topK: 40,
138+
topP: 0.1,
139+
temp: 0.2,
118140
repeatPenalty: 1,
119-
repeatLastN: 64,
120-
seed: 0,
121-
feedPrompt: true,
141+
stopSequence: "### Human",
142+
prompt,
122143
},
123144
(response) => {
124145
process.stdout.write(response.token);
125146
}
126147
);
148+
127149
```
128150

129-
### 聊天
151+
---
152+
153+
## 使用(llama-rs后端)
130154

131-
这段代码目前只用于Alpaca模型,它只是建立了一个Alpaca指令的上下文。请确保您的最后一条消息以“用户角色(user role)”结尾。
155+
当前版本只支持在一个LLama实例上进行单个推理会话。
156+
157+
如果您希望同时进行多个推理会话,则需要创建多个LLama实例。
158+
159+
### 推理
132160

133161
```typescript
134-
import { LLamaClient } from "llama-node";
162+
import { LLama } from "llama-node";
163+
import { LLamaRS } from "llama-node/dist/llm/llama-rs";
135164
import path from "path";
136165

137166
const model = path.resolve(process.cwd(), "./ggml-alpaca-7b-q4.bin");
138167

139-
const llama = new LLamaClient(
140-
{
141-
path: model,
142-
numCtxTokens: 128,
143-
},
144-
true
145-
);
168+
const llama = new LLama(LLamaRS);
146169

147-
const content = "how are you?";
170+
llama.load({ path: model });
171+
172+
const template = `how are you`;
148173

149-
llama.createChatCompletion(
174+
const prompt = `Below is an instruction that describes a task. Write a response that appropriately completes the request.
175+
176+
### Instruction:
177+
178+
${template}
179+
180+
### Response:`;
181+
182+
llama.createCompletion(
150183
{
151-
messages: [{ role: "user", content }],
184+
prompt,
152185
numPredict: 128,
153186
temp: 0.2,
154187
topP: 1,
155188
topK: 40,
156189
repeatPenalty: 1,
157190
repeatLastN: 64,
158191
seed: 0,
192+
feedPrompt: true,
159193
},
160194
(response) => {
161-
if (!response.completed) {
162-
process.stdout.write(response.token);
163-
}
195+
process.stdout.write(response.token);
164196
}
165197
);
166-
167198
```
168199

169200
### 分词
170201

171202
从LLama-rs中获取分词
172203

173204
```typescript
174-
import { LLamaClient } from "llama-node";
205+
import { LLama } from "llama-node";
206+
import { LLamaRS } from "llama-node/dist/llm/llama-rs";
175207
import path from "path";
176208

177209
const model = path.resolve(process.cwd(), "./ggml-alpaca-7b-q4.bin");
178210

179-
const llama = new LLamaClient(
180-
{
181-
path: model,
182-
numCtxTokens: 128,
183-
},
184-
true
185-
);
211+
const llama = new LLama(LLamaRS);
212+
213+
llama.load({ path: model });
186214

187215
const content = "how are you?";
188216

@@ -194,23 +222,19 @@ llama.tokenize(content).then(console.log);
194222
这是一份预览版本的代码,嵌入所使用的尾词在未来可能会发生变化。请勿在生产环境中使用!
195223

196224
```typescript
197-
import { LLamaClient } from "llama-node";
225+
import { LLama } from "llama-node";
226+
import { LLamaRS } from "llama-node/dist/llm/llama-rs";
198227
import path from "path";
228+
import fs from "fs";
199229

200230
const model = path.resolve(process.cwd(), "./ggml-alpaca-7b-q4.bin");
201231

202-
const llama = new LLamaClient(
203-
{
204-
path: model,
205-
numCtxTokens: 128,
206-
},
207-
true
208-
);
232+
const llama = new LLama(LLamaRS);
209233

210-
const prompt = `how are you`;
234+
llama.load({ path: model });
211235

212-
llama
213-
.getEmbedding({
236+
const getWordEmbeddings = async (prompt: string, file: string) => {
237+
const data = await llama.getEmbedding({
214238
prompt,
215239
numPredict: 128,
216240
temp: 0.2,
@@ -219,10 +243,28 @@ llama
219243
repeatPenalty: 1,
220244
repeatLastN: 64,
221245
seed: 0,
222-
feedPrompt: true,
223-
})
224-
.then(console.log);
246+
});
247+
248+
console.log(prompt, data);
249+
250+
await fs.promises.writeFile(
251+
path.resolve(process.cwd(), file),
252+
JSON.stringify(data)
253+
);
254+
};
255+
256+
const run = async () => {
257+
const dog1 = `My favourite animal is the dog`;
258+
await getWordEmbeddings(dog1, "./example/semantic-compare/dog1.json");
259+
260+
const dog2 = `I have just adopted a cute dog`;
261+
await getWordEmbeddings(dog2, "./example/semantic-compare/dog2.json");
262+
263+
const cat1 = `My favourite animal is the cat`;
264+
await getWordEmbeddings(cat1, "./example/semantic-compare/cat1.json");
265+
};
225266

267+
run();
226268
```
227269

228270
---
@@ -269,4 +311,6 @@ llama
269311
- [ ] 提示词扩展
270312
- [ ] 更多平台和处理器架构(在最高的性能条件下)
271313
- [ ] 优化嵌入API,提供可以配置尾词的选项
272-
- [ ] 命令行工具
314+
- [ ] 命令行工具
315+
- [ ] 更新llama-rs以支持更多模型 https://github.com/rustformers/llama-rs/pull/85 https://github.com/rustformers/llama-rs/issues/75
316+
- [ ] 更多native推理后端支持!

0 commit comments

Comments
 (0)