Skip to content

Commit 209419b

Browse files
sejoriclaude
andcommitted
fix: use correct model names, async batch examples, and cost savings
- Replace gpt-4o with Qwen/Qwen3-30B-A3B and text-embedding-3-small with Qwen/Qwen3-Embedding-8B (models available through Doubleword) - Fix batch class examples to use async methods (ainvoke, aembed_documents) since ChatDoublewordBatch and DoublewordEmbeddingsBatch are async-only - Update cost savings messaging to "up to 90% cost savings" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 76e5871 commit 209419b

File tree

3 files changed

+32
-12
lines changed

3 files changed

+32
-12
lines changed

src/oss/python/integrations/chat/doubleword.mdx

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ Now we can instantiate our model object and generate chat completions:
7171
from langchain_doubleword import ChatDoubleword
7272

7373
model = ChatDoubleword(
74-
model="gpt-4o",
74+
model="Qwen/Qwen3-30B-A3B",
7575
temperature=0,
7676
max_tokens=1024,
7777
max_retries=2,
@@ -142,19 +142,29 @@ For more on binding tools and tool call outputs, head to the [tool calling](/oss
142142

143143
## Batch processing
144144

145-
`ChatDoublewordBatch` uses Doubleword's batch API to transparently collect concurrent calls into batch submissions at reduced cost. This is useful for high-throughput workloads where real-time responses are not required.
145+
`ChatDoublewordBatch` uses Doubleword's batch API to transparently collect concurrent calls into batch submissions with up to 90% cost savings. This is useful for high-throughput workloads where real-time responses are not required.
146+
147+
**Note:** `ChatDoublewordBatch` is async-only. Sync methods like `invoke()` will raise `NotImplementedError`. Use `ainvoke()` instead.
146148

147149
```python
150+
import asyncio
148151
from langchain_doubleword import ChatDoublewordBatch
149152

150153
batch_model = ChatDoublewordBatch(
151-
model="gpt-4o",
154+
model="Qwen/Qwen3-30B-A3B",
152155
temperature=0,
153156
)
154157

158+
155159
# Calls are automatically batched behind the scenes
156-
result = batch_model.invoke("Summarize the theory of relativity in one sentence.")
157-
result.content
160+
async def main():
161+
result = await batch_model.ainvoke(
162+
"Summarize the theory of relativity in one sentence."
163+
)
164+
print(result.content)
165+
166+
167+
asyncio.run(main())
158168
```
159169

160170
---

src/oss/python/integrations/embeddings/doubleword.mdx

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ if not os.getenv("DOUBLEWORD_API_KEY"):
3434
```python
3535
from langchain_doubleword import DoublewordEmbeddings
3636

37-
embeddings = DoublewordEmbeddings(model="text-embedding-3-small")
37+
embeddings = DoublewordEmbeddings(model="Qwen/Qwen3-Embedding-8B")
3838

3939
# Embed a single query
4040
query_embedding = embeddings.embed_query("What is the meaning of life?")
@@ -47,15 +47,25 @@ doc_embeddings = embeddings.embed_documents(
4747

4848
## Batch embeddings
4949

50-
For high-throughput workloads, use `DoublewordEmbeddingsBatch` to automatically batch concurrent embedding requests at reduced cost:
50+
For high-throughput workloads, use `DoublewordEmbeddingsBatch` to automatically batch concurrent embedding requests with up to 90% cost savings.
51+
52+
**Note:** `DoublewordEmbeddingsBatch` is async-only. Sync methods like `embed_documents()` will raise `NotImplementedError`. Use `aembed_documents()` instead.
5153

5254
```python
55+
import asyncio
5356
from langchain_doubleword import DoublewordEmbeddingsBatch
5457

55-
batch_embeddings = DoublewordEmbeddingsBatch(model="text-embedding-3-small")
56-
doc_embeddings = batch_embeddings.embed_documents(
57-
["Document one.", "Document two.", "Document three."]
58-
)
58+
batch_embeddings = DoublewordEmbeddingsBatch(model="Qwen/Qwen3-Embedding-8B")
59+
60+
61+
async def main():
62+
doc_embeddings = await batch_embeddings.aembed_documents(
63+
["Document one.", "Document two.", "Document three."]
64+
)
65+
print(f"Generated {len(doc_embeddings)} embeddings")
66+
67+
68+
asyncio.run(main())
5969
```
6070

6171
## API reference

src/oss/python/integrations/providers/doubleword.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: "Route AI inference through Doubleword's unified gateway using Lang
44
sidebarTitle: "Doubleword"
55
---
66

7-
[Doubleword](https://doubleword.ai/) is an AI model gateway and control layer that provides unified routing, management, and security for inference across multiple model providers. It exposes an OpenAI-compatible API with features like per-key rate limiting, request logging, and cost-optimized batch processing.
7+
[Doubleword](https://doubleword.ai/) is an AI model gateway and control layer that provides unified routing, management, and security for inference across multiple model providers. It exposes an OpenAI-compatible API with features like per-key rate limiting, request logging, and cost-optimized batch processing with up to 90% cost savings.
88

99
## Chat models
1010

0 commit comments

Comments
 (0)