You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+55-41Lines changed: 55 additions & 41 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -79,44 +79,76 @@ Monitor usage and manage spend via the model dashboard on [Lightning AI](https:/
79
79
✅ Bring your own model (connect your API keys, coming soon...)
80
80
✅ Chat logs (coming soon...)
81
81
82
-
# Advanced features
82
+
<br/>
83
83
84
-
## Concurrency with async
84
+
#Advanced features
85
85
86
-
LitAI supports asynchronous execution, allowing you to handle multiple requests concurrently without blocking. This is especially useful in high-throughput applications like chatbots, APIs, or agent loops.
86
+
### Auto fallbacks and retries
87
87
88
-
To enable async behavior, set `enable_async=True` when initializing the `LLM` class. Then use `await llm.chat(...)` inside an `async` function.
88
+
Model APIs can flake or can have outages. LitAI LitAI automatically retries in case of failures. After multiple failures it can automatically fallback to other models in case the provider is down.
- Fallback models are tried in the order provided.
104
+
- Each model gets up to `max_retries` attempts independently.
105
+
- The first successful response is returned immediately.
106
+
- If all models fail after their retry limits, LitAI raises an error.
99
107
100
-
if__name__=="__main__":
101
-
asyncio.run(main())
102
-
```
103
108
104
-
## Streaming
109
+
<details>
110
+
<summary>Streaming</summary>
105
111
106
-
Stream the model response as it's being generated.
112
+
Real-time chat applications benefit from showing words as they generate which gives the illusion of faster speed to the user. Streaming
113
+
is the mechanism that allows you to do this.
107
114
108
115
```python
109
116
from litai importLLM
110
117
111
118
llm = LLM(model="openai/gpt-4")
112
119
for chunk in llm.chat("hello", stream=True):
113
120
print(chunk, end="", flush=True)
121
+
````
122
+
</details>
123
+
124
+
<details>
125
+
<summary>Concurrency withasync</summary>
126
+
127
+
Advanced Python programs that process multiple requests at once rely on "async" to do this. LitAI can work withasync libraries without blocking calls. This is especially useful in high-throughput applications like chatbots, APIs, or agent loops.
128
+
129
+
To enable async behavior, set`enable_async=True` when initializing the `LLM`class. Then use `await llm.chat(...)` inside an `async` function.
0 commit comments