You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Colivara is a suite of services that allows you to store, search, and retrieve documents based on their **_visual_** embedding. ColiVara has state of the art retrieval performance on both text and visual documents, offering superior multimodal understanding and control.
It is a web-first implementation of the **ColPali** paper using ColQwen2 as the LLM model. It works exactly like RAG from the end-user standpoint - but using vision models instead of chunking and text-processing for documents.
12
10
13
-
**No OCR, no text extraction, no broken tables, or missing images. What you see, is what you get.**
14
11
15
-
### Cloud Quickstart:
12
+
### Quickstart:
16
13
17
14
1. Get a free API Key from the [ColiVara Website](https://colivara.com).
18
15
@@ -21,31 +18,30 @@ It is a web-first implementation of the **ColPali** paper using ColQwen2 as the
21
18
```bash
22
19
pip install colivara-py
23
20
```
24
-
or in Typescript
21
+
or
25
22
26
23
```bash
27
24
npm install colivara-ts
28
25
```
29
26
30
-
3. Index a document. Colivara accepts a file url, or base64 encoded file, or a file path. We support over 100 file formats including PDF, DOCX, PPTX, and more. We will also automatically take a screenshot of URLs (webpages) and index them.
27
+
3. Index a document (a file url, base64 encoded file, or path). It supports over 100 file formats including PDF, DOCX, PPTX, and more.
31
28
32
29
```python
33
30
from colivara_py import ColiVara
34
31
35
-
client = ColiVara(api_key="your_api_key")
32
+
client = ColiVara(api_key=os.environ.get("COLIVARA_API_KEY"), # default and can be omitted
33
+
base_url="https://api.colivara.com"# default and can be omitted
34
+
)
36
35
37
36
# Upload a document to the default_collection
38
37
document = client.upsert_document(
39
-
name="sample_document",
40
-
# You can use a file path, base64 encoded file, or a URL
41
-
document_url="https://example.com/sample.pdf",
42
-
# optional - add metadata
43
-
metadata={"author": "John Doe"},
44
-
# optional - specify a collection
45
-
collection_name="user_1_collection",
46
-
# optional - wait for the document to index. Webhooks are also supported.
47
-
wait=True
38
+
name="sample_document",
39
+
document_url="https://example.com/sample.pdf", # You can use a file path, base64 encoded file, or a URL
0 commit comments