Very poor compression performance

I tested **Memvid** with three text files totaling **94 KB**:

<img width="723" height="254" alt="Image" src="https://github.com/user-attachments/assets/2763a08f-a46c-47b3-8f75-336189c0ed7a" />

After processing, the outputs were:

- `docs.mp4` (video file)  
- `docs_index.json` (JSON index)  

Together, these files are roughly **2 MB**, which is about **20× larger than the original text**.

See also [issue #49](https://github.com/Olow304/memvid/issues/49)

---

#### Why the JSON Index Exists
The JSON file is required because video compression codecs are typically **lossy**, which risks corrupting the information embedded in QR codes. The JSON index serves as a fallback to preserve exact chunk data.  

See [issue #39](https://github.com/Olow304/memvid/issues/39) for a deeper analysis.

---

#### Problem with Current Approach
Encoding via a video compression pipeline introduces:
- Significant **processing overhead**  
- Substantial **storage inflation** compared to traditional compression  

Also discussed in [issue #63](https://github.com/Olow304/memvid/issues/63)

---

#### Strong Recommendation
Instead of relying on a video-based workflow, just use **any general-purpose LZ compressor** (e.g., `gzip`, `zstd`, `LZMA`).  

This would:
- Eliminate the need for lossy video encoding and JSON fallback  
- Achieve **far better compression ratios**  
- Reduce both **processing complexity** and **storage footprint**

In short: an LZ compressor directly on the text or embeddings would be vastly more efficient than the current pipeline.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Very poor compression performance #85

Why the JSON Index Exists

Problem with Current Approach

Strong Recommendation

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Very poor compression performance #85

Description

Why the JSON Index Exists

Problem with Current Approach

Strong Recommendation

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions