Unable to Provide insights on Overall Data - Only Taking top 5 or 7 chunks

### .env

# Generic
TEXT_EMBEDDINGS_MODEL=sentence-transformers/all-MiniLM-L6-v2
TEXT_EMBEDDINGS_MODEL_TYPE=HF  # LlamaCpp or HF
USE_MLOCK=false

# Ingestion
PERSIST_DIRECTORY=db
DOCUMENTS_DIRECTORY=source_documents
INGEST_CHUNK_SIZE=500
INGEST_CHUNK_OVERLAP=50
INGEST_N_THREADS=1

# Generation
MODEL_TYPE=LlamaCpp # GPT4All or LlamaCpp
MODEL_PATH=eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
MODEL_TEMP=0.8
MODEL_N_CTX=2048  # Max total size of prompt+answer
MODEL_MAX_TOKENS=1024  # Max size of answer
MODEL_STOP=[STOP]
CHAIN_TYPE=betterstuff
N_RETRIEVE_DOCUMENTS=100 # How many documents to retrieve from the db
N_FORWARD_DOCUMENTS=100 # How many documents to forward to the LLM, chosen among those retrieved
N_GPU_LAYERS=32


### Python version

Python 3.10.10

### System

Description:    Ubuntu 22.04.2 LTS Release:        22.04 Codename:       jammy

### CASALIOY version

Latest Commit - ee9a4e5

### Information

- [X] The official example scripts
- [ ] My own modified scripts

### Related Components

- [X] Document ingestion
- [ ] GUI
- [X] Prompt answering

### Reproduction

I have fed the system a 5000 line csv file, with 30 columns. 


Now I asked about overall insight from the data.

I can see in the terminal, it is only seeing top 5 or 7 documents, which is nothing but single row. So, this is giving me answer based on 5 or 7 rows, and thus no actual insight is coming

Point to be noted - I have kept only 1 document in the source documents folder to avoid information overlapping

### Expected behavior

Should be able to understand the pattern in the data, and suggest some insights based on it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to Provide insights on Overall Data - Only Taking top 5 or 7 chunks #100

.env

Generic

Ingestion

Generation

Python version

System

CASALIOY version

Information

Related Components

Reproduction

Expected behavior

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Unable to Provide insights on Overall Data - Only Taking top 5 or 7 chunks #100

Description

.env

Generic

Ingestion

Generation

Python version

System

CASALIOY version

Information

Related Components

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions