[Bug]: URL ingestion of page links to sub-pages (crawl depth) needs reconciling with product requirements

### OpenRAG Version

0.5.0

### Deployment Method

Local development (make dev)

### Operating System

Ubuntu 24.04.4 LTS

### Python Version

3.13.13

### Affected Area

Ingestion (document processing, upload, Docling)

### Bug Description

URL ingestion of page links to sub-pages (crawl depth) needs reconciling with product requirements

### Steps to Reproduce

1. Go to Chat
2. Enter prompt: "Ingest this URL: https://crawler-test.com/"
   - URL successfully ingested
   - Crawl Depth used by agent is 1 (instead of 2)
   - BUG(?): Unable to find any content from sub-pages   
   - See screenshot below

### Expected Behavior

- `Verify`: Only pages up to the configured crawl depth (default 2) are ingested; no runaway crawl	

### Actual Behavior

- Crawl Depth used by agent is 1 (instead of 2)

### Relevant Logs

```shell
N/A
```

### Screenshots

<img width="1056" height="625" alt="Image" src="https://github.com/user-attachments/assets/dbb6c31b-3bce-490e-995b-8a3ace223b4b" />

### Additional Context

ℹ️  **Feedback** from @lucaseduoli 

- Crawl depth should be based on the length of the page (rather than sub-pages)
- Should delegate and let agent decide what sub-pages (if any) should be crawled
- No known competitor RAG tools that automatically crawl to a depth of 2 (sub-pages) - only 1 (same page)
- Default crawl depth should be 1 (which is the current behavior)
- Should consult with Product team to verify
- This test scenario is really valid

### Checklist

- [x] I have searched existing issues to ensure this bug hasn't been reported before.
- [x] I have provided all the requested information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: URL ingestion of page links to sub-pages (crawl depth) needs reconciling with product requirements #1644

OpenRAG Version

Deployment Method

Operating System

Python Version

Affected Area

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Relevant Logs

Screenshots

Additional Context

Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: URL ingestion of page links to sub-pages (crawl depth) needs reconciling with product requirements #1644

Description

OpenRAG Version

Deployment Method

Operating System

Python Version

Affected Area

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Relevant Logs

Screenshots

Additional Context

Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions