Open
Description
Is your feature request related to a problem?
When you call .read_csv()
on the duckdb backend, this makes duckdb actually go fetch [some] of the data in order to sniff the schema. Then, when you call .cache()
on the created view, it actually goes and fetches the full data.
This is related to #9931.
What is the motivation behind your request?
I am working on relatively large tables on a slow internet connection. Each fetch takes about 30 seconds. I would like to avoid this double fetch.
Describe the solution you'd like
Since the result of .read_csv()
needs to be a Table with a known schema, it is going to be required to fetch some data during that function call. So, I think we need to add an optional argument to the function, or create entirely new function. I would vote for adding params if we can come up with something sane. Maybe cache: bool
?
What version of ibis are you running?
main
What backend(s) are you using, if any?
duckdb
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Type
Projects
Status
backlog