You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I included a link to the documentation page I am referring to (if applicable).
Issue with current documentation:
It seems that units for threshold-type = "percentage" are out of a hundred, i.e., 85.0 not 0.85, and this is also unclear for the other threshold types, "gradient," and "interquartile."
Idea or request for content:
Also, Semantic Chunker really needs a min and max chunk size. I am getting chunks of a single word, and chunks that exceed the OpenAI limit. Thanks for all the great work on LangChain.
The text was updated successfully, but these errors were encountered:
dosubotbot
added
the
🤖:docs
Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder
label
Sep 6, 2024
@mvirag2000 What do you think about the linked PR?
Re your idea/request: I only introduced min_chunk_size, because the max size of chunks can be adjusted by tuning breakpoint_threshould_amount to a reasonable value.
URL
https://python.langchain.com/v0.2/docs/how_to/semantic-chunker/
Checklist
Issue with current documentation:
It seems that units for threshold-type = "percentage" are out of a hundred, i.e., 85.0 not 0.85, and this is also unclear for the other threshold types, "gradient," and "interquartile."
Idea or request for content:
Also, Semantic Chunker really needs a min and max chunk size. I am getting chunks of a single word, and chunks that exceed the OpenAI limit. Thanks for all the great work on LangChain.
The text was updated successfully, but these errors were encountered: