Open
Description
Hey, I have a code that uses partition_via_api() function from unstructured to extract images and text. When i run this code from terminal (python3 extractor.py) the overall code completes within 1min. Now, I'm putting the same function in a fastapi endpoint. When the endpoint in called via a request, the same function is triggered, but it takes drastically higher time to make unstructured api call. Any ideas on why this issue or how to solve this?
sample function used:
async def pdf_extracted_images(file_path,filename):
chunks = partition_via_api(
filename = str(file_path),
api_key = "api_key",
api_url = "https://api.unstructuredapp.io/general/v0/general",
strategy = "hi_res",
split_pdf_page = True,
split_pdf_concurrency_level = 15,
infer_table_structure = True,
extract_image_block_types = ["Image"],
extract_image_block_to_payload = True,
chunking_strategy = "basic",
max_characters = 20000,
combine_text_under_n_chars = 6000,
new_after_n_chars = 6000,
)
return chunks
Metadata
Metadata
Assignees
Labels
No labels