Skip to content

fix(infra): remove setfit dependency from api server#5449

Merged
Weves merged 2 commits intomainfrom
edwin/dan-2552
Sep 18, 2025
Merged

fix(infra): remove setfit dependency from api server#5449
Weves merged 2 commits intomainfrom
edwin/dan-2552

Conversation

@edwin-onyx
Copy link
Contributor

@edwin-onyx edwin-onyx commented Sep 18, 2025

Description

SetFit is a machine learning library for few-shot text classification
we use it in model server but not api server, not sure why its in the api server requirements txt rn

docker stats w current req ->
onyx-stack-background-1 6.838GiB
onyx-stack-api_server-1 805MiB

w/o setfit pkg
onyx-stack-background-1 5.323GiB
onyx-stack-api_server-1 678.8MiB

How Has This Been Tested?

[Describe the tests you ran to verify your changes]

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

@edwin-onyx edwin-onyx requested a review from a team as a code owner September 18, 2025 00:06
@vercel
Copy link

vercel bot commented Sep 18, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Ready Ready Preview Comment Sep 18, 2025 1:03am

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR removes the setfit==1.1.1 dependency from the API server requirements file (backend/requirements/default.txt). SetFit is a machine learning library for few-shot text classification that is currently used in the model server component for content classification during document indexing, specifically for handling short chunks (< 10 words) to calculate aggregated boost factors.

The change implements proper architectural separation between the API server and model server components. SetFit remains in backend/requirements/model_server.txt where the actual ML inference happens, while being removed from the API server dependencies. This follows the principle that the API server should handle HTTP requests, business logic, and orchestration, while ML dependencies should be isolated to the dedicated model server component.

This separation reduces the API server's attack surface, memory footprint, and deployment complexity by removing unnecessary ML library dependencies that aren't directly used by the API server code.

Confidence score: 4/5

  • This PR is safe to merge with minimal risk as it removes an unused dependency from the correct location
  • Score reflects proper architectural separation and the fact that SetFit remains available where it's actually needed (model server)
  • Pay close attention to ensuring no API server code directly imports or uses SetFit functionality

1 file reviewed, no comments

Edit Code Review Bot Settings | Greptile

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Copy link
Contributor

@Weves Weves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@Weves Weves merged commit f6a0e69 into main Sep 18, 2025
53 of 54 checks passed
@Weves Weves deleted the edwin/dan-2552 branch September 18, 2025 06:48
Copy link

@waseembahralaseel-cell waseembahralaseel-cell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix

Copy link

@waseembahralaseel-cell waseembahralaseel-cell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jamal

requests-oauthlib==1.3.1
retry==0.9.2 # This pulls in py which is in CVE-2022-42969, must remove py from image
rfc3986==1.5.0
setfit==1.1.1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fb

requests-oauthlib==1.3.1
retry==0.9.2 # This pulls in py which is in CVE-2022-42969, must remove py from image
rfc3986==1.5.0
setfit==1.1.1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix

Copy link

@waseembahralaseel-cell waseembahralaseel-cell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahn

Copy link

@waseembahralaseel-cell waseembahralaseel-cell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix

Copy link

@waseembahralaseel-cell waseembahralaseel-cell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix

requests-oauthlib==1.3.1
retry==0.9.2 # This pulls in py which is in CVE-2022-42969, must remove py from image
rfc3986==1.5.0
setfit==1.1.1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants