Skip to content

Conversation

@aish-warya-iyer
Copy link

@aish-warya-iyer aish-warya-iyer commented Oct 11, 2025

This example adds a short, self-contained demonstration of using Optuna to tune a TF-IDF + LinearSVC pipeline for sentiment classification on the TweetEval sentiment dataset (three labels: negative, neutral, positive).

What it does

Loads the TweetEval sentiment dataset using the datasets library.

Tunes both the TF-IDF vectorizer (feature count, n-gram range, etc.) and LinearSVC parameters (C, loss, class_weight) using Optuna.

Uses macro-F1 on the validation split as the objective (1 – macro-F1 minimized).

Retrains the best configuration on train + validation and prints a test report.

Files added

examples/sklearn/svm_tfidf_tweeteval_sentiment.py – main script

examples/sklearn/svm_tfidf_tweeteval_sentiment.md – short usage notes

tests/test_svm_tfidf_tweeteval.py – quick smoke test for CI

How to run
python examples/sklearn/svm_tfidf_tweeteval_sentiment.py --n-trials 20 --max-train 20000
pytest -q # optional quick test

Notes

Keeps runtime light by allowing the --max-train argument to limit samples.

Demonstrates how Optuna can help search SVM + text-feature spaces efficiently.

No external dependencies beyond datasets, scikit-learn, and optuna.

@github-actions
Copy link

This pull request has not seen any recent activity.

@github-actions github-actions bot added the stale Exempt from stale bot labeling. label Oct 19, 2025
@github-actions
Copy link

github-actions bot commented Nov 3, 2025

This pull request was closed automatically because it had not seen any recent activity. If you want to discuss it, you can reopen it freely.

@github-actions github-actions bot closed this Nov 3, 2025
@c-bata
Copy link
Member

c-bata commented Nov 4, 2025

@aish-warya-iyer Please feel free to reopen this after fixing all CI checks 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Exempt from stale bot labeling.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants