[lFX Term 1 2026 ] Restoring Ianvs LLM-Agent setup and usage by NishantSinghhhhh · Pull Request #407 · kubeedge/ianvs

NishantSinghhhhh · 2026-04-23T08:24:20Z

feat: add requirements.txt for dependencies
fix: refactor basemodel.py for improved readability and functionality
refactor: enhance rouge.py to utilize RougeScorer for metric calculations

What type of PR is this?

/kind feature
/kind cleanup

What this PR does / why we need it:

This PR fixes and refactors the llm-agent singletask learning benchmark to make it fully functional end-to-end. The original example code had several issues that prevented it from running: broken relative paths, a missing dataset, deprecated HuggingFace API arguments, a name collision with the Ianvs framework lifecycle hook, and a broken ROUGE metric script.

Changes included:

requirements.txt: Added a requirements.txt listing all dependencies needed to run the LLM-agent benchmark (torch, transformers, peft, datasets, evaluate, rouge_score), which were previously undocumented and missing from the environment.
basemodel.py:
- Replaced deprecated use_auth_token= argument with token= to match current HuggingFace transformers API
- Added empty preprocess(self, **kwargs) lifecycle hook required by the Ianvs singletask learning framework
- Renamed internal helper preprocess() → _preprocess_sample() to avoid collision with the framework hook
- Updated _preprocess_sample() signature to accept plain strings instead of a samples object
- Flattened the return value of _preprocess_sample() (removed erroneous [None] wrapper)
- Added explicit str() cast in train() loop when iterating train_data.x / train_data.y to handle numpy.str_ types that caused tokenizer failures
rouge.py:
- Removed bare EOF token at end of file (invalid Python causing NameError on import)
- Replaced evaluate.load() (which required a local metrics folder that did not exist) with direct rouge_score.rouge_scorer.RougeScorer calls
- Updated y_pred handling to use str() cast instead of ["generated_text"] dict access, matching the plain-string output of basemodel.predict()

Which issue(s) this PR fixes:

Fixes #

feat: add requirements.txt for dependencies fix: refactor basemodel.py for improved readability and functionality refactor: enhance rouge.py to utilize RougeScorer for metric calculations Signed-off-by: NishantSinghhhhh <nishantsingh_230137@aitpune.edu.in>

kubeedge-bot · 2026-04-23T08:24:33Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: NishantSinghhhhh
To complete the pull request process, please assign jaypume after the PR has been reviewed.
You can assign the PR to them by writing /assign @jaypume in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

NishantSinghhhhh · 2026-04-23T08:25:51Z

Screencast.from.2026-04-23.13-42-27.webm

@MooreZheng sir,

After making all these changes I was able to restore LLM-Agent Benchmark and run it successfully

gemini-code-assist

Code Review

This pull request significantly updates the Ianvs LLM-Agent benchmark by providing a comprehensive reproduction guide, adding a requirements file, and refactoring the core model and evaluation logic. Key changes include a rewritten predict method that correctly slices prompt tokens from the output and an updated ROUGE scoring implementation using the rouge_score library. Review feedback focuses on ensuring input tensors are moved to the correct device, removing redundant imports, adopting idiomatic boolean checks, and utilizing the internal calculate_mean function to prevent potential division-by-zero errors in metric calculations.

kubeedge-bot added kind/feature Categorizes issue or PR as related to a new feature. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Apr 23, 2026

kubeedge-bot requested review from MooreZheng and hsj576 April 23, 2026 08:24

kubeedge-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 23, 2026

gemini-code-assist Bot reviewed Apr 23, 2026

View reviewed changes

NishantSinghhhhh mentioned this pull request Apr 23, 2026

Comprehensive Example Restoration for KubeEdge Ianvs #230

Open

NishantSinghhhhh changed the title ~~[lFX Term 1 2026 ] Restoring Ianvs LLM-Agent Benchmark setup and usage~~ [lFX Term 1 2026 ] Restoring Ianvs LLM-Agent setup and usage Apr 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[lFX Term 1 2026 ] Restoring Ianvs LLM-Agent setup and usage#407

[lFX Term 1 2026 ] Restoring Ianvs LLM-Agent setup and usage#407
NishantSinghhhhh wants to merge 1 commit intokubeedge:mainfrom
NishantSinghhhhh:restoration-llm-agent

NishantSinghhhhh commented Apr 23, 2026

Uh oh!

kubeedge-bot commented Apr 23, 2026

Uh oh!

NishantSinghhhhh commented Apr 23, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NishantSinghhhhh commented Apr 23, 2026

Uh oh!

kubeedge-bot commented Apr 23, 2026

Uh oh!

NishantSinghhhhh commented Apr 23, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants