Skip to content

Inbatchsample#128

Merged
1985312383 merged 4 commits intodatawhalechina:mainfrom
zerolovesea:inbatchsample
Feb 5, 2026
Merged

Inbatchsample#128
1985312383 merged 4 commits intodatawhalechina:mainfrom
zerolovesea:inbatchsample

Conversation

@zerolovesea
Copy link

Pull Request / 拉取请求

What does this PR do? / 这个PR做了什么?

增加 batch 内负采样支持

Briefly describe your changes / 简要描述您的更改

增加了in batch neg sampling,适配了trainer,并增加了单元测试

Type of Change / 变更类型

  • 🐛 Bug fix / Bug修复
  • ✨ New model/feature / 新模型/功能
  • 📝 Documentation / 文档
  • 🔧 Maintenance / 维护

Related Issues / 相关Issues

Fixes #(issue number) / 修复 #(issue编号)

#111

How to Test / 如何测试

python tests/test_inbatch_sampling.py

Checklist / 检查清单

  • Code follows project style (ran python config/format_code.py) / 代码遵循项目风格(运行了格式化脚本)
  • Added tests for new functionality / 为新功能添加了测试
  • Updated documentation if needed / 如需要已更新文档
  • All tests pass locally / 所有测试在本地通过

Additional Notes / 附加说明

Any other information for reviewers / 给审查者的其他信息

- add in_batch_negative_sampling helper
- cover in-batch sampling with unit tests
- ensure Matching tutorial runs with the new sampler
@codecov-commenter
Copy link

codecov-commenter commented Nov 27, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 40.11628% with 103 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.36%. Comparing base (c30a1ab) to head (5684437).
⚠️ Report is 131 commits behind head on main.

Files with missing lines Patch % Lines
torch_rechub/models/matching/sasrec.py 5.55% 34 Missing ⚠️
torch_rechub/models/matching/narm.py 15.15% 28 Missing ⚠️
torch_rechub/models/matching/stamp.py 15.15% 28 Missing ⚠️
torch_rechub/basic/loss_func.py 16.66% 5 Missing ⚠️
torch_rechub/trainers/match_trainer.py 86.48% 5 Missing ⚠️
torch_rechub/utils/match.py 88.88% 3 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #128      +/-   ##
==========================================
+ Coverage   36.39%   40.36%   +3.96%     
==========================================
  Files          52       66      +14     
  Lines        3283     4402    +1119     
==========================================
+ Hits         1195     1777     +582     
- Misses       2088     2625     +537     
Flag Coverage Δ
unittests 40.36% <40.11%> (+3.96%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@1985312383 1985312383 added the enhancement New feature or request | 新功能 label Nov 28, 2025
Add validation in MatchTrainer.__init__ to check if the model supports
in-batch negative sampling by verifying it has user_tower() and item_tower()
methods. This prevents confusing AttributeError at training time for
unsupported models like SASRec, STAMP, NARM.
Enable in-batch negative sampling / two-tower inference for matching models. Updated NARM, SASRec and STAMP to accept an optional item_feature, expose user_tower/item_tower and mode flags for 'user'/'item' inference, and return dot-product scores in in-batch mode. Refactored session/user representation extraction (e.g. _compute_session_repr/_compute_user_repr) and preserved original full-item scoring behavior when item_feature is not provided. Also updated MatchTrainer error message to list the newly-supported models (SASRec, NARM, STAMP) for clarity.
@1985312383 1985312383 merged commit 3295fee into datawhalechina:main Feb 5, 2026
11 checks passed
@1985312383
Copy link
Collaborator

Hi @zerolovesea 👋

我们发现你的提交已经合并,但在仓库的 Contributors 列表里没有显示。
通常这是因为 commit 使用的邮箱没有绑定到 GitHub 账号。

可以麻烦你检查一下当前提交使用的邮箱:

git config user.email

然后在 GitHub → Settings → Emails 中添加该邮箱并验证。
GitHub 会自动重新关联历史 commit。

感谢你的贡献 🙌

@zerolovesea
Copy link
Author

zerolovesea commented Feb 12, 2026 via email

@1985312383
Copy link
Collaborator

好的,贡献者列表已经生效了

感谢告知!现已添加邮箱

柯慕灵 @.***>于2026年2月12日 周四16:11写道:

1985312383 left a comment (#128)
#128 (comment)

Hi @zerolovesea https://github.com/zerolovesea 👋

我们发现你的提交已经合并,但在仓库的 Contributors 列表里没有显示。
通常这是因为 commit 使用的邮箱没有绑定到 GitHub 账号。

可以麻烦你检查一下当前提交使用的邮箱:

git config user.email

然后在 GitHub → Settings → Emails 中添加该邮箱并验证。
GitHub 会自动重新关联历史 commit。

感谢你的贡献 🙌


Reply to this email directly, view it on GitHub
#128 (comment),
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AOFWZUU4Z6O6RFY6E76ZMLL4LQYS3AVCNFSM6AAAAACNL23NMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTQOBZGMYTAMBWGM
.
You are receiving this because you were mentioned.Message ID:
@.***>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request | 新功能

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[HELP WANTED] 重构并支持batch内采样功能 / Refactor and Support In-Batch Sampling'

3 participants

Comments