Skip to content

Issues: modelscope/data-juicer

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Add correlation_analysis in the analysis module dj:core issues/PRs about the core functions of Data-Juicer enhancement New feature or request
#663 opened May 7, 2025 by HYLcool Loading…
[WIP] Adds gpu minhash support for RayBTSMinhashDeduplicator dj:efficiency regarding to efficiency issues and enhancements dj:op issues/PRs about some specific OPs enhancement New feature or request
#644 opened Apr 17, 2025 by cyruszhang Draft
Installation progress could be optimzed. (Cmake error during installation) dj:lite making DJ more accessible enhancement New feature or request environment related to third-party dependency, DJ-pypi, DJ-docker, etc.
#576 opened Feb 14, 2025 by zhenqincn
2 tasks done
Optimize dedup to avoid oom dj:dist issues/PRs about distributed data processing dj:efficiency regarding to efficiency issues and enhancements dj:tools issues/PRs about specific tools enhancement New feature or request good first issue Good for newcomers
#568 opened Feb 7, 2025 by coolderli Loading…
Support others LLMs & APIs for the OP generate_qa_from_text_mapper dj:op issues/PRs about some specific OPs enhancement New feature or request
#535 opened Jan 9, 2025 by yxdyc
2 tasks done
Checkpointer support for Ray-Mode enhancement New feature or request
#487 opened Nov 12, 2024 by yxdyc
2 tasks done
Distributed processing
[Feat]: Unified LLM Calling Management enhancement New feature or request
#451 opened Oct 16, 2024 by drcege
2 tasks done
[Feat]: Automatic Version Matching During Installation enhancement New feature or request
#450 opened Oct 16, 2024 by drcege
2 tasks done
Require fps filter and mapper for videos dj:op issues/PRs about some specific OPs enhancement New feature or request
#433 opened Sep 23, 2024 by BeachWang
Guidance for OP with multiple data fields to be processed enhancement New feature or request
#411 opened Sep 2, 2024 by yxdyc
2 tasks done
[Feat]: Add Ray actor support dj:dist issues/PRs about distributed data processing enhancement New feature or request stale-issue
#371 opened Jul 29, 2024 by drcege
Add GPT-4V as evaluator dj:multimodal issues/PRs about multimodal data processing enhancement New feature or request stale-pr
#276 opened Mar 22, 2024 by drcege Draft DJ-SORA
support panda's student captioner model in our captioning mapper dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs enhancement New feature or request stale-issue
#251 opened Mar 14, 2024 by yxdyc
ProTip! no:milestone will show everything without a milestone.