Fix: Unpack dataset_config dictionary when calling load_dataset #706

wenzhaoabc · 2025-10-20T06:52:36Z

Description

This PR addresses an issue in the dataset loading logic where args.dataset_config, if it is a dictionary, is not being correctly unpacked.

The datasets.load_dataset function expects configuration parameters (like name, split, data_files, etc.) to be passed as keyword arguments. When these arguments are grouped into a dictionary, it must be unpacked with ** to be passed correctly.

This change modifies the function call to use **args.dataset_config, ensuring that dictionary-based configurations are properly applied.

Change Details

if args.dataset_name and not args.dataset_mixture:
    logger.info(f"Loading dataset: {args.dataset_name}")
-   return datasets.load_dataset(args.dataset_name, args.dataset_config)
+   return datasets.load_dataset(args.dataset_name, **args.dataset_config)
elif args.dataset_mixture:
    logger.info(f"Creating dataset mixture with {len(args.dataset_mixture.datasets)} datasets")
    seed = args.dataset_mixture.seed

This ensures that if args.dataset_config is, for example, {'name': 'en', 'split': 'train'}, the call becomes load_dataset(dataset_name, name='en', split='train') instead of an incorrect load_dataset(dataset_name, {'name': 'en', 'split': 'train'}).

solve

wenzhaoabc added 4 commits October 15, 2025 16:45

Fix dataset loading by unpacking dataset_config

77ae933

add traffic reward

5428777

Merge branch 'main' of github.com:wenzhaoabc/open-r1

c229c2f

solve

adapt for local machine

f84325f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Unpack dataset_config dictionary when calling load_dataset #706

Fix: Unpack dataset_config dictionary when calling load_dataset #706

Uh oh!

wenzhaoabc commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix: Unpack dataset_config dictionary when calling load_dataset #706

Are you sure you want to change the base?

Fix: Unpack dataset_config dictionary when calling load_dataset #706

Uh oh!

Conversation

wenzhaoabc commented Oct 20, 2025

Description

Change Details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant