This repository was archived by the owner on Sep 18, 2024. It is now read-only.
This repository was archived by the owner on Sep 18, 2024. It is now read-only.
Problem of choosing specific gpu card for running trials #4242
Open
Description
Describe the issue:
I have a multi-gpu-card machine and I want to run different search tasks on them. Each task should run on one gpu card and no two tasks will run on same gpu card. In this case, I have to modify and add field in config.yml. First, I add gpuIndices
under localConfig
. However, the trial raised by tuner will be always in the state of waiting. Second, I reset the config.yml and add gpuIndices
under tuner
, trials work fine but all of them may run another gpu card instead of expected card. Is there any bug in current implementation or something wrong with my configuration?
Environment:
- NNI version: v2.4
- Training service (local|remote|pai|aml|etc): local
- Client OS: ubuntu 16.04
- Python version: 3.8.5
- Is conda/virtualenv/venv used?: conda
- Is running in Docker?: no
Configuration:
- one of Experiment configs (remember to remove secrets!):
- Search space:
Log message:
- nnimanager.log:
- dispatcher.log:
- nnictl stdout and stderr:
How to reproduce it?:
add gpuIndices
under localConfig
or tuner