damp-lion-88352
02/25/2025, 6:23 AM/etc/slurm/slurm.conf
NodeName=localhost Gres=gpu:1 CPUs=4 RealMemory=15006 Sockets=1 CoresPerSocket=2 ThreadsPerCore=2 State=UNKNOWN
PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP
this is the /etc/slurm/gres.conf
AutoDetect=nvml
NodeName=localhost Name=gpu Type=tesla File=/dev/nvidia0 COREs=0
after changed the config, I restarted my slurm cluster and type slurmd -C
but it doesn't show that I have gpu.
CC @rich-application-44533 @red-school-96573 @fierce-oil-47448red-school-96573
02/25/2025, 7:15 AMGresTypes=gpu
See https://slurm.schedmd.com/slurm.conf.html#OPT_GresTypesdamp-lion-88352
02/25/2025, 7:16 AMred-school-96573
02/25/2025, 7:38 AMslurmd -C
. What do you see when using slurmd -G
?damp-lion-88352
02/25/2025, 7:39 AM(base) ubuntu@ip-10-0-0-4:~$ sudo slurmd -G
slurmd: _read_slurm_cgroup_conf: No cgroup.conf file (/etc/slurm/cgroup.conf), using defaults
slurmd: A line in gres.conf for GRES gpu:tesla has 1 more configured than expected in slurm.conf. Ignoring extra GRES.
slurmd: gpu/nvml: _get_system_gpu_list_nvml: 1 GPU system device(s) detected
slurmd: gres/gpu: _normalize_sys_gres_types: Could not find an unused configuration record with a GRES type that is a substring of system device `tesla_t4`. Setting system GRES type to NULL
slurmd: error: This GPU specified in [slurm|gres].conf has mismatching Cores or Links from the device found on the system. Ignoring it.
slurmd: error: [slurm|gres].conf:
slurmd: error: GRES[gpu] Type:(null) Count:1 Cores(4):0 Links:(null) Flags:HAS_FILE,ENV_NVML,ENV_RSMI,ENV_ONEAPI,ENV_OPENCL,ENV_DEFAULT File:/dev/nvidia0 UniqueId:(null)
slurmd: error: system:
slurmd: error: GRES[gpu] Type:(null) Count:1 Cores(4):0-1 Links:-1 Flags:HAS_FILE,ENV_NVML File:/dev/nvidia0 UniqueId:(null)
slurmd: The following autodetected GPUs are being ignored:
slurmd: GRES[gpu] Type:(null) Count:1 Cores(4):0-1 Links:-1 Flags:HAS_FILE,ENV_NVML File:/dev/nvidia0 UniqueId:(null)
red-school-96573
02/25/2025, 7:50 AMCOREs=0
in gres.conf
. Could you please remove it?damp-lion-88352
02/25/2025, 8:25 AMdamp-lion-88352
02/25/2025, 8:25 AMdamp-lion-88352
02/25/2025, 8:25 AMred-school-96573
02/25/2025, 8:31 AM