-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Open
Description
When I execute the following lm_eval command, a ValueError is thrown:
lm_eval --model vllm --model_args pretrained=/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B,gpu_memory_utilization=0.8 --tasks aime25 --output_path /dfs/data/zhaozx10/benchmark-eval/results
ValueError: No tasks specified, or no tasks found. Please verify the task names.
other tasks can be verifyed
(eval) root@workspace:/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main# lm_eval --model vllm --model_args pretrained=/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B --tasks mmlu --output_path /dfs/data/zhaozx10/benchmark-eval/results
[2025-11-27 07:49:43] INFO __main__.py:450: Selected Tasks: ['mmlu']
[2025-11-27 07:49:43] INFO evaluator.py:202: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
[2025-11-27 07:49:43] INFO evaluator.py:240: Initializing vllm model,
Temporary Solution:The error is resolved when I exit the lm-evaluation-harness-main directory and run the command outside of it.
Suspected Root Cause:I previously modified the dataset_path field in the aime25.yaml file from the original math-ai/aime25 to a local file path. Even though I reverted this change back to the original value after encountering the error, the ValueError still persists.
This problem has been bothering me for two hours.
(eval) root@workspace:/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main# lm_eval --model vllm --model_args pretrained=/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B,gpu_memory_utilization=0.8 --tasks aime25 --output_path /dfs/data/zhaozx10/benchmark-eval/results
[2025-11-27 08:03:49] INFO __main__.py:450: Selected Tasks: []
[2025-11-27 08:03:49] INFO evaluator.py:202: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
Traceback (most recent call last):
File "/dfs/data/conda/envs/eval/bin/lm_eval", line 7, in <module>
sys.exit(cli_evaluate())
File "/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main/lm_eval/__main__.py", line 459, in cli_evaluate
results = evaluator.simple_evaluate(
File "/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main/lm_eval/utils.py", line 458, in _wrapper
return fn(*args, **kwargs)
File "/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main/lm_eval/evaluator.py", line 207, in simple_evaluate
raise ValueError(
ValueError: No tasks specified, or no tasks found. Please verify the task names.
(eval) root@workspace:/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main# cd ..
(eval) root@workspace:/dfs/data/zhaozx10/benchmark-eval# lm_eval --model vllm --model_args pretrained=/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B,gpu_memory_utilization=0.8 --tasks aime25 --output_path /dfs/data/zhaozx10/benchmark-eval/results
[2025-11-27 08:04:16] INFO __main__.py:450: Selected Tasks: ['aime25']
[2025-11-27 08:04:16] INFO evaluator.py:202: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
[2025-11-27 08:04:16] INFO evaluator.py:240: Initializing vllm model, with arguments: {'pretrained': '/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B', 'gpu_memory_utilization': 0.8}
INFO 11-27 08:04:16 [utils.py:253] non-default args: {'seed': 1234, 'gpu_memory_utilization': 0.8, 'disable_log_stats': True, 'model': '/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B'}
INFO 11-27 08:04:16 [model.py:631] Resolved architecture: Qwen3ForCausalLM
INFO 11-27 08:04:16 [model.py:1745] Using max model len 40960
INFO 11-27 08:04:16 [scheduler.py:216] Chunked prefill is enabled with max_num_batched_tokens=16384.
(EngineCore_DP0 pid=19582) INFO 11-27 08:04:16 [core.py:93] Initializing a V1 LLM engine (v0.11.2) with config:
Metadata
Metadata
Assignees
Labels
No labels