Skip to content

No tasks specified, or no tasks found. #3434

@mao1333

Description

@mao1333

When I execute the following lm_eval command, a ValueError is thrown:

lm_eval --model vllm --model_args pretrained=/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B,gpu_memory_utilization=0.8  --tasks aime25 --output_path /dfs/data/zhaozx10/benchmark-eval/results

 ValueError: No tasks specified, or no tasks found. Please verify the task names.

other tasks can be verifyed

(eval) root@workspace:/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main# lm_eval --model vllm --model_args pretrained=/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B --tasks mmlu --output_path /dfs/data/zhaozx10/benchmark-eval/results
[2025-11-27 07:49:43] INFO __main__.py:450: Selected Tasks: ['mmlu']
[2025-11-27 07:49:43] INFO evaluator.py:202: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
[2025-11-27 07:49:43] INFO evaluator.py:240: Initializing vllm model,

Temporary Solution:The error is resolved when I exit the lm-evaluation-harness-main directory and run the command outside of it.

Suspected Root Cause:I previously modified the dataset_path field in the aime25.yaml file from the original math-ai/aime25 to a local file path. Even though I reverted this change back to the original value after encountering the error, the ValueError still persists.

This problem has been bothering me for two hours.

(eval) root@workspace:/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main# lm_eval --model vllm --model_args pretrained=/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B,gpu_memory_utilization=0.8  --tasks aime25 --output_path /dfs/data/zhaozx10/benchmark-eval/results
[2025-11-27 08:03:49] INFO __main__.py:450: Selected Tasks: []
[2025-11-27 08:03:49] INFO evaluator.py:202: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
Traceback (most recent call last):
  File "/dfs/data/conda/envs/eval/bin/lm_eval", line 7, in <module>
    sys.exit(cli_evaluate())
  File "/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main/lm_eval/__main__.py", line 459, in cli_evaluate
    results = evaluator.simple_evaluate(
  File "/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main/lm_eval/utils.py", line 458, in _wrapper
    return fn(*args, **kwargs)
  File "/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main/lm_eval/evaluator.py", line 207, in simple_evaluate
    raise ValueError(
ValueError: No tasks specified, or no tasks found. Please verify the task names.
(eval) root@workspace:/dfs/data/zhaozx10/benchmark-eval/lm-evaluation-harness-main# cd ..
(eval) root@workspace:/dfs/data/zhaozx10/benchmark-eval# lm_eval --model vllm --model_args pretrained=/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B,gpu_memory_utilization=0.8  --tasks aime25 --output_path /dfs/data/zhaozx10/benchmark-eval/results
[2025-11-27 08:04:16] INFO __main__.py:450: Selected Tasks: ['aime25']
[2025-11-27 08:04:16] INFO evaluator.py:202: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
[2025-11-27 08:04:16] INFO evaluator.py:240: Initializing vllm model, with arguments: {'pretrained': '/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B', 'gpu_memory_utilization': 0.8}
INFO 11-27 08:04:16 [utils.py:253] non-default args: {'seed': 1234, 'gpu_memory_utilization': 0.8, 'disable_log_stats': True, 'model': '/dfs/data/zhaozx10/benchmark-eval/Qwen3-4B'}
INFO 11-27 08:04:16 [model.py:631] Resolved architecture: Qwen3ForCausalLM
INFO 11-27 08:04:16 [model.py:1745] Using max model len 40960
INFO 11-27 08:04:16 [scheduler.py:216] Chunked prefill is enabled with max_num_batched_tokens=16384.
(EngineCore_DP0 pid=19582) INFO 11-27 08:04:16 [core.py:93] Initializing a V1 LLM engine (v0.11.2) with config: 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions