Cannot load custom task config file via its path

I'm trying to load a custom config file by passing its path as the `tasks` argument, according to the description in [New Task Guide](https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/new_task_guide.md#task-name--tags-registering-a-task), but lm_eval cannot find the task file even if the given path is correct.

```bash
$ ls -al                      
total 12
drwxrwxr-x  2 zbw zbw 4096 Nov 24 06:53 .
drwxr-x--- 23 zbw zbw 4096 Nov 24 07:27 ..
-rw-rw-r--  1 zbw zbw 1037 Nov 24 06:53 gsm8k.yaml
$ lm-eval --tasks ./gsm8k.yaml
2025-11-24:07:24:55 ERROR    [__main__:419] Tasks were not found: ./gsm8k.yaml
                                               Try `lm-eval --tasks list` for list of available tasks
Traceback (most recent call last):
  File "/data/zbw/miniconda3/envs/dllm/bin/lm-eval", line 7, in <module>
    sys.exit(cli_evaluate())
             ^^^^^^^^^^^^^^
  File "/data/zbw/miniconda3/envs/dllm/lib/python3.12/site-packages/lm_eval/__main__.py", line 423, in cli_evaluate
    raise ValueError(
ValueError: Tasks not found: ./gsm8k.yaml. Try `lm-eval --tasks {list_groups,list_subtasks,list_tags,list}` to list out all available names for task groupings; only (sub)tasks; tags; or all of the above, or pass '--verbosity DEBUG' to troubleshoot task registration issues.
```

However, if the path of its parent directory (in the above case it's `~/demo` or `.`) is passed, lm_eval can load the custom task.

```bash
$ lm-eval --tasks .           
2025-11-24:07:45:35 INFO     [__main__:446] Selected Tasks: [{'tag': ['math_word_problems'], 'task': 'gsm8k', 'dataset_path': '/data/zbw/szj/datasets/openai/gsm8k', 'dataset_name': 'main', 'output_type': 'generate_until', 'training_split': 'train', 'fewshot_split': 'train', 'test_split': 'test', 'doc_to_text': 'Question: {{question}}\nAnswer:', 'doc_to_target': '{{answer}}', 'metric_list': [{'metric': 'exact_match', 'aggregation': 'mean', 'higher_is_better': True, 'ignore_case': True, 'ignore_punctuation': False, 'regexes_to_ignore': [',', '\\$', '(?s).*#### ', '\\.$']}], 'generation_kwargs': {'until': ['Question:', '</s>', '<|im_end|>'], 'do_sample': False, 'temperature': 0.0}, 'repeats': 1, 'num_fewshot': 5, 'filter_list': [{'name': 'strict-match', 'filter': [{'function': 'regex', 'regex_pattern': '#### (\\-?[0-9\\.\\,]+)'}, {'function': 'take_first'}]}, {'name': 'flexible-extract', 'filter': [{'function': 'regex', 'group_select': -1, 'regex_pattern': '(-?[$0-9.,]{2,})|(-?[0-9]+)'}, {'function': 'take_first'}]}], 'metadata': {'version': 3.0}}]
# skip unrelevant following output
```

Following the error stack trace, the problem lies in the check for missing tasks, where lm_eval will try to match a `dict` with a `str` and eventually fail in the case of loading an external file.

https://github.com/EleutherAI/lm-evaluation-harness/blob/7ddb2b1e4bc819292ff56f334c572d1fc77dec28/lm_eval/__main__.py#L409-L415

	for task in [task for task in task_list if task not in task_names]:
	if os.path.isfile(task):
	config = utils.load_yaml_config(task)
	task_names.append(config)
	task_missing = [
	task for task in task_list if task not in task_names and "*" not in task
	] # we don't want errors if a wildcard ("*") task name was used

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cannot load custom task config file via its path #3424

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cannot load custom task config file via its path #3424

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions