Skip to content

Conversation

@lingfengren
Copy link
Contributor

MM-Vet v2 is a extension of MMVet, which includes a new VL capability called "image-text sequence understanding", evaluating models' ability to process VL sequences.
paper: https://arxiv.org/abs/2408.00765
github: https://github.com/yuweihao/MM-Vet/tree/main/v2

Since the sample may have multiple images, it is provided mmvetv2_group_img.yaml for those models that cannot handle multiple image inputs.

@pufanyi
Copy link
Collaborator

pufanyi commented Dec 11, 2024

Hi! Thank you so much for your contribution!!!!

It looks good to me! However, is it possible to add a simple comment before mmvetv2_group_img.yaml, something like (if I understood correctly):

# This task differs from mmvet2 in that it combines multiple images into one, to accommodate models that cannot accept multiple images as input.

Thanks!!!

@lingfengren
Copy link
Contributor Author

Hi! Thank you so much for your contribution!!!!

It looks good to me! However, is it possible to add a simple comment before mmvetv2_group_img.yaml, something like (if I understood correctly):

# This task differs from mmvet2 in that it combines multiple images into one, to accommodate models that cannot accept multiple images as input.

Thanks!!!

Ok, I'll add this comment later

@pufanyi
Copy link
Collaborator

pufanyi commented Dec 11, 2024

Thanks!!!

@pufanyi pufanyi merged commit 1f30374 into EvolvingLMMs-Lab:main Dec 11, 2024
1 check passed
@Luodian
Copy link
Contributor

Luodian commented Dec 13, 2024

image

Hi I think the evaluation still has some errors?

@lingfengren
Copy link
Contributor Author

lingfengren commented Dec 13, 2024

image Hi I think the evaluation still has some errors?

It seems that len of image_list is zero, the images should be stored in doc follow with image name as key.
This works fine for me.

if __name__ == "__main__":
    doc: dict[str, str] = {
        "question": "Describe the two pointed objects in the image.<IMG>v2_355_0.png<IMG>image<IMG>v2_9_0.png",
    }
    img_names = get_images_tokens(doc["question"])
    for img_name in img_names:
        doc[img_name] = Image.open(
            "path2mmvetv2/mm-vet-v2/images/" + img_name
        ).convert("RGB")
    print(mmvet_doc_to_visual(doc))
    print(mmvet_group_img_doc_to_visual(doc))
image

@pufanyi
Copy link
Collaborator

pufanyi commented Dec 13, 2024

It seems that it is because there are 2 doc_to_visual functions in utils.py:

def mmvet_doc_to_visual(doc):

def mmvet_doc_to_visual(doc):

I have fixed it with some other errors in #457

MichalCiesiolka pushed a commit to MichalCiesiolka/lmms-eval-llmzszl that referenced this pull request Apr 3, 2025
dadwadw233 pushed a commit to dadwadw233/lmms-eval that referenced this pull request Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants