Skip to content

Conversation

@CircleRadon
Copy link
Contributor

This PR introduces support for VideoLLaMA3, a frontier multimodal foundation model for video understanding.

Below is a screenshot of the evaluation result on videomme:
image

Copy link
Collaborator

@kcz358 kcz358 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the contribution

@kcz358 kcz358 merged commit 7761876 into EvolvingLMMs-Lab:main Mar 21, 2025
1 check passed
MichalCiesiolka pushed a commit to MichalCiesiolka/lmms-eval-llmzszl that referenced this pull request Apr 3, 2025
@choucaicai
Copy link

choucaicai commented Apr 11, 2025

Do your results on mlvudev and longvideobench match those in the paper? When I reproduce VideoLLama3-2B, I only get 'mlvu_perception_score, none': 40.11 and 'lvb_acc, none': 0.51458. When I reproduce VideoLLama3-7V on VideoMME, i got same results "videomme_perception_score,none": 65.29629629629629

dadwadw233 pushed a commit to dadwadw233/lmms-eval that referenced this pull request Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants