-
Notifications
You must be signed in to change notification settings - Fork 452
Open
Description
Environment
lmms-eval=0.3.3
transformers=4.51.3
Context
I used the following commands to test the performance of the Qwen2.5VL-3B model on the vqav2_val_lite task, with batch sizes of 1, 2, and 4 respectively.
task="vqav2_val_lite"
CUDA_VISIBLE_DEVICES=0 \
accelerate launch --num_processes=1 --main_process_port=12345 -m lmms_eval \
--model qwen2_5_vl \
--model_args=pretrained=./Qwen/Qwen2.5-VL-3B-Instruct,use_flash_attention_2=True \
--tasks $task \
--batch_size 1 \
--output_path ./result/Qwen2.5-VL-3B-Instruct/lmms_eval_$task \
--log_samples
CUDA_VISIBLE_DEVICES=0 \
accelerate launch --num_processes=1 --main_process_port=12345 -m lmms_eval \
--model qwen2_5_vl \
--model_args=pretrained=./Qwen/Qwen2.5-VL-3B-Instruct,use_flash_attention_2=True \
--tasks $task \
--batch_size 2 \
--output_path ./result/Qwen2.5-VL-3B-Instruct/lmms_eval_${task}_bs2 \
--log_samples
CUDA_VISIBLE_DEVICES=0 \
accelerate launch --num_processes=1 --main_process_port=12345 -m lmms_eval \
--model qwen2_5_vl \
--model_args=pretrained=./Qwen/Qwen2.5-VL-3B-Instruct,use_flash_attention_2=True \
--tasks $task \
--batch_size 4 \
--output_path ./result/Qwen2.5-VL-3B-Instruct/lmms_eval_${task}_bs4 \
--log_samplesThe results are as follows:
| Batch size | Performance |
|---|---|
| 1 | 0.7478 |
| 2 | 0.7234 |
| 4 | 0.6810 |
Observation
Based on my current observations, the issue might be located here
Simply put, the situation in the code seems to be like this:
# `contexts` is a batch of context messages.
# `visual_list` is a batch of image list.
visual_list = self.flatten(visual_list) # Why do this?
# ...
batched_messages = []
for context in contexts:
# ...
processed_visuals = []
for visual in visual_list:
# ...
processed_visuals.append(...) # This seems to cause every message within the batch to include all images from that entire batch.I hope to get clarification from the contributors or the community.
BBBBchan
Metadata
Metadata
Assignees
Labels
No labels