Skip to content

Conversation

@kcz358
Copy link
Collaborator

@kcz358 kcz358 commented Apr 29, 2025

πŸš€ Introducing Aero-1-Audio β€” a compact yet mighty audio model.

🧠 Built on Qwen-2.5-1.5B
⚑ Trained in <24h on just 16Γ—H100
🎧 Handles 15+ min audio seamlessly
πŸ’‘ Outperforms bigger models like Whisper, Qwen-2-Audio & commercial services from ElevenLabs/Scribe

Aero shows: smart data > massive scale.

Github Repo: https://github.com/EvolvingLMMs-Lab/Aero-1
Model Checkpoints: https://huggingface.co/lmms-lab/Aero-1-Audio-1.5B
Evaluation Results: https://github.com/EvolvingLMMs-Lab/lmms-eval/tree/dev/aero
Cookbook: https://www.lmms-lab.com/posts/lmms-lab-docs/aero_audio/

Evaluation Result

20250424_092927_results.json
20250421_203304_results.json
20250421_202840_results.json
20250421_170326_results.json

*Note: for some benchmarks, we use gpt-4o-2024-11-20 as judge model

Examples

We supports batch evaluation for faster inference. Notice that the result might be slightly difference for different batch size

TASK=open_asr_tedlium
CKPT_PATH=lmms-lab/Aero-1-Audio
echo $TASK
TASK_SUFFIX="${TASK//,/_}"
echo $TASK_SUFFIX

accelerate launch --num_processes 8 --main_process_port 30000 -m lmms_eval \
    --model aero \
    --model_args pretrained=$CKPT_PATH,attn_implementation="flash_attention_2" \
    --tasks $TASK \
    --batch_size 32 \
    --log_samples_suffix $TASK_SUFFIX \
    --output_path ./logs/ --verbosity DEBUG

@Luodian Luodian merged commit 819f67e into main Apr 30, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants