Skip to content

Commit 1087bd2

Browse files
authored
chore(model gallery): add qwen3-4b-ra-sft (#6458)
Signed-off-by: Ettore Di Giacinto <[email protected]>
1 parent 7ed3666 commit 1087bd2

File tree

1 file changed

+20
-0
lines changed

1 file changed

+20
-0
lines changed

gallery/index.yaml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3034,6 +3034,26 @@
30343034
- filename: gustavecortal_Beck-4B-Q4_K_M.gguf
30353035
sha256: f4af0cf3e6adedabb79c16d8d5d6d23a3996f626d7866ddc27fa80011ce695af
30363036
uri: huggingface://bartowski/gustavecortal_Beck-4B-GGUF/gustavecortal_Beck-4B-Q4_K_M.gguf
3037+
- !!merge <<: *qwen3
3038+
name: "qwen3-4b-ra-sft"
3039+
icon: https://cdn-avatars.huggingface.co/v1/production/uploads/64fde4e252e82dd432b74ce9/TAEScS71YX5NPRM4TXZc8.png
3040+
urls:
3041+
- https://huggingface.co/Gen-Verse/Qwen3-4B-RA-SFT
3042+
- https://huggingface.co/mradermacher/Qwen3-4B-RA-SFT-GGUF
3043+
description: |
3044+
a 4B-sized agentic reasoning model that is finetuned with our 3k Agentic SFT dataset, based on Qwen3-4B-Instruct-2507.
3045+
In our work, we systematically investigate three dimensions of agentic RL: data, algorithms, and reasoning modes. Our findings reveal
3046+
3047+
🎯 Data Quality Matters: Real end-to-end trajectories and high-diversity datasets significantly outperform synthetic alternatives
3048+
⚡ Training Efficiency: Exploration-friendly techniques like reward clipping and entropy maintenance boost training efficiency
3049+
🧠 Reasoning Strategy: Deliberative reasoning with selective tool calls surpasses frequent invocation or verbose self-reasoning We contribute high-quality SFT and RL datasets, demonstrating that simple recipes enable even 4B models to outperform 32B models on the most challenging reasoning benchmarks.
3050+
overrides:
3051+
parameters:
3052+
model: Qwen3-4B-RA-SFT.Q4_K_M.gguf
3053+
files:
3054+
- filename: Qwen3-4B-RA-SFT.Q4_K_M.gguf
3055+
sha256: 49147b917f431d6c42cc514558c7ce3bcdcc6fdfba937bbb6f964702dc77e532
3056+
uri: huggingface://mradermacher/Qwen3-4B-RA-SFT-GGUF/Qwen3-4B-RA-SFT.Q4_K_M.gguf
30373057
- &gemma3
30383058
url: "github:mudler/LocalAI/gallery/gemma.yaml@master"
30393059
name: "gemma-3-27b-it"

0 commit comments

Comments
 (0)