Skip to content

Commit b2b6080

Browse files
committed
chore(model gallery): add demyagent-4b-i1
Signed-off-by: Ettore Di Giacinto <[email protected]>
1 parent 1087bd2 commit b2b6080

File tree

1 file changed

+22
-0
lines changed

1 file changed

+22
-0
lines changed

gallery/index.yaml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3054,6 +3054,28 @@
30543054
- filename: Qwen3-4B-RA-SFT.Q4_K_M.gguf
30553055
sha256: 49147b917f431d6c42cc514558c7ce3bcdcc6fdfba937bbb6f964702dc77e532
30563056
uri: huggingface://mradermacher/Qwen3-4B-RA-SFT-GGUF/Qwen3-4B-RA-SFT.Q4_K_M.gguf
3057+
- !!merge <<: *qwen3
3058+
name: "demyagent-4b-i1"
3059+
icon: https://cdn-avatars.huggingface.co/v1/production/uploads/64fde4e252e82dd432b74ce9/TAEScS71YX5NPRM4TXZc8.png
3060+
urls:
3061+
- https://huggingface.co/Gen-Verse/DemyAgent-4B
3062+
- https://huggingface.co/mradermacher/DemyAgent-4B-i1-GGUF
3063+
description: |
3064+
This repository contains the DemyAgent-4B model weights, a 4B-sized agentic reasoning model that achieves state-of-the-art performance on challenging benchmarks including AIME2024/2025, GPQA-Diamond, and LiveCodeBench-v6. DemyAgent-4B is trained using our GRPO-TCR recipe with 30K high-quality agentic RL data, demonstrating that small models can outperform much larger alternatives (14B/32B) through effective RL training strategies.
3065+
🌟 Introduction
3066+
3067+
In our work, we systematically investigate three dimensions of agentic RL: data, algorithms, and reasoning modes. Our findings reveal:
3068+
3069+
🎯 Data Quality Matters: Real end-to-end trajectories and high-diversity datasets significantly outperform synthetic alternatives
3070+
⚡ Training Efficiency: Exploration-friendly techniques like reward clipping and entropy maintenance boost training efficiency
3071+
🧠 Reasoning Strategy: Deliberative reasoning with selective tool calls surpasses frequent invocation or verbose self-reasoning We contribute high-quality SFT and RL datasets, demonstrating that simple recipes enable even 4B models to outperform 32B models on the most challenging reasoning benchmarks.
3072+
overrides:
3073+
parameters:
3074+
model: DemyAgent-4B.i1-Q4_K_M.gguf
3075+
files:
3076+
- filename: DemyAgent-4B.i1-Q4_K_M.gguf
3077+
sha256: be619b23510debc492ddba73b6764382a8e0c4e97e5c206e0e2ee86d117c0878
3078+
uri: huggingface://mradermacher/DemyAgent-4B-i1-GGUF/DemyAgent-4B.i1-Q4_K_M.gguf
30573079
- &gemma3
30583080
url: "github:mudler/LocalAI/gallery/gemma.yaml@master"
30593081
name: "gemma-3-27b-it"

0 commit comments

Comments
 (0)