Tired of keeping your LLaMA.cpp launch commands in text files? This tool gives you one directory that handles everything LLaMA.cpp - save configurations, benchmark models, and manage binaries.
pip3 install requests
python3 setup_wizard.pyThe wizard will download LLaMA.cpp binaries and set up defaults for your system.
- Save configurations - Store your preferred launch commands instead of copy-pasting from text files
- One-click startup - Launch models by name instead of typing long commands
- Benchmark & compare - Test different models/settings and generate reports
- Auto-update LLaMA.cpp - Downloads latest releases automatically
- Manage everything locally - Self-contained in one directory
I got tired of:
- Keeping launch commands in random text files
- Retyping the same long
./llama-servercommands - Manually downloading LLaMA.cpp updates
- Having to benchmark the server manually
So I built a tool that handles all of this in one place.
1. Start model server # Launch saved configurations
2. Manage configurations # Add/edit model setups
3. Benchmark models # Test and compare performance
4. Update llama.cpp # Download latest releases
5. Settings # Configure defaults
Configuration Management
- Save model launch commands with names
- Test configurations automatically
- Multiple configs per model for different use cases
Benchmarking
- Compare model performance with standardized prompts
- Uses Einstein's riddle (logic puzzle) to test reasoning ability
- Generate markdown reports with timing data
- Track tokens/second across different settings
Auto-Updates
- Downloads LLaMA.cpp binaries from GitHub releases
- Platform detection (Linux/Windows/macOS)
- GPU package selection (CUDA, Vulkan, CPU-only)
Multi-GPU Support
- Tensor split configuration with examples
- Binary mode:
1,0(GPU 0 only),0,1(GPU 1 only) - Percentage mode:
85,15(85% GPU 0, 15% GPU 1)
- Python 3
requestslibrary (for downloading LLaMA.cpp releases)
settings.json- Your preferencesmodel_configs.json- Saved configurationsprompt.txt- Einstein's riddle puzzle used for benchmarkingllama-server-local- Downloaded LLaMA.cpp binary
The tool uses Einstein's riddle (the classic "Who owns the fish?" logic puzzle) to benchmark models. This tests reasoning ability rather than just raw text generation speed. It's a good middle ground - complex enough to be meaningful, simple enough to run quickly.
You can edit prompt.txt to use your own benchmark prompt if you prefer.
That's it. Everything self-contained in one directory.