Skip to content

IgorWarzocha/llama_cpp_manager

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llama_cpp_manager

Tired of keeping your LLaMA.cpp launch commands in text files? This tool gives you one directory that handles everything LLaMA.cpp - save configurations, benchmark models, and manage binaries.

Quick Start

pip3 install requests
python3 setup_wizard.py

The wizard will download LLaMA.cpp binaries and set up defaults for your system.

What It Does

  • Save configurations - Store your preferred launch commands instead of copy-pasting from text files
  • One-click startup - Launch models by name instead of typing long commands
  • Benchmark & compare - Test different models/settings and generate reports
  • Auto-update LLaMA.cpp - Downloads latest releases automatically
  • Manage everything locally - Self-contained in one directory

Why This Exists

I got tired of:

  • Keeping launch commands in random text files
  • Retyping the same long ./llama-server commands
  • Manually downloading LLaMA.cpp updates
  • Having to benchmark the server manually

So I built a tool that handles all of this in one place.

Menu

1. Start model server        # Launch saved configurations  
2. Manage configurations     # Add/edit model setups
3. Benchmark models          # Test and compare performance
4. Update llama.cpp          # Download latest releases
5. Settings                  # Configure defaults

Features

Configuration Management

  • Save model launch commands with names
  • Test configurations automatically
  • Multiple configs per model for different use cases

Benchmarking

  • Compare model performance with standardized prompts
  • Uses Einstein's riddle (logic puzzle) to test reasoning ability
  • Generate markdown reports with timing data
  • Track tokens/second across different settings

Auto-Updates

  • Downloads LLaMA.cpp binaries from GitHub releases
  • Platform detection (Linux/Windows/macOS)
  • GPU package selection (CUDA, Vulkan, CPU-only)

Multi-GPU Support

  • Tensor split configuration with examples
  • Binary mode: 1,0 (GPU 0 only), 0,1 (GPU 1 only)
  • Percentage mode: 85,15 (85% GPU 0, 15% GPU 1)

Requirements

  • Python 3
  • requests library (for downloading LLaMA.cpp releases)

Files Created

  • settings.json - Your preferences
  • model_configs.json - Saved configurations
  • prompt.txt - Einstein's riddle puzzle used for benchmarking
  • llama-server-local - Downloaded LLaMA.cpp binary

About the Benchmark

The tool uses Einstein's riddle (the classic "Who owns the fish?" logic puzzle) to benchmark models. This tests reasoning ability rather than just raw text generation speed. It's a good middle ground - complex enough to be meaningful, simple enough to run quickly.

You can edit prompt.txt to use your own benchmark prompt if you prefer.

That's it. Everything self-contained in one directory.

About

A simple python-based terminal tool to manage llama.cpp

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •