ggml-org / llama.cpp Public

Notifications You must be signed in to change notification settings
Fork 14k
Star 90.9k

Code
Issues 322
Pull requests 608
Discussions
Actions
Projects 10
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggml-org/llama.cpp

Labels 86 Milestones 0

New pull request New

608 Open 7,977 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

fix: disable regex nosubs|optimize flags on MSVC

#17831 opened Dec 6, 2025 by dranger003

Loading…

common : change --color to accept on/off/auto, default to auto

#17827 opened Dec 6, 2025 by CISC

Loading…

[SYCL] Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai documentation

Improvements or additions to documentation

ggml

changes relating to the ggml tensor library for machine learning

SYCL

https://en.wikipedia.org/wiki/SYCL - GPU programming language

#17826 opened Dec 6, 2025 by NeoZhangJianyu

Loading…

cann : fix ops broken by circular padding guard Ascend NPU

issues specific to Ascend NPUs

ggml

changes relating to the ggml tensor library for machine learning

#17825 opened Dec 6, 2025 by CISC

Loading…

cli: new CLI experience devops

improvements to build systems and github actions

examples script

Script related

server testing

Everything test related

#17824 opened Dec 6, 2025 by ngxson • Draft

1 of 5 tasks

llama : add token matching support to llama-grammar testing

Everything test related

#17816 opened Dec 6, 2025 by aldehir • Draft

3 tasks done

CANN: support gated linear attn Ascend NPU

issues specific to Ascend NPUs

ggml

changes relating to the ggml tensor library for machine learning

#17814 opened Dec 6, 2025 by YushengZhao

Loading…

vulkan: faster q6_k matmul ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#17813 opened Dec 6, 2025 by netrunnereve

Loading…

model: support Rnj-1 model

Model specific

python

python script changes

#17811 opened Dec 6, 2025 by philip-essential

Loading…

webui: Fix parsing non-LaTeX occurrencies of \( or \) examples server

#17810 opened Dec 6, 2025 by allozaur

Loading…

webui: Add setting to always show sidebar on Desktop examples server

#17809 opened Dec 6, 2025 by allozaur

Loading…

server: improve speed of speculative decoding examples server

#17808 opened Dec 5, 2025 by ngxson

Loading…

(CUDA-only) Efficient inference using llama-mtmd-cli for high resolution images with reduced GPU VRAM usage (#17801) examples

#17802 opened Dec 5, 2025 by deepshnv • Draft

[DRAFT] CUDA: Improve performance via less synchronizations between token ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#17795 opened Dec 5, 2025 by aendk • Draft

Make graph_max_nodes vary by ubatch size

#17794 opened Dec 5, 2025 by pwilkin

Loading…

SOLVE_TRI extension to more dimensions ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

testing

Everything test related

#17793 opened Dec 5, 2025 by pwilkin

Loading…

ggml-cpu: add repack GEMM and GEMV for floating-point ggml

changes relating to the ggml tensor library for machine learning

#17791 opened Dec 5, 2025 by taimur-10x • Draft

ggml-cpu: add ggml_thread_cpu_relax with Zihintpause support ggml

changes relating to the ggml tensor library for machine learning

#17784 opened Dec 5, 2025 by ixgbe

Loading…

CANN : Optimize mul_mat_id quantization Ascend NPU

issues specific to Ascend NPUs

ggml

changes relating to the ggml tensor library for machine learning

#17782 opened Dec 5, 2025 by jjjxp03

Loading…

sycl: add missing BF16 conversion support for Intel oneAPI ggml

changes relating to the ggml tensor library for machine learning

SYCL

https://en.wikipedia.org/wiki/SYCL - GPU programming language

#17780 opened Dec 5, 2025 by yingying0906

Loading…

Add link to AshkanYarmoradi/go-llama.cpp

#17776 opened Dec 5, 2025 by AshkanYarmoradi

Loading…

Move common_chat_parse and common_chat_peg_parse to chat-parser.h examples server testing

Everything test related

#17772 opened Dec 4, 2025 by sheldonrobinson

Loading…

docs(server): clarify that --ctx-size is total context divided among parallel slots examples server

#17767 opened Dec 4, 2025 by kitaekatt

Loading…

Add a search field on model selector / improve mobile display examples server

#17765 opened Dec 4, 2025 by ServeurpersoCom

Loading…

server : add development documentation examples server

#17760 opened Dec 4, 2025 by ngxson

Loading…

Previous 1 2 3 4 5 … 24 25 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!