-
Notifications
You must be signed in to change notification settings - Fork 574
Pull requests: ml-explore/mlx-lm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat(nemotron_h): add Multi-Token Prediction (MTP) module
#1161
opened Apr 16, 2026 by
Thump604
Loading…
Fix Gemma 4 KV-shared layers creating unused projections
#1158
opened Apr 15, 2026 by
glyphVault
Loading…
5 tasks done
Fix stale token data in logits processors due to lazy evaluation
#1157
opened Apr 15, 2026 by
Thump604
Loading…
Fix empty tool_call_end breaking Mistral tool calls
#1151
opened Apr 14, 2026 by
eyupcanakman
Contributor
Loading…
Fix Gemma4 tool parser: support hyphenated function names and braces in string args
#1150
opened Apr 14, 2026 by
AkashKhamkar
Loading…
Add TurboQuantKVCache: data-oblivious 2-4 bit KV cache compression
#1144
opened Apr 12, 2026 by
Smilefounder
•
Draft
3 tasks done
fix(gemma4): return [] instead of raising on empty tool-call match
#1142
opened Apr 10, 2026 by
gofastercloud
Loading…
[Fix] Register Basename
short_name in default_model_map
#1140
opened Apr 9, 2026 by
austin362667
Loading…
Add pipeline parallel support for Qwen3 MoE and MiniMax models
#1138
opened Apr 9, 2026 by
qubitcontracting
Loading…
Pipeline parallel: memory-proportional splitting and inference sync
#1137
opened Apr 9, 2026 by
qubitcontracting
Loading…
Add RAG example using mlx-lm hidden state embeddings
#1130
opened Apr 8, 2026 by
ManjushaMotamarry
Loading…
fix(cache): fix batch-size inconsistency crashes in ArraysCache, BatchKVCache, and BatchRotatingKVCache under concurrent generation
#1129
opened Apr 8, 2026 by
jarvisxyz
Loading…
[Bugfix] Fix Gemma 4 tool call regex failing on unbalanced braces in string arguments
#1127
opened Apr 8, 2026 by
Rih0z
Loading…
feat(tuner): support loading PEFT/Unsloth LoRA adapters in load_adapters()
#1120
opened Apr 7, 2026 by
YUGOROU
Loading…
fix: honor --prompt-cache-bytes in sequential serve mode
#1118
opened Apr 7, 2026 by
Jw983cam
Loading…
fix: unconditionally pop prefix entries in LRUPromptCache.insert_cache
#1117
opened Apr 7, 2026 by
Jw983cam
Loading…
fix: BatchRotatingKVCache.merge() shape mismatch with different fill levels
#1116
opened Apr 7, 2026 by
Jw983cam
Loading…
fix: enable speculative decoding for hybrid models (Qwen3.5, fixes #846)
#1111
opened Apr 5, 2026 by
alexlee2046
Loading…
perf: reduce peak memory during model quantization
#1102
opened Apr 3, 2026 by
matteocelani
Contributor
Loading…
5 tasks done
feat: memory-aware auto-config + BatchQuantizedKVCache for batched quantized KV
#1101
opened Apr 3, 2026 by
deceptech-packet-ninja
Loading…
8 tasks done
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.