Skip to content

Feat/mlx inference#767

Open
RacineAI-comp wants to merge 29 commits intoroboflow:developfrom
RacineAI-comp:feat/mlx-inference
Open

Feat/mlx inference#767
RacineAI-comp wants to merge 29 commits intoroboflow:developfrom
RacineAI-comp:feat/mlx-inference

Conversation

@RacineAI-comp
Copy link
Copy Markdown

What does this PR do?

Adds native MLX segmentation inference so RFDETRSegNano/Small/Medium/Large work with optimize_for_inference(backend="mlx") on Apple Silicon.

  • New MLXSegInferenceModel with compiled FP16 forward pass (backbone → decoder with intermediates → seg head → masks)
  • New SegHead module (depthwise conv blocks + einsum mask generation)
  • Seg head weight conversion (convert_seg_weights) with key remapping
  • Decoder return_intermediate support to expose spatial features and per-layer hidden states
  • Automatic routing: segmentation_head=True in config → MLXSegInferenceModel, otherwise → MLXInferenceModel
  • Masks are resized to original image dimensions and returned as sv.Detections(mask=...)

Related Issue(s): None

Type of Change

  • New feature (non-breaking change that adds functionality)

Testing

  • I have tested this change locally
  • I have added/updated tests for this change

Test details:

  • 14 new tests in test_mlx_seg_inference.py: weight conversion key remapping, conv transposition, num_blocks auto-detection, decoder return_intermediate shapes, SegHead output shapes, build_seg_head from dict, postprocess output
    keys/shapes/values
  • 2 new routing tests in test_mlx_inference.py: verifies seg config routes to MLXSegInferenceModel and det config routes to MLXInferenceModel
  • Full suite: 331 pass, 2 skipped, 0 failures (pytest src/ tests/ -n 2 -m "not gpu")
  • pre-commit run --all-files clean
  • Verified on real image with RFDETRSegLarge: correct person/couch/remote detections with masks via MLX backend

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code where necessary, particularly in hard-to-understand areas
  • My changes generate no new warnings or errors
  • I have updated the documentation accordingly (if applicable)

Additional Context

Tested on Apple M4 Pro. The segmentation pipeline reuses the same backbone and decoder as the detection MLX backend, adding only the seg head and return_intermediate plumbing. No changes to existing detection inference behavior.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 2, 2026

CLA assistant check
All committers have signed the CLA.

@Borda Borda self-assigned this Mar 3, 2026
@Borda Borda requested a review from Copilot March 3, 2026 16:52
@Borda Borda added the enhancement New feature or request label Mar 3, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds native MLX-based segmentation inference so RF-DETR segmentation models can run via optimize_for_inference(backend="mlx") on Apple Silicon, extending the existing MLX detection backend.

Changes:

  • Introduces an MLX segmentation inference path (MLXSegInferenceModel) including an MLX SegHead, seg-weight conversion, and decoder intermediate outputs.
  • Adds MLX routing + prediction support in RFDETR.optimize_for_inference() / RFDETR.predict().
  • Adds MLX-only test coverage and a new mlx optional dependency extra.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/rfdetr/detr.py Adds backend="mlx" routing, MLX model caching, and an MLX prediction path.
src/rfdetr/mlx/__init__.py Exposes MLX builders for detection vs segmentation inference models.
src/rfdetr/mlx/inference.py Implements compiled MLX detection + segmentation inference and postprocessing.
src/rfdetr/mlx/decoder.py Adds return_intermediate support for segmentation features.
src/rfdetr/mlx/seg_head.py Adds MLX segmentation head implementation + builder from converted weights.
src/rfdetr/mlx/convert_weights.py Adds segmentation-head weight extraction/remapping and conv transposition support.
src/rfdetr/mlx/backbone.py Adds MLX DINOv2 backbone implementation used by the MLX inference pipeline.
tests/models/test_mlx_inference.py Adds MLX detection backend tests and routing tests for seg vs det builds.
tests/models/test_mlx_seg_inference.py Adds MLX segmentation backend tests (seg weights, intermediates, seg head, postprocess).
pyproject.toml Adds mlx optional extra and registers an mlx pytest marker.
.gitignore Ignores additional local artifacts (e.g., .pth, scratch/demo outputs).

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 2.58718% with 866 lines in your changes missing coverage. Please review.
✅ Project coverage is 69%. Comparing base (7450c16) to head (b0a789e).
⚠️ Report is 9 commits behind head on develop.

❌ Your patch check has failed because the patch coverage (3%) is below the target coverage (95%). You can increase the patch coverage or adjust the target coverage.
❌ Your project check has failed because the head coverage (69%) is below the target coverage (95%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@           Coverage Diff           @@
##           develop   #767    +/-   ##
=======================================
- Coverage       77%    69%    -8%     
=======================================
  Files           97    103     +6     
  Lines         7538   8426   +888     
=======================================
+ Hits          5801   5830    +29     
- Misses        1737   2596   +859     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

RacineAI-comp and others added 15 commits March 24, 2026 16:44
… postprocess

Addresses review comment by @Copilot (PR roboflow#767)
`np.argpartition(-flat, num_select)` raises ValueError when num_select
equals flat.size; changed kth to num_select-1 to match the detection
postprocess. Added num_select==0 early-exit (with correctly-shaped empty
masks array) for parity with the detection path.
…dcoding

Addresses review finding [HIGH] by @maintainer-review (PR roboflow#767)
set(range(3,depth,3)) = {3,6,9} is correct for Nano/Small/Medium/Large
(out_feature_indexes=[3,6,9,12]) but silently wrong for RFDETRBaseConfig
and RFDETRLargeDeprecatedConfig (out_feature_indexes=[2,5,8,11]) where
the PyTorch backbone runs full attention at {2,5,8,11}.
Addresses review finding [MEDIUM] by @maintainer-review (PR roboflow#767)
backbone.py:interpolate_pos_embed() imports scipy.ndimage.zoom; scipy
was only in [train], so pip install 'rfdetr[mlx]' users got a confusing
ModuleNotFoundError at inference time when pos-embed interpolation ran.
Addresses review findings [MEDIUM] by @maintainer-review (PR roboflow#767)
- optimize_for_inference(backend='typo') previously fell through to PyTorch
  silently; now raises ValueError with the supported options listed
- predict(shape=...) was silently ignored when backend='mlx'; now raises
  NotImplementedError with a workaround hint
…ection

Addresses review finding [MEDIUM] by @maintainer-review (PR roboflow#767)
MLX source files import mlx.core at module level, which is unavailable
on non-Darwin hosts. Previously --doctest-modules was globally disabled
as a workaround, silently killing doctest coverage for all other modules.

Add a root conftest.py with collect_ignore_glob to skip src/rfdetr/mlx/*.py
during collection and restore the --doctest-modules flag.
Addresses review finding [LOW] by @maintainer-review (PR roboflow#767)
_optimized_half was set to False in remove_optimized_model() but never
read anywhere in the codebase; dead code.
Addresses review finding [LOW] by @maintainer-review (PR roboflow#767)
pytest.mark.mlx was registered in pyproject.toml but never applied to
the test files, so 'pytest -m mlx' silently matched nothing. Add a
module-level pytestmark so '-m mlx' correctly selects all MLX tests.
…mage.zoom

Addresses review finding [HIGH] by @maintainer-review (PR roboflow#767)
The original 100-iteration sequential cv2.resize loop dominated inference
time (~2-5ms per mask on CPU). scipy.ndimage.zoom operates on the full
(N, H, W) array in one call (order=1 = bilinear), cutting the resize
step to a single operation. Also drops the undeclared cv2 dependency
(available transitively via supervision, but not listed in [mlx] extras).
…attn_layers

Addresses review findings [HIGH]/[MEDIUM] by @maintainer-review (PR roboflow#767)

- test_returns_false_when_mlx_not_installed: assertion was
  'check() is False or sys.platform != "darwin"' which is vacuously True
  on every non-Darwin CI runner; removed the short-circuit so the mock
  is actually validated on all platforms

- backbone.py: add comment explaining why full_attn_layers = set(feature_indices)
  matches PyTorch's global-attention schedule (out_feature_indexes excludes
  those layers from windowed attention; feature_indices are the 0-indexed form)
Consolidates segmentation inference tests into `test_mlx_inference.py` for unified MLX test coverage. Removes the now-redundant `test_mlx_seg_inference.py`. Updates file-level docstrings to include segmentation tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request has conflicts

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants