Feat/mlx inference by RacineAI-comp · Pull Request #767 · roboflow/rf-detr

RacineAI-comp · 2026-03-02T16:04:05Z

What does this PR do?

Adds native MLX segmentation inference so RFDETRSegNano/Small/Medium/Large work with optimize_for_inference(backend="mlx") on Apple Silicon.

New MLXSegInferenceModel with compiled FP16 forward pass (backbone → decoder with intermediates → seg head → masks)
New SegHead module (depthwise conv blocks + einsum mask generation)
Seg head weight conversion (convert_seg_weights) with key remapping
Decoder return_intermediate support to expose spatial features and per-layer hidden states
Automatic routing: segmentation_head=True in config → MLXSegInferenceModel, otherwise → MLXInferenceModel
Masks are resized to original image dimensions and returned as sv.Detections(mask=...)

Related Issue(s): None

Type of Change

New feature (non-breaking change that adds functionality)

Testing

I have tested this change locally
I have added/updated tests for this change

Test details:

14 new tests in test_mlx_seg_inference.py: weight conversion key remapping, conv transposition, num_blocks auto-detection, decoder return_intermediate shapes, SegHead output shapes, build_seg_head from dict, postprocess output
keys/shapes/values
2 new routing tests in test_mlx_inference.py: verifies seg config routes to MLXSegInferenceModel and det config routes to MLXInferenceModel
Full suite: 331 pass, 2 skipped, 0 failures (pytest src/ tests/ -n 2 -m "not gpu")
pre-commit run --all-files clean
Verified on real image with RFDETRSegLarge: correct person/couch/remote detections with masks via MLX backend

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code where necessary, particularly in hard-to-understand areas
My changes generate no new warnings or errors
I have updated the documentation accordingly (if applicable)

Additional Context

Tested on Apple M4 Pro. The segmentation pipeline reuses the same backbone and decoder as the detection MLX backend, adding only the seg head and return_intermediate plumbing. No changes to existing detection inference behavior.

CLAassistant · 2026-03-02T16:04:12Z

All committers have signed the CLA.

Copilot

Pull request overview

Adds native MLX-based segmentation inference so RF-DETR segmentation models can run via optimize_for_inference(backend="mlx") on Apple Silicon, extending the existing MLX detection backend.

Changes:

Introduces an MLX segmentation inference path (MLXSegInferenceModel) including an MLX SegHead, seg-weight conversion, and decoder intermediate outputs.
Adds MLX routing + prediction support in RFDETR.optimize_for_inference() / RFDETR.predict().
Adds MLX-only test coverage and a new mlx optional dependency extra.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`src/rfdetr/detr.py`	Adds `backend="mlx"` routing, MLX model caching, and an MLX prediction path.
`src/rfdetr/mlx/__init__.py`	Exposes MLX builders for detection vs segmentation inference models.
`src/rfdetr/mlx/inference.py`	Implements compiled MLX detection + segmentation inference and postprocessing.
`src/rfdetr/mlx/decoder.py`	Adds `return_intermediate` support for segmentation features.
`src/rfdetr/mlx/seg_head.py`	Adds MLX segmentation head implementation + builder from converted weights.
`src/rfdetr/mlx/convert_weights.py`	Adds segmentation-head weight extraction/remapping and conv transposition support.
`src/rfdetr/mlx/backbone.py`	Adds MLX DINOv2 backbone implementation used by the MLX inference pipeline.
`tests/models/test_mlx_inference.py`	Adds MLX detection backend tests and routing tests for seg vs det builds.
`tests/models/test_mlx_seg_inference.py`	Adds MLX segmentation backend tests (seg weights, intermediates, seg head, postprocess).
`pyproject.toml`	Adds `mlx` optional extra and registers an `mlx` pytest marker.
`.gitignore`	Ignores additional local artifacts (e.g., `.pth`, scratch/demo outputs).

src/rfdetr/mlx/decoder.py

src/rfdetr/mlx/inference.py

codecov · 2026-03-13T17:25:23Z

Codecov Report

❌ Patch coverage is 2.58718% with 866 lines in your changes missing coverage. Please review.
✅ Project coverage is 69%. Comparing base (7450c16) to head (b0a789e).
⚠️ Report is 9 commits behind head on develop.

❌ Your patch check has failed because the patch coverage (3%) is below the target coverage (95%). You can increase the patch coverage or adjust the target coverage.
❌ Your project check has failed because the head coverage (69%) is below the target coverage (95%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@           Coverage Diff           @@
##           develop   #767    +/-   ##
=======================================
- Coverage       77%    69%    -8%     
=======================================
  Files           97    103     +6     
  Lines         7538   8426   +888     
=======================================
+ Hits          5801   5830    +29     
- Misses        1737   2596   +859

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…r torch annotation

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

… postprocess Addresses review comment by @Copilot (PR roboflow#767) `np.argpartition(-flat, num_select)` raises ValueError when num_select equals flat.size; changed kth to num_select-1 to match the detection postprocess. Added num_select==0 early-exit (with correctly-shaped empty masks array) for parity with the detection path.

…dcoding Addresses review finding [HIGH] by @maintainer-review (PR roboflow#767) set(range(3,depth,3)) = {3,6,9} is correct for Nano/Small/Medium/Large (out_feature_indexes=[3,6,9,12]) but silently wrong for RFDETRBaseConfig and RFDETRLargeDeprecatedConfig (out_feature_indexes=[2,5,8,11]) where the PyTorch backbone runs full attention at {2,5,8,11}.

Addresses review finding [MEDIUM] by @maintainer-review (PR roboflow#767) backbone.py:interpolate_pos_embed() imports scipy.ndimage.zoom; scipy was only in [train], so pip install 'rfdetr[mlx]' users got a confusing ModuleNotFoundError at inference time when pos-embed interpolation ran.

Addresses review findings [MEDIUM] by @maintainer-review (PR roboflow#767) - optimize_for_inference(backend='typo') previously fell through to PyTorch silently; now raises ValueError with the supported options listed - predict(shape=...) was silently ignored when backend='mlx'; now raises NotImplementedError with a workaround hint

…ection Addresses review finding [MEDIUM] by @maintainer-review (PR roboflow#767) MLX source files import mlx.core at module level, which is unavailable on non-Darwin hosts. Previously --doctest-modules was globally disabled as a workaround, silently killing doctest coverage for all other modules. Add a root conftest.py with collect_ignore_glob to skip src/rfdetr/mlx/*.py during collection and restore the --doctest-modules flag.

Addresses review finding [LOW] by @maintainer-review (PR roboflow#767) _optimized_half was set to False in remove_optimized_model() but never read anywhere in the codebase; dead code.

Addresses review finding [LOW] by @maintainer-review (PR roboflow#767) pytest.mark.mlx was registered in pyproject.toml but never applied to the test files, so 'pytest -m mlx' silently matched nothing. Add a module-level pytestmark so '-m mlx' correctly selects all MLX tests.

…/rf-detr into feat/mlx-inference

…mage.zoom Addresses review finding [HIGH] by @maintainer-review (PR roboflow#767) The original 100-iteration sequential cv2.resize loop dominated inference time (~2-5ms per mask on CPU). scipy.ndimage.zoom operates on the full (N, H, W) array in one call (order=1 = bilinear), cutting the resize step to a single operation. Also drops the undeclared cv2 dependency (available transitively via supervision, but not listed in [mlx] extras).

…attn_layers Addresses review findings [HIGH]/[MEDIUM] by @maintainer-review (PR roboflow#767) - test_returns_false_when_mlx_not_installed: assertion was 'check() is False or sys.platform != "darwin"' which is vacuously True on every non-Darwin CI runner; removed the short-circuit so the mock is actually validated on all platforms - backbone.py: add comment explaining why full_attn_layers = set(feature_indices) matches PyTorch's global-attention schedule (out_feature_indexes excludes those layers from windowed attention; feature_indices are the 0-indexed form)

Consolidates segmentation inference tests into `test_mlx_inference.py` for unified MLX test coverage. Removes the now-redundant `test_mlx_seg_inference.py`. Updates file-level docstrings to include segmentation tests.

RacineAI-comp requested review from Borda, SkalskiP, isaacrob and probicheaux as code owners March 2, 2026 16:04

github-actions bot added the has conflicts label Mar 3, 2026

Borda self-assigned this Mar 3, 2026

Borda requested a review from Copilot March 3, 2026 16:52

Copilot started reviewing on behalf of Borda March 3, 2026 16:53 View session

github-actions bot removed the has conflicts label Mar 3, 2026

Borda added the enhancement New feature or request label Mar 3, 2026

Copilot AI reviewed Mar 3, 2026

View reviewed changes

src/rfdetr/mlx/decoder.py Outdated Show resolved Hide resolved

src/rfdetr/mlx/inference.py Outdated Show resolved Hide resolved

src/rfdetr/mlx/inference.py Show resolved Hide resolved

github-actions bot added has conflicts and removed has conflicts labels Mar 7, 2026

Borda force-pushed the develop branch from 7b535b4 to 34b3e3d Compare March 13, 2026 15:24

Borda requested a review from Matvezy as a code owner March 13, 2026 15:24

Borda force-pushed the develop branch from a39bebd to a6e6ca0 Compare March 13, 2026 16:41

github-actions bot removed the has conflicts label Mar 13, 2026

Borda force-pushed the develop branch from a6e6ca0 to 0485141 Compare March 13, 2026 17:07

github-actions bot added the has conflicts label Mar 13, 2026

RacineAI-comp force-pushed the feat/mlx-inference branch from a9f2ed9 to 7d4a6dc Compare March 14, 2026 10:11

RacineAI-comp and others added 5 commits March 14, 2026 11:16

Add native Apple Silicon inference via MLX backend

d73824a

Add MLX segmentation backend (RFDETRSeg* via optimize_for_inference)

cffe7a0

Add MLX segmentation backend (RFDETRSeg* via optimize_for_inference)

e5bd7a1

fix test warnings (roboflow#761)

65a5d7a

fix: revert --doctest-modules (breaks CI on non-MLX platforms)

82f67e4

RacineAI-comp force-pushed the feat/mlx-inference branch from 7d4a6dc to 82f67e4 Compare March 14, 2026 10:21

github-actions bot removed the has conflicts label Mar 14, 2026

fix: remove stale train_from_config from rebase, add TYPE_CHECKING fo…

ede91f3

…r torch annotation

RacineAI-comp force-pushed the feat/mlx-inference branch from 40a7690 to ede91f3 Compare March 14, 2026 10:36

github-actions bot added the has conflicts label Mar 14, 2026

Merge branch 'develop' into feat/mlx-inference

849eeac

github-actions bot removed the has conflicts label Mar 16, 2026

RacineAI-comp and others added 6 commits March 17, 2026 09:52

Merge branch 'develop' into feat/mlx-inference

63f30df

Merge branch 'develop' into feat/mlx-inference

15f660d

Merge branch 'develop' into feat/mlx-inference

37f0687

Apply suggestions from code review

d3daf3c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Merge branch 'develop' into feat/mlx-inference

7916b0f

Update decoder.py

167ab24

github-actions bot added the has conflicts label Mar 19, 2026

Merge branch 'develop' into feat/mlx-inference

505a009

github-actions bot removed the has conflicts label Mar 23, 2026

RacineAI-comp and others added 15 commits March 24, 2026 16:44

Merge branch 'develop' into feat/mlx-inference

47d77a9

Merge branch 'develop' into feat/mlx-inference

8be5694

Merge branch 'develop' into feat/mlx-inference

b47a72b

refactor(detr): remove dead _optimized_half attribute

0b2401d

Addresses review finding [LOW] by @maintainer-review (PR roboflow#767) _optimized_half was set to False in remove_optimized_model() but never read anywhere in the codebase; dead code.

lint: auto-fix violations after resolve cycle

aa34d81

Merge branch 'feat/mlx-inference' of https://github.com/RacineAI-comp…

397c14c

…/rf-detr into feat/mlx-inference

github-actions bot added the has conflicts label Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/mlx inference#767

Feat/mlx inference#767
RacineAI-comp wants to merge 29 commits intoroboflow:developfrom
RacineAI-comp:feat/mlx-inference

RacineAI-comp commented Mar 2, 2026

Uh oh!

CLAassistant commented Mar 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

RacineAI-comp commented Mar 2, 2026

What does this PR do?

Type of Change

Testing

Checklist

Additional Context

Uh oh!

CLAassistant commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CLAassistant commented Mar 2, 2026 •

edited

Loading

codecov bot commented Mar 13, 2026 •

edited

Loading