Fix ONNX export for dynamic batch dimensions by svengoluza · Pull Request #871 · roboflow/rf-detr

svengoluza · 2026-03-25T12:00:00Z

What does this PR do?

ONNX export silently baked spatial dimensions and batch size into the graph as fixed constants. When users tried to run the exported model with a different batch size at inference time, it would fail because the ONNX graph contained hardcoded values derived from the export-time batch size.

Three root causes:

LayerNorm.forward used x.size(3) instead of self.normalized_shape — The ONNX tracer captured the concrete integer from the export-time tensor shape and embedded it as a constant node. Switching to self.normalized_shape (already stored on the module) gives the tracer a static attribute it can reference symbolically.
spatial_shapes was built as a Python list, then converted with torch.as_tensor() — The tracer never saw the h, w values flow through tensor operations, so it treated them as constants. Building spatial_shapes directly as a tensor with index assignment (spatial_shapes[lvl, 0] = h) lets the tracer track the symbolic relationship between the input spatial dims and downstream uses.
gen_encoder_output_proposals created valid_H/valid_W via Python list comprehensions — torch.tensor([H_ for _ in range(N_)]) bakes the batch-dependent value N_ as a fixed list length. Replacing this with H_.expand(N_) keeps the batch dimension symbolic in the graph.

Additionally, a dynamic_batch parameter is added to RFDETR.export_onnx() and the standalone export CLI. When enabled, it marks the batch axis (dim 0) as dynamic on all input and output names, allowing the exported model to accept variable batch sizes at runtime.

Related Issue(s): #376 , #79

Type of Change

Bug fix (non-breaking change that fixes an issue)

Testing

I have tested this change locally

CLAassistant · 2026-03-25T12:00:15Z

All committers have signed the CLA.

Copilot

Pull request overview

Fixes RF-DETR’s ONNX export to better support dynamic batch sizes by removing shape-dependent Python constructs that get constant-folded into the exported graph.

Changes:

Replace shape-derived Python constructs with tensor/attribute-based equivalents to avoid baking batch/spatial dims into ONNX graphs.
Add dynamic_batch option to RFDETR.export() and the export CLI path to emit dynamic_axes for batch dimension.
Update projector LayerNorm to use self.normalized_shape instead of x.size(3) to prevent tracing fixed constants.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
`src/rfdetr/models/transformer.py`	Reworks `spatial_shapes` construction and proposal generation to be more ONNX-trace-friendly.
`src/rfdetr/models/backbone/projector.py`	Prevents LayerNorm from embedding export-time channel size as a constant.
`src/rfdetr/export/main.py`	Adds CLI support for dynamic batch via `dynamic_axes`.
`src/rfdetr/detr.py`	Adds `dynamic_batch` parameter and wires `dynamic_axes` through high-level export API.

src/rfdetr/export/main.py

src/rfdetr/models/transformer.py

src/rfdetr/detr.py

codecov · 2026-03-25T12:48:26Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77%. Comparing base (c0fb944) to head (cacdd70).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files

@@          Coverage Diff           @@
##           develop   #871   +/-   ##
======================================
  Coverage       77%    77%           
======================================
  Files           97     97           
  Lines         7530   7538    +8     
======================================
+ Hits          5793   5801    +8     
  Misses        1737   1737

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…xport

The PR switched spatial_shapes from a Python list to a tensor for ONNX tracing. But gen_encoder_output_proposals iterates over it and uses H_/W_ as slice indices and torch.linspace steps, both of which require Python ints, not scalar tensors — causing TypeError at runtime. Reintroduce spatial_shapes_hw as a list[tuple[int, int]] built alongside the tensor during the loop (h/w come from src.shape, so they are already Python ints). Pass spatial_shapes_hw to gen_encoder_output_proposals while the tensor form is still used for MSDeformAttn and level_start_index. Addresses review comment by @Copilot (PR roboflow#871)

Add two test suites recommended by Copilot review: 1. TestCliExportMain.test_dynamic_batch_forwards_dynamic_axes — verifies that CLI main() passes dynamic_axes={name: {0: 'batch'}} to export_onnx for every I/O name when --dynamic_batch=True, and None when False. Also updates _make_args() to accept dynamic_batch and fake_export_onnx to capture dynamic_axes. 2. test_rfdetr_export_dynamic_batch_forwards_dynamic_axes — verifies that RFDETR.export(..., dynamic_batch=True) forwards a correctly keyed dynamic_axes dict into export_onnx, covering detection and segmentation model configs, plus static (False) baseline. Addresses review comments by @Copilot (PR roboflow#871)

H_.expand(N_) assumed H_ is a tensor, but gen_encoder_output_proposals now receives spatial_shapes as list[tuple[int, int]] from Transformer.forward(). Python ints have no .expand() method, causing AttributeError on the export path (masks=None, two_stage=True). Replace with torch.full((N_,), H_, ...) which accepts Python ints. Also add regression test with list[tuple[int, int]] + masks=None to lock in the int-tuple path and catch any future regressions.

src/rfdetr/models/transformer.py

Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

src/rfdetr/models/transformer.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix ONNX export for dynamic batch dimensions * fix: restore Python int pairs for gen_encoder_output_proposals * test: add dynamic_batch coverage for CLI and RFDETR.export() * fix: replace H_.expand(N_) with torch.full for Python int spatial dims * Apply suggestions from code review --------- Co-authored-by: jirka <6035284+Borda@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

svengoluza requested review from Borda, SkalskiP, isaacrob and probicheaux as code owners March 25, 2026 12:00

Fix ONNX export for dynamic batch dimensions

febf8d7

svengoluza force-pushed the fix-onnx-dynamic-export branch from 18278fb to febf8d7 Compare March 25, 2026 12:04

Borda requested a review from Copilot March 25, 2026 12:40

Copilot started reviewing on behalf of Borda March 25, 2026 12:40 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

src/rfdetr/export/main.py Show resolved Hide resolved

src/rfdetr/models/transformer.py Show resolved Hide resolved

src/rfdetr/detr.py Show resolved Hide resolved

Borda added 5 commits March 25, 2026 13:53

Merge remote-tracking branch 'origin/develop' into fix-onnx-dynamic-e…

66a41d1

…xport

style: ruff-format test_export.py

3086f98

Borda requested a review from Copilot March 25, 2026 13:13

Copilot started reviewing on behalf of Borda March 25, 2026 13:14 View session

Borda reviewed Mar 25, 2026

View reviewed changes

src/rfdetr/models/transformer.py Outdated Show resolved Hide resolved

Borda and others added 3 commits March 25, 2026 14:14

Apply suggestions from code review

6101456

Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>

fix(pre-commit): 🎨 auto format pre-commit hooks

39e40ec

Merge branch 'develop' into fix-onnx-dynamic-export

d3f9e98

Copilot AI reviewed Mar 25, 2026

View reviewed changes

src/rfdetr/models/transformer.py Outdated Show resolved Hide resolved

Update src/rfdetr/models/transformer.py

cacdd70

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Borda approved these changes Mar 25, 2026

View reviewed changes

Borda merged commit 7450c16 into roboflow:develop Mar 25, 2026
23 checks passed

Borda added the bug Something isn't working label Mar 25, 2026

Borda mentioned this pull request Mar 26, 2026

RF-DETR Segmentation to ONNX #463

Closed

2 tasks

Borda mentioned this pull request Mar 27, 2026

releasing 1.6.2 [rebase & merge] #884

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ONNX export for dynamic batch dimensions#871

Fix ONNX export for dynamic batch dimensions#871
Borda merged 10 commits intoroboflow:developfrom
svengoluza:fix-onnx-dynamic-export

svengoluza commented Mar 25, 2026

Uh oh!

CLAassistant commented Mar 25, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

svengoluza commented Mar 25, 2026

What does this PR do?

Type of Change

Testing

Uh oh!

CLAassistant commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CLAassistant commented Mar 25, 2026 •

edited

Loading

codecov bot commented Mar 25, 2026 •

edited

Loading