feat: Kornia GPU augmentation backend for detection training by Borda · Pull Request #874 · roboflow/rf-detr

Borda · 2026-03-25T23:30:38Z

What does this PR do?

Add augmentation_backend field to TrainConfig (cpu/auto/gpu); cpu is the default
New src/rfdetr/datasets/kornia_transforms.py: registry of 8 transform factories, build_kornia_pipeline, build_normalize, collate_boxes/unpack_boxes box utilities
Wire gpu_postprocess flag through coco.py and yolo.py so CPU Albumentations augmentation and normalize are skipped when GPU path is active
Add _setup_kornia_pipeline + on_after_batch_transfer to RFDETRDataModule; segmentation models skip GPU aug (phase 2) with a one-time warning
Add kornia>=0.7,<1 optional dep group in pyproject.toml
12 new tests across test_module_data.py and test_kornia_transforms.py

Closes #862

Type of Change

New feature (non-breaking change that adds functionality)

Testing

I have tested this change locally
I have added/updated tests for this change

Additional Context

- Add `augmentation_backend` field to `TrainConfig` (cpu/auto/gpu); cpu is the default - New `src/rfdetr/datasets/kornia_transforms.py`: registry of 8 transform factories, `build_kornia_pipeline`, `build_normalize`, `collate_boxes`/`unpack_boxes` box utilities - Wire `gpu_postprocess` flag through `coco.py` and `yolo.py` so CPU Albumentations augmentation and normalize are skipped when GPU path is active - Add `_setup_kornia_pipeline` + `on_after_batch_transfer` to `RFDETRDataModule`; segmentation models skip GPU aug (phase 2) with a one-time warning - Add `kornia>=0.7,<1` optional dep group in `pyproject.toml` - 12 new tests across `test_module_data.py` and `test_kornia_transforms.py` --- Co-authored-by: Claude Code <noreply@anthropic.com>

Copilot

Pull request overview

Adds an opt-in GPU-side augmentation path for detection training by introducing an augmentation_backend switch and routing normalization/augmentation to run after the batch is transferred to device (via RFDETRDataModule.on_after_batch_transfer), while keeping the existing CPU Albumentations pipeline as the default.

Changes:

Add TrainConfig.augmentation_backend ("cpu" | "auto" | "gpu") and a new optional dependency group kornia.
Thread a gpu_postprocess flag through COCO/YOLO dataset builders so CPU Albumentations + Normalize can be skipped when GPU postprocessing is active.
Add DataModule logic + tests for backend resolution and the on_after_batch_transfer hook.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`src/rfdetr/training/module_data.py`	Adds Kornia pipeline setup and `on_after_batch_transfer` GPU postprocessing hook.
`src/rfdetr/datasets/coco.py`	Adds `gpu_postprocess` option to training transforms and wires backend flag through dataset builders.
`src/rfdetr/datasets/yolo.py`	Wires backend flag through Roboflow-from-YOLO builder.
`src/rfdetr/config.py`	Introduces `augmentation_backend` on `TrainConfig`.
`src/rfdetr/datasets/aug_config.py`	Documents Kornia GPU backend and Phase 1 limitations.
`pyproject.toml`	Adds optional dependency group `kornia`.
`tests/training/test_module_data.py`	Adds tests for backend resolution and `on_after_batch_transfer`.
`tests/training/conftest.py`	Adds autouse fixture to restore `RFDETRDataModule.trainer` property after tests.
`CHANGELOG.md`	Documents the new `augmentation_backend` feature.

Comments suppressed due to low confidence (1)

src/rfdetr/datasets/coco.py:356

The make_coco_transforms() docstring and Args list no longer match behavior now that gpu_postprocess can skip Albumentations and Normalize() for the train split. Please document the new gpu_postprocess parameter and clarify that normalization is deferred to the DataModule GPU path when it’s enabled.

    """Build the standard COCO transform pipeline for a given dataset split.

    Returns a composed transform that resizes images to the target ``resolution``
    (with optional multi-scale jitter), applies Albumentations-based augmentations
    during training, and normalises pixel values with ImageNet statistics.

    For the ``"train"`` split the pipeline uses a two-branch ``OneOf`` between a
    direct resize and a resize → random-crop → resize sequence (built via
    :func:`_build_train_resize_config`), followed by the augmentation stack and
    normalisation.  For ``"val"``, ``"test"``, and ``"val_speed"`` only resize and
    normalisation are applied — no augmentation.

src/rfdetr/datasets/coco.py

src/rfdetr/training/module_data.py

src/rfdetr/datasets/coco.py

src/rfdetr/datasets/yolo.py

tests/training/conftest.py

src/rfdetr/training/module_data.py

src/rfdetr/datasets/coco.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…rmalize - Import get_logger and add module-level logger to o365.py - Detect augmentation_backend from args; emit WARNING when non-cpu (Phase 1 limitation: no aug_config support for O365) - Compute gpu_postprocess flag and pass to both make_coco_transforms / make_coco_transforms_square_div_64 calls Addresses review comment — HIGH blocking: double normalize for O365 users with augmentation_backend != 'cpu' (PR #874) --- Co-authored-by: Claude Code <noreply@anthropic.com>

…RNING - Add _kornia_setup_done: bool = False in __init__ to prevent _setup_kornia_pipeline re-running on every setup('fit') call when the auto+no-CUDA/no-kornia fallback leaves _kornia_pipeline as None - Switch the auto+no-CUDA fallback from logger.info to logger.warning (consistent with auto+no-kornia WARNING) Addresses review comments — MEDIUM: setup guard re-runs in fallback path; inconsistent log levels (PR #874) --- Co-authored-by: Claude Code <noreply@anthropic.com>

- _make_gaussian_blur: enforce blur_limit >= 3 after odd-rounding (Kornia requires kernel_size >= 3) - make_coco_transforms, make_coco_transforms_square_div_64: add gpu_postprocess to Args docstring - unpack_boxes: correct docstring claiming in-place mutation (function returns shallow copies) - conftest.py: fix docstring wording LightningModule → LightningDataModule Addresses review comments from @Copilot and @review on PR #874 --- Co-authored-by: Claude Code <noreply@anthropic.com>

…pipeline forward pass - TestGaussianBlurMinKernel: parametrized test for blur_limit=1,2 producing valid kernel_size >= 3 - TestKorniaPipelineForwardPass: shape/dtype check and empty-bbox batch through built pipeline (kornia skip guard) - TestBuildO365RawGpuBackend: warning emitted for non-cpu backend; gpu_postprocess wired correctly; square-resize delegate - TestKorniaSetupDoneSentinel: sentinel starts False, set after fit, _setup_kornia_pipeline called exactly once across repeated setup('fit') calls Closes review test-coverage gaps from PR #874 --- Co-authored-by: Claude Code <noreply@anthropic.com>

When augmentation_backend != 'cpu' and aug_config is not explicitly set, build_kornia_pipeline was receiving {} (empty dict) while the CPU path correctly fell back to AUG_CONFIG. This caused GPU training to have zero augmentation by default — a silent regression. - Import AUG_CONFIG in module_data.py - Use `train_config.aug_config if ... is not None else AUG_CONFIG` in _setup_kornia_pipeline - Add test_gpu_path_uses_aug_config_fallback to TestBackendResolution Addresses QA finding B1 (blocking) from post-commit review of PR #874 --- Co-authored-by: Claude Code <noreply@anthropic.com>

codecov · 2026-03-26T00:19:07Z

Codecov Report

❌ Patch coverage is 76.98113% with 61 lines in your changes missing coverage. Please review.
✅ Project coverage is 79%. Comparing base (bffeaa7) to head (956b71e).
⚠️ Report is 1 commits behind head on develop.

❌ Your patch check has failed because the patch coverage (77%) is below the target coverage (95%). You can increase the patch coverage or adjust the target coverage.
❌ Your project check has failed because the head coverage (79%) is below the target coverage (95%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@           Coverage Diff           @@
##           develop   #874    +/-   ##
=======================================
  Coverage       79%    79%            
=======================================
  Files           97     98     +1     
  Lines         7829   8070   +241     
=======================================
+ Hits          6179   6370   +191     
- Misses        1650   1700    +50

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.

src/rfdetr/training/module_data.py

src/rfdetr/datasets/kornia_transforms.py

src/rfdetr/datasets/coco.py

src/rfdetr/datasets/o365.py

src/rfdetr/training/module_data.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…auto backend - _make_gaussian_blur: kernel_size=(blur_limit, blur_limit) instead of (3, blur_limit) — square kernel per Albumentations semantics (Copilot #2991808605) - build_normalize: pass plain Python tuples instead of torch.tensor() so Kornia handles device placement (Copilot #2991808573) - on_after_batch_transfer: call .to(img.device) on pipeline and normalize before use to prevent CPU/GPU device mismatch (Copilot #2991808540) - setup("fit"): resolve 'auto' backend via _resolve_augmentation_backend() before dataset build so gpu_postprocess matches actual runtime behavior — fixes silent CPU-normalize stripping on machines without CUDA/kornia (Copilot #2991808618, #2991808669) - 4 new tests covering _resolve_augmentation_backend and namespace pre-resolution --- Co-authored-by: Claude Code <noreply@anthropic.com>

The try block unconditionally set has_kornia=True with no import, making the except unreachable; on machines with CUDA but without kornia, auto would incorrectly resolve to gpu — causing ImportError or unnormalized training inputs (review HIGH finding). Also update test_auto_backend_emits_warning and test_gpu_postprocess_true_for_auto_backend to mock CUDA+kornia availability so the GPU path is actually exercised; add complementary no-CUDA tests. --- Co-authored-by: Claude Code <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.

src/rfdetr/training/module_data.py

src/rfdetr/datasets/coco.py

src/rfdetr/datasets/yolo.py

src/rfdetr/datasets/o365.py

tests/training/test_module_data.py

…kornia - Annotated kornia imports with `# type: ignore[import-not-found]` to suppress import errors in type-checking. - Updated `pyproject.toml` to include kornia in `ignore_missing_imports` for mypy. - Removed redundant `num_workers` assignment in `module_data.py`.

…cross modules and tests - Introduced `_has_cuda_device` helper in `module_data.py` for fork-safe CUDA checks. - Updated all instances of `torch.cuda.is_available` in datasets, training modules, and tests to use `_has_cuda_device`. - Refactored runtime backend resolution to ensure consistent error handling and modularity. - Enhanced various tests to mock `_has_cuda_device` for deterministic behavior. - Standardized normalization and backend resolution behaviors across GPU and CPU paths.

…rt usage - Replaced ambiguous variables (`H`, `W`, `B`) with descriptive names (`image_height`, `image_width`, `batch

- Add `_has_cuda_device()` and `resolve_augmentation_backend()` to `kornia_transforms.py` as single canonical implementations - `_has_cuda_device` uses `rfdetr.config.DEVICE` (fork-safe) instead of `torch.cuda.is_available()` — mirrors module_data.py - `_resolve_runtime_augmentation_backend` in coco.py now delegates to the shared resolver; yolo.py import unchanged - `build_o365_raw` inline resolution replaced with single `resolve_augmentation_backend()` call, removing duplicate logic and two direct `torch.cuda.is_available()` calls [resolve #1] /review finding by sw-engineer (report: .temp/output-review-aug-kornia-2026-04-09.md): "Duplicated backend-resolution logic across three modules" [resolve #2] /review finding by linting-expert (report: .temp/output-review-aug-kornia-2026-04-09.md): "Inconsistent CUDA detection: fork-safe vs direct" --- Co-authored-by: Claude Code <noreply@anthropic.com>

The `setattr(args, "augmentation_backend", "cpu")` calls mutated the shared namespace when resolving the "auto" or segmentation-forced-cpu paths. The local `resolved_augmentation_backend` variable already controls `gpu_postprocess` correctly; the mutations were redundant side-effects that could surprise the DataModule's own `_setup_kornia_pipeline` resolution path. [resolve #3] /review finding by sw-engineer (report: .temp/output-review-aug-kornia-2026-04-09.md): "Mutable namespace mutation in build_coco()" --- Co-authored-by: Claude Code <noreply@anthropic.com>

…ata.py The bare `import kornia.augmentation` statements in `_setup_kornia_pipeline` lacked the `# type: ignore[import-not-found]` annotation used consistently elsewhere in the PR (coco.py, o365.py, kornia_transforms.py). [resolve #4] /review finding by linting-expert (report: .temp/output-review-aug-kornia-2026-04-09.md): "kornia imports without type: ignore in module_data.py" --- Co-authored-by: Claude Code <noreply@anthropic.com>

…strings Both `make_coco_transforms` and `make_coco_transforms_square_div_64` described the train pipeline as always applying augmentation + normalization. Added a paragraph explaining that `gpu_postprocess=True` omits both, deferring them to `RFDETRDataModule.on_after_batch_transfer`. [resolve #5] /review finding by doc-scribe (report: .temp/output-review-aug-kornia-2026-04-09.md): "make_coco_transforms() docstring not updated for gpu_postprocess" --- Co-authored-by: Claude Code <noreply@anthropic.com>

- Upgrade typing to built-in generics (Dict→dict, List→list, Tuple→tuple, Optional→X|None) across kornia_transforms.py and coco.py - Remove unused `typing` imports; use `collections.abc.Callable` - Add `# noqa: F401` to all unused kornia guard imports in module_data.py and kornia_transforms.py - Restore `# noqa: N806` on PATHS dicts in coco.py and o365.py (removed by linter pass, re-flagged by ruff) - Minor: `super(Cls, self)` → `super()`, `setattr(self.coco, …)` → direct attr assignment, f-string instead of `.format()` --- Co-authored-by: Claude Code <noreply@anthropic.com>

`_has_cuda_device` now reads `rfdetr.config.DEVICE` (set once at import time) instead of calling `torch.cuda.is_available`. Patching `torch.cuda.is_available` had no effect, causing three TestBuildO365RawGpuBackend tests to fail on non-CUDA machines and `TestBuildRoboflowFromCocoBackendResolution.test_auto_no_cuda_keeps_cpu_normalize` to be fragile on CUDA hosts. Updated all six affected patch calls to target `rfdetr.datasets.kornia_transforms._has_cuda_device` directly. --- Co-authored-by: Claude Code <noreply@anthropic.com>

…entation_backend Previously build_coco() only forwarded 'auto' to the resolver, so an explicit augmentation_backend='gpu' with no CUDA would not fail at dataset-build time (the check only happened later in on_after_batch_transfer). Changed the condition from `== "auto"` to `!= "cpu"` so both 'auto' and 'gpu' pass through the shared resolver — matching the behaviour of build_roboflow_from_coco. --- Co-authored-by: Claude Code <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

src/rfdetr/datasets/kornia_transforms.py

src/rfdetr/training/module_data.py

tests/datasets/test_yolo.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

- resolve_augmentation_backend() now raises ValueError for unrecognised backend strings instead of silently returning the raw value (which would leave gpu_postprocess=True while no Kornia pipeline is built, producing unnormalised training inputs) - _make_affine() converts Albumentations translate_percent=(min, max) to Kornia RandomAffine(translate=(tx, ty)) by taking the max absolute value of the range, matching the intended symmetric translation bound [resolve #5] Review comment by @Copilot (PR #874): "resolve_augmentation_backend() falls through to `return backend` for any unexpected string..." [resolve #7] Review comment by @Copilot (PR #874): "_make_affine() forwards Albumentations-style translate_percent directly to Kornia RandomAffine..." --- Co-authored-by: Claude Code <noreply@anthropic.com>

The test was patching torch.cuda.is_available, which has no effect because backend resolution uses rfdetr.datasets.kornia_transforms._has_cuda_device (which reads the fork-safe DEVICE constant). Also the args namespace was missing augmentation_backend='auto', so the assertion passed trivially against the 'cpu' default rather than the auto path. [resolve #8] Review comment by @Copilot (PR #874): "This test claims to validate augmentation_backend='auto' no-CUDA path, but the constructed args never sets augmentation_backend..." --- Co-authored-by: Claude Code <noreply@anthropic.com>

Borda requested review from SkalskiP, isaacrob and probicheaux as code owners March 25, 2026 23:30

Copilot AI review requested due to automatic review settings March 25, 2026 23:30

fix(pre-commit): 🎨 auto format pre-commit hooks

17043fe

Copilot started reviewing on behalf of Borda March 25, 2026 23:31 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Borda and others added 4 commits March 26, 2026 00:55

Apply suggestions from code review

eee9c01

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Borda self-assigned this Mar 26, 2026

Borda and others added 2 commits March 26, 2026 01:06

Borda requested a review from Copilot March 26, 2026 00:20

Copilot started reviewing on behalf of Borda March 26, 2026 00:21 View session

Copilot AI reviewed Mar 26, 2026

View reviewed changes

Borda and others added 5 commits March 26, 2026 22:06

Apply suggestions from code review

e969361

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

fix(pre-commit): 🎨 auto format pre-commit hooks

de9017c

Merge branch 'develop' into aug/kornia

b517771

github-actions bot added the has conflicts label Apr 2, 2026

Merge branch 'develop' into aug/kornia

5ab9bab

Borda requested a review from Copilot April 9, 2026 11:59

github-actions bot removed the has conflicts label Apr 9, 2026

Copilot started reviewing on behalf of Borda April 9, 2026 12:00 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Borda and others added 11 commits April 9, 2026 14:07

Merge branch 'develop' into aug/kornia

26c7c80

refactor: standardize variable naming for clarity; update kornia impo…

ea5d30d

…rt usage - Replaced ambiguous variables (`H`, `W`, `B`) with descriptive names (`image_height`, `image_width`, `batch

Borda requested a review from Copilot April 9, 2026 17:55

Copilot started reviewing on behalf of Borda April 9, 2026 17:55 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Borda and others added 3 commits April 9, 2026 20:02

Update src/rfdetr/training/module_data.py

77bea4f

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

github-actions bot added the has conflicts label Apr 9, 2026

Merge branch 'develop' into aug/kornia

956b71e

github-actions bot removed the has conflicts label Apr 9, 2026

Conversation

Borda commented Mar 25, 2026

What does this PR do?

Type of Change

Testing

Additional Context

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Mar 26, 2026 •

edited

Loading