fix: slice numpy array values in `custom_data` per row in CSVSink by farukalamai · Pull Request #2199 · roboflow/supervision

farukalamai · 2026-04-03T20:37:59Z

Before submitting

Self-reviewed the code
Updated documentation, follow Google-style
Added docs entry for autogeneration (if new functions/classes)
Added/updated tests
All tests pass locally

Description

Fixes a bug in CSVSink and JSONSink where passing a numpy array as a
custom_data value wrote the entire array on every row instead of the
per-detection scalar value.

Type of Change

🐛 Bug fix (non-breaking change which fixes an issue)

Motivation and Context

When users pass computed per-detection values like detections.area via
custom_data, each row should receive its own scalar — not the whole array.

# Before (broken): every row got the full array
with sv.CSVSink("out.csv") as sink:
    sink.append(detections, custom_data={"area": detections.area})
# area column: [400.0, 400.0] on every row ❌

# After (fixed): each row gets its own value
# area column: 400.0, 400.0 ✅

The root cause was row.update(custom_data) inside the per-detection loop,
which blindly wrote the whole value. The fix applies the same per-index
slicing logic that detections.data already uses correctly.

Closes #1397

Changes Made

src/supervision/detection/tools/csv_sink.py — slice numpy array values in custom_data per detection row
src/supervision/detection/tools/json_sink.py — same fix
tests/detection/test_csv.py — added test case for numpy array in custom_data

Testing

I have tested this code locally
I have added unit tests that prove my fix is effective or that my feature works
All new and existing tests pass

Google Colab (optional)

Colab link:

Screenshots/Videos (optional)

Additional Notes

The fix is backward compatible — scalar values in custom_data (e.g.
{"frame_number": 42}) continue to work as before, written as-is on every
row.

…SONSink

codecov · 2026-04-08T12:30:54Z

Codecov Report

❌ Patch coverage is 75.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 77%. Comparing base (48035c1) to head (628f295).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files

@@           Coverage Diff           @@
##           develop   #2199   +/-   ##
=======================================
- Coverage       77%     77%   -0%     
=======================================
  Files           62      62           
  Lines         7640    7650   +10     
=======================================
+ Hits          5919    5926    +7     
- Misses        1721    1724    +3

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

Fixes incorrect serialization of per-detection custom_data in CSVSink/JSONSink when users pass numpy arrays (previously the full array was written on every row), aligning output with expected “one value per detection row” behavior.

Changes:

Update CSVSink.parse_detection_data() to slice custom_data numpy arrays per detection row.
Update JSONSink.parse_detection_data() to slice custom_data numpy arrays per detection row.
Add a unit test ensuring CSVSink slices numpy-array custom_data per row.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
`src/supervision/detection/tools/csv_sink.py`	Slice numpy-array `custom_data` per detection row when producing CSV rows.
`src/supervision/detection/tools/json_sink.py`	Apply analogous per-row slicing for numpy-array `custom_data` when producing JSON rows.
`tests/detection/test_csv.py`	Add regression test covering numpy-array `custom_data` in `CSVSink`.

Copilot · 2026-04-08T12:31:05Z

src/supervision/detection/tools/json_sink.py

            if custom_data:
-                row.update(custom_data)
+                for key, value in custom_data.items():
+                    if isinstance(value, np.ndarray) and value.ndim == 0:
+                        row[key] = str(value)
+                    elif isinstance(value, np.ndarray):
+                        row[key] = str(value[i])
+                    else:
+                        row[key] = value


custom_data numpy arrays are serialized using str(...), which turns numeric values into JSON strings (while other built-in fields like confidence are numbers). Consider converting numpy values to native Python scalars (e.g., via .item() for 0-d arrays and elements) so JSON output preserves numeric types and remains consistently typed.

Copilot · 2026-04-08T12:31:06Z

src/supervision/detection/tools/json_sink.py

+                for key, value in custom_data.items():
+                    if isinstance(value, np.ndarray) and value.ndim == 0:
+                        row[key] = str(value)
+                    elif isinstance(value, np.ndarray):
+                        row[key] = str(value[i])
+                    else:


Indexing custom_data numpy arrays with value[i] will raise IndexError if the array length doesn't match the number of detections (including the common case of a 1-element array intended as a constant). It would be safer to validate lengths and either broadcast length-1 arrays or raise a clear ValueError explaining the expected shape.

Copilot · 2026-04-08T12:31:06Z

src/supervision/detection/tools/json_sink.py

            if custom_data:
-                row.update(custom_data)
+                for key, value in custom_data.items():
+                    if isinstance(value, np.ndarray) and value.ndim == 0:
+                        row[key] = str(value)
+                    elif isinstance(value, np.ndarray):
+                        row[key] = str(value[i])
+                    else:
+                        row[key] = value
            parsed_rows.append(row)


The PR changes JSONSink behavior but there’s no unit test covering custom_data passed as a numpy array (similar to the new CSVSink test). Adding a test that asserts per-row slicing and JSON-serializable output would prevent regressions and confirm the fix end-to-end.

Copilot · 2026-04-08T12:31:06Z

src/supervision/detection/tools/csv_sink.py

+                    elif isinstance(value, np.ndarray):
+                        row[key] = value[i]
+                    else:
+                        row[key] = value


custom_data slicing currently only handles np.ndarray. If a caller passes a per-detection Python sequence (e.g., list/tuple) it will still be written as the full sequence on every row. Consider mirroring the detections.data logic here (slice values that are indexable and match detection length) or explicitly documenting that only numpy arrays are supported for per-row custom values.

Suggested change

row[key] = value

row[key] = value[i] if hasattr(value, "__getitem__") else value

Copilot · 2026-04-08T12:31:06Z

src/supervision/detection/tools/csv_sink.py

+                for key, value in custom_data.items():
+                    if isinstance(value, np.ndarray) and value.ndim == 0:
+                        row[key] = value
+                    elif isinstance(value, np.ndarray):
+                        row[key] = value[i]
+                    else:


Indexing custom_data numpy arrays with value[i] can raise IndexError when the provided array length doesn't match the number of detections (including a 1-element array intended to broadcast). Consider validating lengths and either broadcasting or raising a clear ValueError describing the expected shape to make failures easier to debug.

fix: slice numpy array values in custom_data per row in CSVSink and J…

628f295

…SONSink

farukalamai requested a review from SkalskiP as a code owner April 3, 2026 20:38

Borda requested a review from Copilot April 8, 2026 12:27

Copilot started reviewing on behalf of Borda April 8, 2026 12:28 View session

Borda changed the title ~~fix: slice numpy array values in custom_data per row in CSVSink and J…~~ fix: slice numpy array values in custom_data per row in CSVSink Apr 8, 2026

Copilot AI reviewed Apr 8, 2026

View reviewed changes

Borda added waiting for author bug Something isn't working labels Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: slice numpy array values in `custom_data` per row in CSVSink#2199

fix: slice numpy array values in `custom_data` per row in CSVSink#2199
farukalamai wants to merge 1 commit intoroboflow:developfrom
farukalamai:fix/csv-json-sink-custom-data-array-slicing

farukalamai commented Apr 3, 2026

Uh oh!

codecov bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	row[key] = value
	row[key] = value[i] if hasattr(value, "__getitem__") else value

Conversation

farukalamai commented Apr 3, 2026

Description

Type of Change

Motivation and Context

Changes Made

Testing

Google Colab (optional)

Screenshots/Videos (optional)

Additional Notes

Uh oh!

codecov bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Apr 8, 2026 •

edited

Loading