fix: add area column to CSVSink BASE_HEADER#2205
fix: add area column to CSVSink BASE_HEADER#2205Zeesejo wants to merge 1 commit intoroboflow:developfrom
Conversation
CSVSink iterated over detections using __iter__, which yields (xyxy, mask, confidence, class_id, tracker_id, data) tuples. The .area property is not part of that tuple, so it was never written to the CSV — accessing it via custom_data raised an AttributeError (issue roboflow#1397). Fix: - Add 'area' to BASE_HEADER so it appears as a standard column. - Compute detections.area once in parse_detection_data() and include area[i] for each row. Fixes roboflow#1397
There was a problem hiding this comment.
Pull request overview
This PR fixes CSVSink serialization to include each detection’s computed area in the emitted CSV, addressing the gap reported in issue #1397.
Changes:
- Add
"area"toCSVSink’sBASE_HEADERso it’s always written. - Compute
detections.areaonce and include the per-detection value in each serialized row.
| "confidence", | ||
| "tracker_id", | ||
| "area", | ||
| ] |
There was a problem hiding this comment.
Adding area to BASE_HEADER can produce duplicate area columns when callers still pass custom_data={'area': ...} (or if detections.data contains an area key). parse_field_names() currently appends all dynamic keys without excluding base fields, so the header can contain duplicate names and the resulting CSV becomes ambiguous. Consider filtering dynamic_header to remove any keys already in BASE_HEADER (and/or treating base-field overrides as an error).
| ] | |
| ] | |
| BASE_HEADER_SET = set(BASE_HEADER) |
| BASE_HEADER = [ | ||
| "x_min", | ||
| "y_min", | ||
| "x_max", | ||
| "y_max", | ||
| "class_id", | ||
| "confidence", | ||
| "tracker_id", | ||
| "area", | ||
| ] |
There was a problem hiding this comment.
This change updates the CSV schema (new area column) and will require updating existing CSVSink tests/fixtures that assert the header and row layouts (e.g., tests/detection/test_csv.py expected headers/rows). Please add/adjust test expectations to include area in the correct position and with the computed values so CI stays green.
This comment was marked as outdated.
This comment was marked as outdated.
|
@Zeesejo, it seems I cannot push to your branch to finish, so could you pls enable it or perform all the requested suggestions? |
Problem
CSVSinkwrites bounding-box coordinates,confidence,class_id, andtracker_idto CSV but silently omits theareaof each detection, even thoughDetections.areais a standard computed property. Users trying to log area either get an empty column or anAttributeErrorwhen passing it viacustom_data.Reported in #1397.
Root Cause
BASE_HEADER(the static list of always-written columns) never included"area":parse_detection_data()built each row from thexyxyarray + optional fields, but never calleddetections.area.Fix
Two small, self-contained changes to
csv_sink.py:1.
BASE_HEADER— add"area"2.
parse_detection_data()— compute and include area per rowdetections.areais computed once before the loop (not per-iteration) to avoid redundant work on large batches.Verification
Fixes #1397