Skip to content

Add autotune — autonomous MOT tracker optimization loop [rebase&merge]#346

Draft
Borda wants to merge 53 commits intodevelopfrom
bemch/auto-research
Draft

Add autotune — autonomous MOT tracker optimization loop [rebase&merge]#346
Borda wants to merge 53 commits intodevelopfrom
bemch/auto-research

Conversation

@Borda
Copy link
Copy Markdown
Member

@Borda Borda commented Apr 3, 2026

Summary

MOT tracker quality depends on two largely independent axes: algorithm design and hyperparameter tuning. Most published improvements conflate them — a well-tuned weaker algorithm routinely beats a poorly-tuned stronger one, making it hard to isolate what actually matters. This PR separates the axes by adding autotrack/, an autonomous optimization loop for SORT, ByteTrack, and OC-SORT on MOT17.

The goal is both practical (better trackers, reproducible tuning) and scientific (the experiment log — including every reverted change — is itself a research artifact).

Approach

Three progressive layers build on each other:

Layer 1 — SOTA trackers with solid defaults. The existing trackers/core/ implementations of SORT, ByteTrack, and OC-SORT are already competitive out of the box. This layer is the foundation; autotrack/ does not replace it.

Layer 2 — Optuna extracts the best from the existing parameter surface. optimize_tracking.py runs an Optuna study over the tracker's exposed hyperparameters (Kalman noise scales, confidence thresholds, buffer sizes). No code changes — pure tuning. FRCNN results gain 1–2.5 HOTA points; SDP gains 2–4 points. This layer alone is useful as a standalone tuning tool and can be adopted without running the agent loop.

Layer 3 — autotrack goes beyond tuning by making algorithmic improvements. This is the novel contribution. An autonomous agent iterates over structural code changes (state representation, association strategy, camera motion compensation, Kalman mechanics), measures HOTA at fixed default parameters after each change, keeps improvements, and reverts regressions. Optuna acts as a second-pass validator after each kept change to confirm the improvement is real and not a tuning artifact. The iteration log is JSONL and captures every attempt, kept or reverted.

Human defines:  research question · metric · hard boundaries
Agent decides:  what to change · what to try next

Two tools govern the loop:

Tool Role
optimize_tracking.py --n-trials 1 Campaign metric — default params, clean code-change signal
optimize_tracking.py --n-trials N Optuna study — warm-starts from best_config.json, validates tuned ceiling

The agent is explicitly permitted to update optimize_tracking.py as the tracker architecture evolves — adding parameters that newly exist, removing ones absorbed into the implementation, tightening search ranges as knowledge accumulates.

Benchmarks

MOT17-val, full 7-sequence eval. Defaults = fixed params from default_config.json, no tuning. +Optuna = n=500 trials. +autotrack + Optuna = in progress.

FRCNN public detections (bundled, no GPU)

Config ByteTrack OC-SORT SORT
Defaults (HOTA) 50.36 49.69 49.95
+ Optuna (HOTA) 51.76 52.22 51.49
+ autotrack + Optuna (HOTA) (pending) (pending) (pending)

SDP public detections (bundled, no GPU)

Config ByteTrack OC-SORT SORT
Defaults (HOTA) 53.94 53.35 53.22
+ Optuna (HOTA) 56.12 57.75 56.08
+ autotrack + Optuna (HOTA) (pending) (pending) (pending)

Estimated ceiling with code improvements + Optuna on FRCNN: ~61.9 HOTA (vs ~56.0 for tuning alone), derived from the DetA/AssA decomposition — DetA is bounded by the detector (~0.57–0.62 for FRCNN), but AssA has substantial headroom from ~0.55 to ~0.65 via better association logic.

Hard guarantees

Three invariants are enforced by program.md and cannot be relaxed by the agent:

  • No GT leakage. The tracker sees only det/det.txt. gt/gt.txt is never accessed at inference time.
  • Reproducible detections. FRCNN and SDP detections are bundled with the MOT17 benchmark. Generated detections (RF-DETR, YOLO World X) are written to content-addressed sibling directories before any agent run — they are frozen inputs, not live inference.
  • Metrics via trackers.eval only. trackers/eval/ is out of scope for agent edits. The metric computation is identical across all iterations; the agent cannot move the goalposts.

Quick start

# 1. Install the optimize dependency group
uv sync --group optimize

# 2. Download MOT17-val (bundled detections, no GPU needed)
trackers download mot17 --split val --asset annotations,detections

# 3. Baseline: measure defaults
cd autotrack
uv run python optimize_tracking.py bytetrack frcnn --n-trials 1   # ~50 HOTA

# 4. Tune: Optuna over the parameter surface
uv run python optimize_tracking.py bytetrack frcnn --n-trials 500  # ~52 HOTA

To run the autonomous agent loop, point any coding agent at program.md:

claude
> Read program.md and start the experiment loop.

References

  • Bewley et al., SORT, ICIP 2016
  • Zhang et al., ByteTrack, ECCV 2022
  • Cao et al., OC-SORT, CVPR 2023
  • Luiten et al., HOTA, IJCV 2021
  • Akiba et al., Optuna, KDD 2019

Borda and others added 13 commits April 2, 2026 13:51
- experiments/program.md: autoresearch contract — research question, HOTA≥60 target, hard boundaries, 7 research starting points (Kalman P/R init, two-threshold association, velocity attenuation, etc.)
- experiments/optimize_tracking.py: Optuna-based metric runner; n_trials=1 evaluates defaults; multi-core via multiprocessing+SQLite; agent updates search space as architecture evolves
- experiments/README.md: motivation, approach, target analysis (HOTA ceiling derivation), pre-flight checks, references
- pyproject.toml: add `optimize` dependency group (optuna[rdb], fire)

---
Co-authored-by: Claude Code <noreply@anthropic.com>
 autotrack/optimize_tracking.py
  - --det-tag TAG CLI arg: overrides the directory suffix for any custom detector without touching _DET_SOURCE_TO_TAG; _validate_args and
  _resolve_sequences both accept it
  - Multiprocessing progress bar: replaced pool.starmap with starmap_async + a polling loop that loads the SQLite study every 2 s and feeds a
  Rich Progress bar showing completed trials and live best HOTA (mirrors the existing single-worker callback approach)
  - Module docstring updated with --det-tag usage example

  autotrack/README.md
  - Fixed cd experiments → cd autotrack; old --tracker sort --fast → positional syntax
  - YOLO section replaced with YOLOX section (correct weights filename)
  - RF-DETR section added as a standalone step
  - New Custom detections section: dir layout, MOT format, --det-tag usage
  - Pre-flight checks table updated (removed API key row, fixed commands)
  - Fixed /optimize campaign experiments/ → autotrack/
  - Fixed broken Files table row for optimize_tracking.py

  autotrack/program.md
  - generate_detections.py added to scope_files
  - Weights filename corrected (yolox_x.pth → bytetrack_x_mot17.pth.tar)
  - RF-DETR and custom detector quickstart notes added below pre-flight table

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- generate_detections.py: remove YOLOX backend (loader, predictor, frame processing); add YOLO-World via inference-models with center→top-left coord conversion; rename rfdetr-l → rfdetr/l to match yolo_world/l slash notation
- optimize_tracking.py: swap yolox→yoloworld in _DET_SOURCE_TO_TAG; extract _run_parallel_study; fix multiline ternaries to if/else; use setattr() for dynamic Kalman attrs (mypy); pass >3 args as kwargs
- best_config.json: drop broken yolox entry (HOTA=7.7); add real Optuna results for yoloworld, rfdetr, dpm across all three trackers
- pyproject.toml: remove YOLOX git source + no-build-isolation; add inference-models>=0.19.0

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- search_space.json: expand 16 boundary-hugging parameters across all three trackers (lost_track_buffer, track_activation_threshold, minimum_iou_threshold, high_conf_det_threshold, q_scale/r_scale/p_scale, velocity_decay, q_miss_alpha, max_interpolation_gap, p_reset_threshold, direction_consistency_weight); add log=true to lost_track_buffer (all trackers) and minimum_iou_threshold (all trackers)
- optimize_tracking.py: pass log= to suggest_int so log-scale int parameters are respected
- best_config.json: bytetrack/rfdetr updated to HOTA 45.08 from new run
- uv.lock: regenerated after yolox removal

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…mation (ORU)

- Add oru_enabled parameter to ByteTrackKalmanBoxTracker: on re-detection after occlusion, replay virtual predict+update cycles along linearly interpolated trajectory to re-estimate velocity
- Expose oru_enabled in optimize_tracking.py _build_tracker and _define_search_space
- Add oru_enabled to default_config.json and search_space.json

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…0.05)

- Add stage2_iou_threshold=0.05 param to ByteTrackTracker; stage-1 keeps minimum_iou_threshold=0.1
- Lower stage-2 threshold recovers more low-confidence detections without breaking high-conf stage
- Expose to Optuna via search_space.json; add to default_config.json and optimize_tracking.py

---
Co-authored-by: OpenAI Codex <codex@openai.com>
…larity

- Add iou_age_weight=0.03: scale stage-1 IoU similarity by 1/(1+w*lost_frames) for each track
- Biases Hungarian assignment toward recently-seen tracks; reduces stale-prediction false matches
- iou_age_weight=0.03 is active at default params; Optuna range [0.0, 0.2] log-scale

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- Apply age discount only to cost matrix (not threshold check): raw IoU used for min-threshold gate, discount only biases solver assignment toward active tracks
- Tighten Optuna search range [0.0, 0.2] -> [0.0, 0.1]
- Fix pre-existing bug: optimize_tracking.py final re-eval now applies _apply_kalman_patch

---
Co-authored-by: Claude Code <noreply@anthropic.com>
Apply Optuna-found parameter values as new defaults: lost_track_buffer 30→62,
track_activation_threshold 0.7→0.314, q_scale 0.01→0.00246, r_scale 0.1→0.292,
p_scale 1.0→7.34, velocity_decay 0.95→0.817, q_miss_alpha 0.1→0.461,
max_interpolation_gap 20→30, p_reset_threshold 5→13; HOTA 56.781→57.424 (+1.13%)

---
Co-authored-by: Claude Code <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 3, 2026 07:14
@Borda Borda marked this pull request as draft April 3, 2026 07:14
@Borda Borda added the enhancement New feature or request label Apr 3, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces the new autotrack/ workflow for autonomous + Optuna-based optimization of MOT17 trackers, and updates core tracker internals to support additional post-processing and association/Kalman behaviors that the optimization loop can tune and validate.

Changes:

  • Added autotrack/ tooling: Optuna runner (optimize_tracking.py), detection generation (generate_detections.py), visualization utilities, and configuration/artifact files (default_config.json, search_space.json, best_config.json, program.md).
  • Extended ByteTrack and SORT utilities with new association / Kalman mechanics and MOT-gap interpolation.
  • Added an optimize dependency group and adjusted repo formatting/ignore configs to support the new workflow.

Reviewed changes

Copilot reviewed 17 out of 19 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
trackers/core/sort/utils.py Adds MOT-format short-gap interpolation helper used by autotrack evaluation output.
trackers/core/bytetrack/tracker.py Adds stage-2 IoU threshold and IoU age discount for stage-1 ranking; updates association gating logic.
trackers/core/bytetrack/kalman.py Adds velocity decay, miss-noise inflation, P-reset, and ORU mechanics to ByteTrack Kalman tracker.
README.md Badge formatting change (single-line).
pyproject.toml Adds optimize dependency group and uv git source for onnx-simplifier.
docs/trackers/ocsort.md Reflowed paragraph formatting.
docs/trackers/comparison.md Reflowed admonition formatting.
CODE_OF_CONDUCT.md Reflowed paragraph formatting.
autotrack/visualize_detections.py New utility to render MOT detections on frames.
autotrack/search_space.json New Optuna parameter search space definitions per tracker.
autotrack/README.md New documentation for the autotrack workflow and benchmarks.
autotrack/program.md New campaign contract/spec for the autonomous optimization loop.
autotrack/optimize_tracking.py New Optuna study runner + evaluation harness using trackers.eval.
autotrack/generate_detections.py New script to generate MOT17 detections via RF-DETR / YOLO-World backends.
autotrack/default_config.json New baseline/default parameter set for --n-trials 1 runs.
autotrack/best_config.json New committed “best known” tuned configs used for warm-starting/guarding.
.pre-commit-config.yaml mdformat configured with --wrap=no (drives markdown reflow behavior).
.gitignore Adjusts ignores (including .python-version) and adds autotrack output/cache patterns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Borda Borda force-pushed the bemch/auto-research branch from 07f7488 to a62024a Compare April 3, 2026 08:36
Borda and others added 5 commits April 3, 2026 12:54
…ecovery

Short occlusions (1-4 frames) are handled well by velocity decay alone; ORU
trajectory replay is beneficial only for longer gaps where velocity has drifted.
HOTA 57.424→57.813 (+0.686%), IDF1 69.573→70.009

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- bytetrack/sdp Optuna result: 58.753 (was 56.115 before i10-i11)
- New optimal params include oru_threshold=14, q_scale/r_scale/p_scale all ~10x lower

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- q_scale 0.00246→0.000202, r_scale 0.292→0.0441, p_scale 7.34→0.731 (tighter Kalman — trust measurements more)
- oru_threshold 5→14, velocity_decay 0.817→0.774, q_miss_alpha 0.461→0.282
- stage2_iou_threshold 0.05→0.233, lost_track_buffer 62→52, p_reset_threshold 13→26
- HOTA 57.813→58.753 (+1.30%)

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- Confidence boost in Hungarian cost: solver_iou *= (1 + w * conf[det])
- Neutral at all tested defaults (0.0–0.5); added to Optuna search space [0.0, 1.0]
- IDSW improved 297→293 at w=0.3 but HOTA regressed; w=0.1 exactly neutral

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- Mature-track-only stage-2: only tracks with >= N updates participate in low-conf recovery
- Neutral at N=0,1; regresses at N>=2 — ghost exclusion hurts legitimate young tracks
- Added to Optuna search space [0, 5] for future joint optimisation

---
Co-authored-by: Claude Code <noreply@anthropic.com>
@Borda Borda force-pushed the bemch/auto-research branch from 699e62f to 1bc1138 Compare April 3, 2026 21:50
Borda and others added 5 commits April 4, 2026 00:22
…disabled)

- Add _giou_matrix() helper and giou_blend param to ByteTrackTracker stage-1 cost
- giou_blend=0.0 default keeps metric at 58.753 (best found 0.32 gave +0.092%, below 0.1% threshold)
- Add giou_blend to search_space.json [0.0, 1.0] and optimize_tracking.py wiring
- Fix best_config.json trailing newline

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…earch)

- 1000-trial Optuna search over expanded search space (new: conf_cost_weight, stage2_min_updates, giou_blend)
- HOTA 58.753→58.862 (+0.185%), IDSW 297→269 (-9.4%)
- Key changes: high_conf_det_threshold 0.608→0.795, oru_threshold 14→0, Kalman looser (q_scale/r_scale ~14x), minimum_consecutive_frames 2→1, stage2_min_updates 5, giou_blend 0.396, conf_cost_weight 0.170

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- HOTA 58.862→58.961 (+0.168%), IDSW 269→266, IDF1 71.365→71.730
- Optuna search was capped at stage2_min_updates≤5; manual scan found peak at 12 (cliff at 14+)
- Widen search_space.json high: 5→15 so future guard runs can explore the full range

---
Co-authored-by: Claude Code <noreply@anthropic.com>
---
Co-authored-by: Claude Code <noreply@anthropic.com>
- HOTA 58.961→59.031 (+0.119%), IDSW 266→262, IDF1 71.730→71.852
- max_interpolation_gap 45→48 (Optuna undershoot, true peak at 48)
- giou_blend 0.3963→0.42 (refined from 0.396 Optuna result)
- velocity_decay 0.827→0.82 (slight tightening of decay)

---
Co-authored-by: Claude Code <noreply@anthropic.com>
@Borda Borda force-pushed the bemch/auto-research branch from 15fce38 to cf03dd4 Compare April 4, 2026 22:35
Borda and others added 2 commits April 5, 2026 01:16
- Runs tracker over all 21 MOT17 test sequences (7 IDs × 3 public detectors)
- Produces a flat ZIP ready for CodaBench upload (21 files, MOT 10-col format)
- Default: --det-source all (DPM/FRCNN/SDP run separately); single source replicates to all 3 DET slots
- Uses tracker constructor defaults; dataset path resolved relative to script
- Gitignore: track autotrack/*.zip to keep submission ZIPs out of repo

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- Rename directory via git mv
- Update all internal references (cd paths, study name, README tables, CLI hints)

---
Co-authored-by: Claude Code <noreply@anthropic.com>
@Borda Borda force-pushed the bemch/auto-research branch from 34d440c to 3653d30 Compare April 5, 2026 12:20
Borda and others added 24 commits April 6, 2026 09:17
… & guard script to `guard.py`

- Replace hardcoded bytetrack/sdp in metric_cmd, notes, and optuna commands with {algo} and {det_source} template tokens resolved at run time by the /optimize skill
- Config defaults: algo=bytetrack, det_source=sdp (no behaviour change for existing runs)
- Remove fixed target (varies per tracker); campaign now runs to max_iterations
- Consolidate Phase 1 findings as ByteTrack-specific history; flag which improvements are not yet in SORT/OC-SORT
- Drop hardcoded class names and per-tracker baselines from hard boundaries; add tuned HOTA reference table
- Replaced inline guard logic in `program.md` with a dedicated `guard.py` script.
- Ensures modularity and improves maintainability for HOTA regression checks.
- Provides isolated execution environment for `/optimize` runs
- Includes dependencies for OpenCV, git, and build tools
- Configures Python environment with prebuilt venv for faster runtime setup

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- Add velocity_decay to SORTKalmanBoxTracker: shrinks velocity components
  each missed frame (default 0.82), prevents runaway linear extrapolation
- Add q_miss_alpha: inflates Q proportionally to time_since_update for lost
  tracks (default 0.8), widens uncertainty so re-detection gets higher gain
- Add p_reset_threshold: resets P to identity after gaps >= threshold frames
  (default 10), discards stale accumulated uncertainty on re-detection
- Wire all three params through SORTTracker.__init__ and _spawn_new_trackers
- Add to search_space.json and default_config.json for sort
- HOTA: 53.217 → 53.738 (+0.521, +0.98%) at default params on sdp

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…ion for SORT

- On re-detection after >= oru_threshold missed frames, override Kalman velocity
  with virtual trajectory: (current_bbox - last_observed_bbox) / gap_frames
- Store _last_observed_bbox at each update for ORU computation
- Add oru_threshold param (default 3) — technique from OC-SORT paper
- Register in search_space.json [0, 15] and default_config.json

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…lation for SORT

- Add conf_cost_weight param to SORTTracker: boosts Hungarian solver matrix
  by detection confidence (tiebreaker only — IoU gate uses raw IoU), default 0.2
- Add conf_cost_weight to sort search_space.json [0.0, 1.0] and default_config.json
- Set max_interpolation_gap default from 0 → 30 in default_config.json (activates
  existing interpolate_mot_gaps post-processing already wired in optimize_tracking.py)
- HOTA: 53.217 → 54.506 (+1.289, +2.4%) at default params on sdp

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…nment to SORT

- Scale DIoU cost matrix by detection confidence (1 + conf_cost_weight * conf) to break ties in favor of higher-confidence detections
- Threshold gate uses raw DIoU so the boost only affects ranking, not filtering
- Add conf_cost_weight param (default 0.0 = disabled) to SORTTracker constructor
- Update search_space.json, default_config.json, optimize_tracking.py

Co-authored-by: Claude Code <noreply@anthropic.com>
- track_activation_threshold: 0.25 → 0.9725 (strict initiation, fewer FP tracks)
- minimum_consecutive_frames: 2 → 1 (immediate confirmation, safe with high threshold)
- max_interpolation_gap: 30 → 57 (more gap bridging, fewer ID switches)
- minimum_iou_threshold: 0.3 → 0.275 (closer to optimal)
- lost_track_buffer: 30 → 26 (tuned optimal)
HOTA: 54.959 → 56.136 (+1.177, +2.14%)

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- Discount DIoU similarity for lost tracks by 1/(1 + iou_age_weight * lost_frames)
- Biases solver to prefer active tracks over stale Kalman predictions, reducing ID switches
- Threshold gate uses raw DIoU so valid matches are never rejected by the discount
- Add iou_age_weight param (default 0.0 = disabled) to SORTTracker constructor
- Update search_space.json, default_config.json, optimize_tracking.py

Co-authored-by: Claude Code <noreply@anthropic.com>
…ion to SORT

- Split detections by confidence threshold into high (stage 1) and low (stage 2) groups
- Stage 1 matches high-confidence dets to all tracks using DIoU + age discount + conf boost
- Stage 2 matches low-confidence dets to unmatched tracks using a lower IoU threshold
- New tracks only spawned from unmatched high-confidence detections
- Refactor _get_associated_indices into static _match method for reuse across stages
- Extract _build_solver_iou helper for age discount + confidence boost application
- Add high_conf_det_threshold (default 0.0 = disabled) and stage2_iou_threshold params
- Update search_space.json, default_config.json, optimize_tracking.py

Co-authored-by: Claude Code <noreply@anthropic.com>
- DIoU/two-stage/Kalman code changes make old 56.129 threshold stale
- Measured new default-param HOTA=55.656 with consolidated branch code

---
Co-authored-by: Claude Code <noreply@anthropic.com>
DIoU returns ≤ 0 (not 0.0) for non-overlapping boxes due to the
centre-distance penalty term; old assertion assumed pure IoU semantics.

---
Co-authored-by: Claude Code <noreply@anthropic.com>
New code (DIoU + Kalman dynamics + conf-weight + age-discount + two-stage)
tuned by Optuna achieves 57.675 vs 53.217 baseline (+8.4%).

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- README Algorithms table: add MOT17 HOTA (tuned) column; SORT 55.7→57.7
- autotune/program.md: SORT Phase 1 findings section with 9 kept changes,
  5 reverted, tuned best config (HOTA=57.7, +8.4% from 53.217 baseline)

---
Co-authored-by: Claude Code <noreply@anthropic.com>
- Set ocsort.max_interpolation_gap from 0 to 20 in default_config.json
- Infrastructure already wired: optimize_tracking.py calls interpolate_mot_gaps() when max_gap > 0
- ByteTrack Phase 1 precedent: +1.666% HOTA at same max_gap value

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…for ocsort

- Extend _apply_kalman_patch with ocsort elif branch: monkey-patches XCYCSRStateEstimator._create_filter to multiply paper-default Q/R/P by q_scale/r_scale/p_scale scalars after original init; defaults 1.0 preserve baseline HOTA
- Add q_scale (0.001–10), r_scale (0.01–100), p_scale (0.01–100) to ocsort section of search_space.json (all log-scale)
- Add q_scale=1.0, r_scale=1.0, p_scale=1.0 defaults to ocsort section of default_config.json

---
Co-authored-by: Claude Code <noreply@anthropic.com>
… association

- _get_iou_matrix in ocsort/utils.py now uses _compute_diou_matrix from sort/utils.py (center-distance penalty improves near-miss association)
- OCR stage sv.box_iou_batch replaced with _compute_diou_matrix (consistent DIoU across both association stages)

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…ation

- Promote Optuna-found best ocsort config (DIoU-calibrated): min_iou_thr 0.095→0.061, q_scale 1.214→0.0072, r_scale 14.31→0.136, p_scale 47.16→9.74
- Set conf_cost_weight=0.258 (Optuna best); together with DIoU yields HOTA=58.652
- Add conf_cost_weight param (default 0.0) to OCSORTTracker; boosts high-confidence detections in solver cost matrix while keeping raw IoU gate unchanged
- Apply confidence boost in both primary (OCM) and OCR association stages
- Wire conf_cost_weight into _build_tracker ocsort block and search_space.json

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…tale lost tracks

- Add iou_age_weight param (default 0.0) to OCSORTTracker; discounts stale tracks' solver cost by 1/(1+iou_age_weight*(tsu-1)) pushing them to OCR stage; gate check unaffected
- Wire into _build_tracker ocsort block; add to search_space.json ocsort (range 0-0.5); default_config.json default 0.0

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…on in ocsort

- Add p_reset_threshold to OCSORTTracklet.update(): after gap >= threshold frames, reset kf.P to identity on re-detection, discarding stale accumulated uncertainty
- Thread p_reset_threshold through OCSORTTracker.__init__ and _spawn_new_tracklets
- Wire into _build_tracker ocsort; add to search_space.json (range 0-30); default_config.json p_reset_threshold=0

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…e dynamics

- Add velocity_decay and q_miss_alpha params to OCSORTTracklet.predict(): attenuate
  velocity components and inflate Q each missed frame to reduce drift on lost tracks
- Thread both params from OCSORTTracker.__init__ through _spawn_new_tracklets to tracklet
- Expose both in autotune search_space.json (velocity_decay 0.5–1.0; q_miss_alpha 0.0–2.0)
- Set neutral defaults in default_config.json (velocity_decay=1.0, q_miss_alpha=0.0)

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…earch

- Promote velocity_decay=0.926, q_miss_alpha=0.512, p_reset=8, iou_age_weight=0.428
- Also: conf_cost_weight=0.970, q_scale=0.720, r_scale=1.189, p_scale=0.095
- New SDP baseline 58.905 (+0.43% over previous best 58.652)

---
Co-authored-by: Claude Code <noreply@anthropic.com>
Co-authored-by: OpenAI Codex <codex@openai.com>
- SDP table: fill OC-SORT autotune+Optuna row (HOTA=58.905, IDF1=71.636,
  MOTA=66.396, IDSW=291) and SORT row (HOTA=58.026)
- Journal: add SORT Phase 1 section (9 kept, 5 reverted, +8.4%)
- Journal: add OC-SORT Phase 1 section (7 iters + Codex, +10.4%)
  with positive experiments table, code features, and key lesson
- README Algorithms table: OC-SORT MOT17 HOTA (tuned) 57.9→58.9
- autotune/program.md: OC-SORT Phase 1 findings section — 7 kept changes,
  tuned best config (HOTA=58.905, +10.4% from 53.351 baseline), Optuna insights

---
Co-authored-by: Claude Code <noreply@anthropic.com>
Keep "already in the code" tables (prevent duplicate work) but strip
dataset-specific verdicts that anchor future agents against valid hypotheses:
- ByteTrack Phase 1 "tried and reverted" block removed
- SORT Phase 1 "tried and reverted" sub-section removed
- OC-SORT Phase 1 "Optuna findings" paragraph removed

Full campaign history remains in autotune/README.md Journal section.

---
Co-authored-by: Claude Code <noreply@anthropic.com>
@Borda Borda force-pushed the bemch/auto-research branch from 3cccd8e to 5d61be1 Compare April 7, 2026 10:06
@Borda Borda changed the title Add autotrack — autonomous MOT tracker optimization loop [rebase&merge] Add autotune — autonomous MOT tracker optimization loop [rebase&merge] Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants