From c48c74779a6530367f143cff4be5d667b764a039 Mon Sep 17 00:00:00 2001
From: Josh Hawkins <32435876+hawkeye217@users.noreply.github.com>
Date: Wed, 27 May 2026 10:38:45 -0500
Subject: [PATCH] add caveat

---
 testing-scripts/object_dataset.py | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/testing-scripts/object_dataset.py b/testing-scripts/object_dataset.py
index 3a30ebe3a4..dd4071c441 100644
--- a/testing-scripts/object_dataset.py
+++ b/testing-scripts/object_dataset.py
@@ -42,8 +42,22 @@ Recommended workflow when troubleshooting misclassifications:
      them to whichever class has the most of them.
 
      Fix: quarantine every image where min(w, h) < 80 (or 100 for a
-     stricter cut) and retrain. This single step often resolves most
-     misclassifications in datasets collected from distant cameras.
+     stricter cut) and retrain. This works when the named class has
+     plenty of non-small examples to fall back on AND the small crops
+     are mostly degenerate blobs (target unrecognizable at that size).
+
+     CAVEAT — sometimes small crops ARE the signal, not the noise: if
+     your target naturally appears small at the camera distance (cats
+     indoors, distant subjects, wide-FOV setups), the small crops in
+     the named class ARE the typical inference-time input. Removing them
+     leaves the model unable to recognize the target at its natural
+     detection size, and accuracy on the named class collapses after
+     retraining. If that happens — named-class accuracy drops sharply
+     after size cut + retrain — restore the quarantine and switch to
+     visual review of just the misclassified small crops instead of
+     bulk size filtering. The size threshold is a tool for "tons of
+     accidental tiny blobs polluting a class with otherwise large
+     examples," not a universal cleanup.
 
   3. Verify the "none" class exists and is healthy. Without a strong
      "none" class, every unknown crop at inference gets forced into one of