mirror of
https://github.com/blakeblackshear/frigate.git
synced 2026-05-03 06:50:58 +00:00
Compare commits
No commits in common. "68382d89b4167ae4e0f1b245a8af3b09de03287e" and "8270967cdc1ffaed7e2fc35eb7d045edd8be71f0" have entirely different histories.
68382d89b4
...
8270967cdc
@ -184,7 +184,7 @@ cameras:
|
|||||||
ffmpeg: ... # add your streams
|
ffmpeg: ... # add your streams
|
||||||
detect:
|
detect:
|
||||||
enabled: True
|
enabled: True
|
||||||
fps: 5 # increase to 10 if vehicles move quickly across your frame. Higher than 15 is unnecessary and is not recommended.
|
fps: 5 # increase to 10 if vehicles move quickly across your frame. Higher than 10 is unnecessary and is not recommended.
|
||||||
min_initialized: 2
|
min_initialized: 2
|
||||||
width: 1920
|
width: 1920
|
||||||
height: 1080
|
height: 1080
|
||||||
@ -267,7 +267,7 @@ With this setup:
|
|||||||
- Review items will always be classified as a `detection`.
|
- Review items will always be classified as a `detection`.
|
||||||
- Snapshots will always be saved.
|
- Snapshots will always be saved.
|
||||||
- Zones and object masks are **not** used.
|
- Zones and object masks are **not** used.
|
||||||
- The `frigate/events` MQTT topic will **not** publish tracked object updates with the license plate bounding box and score, though `frigate/reviews` will publish if recordings are enabled. If a plate is recognized as a known plate, publishing will occur with an updated `sub_label` field. If characters are recognized, publishing will occur with an updated `recognized_license_plate` field.
|
- The `frigate/events` MQTT topic will **not** publish tracked object updates, though `frigate/reviews` will if recordings are enabled.
|
||||||
- License plate snapshots are saved at the highest-scoring moment and appear in Explore.
|
- License plate snapshots are saved at the highest-scoring moment and appear in Explore.
|
||||||
- Debug view will not show `license_plate` bounding boxes.
|
- Debug view will not show `license_plate` bounding boxes.
|
||||||
|
|
||||||
@ -280,7 +280,7 @@ With this setup:
|
|||||||
| Object Detection | Standard Frigate+ detection applies | Bypasses standard object detection |
|
| Object Detection | Standard Frigate+ detection applies | Bypasses standard object detection |
|
||||||
| Zones & Object Masks | Supported | Not supported |
|
| Zones & Object Masks | Supported | Not supported |
|
||||||
| Debug View | May show `license_plate` bounding boxes | May **not** show `license_plate` bounding boxes |
|
| Debug View | May show `license_plate` bounding boxes | May **not** show `license_plate` bounding boxes |
|
||||||
| MQTT `frigate/events` | Publishes tracked object updates | Publishes limited updates |
|
| MQTT `frigate/events` | Publishes tracked object updates | Does **not** publish tracked object updates |
|
||||||
| Explore | Recognized plates available in More Filters | Recognized plates available in More Filters |
|
| Explore | Recognized plates available in More Filters | Recognized plates available in More Filters |
|
||||||
|
|
||||||
By selecting the appropriate configuration, users can optimize their dedicated LPR cameras based on whether they are using a Frigate+ model or the secondary LPR pipeline.
|
By selecting the appropriate configuration, users can optimize their dedicated LPR cameras based on whether they are using a Frigate+ model or the secondary LPR pipeline.
|
||||||
|
|||||||
@ -659,7 +659,7 @@ YOLOv3, YOLOv4, YOLOv7, and [YOLOv9](https://github.com/WongKinYiu/yolov9) model
|
|||||||
|
|
||||||
:::tip
|
:::tip
|
||||||
|
|
||||||
The YOLO detector has been designed to support YOLOv3, YOLOv4, YOLOv7, and YOLOv9 models, but may support other YOLO model architectures as well. See [the models section](#downloading-yolo-models) for more information on downloading YOLO models for use in Frigate.
|
The YOLO detector has been designed to support YOLOv3, YOLOv4, YOLOv7, and YOLOv9 models, but may support other YOLO model architectures as well.
|
||||||
|
|
||||||
:::
|
:::
|
||||||
|
|
||||||
@ -682,29 +682,6 @@ model:
|
|||||||
|
|
||||||
Note that the labelmap uses a subset of the complete COCO label set that has only 80 objects.
|
Note that the labelmap uses a subset of the complete COCO label set that has only 80 objects.
|
||||||
|
|
||||||
#### YOLOx
|
|
||||||
|
|
||||||
[YOLOx](https://github.com/Megvii-BaseDetection/YOLOX) models are supported, but not included by default. See [the models section](#downloading-yolo-models) for more information on downloading the YOLOx model for use in Frigate.
|
|
||||||
|
|
||||||
After placing the downloaded onnx model in your config folder, you can use the following configuration:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
detectors:
|
|
||||||
onnx:
|
|
||||||
type: onnx
|
|
||||||
|
|
||||||
model:
|
|
||||||
model_type: yolox
|
|
||||||
width: 416 # <--- should match the imgsize set during model export
|
|
||||||
height: 416 # <--- should match the imgsize set during model export
|
|
||||||
input_tensor: nchw_denorm
|
|
||||||
input_dtype: float
|
|
||||||
path: /config/model_cache/yolox_tiny.onnx
|
|
||||||
labelmap_path: /labelmap/coco-80.txt
|
|
||||||
```
|
|
||||||
|
|
||||||
Note that the labelmap uses a subset of the complete COCO label set that has only 80 objects.
|
|
||||||
|
|
||||||
#### RF-DETR
|
#### RF-DETR
|
||||||
|
|
||||||
[RF-DETR](https://github.com/roboflow/rf-detr) is a DETR based model. The ONNX exported models are supported, but not included by default. See [the models section](#downloading-rf-detr-model) for more information on downloading the RF-DETR model for use in Frigate.
|
[RF-DETR](https://github.com/roboflow/rf-detr) is a DETR based model. The ONNX exported models are supported, but not included by default. See [the models section](#downloading-rf-detr-model) for more information on downloading the RF-DETR model for use in Frigate.
|
||||||
@ -985,10 +962,6 @@ The input image size in this notebook is set to 320x320. This results in lower C
|
|||||||
|
|
||||||
### Downloading YOLO Models
|
### Downloading YOLO Models
|
||||||
|
|
||||||
#### YOLOx
|
|
||||||
|
|
||||||
YOLOx models can be downloaded [from the YOLOx repo](https://github.com/Megvii-BaseDetection/YOLOX/tree/main/demo/ONNXRuntime).
|
|
||||||
|
|
||||||
#### YOLOv3, YOLOv4, and YOLOv7
|
#### YOLOv3, YOLOv4, and YOLOv7
|
||||||
|
|
||||||
To export as ONNX:
|
To export as ONNX:
|
||||||
|
|||||||
@ -513,14 +513,10 @@ class FrigateConfig(FrigateBaseModel):
|
|||||||
)
|
)
|
||||||
|
|
||||||
# Warn if detect fps > 10
|
# Warn if detect fps > 10
|
||||||
if camera_config.detect.fps > 10 and camera_config.type != "lpr":
|
if camera_config.detect.fps > 10:
|
||||||
logger.warning(
|
logger.warning(
|
||||||
f"{camera_config.name} detect fps is set to {camera_config.detect.fps}. This does NOT need to match your camera's frame rate. High values could lead to reduced performance. Recommended value is 5."
|
f"{camera_config.name} detect fps is set to {camera_config.detect.fps}. This does NOT need to match your camera's frame rate. High values could lead to reduced performance. Recommended value is 5."
|
||||||
)
|
)
|
||||||
if camera_config.detect.fps > 15 and camera_config.type == "lpr":
|
|
||||||
logger.warning(
|
|
||||||
f"{camera_config.name} detect fps is set to {camera_config.detect.fps}. This does NOT need to match your camera's frame rate. High values could lead to reduced performance. Recommended value for LPR cameras are between 5-15."
|
|
||||||
)
|
|
||||||
|
|
||||||
# Default min_initialized configuration
|
# Default min_initialized configuration
|
||||||
min_initialized = int(camera_config.detect.fps / 2)
|
min_initialized = int(camera_config.detect.fps / 2)
|
||||||
|
|||||||
@ -490,6 +490,10 @@ class LicensePlateProcessingMixin:
|
|||||||
merged_boxes.append(current_box)
|
merged_boxes.append(current_box)
|
||||||
current_box = next_box
|
current_box = next_box
|
||||||
|
|
||||||
|
logger.debug(
|
||||||
|
f"Provided plate_width: {plate_width}, max_gap: {max_gap}, horizontal_gap: {horizontal_gap}"
|
||||||
|
)
|
||||||
|
|
||||||
# Add the last box
|
# Add the last box
|
||||||
merged_boxes.append(current_box)
|
merged_boxes.append(current_box)
|
||||||
|
|
||||||
@ -1129,7 +1133,7 @@ class LicensePlateProcessingMixin:
|
|||||||
# 4. Log the comparison
|
# 4. Log the comparison
|
||||||
logger.debug(
|
logger.debug(
|
||||||
f"Plate comparison - Current: {top_plate} (score: {curr_score:.3f}, min_conf: {curr_min_conf:.2f}) vs "
|
f"Plate comparison - Current: {top_plate} (score: {curr_score:.3f}, min_conf: {curr_min_conf:.2f}) vs "
|
||||||
f"Previous: {prev_plate} (score: {prev_score:.3f}, min_conf: {prev_min_conf:.2f}) "
|
f"Previous: {prev_plate} (score: {prev_score:.3f}, min_conf: {prev_min_conf:.2f})\n"
|
||||||
f"Metrics - Length: {len(top_plate)} vs {len(prev_plate)} (scores: {curr_length_score:.2f} vs {prev_length_score:.2f}), "
|
f"Metrics - Length: {len(top_plate)} vs {len(prev_plate)} (scores: {curr_length_score:.2f} vs {prev_length_score:.2f}), "
|
||||||
f"Area: {top_area} vs {prev_area}, "
|
f"Area: {top_area} vs {prev_area}, "
|
||||||
f"Avg Conf: {avg_confidence:.2f} vs {prev_avg_confidence:.2f}, "
|
f"Avg Conf: {avg_confidence:.2f} vs {prev_avg_confidence:.2f}, "
|
||||||
@ -1259,15 +1263,6 @@ class LicensePlateProcessingMixin:
|
|||||||
)
|
)
|
||||||
return
|
return
|
||||||
|
|
||||||
# don't run for objects with no position changes
|
|
||||||
# this is the initial state after registering a new tracked object
|
|
||||||
# LPR will run 2 frames after detect.min_initialized is reached
|
|
||||||
if obj_data.get("position_changes", 0) == 0:
|
|
||||||
logger.debug(
|
|
||||||
f"{camera}: Plate detected in {self.config.cameras[camera].detect.min_initialized + 1} concurrent frames, LPR frame threshold ({self.config.cameras[camera].detect.min_initialized + 2})"
|
|
||||||
)
|
|
||||||
return
|
|
||||||
|
|
||||||
license_plate: Optional[dict[str, any]] = None
|
license_plate: Optional[dict[str, any]] = None
|
||||||
|
|
||||||
if "license_plate" not in self.config.cameras[camera].objects.track:
|
if "license_plate" not in self.config.cameras[camera].objects.track:
|
||||||
@ -1406,8 +1401,6 @@ class LicensePlateProcessingMixin:
|
|||||||
license_plate_frame,
|
license_plate_frame,
|
||||||
)
|
)
|
||||||
|
|
||||||
logger.debug(f"{camera}: Running plate recognition.")
|
|
||||||
|
|
||||||
# run detection, returns results sorted by confidence, best first
|
# run detection, returns results sorted by confidence, best first
|
||||||
start = datetime.datetime.now().timestamp()
|
start = datetime.datetime.now().timestamp()
|
||||||
license_plates, confidences, areas = self._process_license_plate(
|
license_plates, confidences, areas = self._process_license_plate(
|
||||||
|
|||||||
@ -54,9 +54,6 @@ class LicensePlatePostProcessor(LicensePlateProcessingMixin, PostProcessorApi):
|
|||||||
Returns:
|
Returns:
|
||||||
None.
|
None.
|
||||||
"""
|
"""
|
||||||
# don't run LPR post processing for now
|
|
||||||
return
|
|
||||||
|
|
||||||
event_id = data["event_id"]
|
event_id = data["event_id"]
|
||||||
camera_name = data["camera"]
|
camera_name = data["camera"]
|
||||||
|
|
||||||
|
|||||||
@ -16,7 +16,7 @@ class DetectionApi(ABC):
|
|||||||
@abstractmethod
|
@abstractmethod
|
||||||
def __init__(self, detector_config: BaseDetectorConfig):
|
def __init__(self, detector_config: BaseDetectorConfig):
|
||||||
self.detector_config = detector_config
|
self.detector_config = detector_config
|
||||||
self.thresh = 0.4
|
self.thresh = 0.5
|
||||||
self.height = detector_config.model.height
|
self.height = detector_config.model.height
|
||||||
self.width = detector_config.model.width
|
self.width = detector_config.model.width
|
||||||
|
|
||||||
@ -24,21 +24,58 @@ class DetectionApi(ABC):
|
|||||||
def detect_raw(self, tensor_input):
|
def detect_raw(self, tensor_input):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
def calculate_grids_strides(self) -> None:
|
def post_process_yolonas(self, output):
|
||||||
grids = []
|
"""
|
||||||
expanded_strides = []
|
@param output: output of inference
|
||||||
|
expected shape: [np.array(1, N, 4), np.array(1, N, 80)]
|
||||||
|
where N depends on the input size e.g. N=2100 for 320x320 images
|
||||||
|
|
||||||
# decode and orient predictions
|
@return: best results: np.array(20, 6) where each row is
|
||||||
strides = [8, 16, 32]
|
in this order (class_id, score, y1/height, x1/width, y2/height, x2/width)
|
||||||
hsizes = [self.height // stride for stride in strides]
|
"""
|
||||||
wsizes = [self.width // stride for stride in strides]
|
|
||||||
|
|
||||||
for hsize, wsize, stride in zip(hsizes, wsizes, strides):
|
N = output[0].shape[1]
|
||||||
xv, yv = np.meshgrid(np.arange(wsize), np.arange(hsize))
|
|
||||||
grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
|
|
||||||
grids.append(grid)
|
|
||||||
shape = grid.shape[:2]
|
|
||||||
expanded_strides.append(np.full((*shape, 1), stride))
|
|
||||||
|
|
||||||
self.grids = np.concatenate(grids, 1)
|
boxes = output[0].reshape(N, 4)
|
||||||
self.expanded_strides = np.concatenate(expanded_strides, 1)
|
scores = output[1].reshape(N, 80)
|
||||||
|
|
||||||
|
class_ids = np.argmax(scores, axis=1)
|
||||||
|
scores = scores[np.arange(N), class_ids]
|
||||||
|
|
||||||
|
args_best = np.argwhere(scores > self.thresh)[:, 0]
|
||||||
|
|
||||||
|
num_matches = len(args_best)
|
||||||
|
if num_matches == 0:
|
||||||
|
return np.zeros((20, 6), np.float32)
|
||||||
|
elif num_matches > 20:
|
||||||
|
args_best20 = np.argpartition(scores[args_best], -20)[-20:]
|
||||||
|
args_best = args_best[args_best20]
|
||||||
|
|
||||||
|
boxes = boxes[args_best]
|
||||||
|
class_ids = class_ids[args_best]
|
||||||
|
scores = scores[args_best]
|
||||||
|
|
||||||
|
boxes = np.transpose(
|
||||||
|
np.vstack(
|
||||||
|
(
|
||||||
|
boxes[:, 1] / self.height,
|
||||||
|
boxes[:, 0] / self.width,
|
||||||
|
boxes[:, 3] / self.height,
|
||||||
|
boxes[:, 2] / self.width,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
results = np.hstack(
|
||||||
|
(class_ids[..., np.newaxis], scores[..., np.newaxis], boxes)
|
||||||
|
)
|
||||||
|
|
||||||
|
return np.resize(results, (20, 6))
|
||||||
|
|
||||||
|
def post_process(self, output):
|
||||||
|
if self.detector_config.model.model_type == ModelTypeEnum.yolonas:
|
||||||
|
return self.post_process_yolonas(output)
|
||||||
|
else:
|
||||||
|
raise ValueError(
|
||||||
|
f'Model type "{self.detector_config.model.model_type}" is currently not supported.'
|
||||||
|
)
|
||||||
|
|||||||
@ -31,7 +31,6 @@ class InputTensorEnum(str, Enum):
|
|||||||
|
|
||||||
class InputDTypeEnum(str, Enum):
|
class InputDTypeEnum(str, Enum):
|
||||||
float = "float"
|
float = "float"
|
||||||
float_denorm = "float_denorm" # non-normalized float
|
|
||||||
int = "int"
|
int = "int"
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@ -14,7 +14,6 @@ from frigate.util.model import (
|
|||||||
post_process_dfine,
|
post_process_dfine,
|
||||||
post_process_rfdetr,
|
post_process_rfdetr,
|
||||||
post_process_yolo,
|
post_process_yolo,
|
||||||
post_process_yolox,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@ -31,8 +30,6 @@ class ONNXDetector(DetectionApi):
|
|||||||
type_key = DETECTOR_KEY
|
type_key = DETECTOR_KEY
|
||||||
|
|
||||||
def __init__(self, detector_config: ONNXDetectorConfig):
|
def __init__(self, detector_config: ONNXDetectorConfig):
|
||||||
super().__init__(detector_config)
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
import onnxruntime as ort
|
import onnxruntime as ort
|
||||||
|
|
||||||
@ -54,14 +51,13 @@ class ONNXDetector(DetectionApi):
|
|||||||
path, providers=providers, provider_options=options
|
path, providers=providers, provider_options=options
|
||||||
)
|
)
|
||||||
|
|
||||||
|
self.h = detector_config.model.height
|
||||||
|
self.w = detector_config.model.width
|
||||||
self.onnx_model_type = detector_config.model.model_type
|
self.onnx_model_type = detector_config.model.model_type
|
||||||
self.onnx_model_px = detector_config.model.input_pixel_format
|
self.onnx_model_px = detector_config.model.input_pixel_format
|
||||||
self.onnx_model_shape = detector_config.model.input_tensor
|
self.onnx_model_shape = detector_config.model.input_tensor
|
||||||
path = detector_config.model.path
|
path = detector_config.model.path
|
||||||
|
|
||||||
if self.onnx_model_type == ModelTypeEnum.yolox:
|
|
||||||
self.calculate_grids_strides()
|
|
||||||
|
|
||||||
logger.info(f"ONNX: {path} loaded")
|
logger.info(f"ONNX: {path} loaded")
|
||||||
|
|
||||||
def detect_raw(self, tensor_input: np.ndarray):
|
def detect_raw(self, tensor_input: np.ndarray):
|
||||||
@ -70,12 +66,10 @@ class ONNXDetector(DetectionApi):
|
|||||||
None,
|
None,
|
||||||
{
|
{
|
||||||
"images": tensor_input,
|
"images": tensor_input,
|
||||||
"orig_target_sizes": np.array(
|
"orig_target_sizes": np.array([[self.h, self.w]], dtype=np.int64),
|
||||||
[[self.height, self.width]], dtype=np.int64
|
|
||||||
),
|
|
||||||
},
|
},
|
||||||
)
|
)
|
||||||
return post_process_dfine(tensor_output, self.width, self.height)
|
return post_process_dfine(tensor_output, self.w, self.h)
|
||||||
|
|
||||||
model_input_name = self.model.get_inputs()[0].name
|
model_input_name = self.model.get_inputs()[0].name
|
||||||
tensor_output = self.model.run(None, {model_input_name: tensor_input})
|
tensor_output = self.model.run(None, {model_input_name: tensor_input})
|
||||||
@ -97,22 +91,14 @@ class ONNXDetector(DetectionApi):
|
|||||||
detections[i] = [
|
detections[i] = [
|
||||||
class_id,
|
class_id,
|
||||||
confidence,
|
confidence,
|
||||||
y_min / self.height,
|
y_min / self.h,
|
||||||
x_min / self.width,
|
x_min / self.w,
|
||||||
y_max / self.height,
|
y_max / self.h,
|
||||||
x_max / self.width,
|
x_max / self.w,
|
||||||
]
|
]
|
||||||
return detections
|
return detections
|
||||||
elif self.onnx_model_type == ModelTypeEnum.yologeneric:
|
elif self.onnx_model_type == ModelTypeEnum.yologeneric:
|
||||||
return post_process_yolo(tensor_output, self.width, self.height)
|
return post_process_yolo(tensor_output, self.w, self.h)
|
||||||
elif self.onnx_model_type == ModelTypeEnum.yolox:
|
|
||||||
return post_process_yolox(
|
|
||||||
tensor_output[0],
|
|
||||||
self.width,
|
|
||||||
self.height,
|
|
||||||
self.grids,
|
|
||||||
self.expanded_strides,
|
|
||||||
)
|
|
||||||
else:
|
else:
|
||||||
raise Exception(
|
raise Exception(
|
||||||
f"{self.onnx_model_type} is currently not supported for onnx. See the docs for more info on supported models."
|
f"{self.onnx_model_type} is currently not supported for onnx. See the docs for more info on supported models."
|
||||||
|
|||||||
@ -38,7 +38,6 @@ class OvDetector(DetectionApi):
|
|||||||
]
|
]
|
||||||
|
|
||||||
def __init__(self, detector_config: OvDetectorConfig):
|
def __init__(self, detector_config: OvDetectorConfig):
|
||||||
super().__init__(detector_config)
|
|
||||||
self.ov_core = ov.Core()
|
self.ov_core = ov.Core()
|
||||||
self.ov_model_type = detector_config.model.model_type
|
self.ov_model_type = detector_config.model.model_type
|
||||||
|
|
||||||
@ -134,7 +133,25 @@ class OvDetector(DetectionApi):
|
|||||||
break
|
break
|
||||||
self.num_classes = tensor_shape[2] - 5
|
self.num_classes = tensor_shape[2] - 5
|
||||||
logger.info(f"YOLOX model has {self.num_classes} classes")
|
logger.info(f"YOLOX model has {self.num_classes} classes")
|
||||||
self.calculate_grids_strides()
|
self.set_strides_grids()
|
||||||
|
|
||||||
|
def set_strides_grids(self):
|
||||||
|
grids = []
|
||||||
|
expanded_strides = []
|
||||||
|
|
||||||
|
strides = [8, 16, 32]
|
||||||
|
|
||||||
|
hsize_list = [self.h // stride for stride in strides]
|
||||||
|
wsize_list = [self.w // stride for stride in strides]
|
||||||
|
|
||||||
|
for hsize, wsize, stride in zip(hsize_list, wsize_list, strides):
|
||||||
|
xv, yv = np.meshgrid(np.arange(wsize), np.arange(hsize))
|
||||||
|
grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
|
||||||
|
grids.append(grid)
|
||||||
|
shape = grid.shape[:2]
|
||||||
|
expanded_strides.append(np.full((*shape, 1), stride))
|
||||||
|
self.grids = np.concatenate(grids, 1)
|
||||||
|
self.expanded_strides = np.concatenate(expanded_strides, 1)
|
||||||
|
|
||||||
## Takes in class ID, confidence score, and array of [x, y, w, h] that describes detection position,
|
## Takes in class ID, confidence score, and array of [x, y, w, h] that describes detection position,
|
||||||
## returns an array that's easily passable back to Frigate.
|
## returns an array that's easily passable back to Frigate.
|
||||||
|
|||||||
@ -4,7 +4,6 @@ import re
|
|||||||
import urllib.request
|
import urllib.request
|
||||||
from typing import Literal
|
from typing import Literal
|
||||||
|
|
||||||
import numpy as np
|
|
||||||
from pydantic import Field
|
from pydantic import Field
|
||||||
|
|
||||||
from frigate.const import MODEL_CACHE_DIR
|
from frigate.const import MODEL_CACHE_DIR
|
||||||
@ -151,62 +150,6 @@ class Rknn(DetectionApi):
|
|||||||
'Make sure to set the model input_tensor to "nhwc" in your config.'
|
'Make sure to set the model input_tensor to "nhwc" in your config.'
|
||||||
)
|
)
|
||||||
|
|
||||||
def post_process_yolonas(self, output: list[np.ndarray]):
|
|
||||||
"""
|
|
||||||
@param output: output of inference
|
|
||||||
expected shape: [np.array(1, N, 4), np.array(1, N, 80)]
|
|
||||||
where N depends on the input size e.g. N=2100 for 320x320 images
|
|
||||||
|
|
||||||
@return: best results: np.array(20, 6) where each row is
|
|
||||||
in this order (class_id, score, y1/height, x1/width, y2/height, x2/width)
|
|
||||||
"""
|
|
||||||
|
|
||||||
N = output[0].shape[1]
|
|
||||||
|
|
||||||
boxes = output[0].reshape(N, 4)
|
|
||||||
scores = output[1].reshape(N, 80)
|
|
||||||
|
|
||||||
class_ids = np.argmax(scores, axis=1)
|
|
||||||
scores = scores[np.arange(N), class_ids]
|
|
||||||
|
|
||||||
args_best = np.argwhere(scores > self.thresh)[:, 0]
|
|
||||||
|
|
||||||
num_matches = len(args_best)
|
|
||||||
if num_matches == 0:
|
|
||||||
return np.zeros((20, 6), np.float32)
|
|
||||||
elif num_matches > 20:
|
|
||||||
args_best20 = np.argpartition(scores[args_best], -20)[-20:]
|
|
||||||
args_best = args_best[args_best20]
|
|
||||||
|
|
||||||
boxes = boxes[args_best]
|
|
||||||
class_ids = class_ids[args_best]
|
|
||||||
scores = scores[args_best]
|
|
||||||
|
|
||||||
boxes = np.transpose(
|
|
||||||
np.vstack(
|
|
||||||
(
|
|
||||||
boxes[:, 1] / self.height,
|
|
||||||
boxes[:, 0] / self.width,
|
|
||||||
boxes[:, 3] / self.height,
|
|
||||||
boxes[:, 2] / self.width,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
results = np.hstack(
|
|
||||||
(class_ids[..., np.newaxis], scores[..., np.newaxis], boxes)
|
|
||||||
)
|
|
||||||
|
|
||||||
return np.resize(results, (20, 6))
|
|
||||||
|
|
||||||
def post_process(self, output):
|
|
||||||
if self.detector_config.model.model_type == ModelTypeEnum.yolonas:
|
|
||||||
return self.post_process_yolonas(output)
|
|
||||||
else:
|
|
||||||
raise ValueError(
|
|
||||||
f'Model type "{self.detector_config.model.model_type}" is currently not supported.'
|
|
||||||
)
|
|
||||||
|
|
||||||
def detect_raw(self, tensor_input):
|
def detect_raw(self, tensor_input):
|
||||||
output = self.rknn.inference(
|
output = self.rknn.inference(
|
||||||
[
|
[
|
||||||
|
|||||||
@ -77,8 +77,6 @@ class LocalObjectDetector(ObjectDetector):
|
|||||||
if self.dtype == InputDTypeEnum.float:
|
if self.dtype == InputDTypeEnum.float:
|
||||||
tensor_input = tensor_input.astype(np.float32)
|
tensor_input = tensor_input.astype(np.float32)
|
||||||
tensor_input /= 255
|
tensor_input /= 255
|
||||||
elif self.dtype == InputDTypeEnum.float_denorm:
|
|
||||||
tensor_input = tensor_input.astype(np.float32)
|
|
||||||
|
|
||||||
return self.detect_api.detect_raw(tensor_input=tensor_input)
|
return self.detect_api.detect_raw(tensor_input=tensor_input)
|
||||||
|
|
||||||
|
|||||||
@ -138,13 +138,11 @@ class TrackedObject:
|
|||||||
|
|
||||||
if not self.false_positive and has_valid_frame:
|
if not self.false_positive and has_valid_frame:
|
||||||
# determine if this frame is a better thumbnail
|
# determine if this frame is a better thumbnail
|
||||||
if self.thumbnail_data is None or (
|
if self.thumbnail_data is None or is_better_thumbnail(
|
||||||
better_thumb := is_better_thumbnail(
|
self.obj_data["label"],
|
||||||
self.obj_data["label"],
|
self.thumbnail_data,
|
||||||
self.thumbnail_data,
|
obj_data,
|
||||||
obj_data,
|
self.camera_config.frame_shape,
|
||||||
self.camera_config.frame_shape,
|
|
||||||
)
|
|
||||||
):
|
):
|
||||||
# use the current frame time if the object's frame time isn't in the frame cache
|
# use the current frame time if the object's frame time isn't in the frame cache
|
||||||
selected_frame_time = (
|
selected_frame_time = (
|
||||||
@ -152,13 +150,6 @@ class TrackedObject:
|
|||||||
if obj_data["frame_time"] not in self.frame_cache.keys()
|
if obj_data["frame_time"] not in self.frame_cache.keys()
|
||||||
else obj_data["frame_time"]
|
else obj_data["frame_time"]
|
||||||
)
|
)
|
||||||
if (
|
|
||||||
obj_data["frame_time"] not in self.frame_cache.keys()
|
|
||||||
and not better_thumb
|
|
||||||
):
|
|
||||||
logger.warning(
|
|
||||||
f"Frame time {obj_data['frame_time']} not in frame cache, using current frame time {selected_frame_time}"
|
|
||||||
)
|
|
||||||
self.thumbnail_data = {
|
self.thumbnail_data = {
|
||||||
"frame_time": selected_frame_time,
|
"frame_time": selected_frame_time,
|
||||||
"box": obj_data["box"],
|
"box": obj_data["box"],
|
||||||
|
|||||||
@ -148,17 +148,27 @@ def __post_process_multipart_yolo(
|
|||||||
bw = ((dw * 2.0) ** 2) * anchor_w
|
bw = ((dw * 2.0) ** 2) * anchor_w
|
||||||
bh = ((dh * 2.0) ** 2) * anchor_h
|
bh = ((dh * 2.0) ** 2) * anchor_h
|
||||||
|
|
||||||
x1 = max(0, bx - bw / 2)
|
x1 = max(0, bx - bw / 2) / width
|
||||||
y1 = max(0, by - bh / 2)
|
y1 = max(0, by - bh / 2) / height
|
||||||
x2 = min(width, bx + bw / 2)
|
x2 = min(width, bx + bw / 2) / width
|
||||||
y2 = min(height, by + bh / 2)
|
y2 = min(height, by + bh / 2) / height
|
||||||
|
|
||||||
all_boxes.append([x1, y1, x2, y2])
|
all_boxes.append([x1, y1, x2, y2])
|
||||||
all_scores.append(conf)
|
all_scores.append(conf)
|
||||||
all_class_ids.append(class_id)
|
all_class_ids.append(class_id)
|
||||||
|
|
||||||
|
formatted_boxes = [
|
||||||
|
[
|
||||||
|
int(x1 * width),
|
||||||
|
int(y1 * height),
|
||||||
|
int((x2 - x1) * width),
|
||||||
|
int((y2 - y1) * height),
|
||||||
|
]
|
||||||
|
for x1, y1, x2, y2 in all_boxes
|
||||||
|
]
|
||||||
|
|
||||||
indices = cv2.dnn.NMSBoxes(
|
indices = cv2.dnn.NMSBoxes(
|
||||||
bboxes=all_boxes,
|
bboxes=formatted_boxes,
|
||||||
scores=all_scores,
|
scores=all_scores,
|
||||||
score_threshold=0.4,
|
score_threshold=0.4,
|
||||||
nms_threshold=0.4,
|
nms_threshold=0.4,
|
||||||
@ -171,25 +181,13 @@ def __post_process_multipart_yolo(
|
|||||||
class_id = all_class_ids[idx]
|
class_id = all_class_ids[idx]
|
||||||
conf = all_scores[idx]
|
conf = all_scores[idx]
|
||||||
x1, y1, x2, y2 = all_boxes[idx]
|
x1, y1, x2, y2 = all_boxes[idx]
|
||||||
results[i] = [
|
results[i] = [class_id, conf, y1, x1, y2, x2]
|
||||||
class_id,
|
|
||||||
conf,
|
|
||||||
y1 / height,
|
|
||||||
x1 / width,
|
|
||||||
y2 / height,
|
|
||||||
x2 / width,
|
|
||||||
]
|
|
||||||
|
|
||||||
return np.array(results, dtype=np.float32)
|
return np.array(results, dtype=np.float32)
|
||||||
|
|
||||||
|
|
||||||
def __post_process_nms_yolo(predictions: np.ndarray, width, height) -> np.ndarray:
|
def __post_process_nms_yolo(predictions: np.ndarray, width, height) -> np.ndarray:
|
||||||
predictions = np.squeeze(predictions)
|
predictions = np.squeeze(predictions).T
|
||||||
|
|
||||||
# transpose the output so it has order (inferences, class_ids)
|
|
||||||
if predictions.shape[0] < predictions.shape[1]:
|
|
||||||
predictions = predictions.T
|
|
||||||
|
|
||||||
scores = np.max(predictions[:, 4:], axis=1)
|
scores = np.max(predictions[:, 4:], axis=1)
|
||||||
predictions = predictions[scores > 0.4, :]
|
predictions = predictions[scores > 0.4, :]
|
||||||
scores = scores[scores > 0.4]
|
scores = scores[scores > 0.4]
|
||||||
@ -197,14 +195,9 @@ def __post_process_nms_yolo(predictions: np.ndarray, width, height) -> np.ndarra
|
|||||||
|
|
||||||
# Rescale box
|
# Rescale box
|
||||||
boxes = predictions[:, :4]
|
boxes = predictions[:, :4]
|
||||||
boxes_xyxy = np.ones_like(boxes)
|
|
||||||
boxes_xyxy[:, 0] = boxes[:, 0] - boxes[:, 2] / 2
|
|
||||||
boxes_xyxy[:, 1] = boxes[:, 1] - boxes[:, 3] / 2
|
|
||||||
boxes_xyxy[:, 2] = boxes[:, 0] + boxes[:, 2] / 2
|
|
||||||
boxes_xyxy[:, 3] = boxes[:, 1] + boxes[:, 3] / 2
|
|
||||||
boxes = boxes_xyxy
|
|
||||||
|
|
||||||
# run NMS
|
input_shape = np.array([width, height, width, height])
|
||||||
|
boxes = np.divide(boxes, input_shape, dtype=np.float32)
|
||||||
indices = cv2.dnn.NMSBoxes(boxes, scores, score_threshold=0.4, nms_threshold=0.4)
|
indices = cv2.dnn.NMSBoxes(boxes, scores, score_threshold=0.4, nms_threshold=0.4)
|
||||||
detections = np.zeros((20, 6), np.float32)
|
detections = np.zeros((20, 6), np.float32)
|
||||||
for i, (bbox, confidence, class_id) in enumerate(
|
for i, (bbox, confidence, class_id) in enumerate(
|
||||||
@ -216,10 +209,10 @@ def __post_process_nms_yolo(predictions: np.ndarray, width, height) -> np.ndarra
|
|||||||
detections[i] = [
|
detections[i] = [
|
||||||
class_id,
|
class_id,
|
||||||
confidence,
|
confidence,
|
||||||
bbox[1] / height,
|
bbox[1] - bbox[3] / 2,
|
||||||
bbox[0] / width,
|
bbox[0] - bbox[2] / 2,
|
||||||
bbox[3] / height,
|
bbox[1] + bbox[3] / 2,
|
||||||
bbox[2] / width,
|
bbox[0] + bbox[2] / 2,
|
||||||
]
|
]
|
||||||
|
|
||||||
return detections
|
return detections
|
||||||
@ -232,53 +225,6 @@ def post_process_yolo(output: list[np.ndarray], width: int, height: int) -> np.n
|
|||||||
return __post_process_nms_yolo(output[0], width, height)
|
return __post_process_nms_yolo(output[0], width, height)
|
||||||
|
|
||||||
|
|
||||||
def post_process_yolox(
|
|
||||||
predictions: np.ndarray,
|
|
||||||
width: int,
|
|
||||||
height: int,
|
|
||||||
grids: np.ndarray,
|
|
||||||
expanded_strides: np.ndarray,
|
|
||||||
) -> np.ndarray:
|
|
||||||
predictions[..., :2] = (predictions[..., :2] + grids) * expanded_strides
|
|
||||||
predictions[..., 2:4] = np.exp(predictions[..., 2:4]) * expanded_strides
|
|
||||||
|
|
||||||
# process organized predictions
|
|
||||||
predictions = predictions[0]
|
|
||||||
boxes = predictions[:, :4]
|
|
||||||
scores = predictions[:, 4:5] * predictions[:, 5:]
|
|
||||||
|
|
||||||
boxes_xyxy = np.ones_like(boxes)
|
|
||||||
boxes_xyxy[:, 0] = boxes[:, 0] - boxes[:, 2] / 2
|
|
||||||
boxes_xyxy[:, 1] = boxes[:, 1] - boxes[:, 3] / 2
|
|
||||||
boxes_xyxy[:, 2] = boxes[:, 0] + boxes[:, 2] / 2
|
|
||||||
boxes_xyxy[:, 3] = boxes[:, 1] + boxes[:, 3] / 2
|
|
||||||
|
|
||||||
cls_inds = scores.argmax(1)
|
|
||||||
scores = scores[np.arange(len(cls_inds)), cls_inds]
|
|
||||||
|
|
||||||
indices = cv2.dnn.NMSBoxes(
|
|
||||||
boxes_xyxy, scores, score_threshold=0.4, nms_threshold=0.4
|
|
||||||
)
|
|
||||||
|
|
||||||
detections = np.zeros((20, 6), np.float32)
|
|
||||||
for i, (bbox, confidence, class_id) in enumerate(
|
|
||||||
zip(boxes_xyxy[indices], scores[indices], cls_inds[indices])
|
|
||||||
):
|
|
||||||
if i == 20:
|
|
||||||
break
|
|
||||||
|
|
||||||
detections[i] = [
|
|
||||||
class_id,
|
|
||||||
confidence,
|
|
||||||
bbox[1] / height,
|
|
||||||
bbox[0] / width,
|
|
||||||
bbox[3] / height,
|
|
||||||
bbox[2] / width,
|
|
||||||
]
|
|
||||||
|
|
||||||
return detections
|
|
||||||
|
|
||||||
|
|
||||||
### ONNX Utilities
|
### ONNX Utilities
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user