Compare commits

...

3 Commits

Author SHA1 Message Date
Nicolas Mowen
89b54f19c8
Add YOLOv9 support to RKNN (#17791)
* Add yolov9

* Undo

* Update docs for rknn yolov9

* Update docs notes

* Add infernece times table
2025-04-18 16:51:04 -06:00
Nicolas Mowen
e8883a2a2e
Fix yolox docs (#17789) 2025-04-18 16:15:55 -05:00
Nicolas Mowen
1cdc9b6097
Implement YOLOx for RKNN (#17788)
* Implement yolox rknn inference and post processing

* rework docs
2025-04-18 14:44:02 -06:00
5 changed files with 203 additions and 67 deletions

View File

@ -697,8 +697,8 @@ model:
model_type: yolox
width: 416 # <--- should match the imgsize set during model export
height: 416 # <--- should match the imgsize set during model export
input_tensor: nchw_denorm
input_dtype: float
input_tensor: nchw
input_dtype: float_denorm
path: /config/model_cache/yolox_tiny.onnx
labelmap_path: /labelmap/coco-80.txt
```
@ -815,62 +815,7 @@ This implementation uses the [Rockchip's RKNN-Toolkit2](https://github.com/airoc
### Prerequisites
Make sure to follow the [Rockchip specific installation instrucitions](/frigate/installation#rockchip-platform).
### Configuration
This `config.yml` shows all relevant options to configure the detector and explains them. All values shown are the default values (except for two). Lines that are required at least to use the detector are labeled as required, all other lines are optional.
```yaml
detectors: # required
rknn: # required
type: rknn # required
# number of NPU cores to use
# 0 means choose automatically
# increase for better performance if you have a multicore NPU e.g. set to 3 on rk3588
num_cores: 0
model: # required
# name of model (will be automatically downloaded) or path to your own .rknn model file
# possible values are:
# - deci-fp16-yolonas_s
# - deci-fp16-yolonas_m
# - deci-fp16-yolonas_l
# - /config/model_cache/your_custom_model.rknn
path: deci-fp16-yolonas_s
# width and height of detection frames
width: 320
height: 320
# pixel format of detection frame
# default value is rgb but yolo models usually use bgr format
input_pixel_format: bgr # required
# shape of detection frame
input_tensor: nhwc
# needs to be adjusted to model, see below
labelmap_path: /labelmap.txt # required
```
The correct labelmap must be loaded for each model. If you use a custom model (see notes below), you must make sure to provide the correct labelmap. The table below lists the correct paths for the bundled models:
| `path` | `labelmap_path` |
| --------------------- | --------------------- |
| deci-fp16-yolonas\_\* | /labelmap/coco-80.txt |
### Choosing a model
:::warning
The pre-trained YOLO-NAS weights from DeciAI are subject to their license and can't be used commercially. For more information, see: https://docs.deci.ai/super-gradients/latest/LICENSE.YOLONAS.html
:::
The inference time was determined on a rk3588 with 3 NPU cores.
| Model | Size in mb | Inference time in ms |
| ------------------- | ---------- | -------------------- |
| deci-fp16-yolonas_s | 24 | 25 |
| deci-fp16-yolonas_m | 62 | 35 |
| deci-fp16-yolonas_l | 81 | 45 |
Make sure to follow the [Rockchip specific installation instructions](/frigate/installation#rockchip-platform).
:::tip
@ -883,9 +828,94 @@ $ cat /sys/kernel/debug/rknpu/load
:::
### Supported Models
This `config.yml` shows all relevant options to configure the detector and explains them. All values shown are the default values (except for two). Lines that are required at least to use the detector are labeled as required, all other lines are optional.
```yaml
detectors: # required
rknn: # required
type: rknn # required
# number of NPU cores to use
# 0 means choose automatically
# increase for better performance if you have a multicore NPU e.g. set to 3 on rk3588
num_cores: 0
```
The inference time was determined on a rk3588 with 3 NPU cores.
| Model | Size in mb | Inference time in ms |
| ------------------- | ---------- | -------------------- |
| deci-fp16-yolonas_s | 24 | 25 |
| deci-fp16-yolonas_m | 62 | 35 |
| deci-fp16-yolonas_l | 81 | 45 |
| yolov9_tiny | 8 | 35 |
| yolox_nano | 3 | 16 |
| yolox_tiny | 6 | 20 |
- All models are automatically downloaded and stored in the folder `config/model_cache/rknn_cache`. After upgrading Frigate, you should remove older models to free up space.
- You can also provide your own `.rknn` model. You should not save your own models in the `rknn_cache` folder, store them directly in the `model_cache` folder or another subfolder. To convert a model to `.rknn` format see the `rknn-toolkit2` (requires a x86 machine). Note, that there is only post-processing for the supported models.
#### YOLO-NAS
```yaml
model: # required
# name of model (will be automatically downloaded) or path to your own .rknn model file
# possible values are:
# - deci-fp16-yolonas_s
# - deci-fp16-yolonas_m
# - deci-fp16-yolonas_l
# your yolonas_model.rknn
path: deci-fp16-yolonas_s
model_type: yolonas
width: 320
height: 320
input_pixel_format: bgr
input_tensor: nhwc
labelmap_path: /labelmap/coco-80.txt
```
:::warning
The pre-trained YOLO-NAS weights from DeciAI are subject to their license and can't be used commercially. For more information, see: https://docs.deci.ai/super-gradients/latest/LICENSE.YOLONAS.html
:::
#### YOLO (v9)
```yaml
model: # required
# name of model (will be automatically downloaded) or path to your own .rknn model file
# possible values are:
# - yolov9-t
# - yolov9-s
# your yolo_model.rknn
path: /config/model_cache/rknn_cache/yolov9-t.rknn
model_type: yolo-generic
width: 320
height: 320
input_tensor: nhwc
input_dtype: float
labelmap_path: /labelmap/coco-80.txt
```
#### YOLOx
```yaml
model: # required
# name of model (will be automatically downloaded) or path to your own .rknn model file
# possible values are:
# - yolox_nano
# - yolox_tiny
# your yolox_model.rknn
path: yolox_tiny
model_type: yolox
width: 416
height: 416
input_tensor: nhwc
labelmap_path: /labelmap/coco-80.txt
```
### Converting your own onnx model to rknn format
To convert a onnx model to the rknn format using the [rknn-toolkit2](https://github.com/airockchip/rknn-toolkit2/) you have to:

View File

@ -165,6 +165,12 @@ Frigate supports hardware video processing on all Rockchip boards. However, hard
- RK3576
- RK3588
| Name | YOLOv9 Inference Time | YOLO-NAS Inference Time | YOLOx Inference Time |
| --------------- | --------------------- | --------------------------- | ------------------------- |
| rk3588 3 cores | ~ 35 ms | small: ~ 20 ms med: ~ 30 ms | nano: 18 ms tiny: 20 ms |
| rk3566 1 core | | small: ~ 96 ms | |
The inference time of a rk3588 with all 3 cores enabled is typically 25-30 ms for yolo-nas s.
## What does Frigate use the CPU for and what does it use a detector for? (ELI5 Version)

View File

@ -24,7 +24,7 @@ class DetectionApi(ABC):
def detect_raw(self, tensor_input):
pass
def calculate_grids_strides(self) -> None:
def calculate_grids_strides(self, expanded=True) -> None:
grids = []
expanded_strides = []
@ -35,10 +35,23 @@ class DetectionApi(ABC):
for hsize, wsize, stride in zip(hsizes, wsizes, strides):
xv, yv = np.meshgrid(np.arange(wsize), np.arange(hsize))
grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
grids.append(grid)
shape = grid.shape[:2]
expanded_strides.append(np.full((*shape, 1), stride))
self.grids = np.concatenate(grids, 1)
self.expanded_strides = np.concatenate(expanded_strides, 1)
if expanded:
grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
grids.append(grid)
shape = grid.shape[:2]
expanded_strides.append(np.full((*shape, 1), stride))
else:
xv = xv.reshape(1, 1, hsize, wsize)
yv = yv.reshape(1, 1, hsize, wsize)
grids.extend(np.concatenate((xv, yv), axis=1).tolist())
expanded_strides.extend(
np.array([stride, stride]).reshape(1, 2, 1, 1).tolist()
)
if expanded:
self.grids = np.concatenate(grids, 1)
self.expanded_strides = np.concatenate(expanded_strides, 1)
else:
self.grids = grids
self.expanded_strides = expanded_strides

View File

@ -4,12 +4,14 @@ import re
import urllib.request
from typing import Literal
import cv2
import numpy as np
from pydantic import Field
from frigate.const import MODEL_CACHE_DIR
from frigate.detectors.detection_api import DetectionApi
from frigate.detectors.detector_config import BaseDetectorConfig, ModelTypeEnum
from frigate.util.model import post_process_yolo
logger = logging.getLogger(__name__)
@ -17,7 +19,10 @@ DETECTOR_KEY = "rknn"
supported_socs = ["rk3562", "rk3566", "rk3568", "rk3576", "rk3588"]
supported_models = {ModelTypeEnum.yolonas: "^deci-fp16-yolonas_[sml]$"}
supported_models = {
ModelTypeEnum.yolonas: "^deci-fp16-yolonas_[sml]$",
ModelTypeEnum.yolox: None,
}
model_cache_dir = os.path.join(MODEL_CACHE_DIR, "rknn_cache/")
@ -41,6 +46,9 @@ class Rknn(DetectionApi):
model_props = self.parse_model_input(model_path, soc)
if self.detector_config.model.model_type == ModelTypeEnum.yolox:
self.calculate_grids_strides(expanded=False)
if model_props["preset"]:
config.model.model_type = model_props["model_type"]
@ -199,9 +207,88 @@ class Rknn(DetectionApi):
return np.resize(results, (20, 6))
def post_process_yolox(
self,
predictions: list[np.ndarray],
grids: np.ndarray,
expanded_strides: np.ndarray,
) -> np.ndarray:
def sp_flatten(_in: np.ndarray):
ch = _in.shape[1]
_in = _in.transpose(0, 2, 3, 1)
return _in.reshape(-1, ch)
boxes, scores, classes_conf = [], [], []
input_data = [
_in.reshape([1, -1] + list(_in.shape[-2:])) for _in in predictions
]
for i in range(len(input_data)):
unprocessed_box = input_data[i][:, :4, :, :]
box_xy = unprocessed_box[:, :2, :, :]
box_wh = np.exp(unprocessed_box[:, 2:4, :, :]) * expanded_strides[i]
box_xy += grids[i]
box_xy *= expanded_strides[i]
box = np.concatenate((box_xy, box_wh), axis=1)
# Convert [c_x, c_y, w, h] to [x1, y1, x2, y2]
xyxy = np.copy(box)
xyxy[:, 0, :, :] = box[:, 0, :, :] - box[:, 2, :, :] / 2 # top left x
xyxy[:, 1, :, :] = box[:, 1, :, :] - box[:, 3, :, :] / 2 # top left y
xyxy[:, 2, :, :] = box[:, 0, :, :] + box[:, 2, :, :] / 2 # bottom right x
xyxy[:, 3, :, :] = box[:, 1, :, :] + box[:, 3, :, :] / 2 # bottom right y
boxes.append(xyxy)
scores.append(input_data[i][:, 4:5, :, :])
classes_conf.append(input_data[i][:, 5:, :, :])
# flatten data
boxes = np.concatenate([sp_flatten(_v) for _v in boxes])
classes_conf = np.concatenate([sp_flatten(_v) for _v in classes_conf])
scores = np.concatenate([sp_flatten(_v) for _v in scores])
# reshape and filter boxes
box_confidences = scores.reshape(-1)
class_max_score = np.max(classes_conf, axis=-1)
classes = np.argmax(classes_conf, axis=-1)
_class_pos = np.where(class_max_score * box_confidences >= 0.4)
scores = (class_max_score * box_confidences)[_class_pos]
boxes = boxes[_class_pos]
classes = classes[_class_pos]
# run nms
indices = cv2.dnn.NMSBoxes(
bboxes=boxes,
scores=scores,
score_threshold=0.4,
nms_threshold=0.4,
)
results = np.zeros((20, 6), np.float32)
if len(indices) > 0:
for i, idx in enumerate(indices.flatten()[:20]):
box = boxes[idx]
results[i] = [
classes[idx],
scores[idx],
box[1] / self.height,
box[0] / self.width,
box[3] / self.height,
box[2] / self.width,
]
return results
def post_process(self, output):
if self.detector_config.model.model_type == ModelTypeEnum.yolonas:
return self.post_process_yolonas(output)
elif self.detector_config.model.model_type == ModelTypeEnum.yologeneric:
return post_process_yolo(output, self.width, self.height)
elif self.detector_config.model.model_type == ModelTypeEnum.yolox:
return self.post_process_yolox(output, self.grids, self.expanded_strides)
else:
raise ValueError(
f'Model type "{self.detector_config.model.model_type}" is currently not supported.'

View File

@ -180,7 +180,7 @@ def __post_process_multipart_yolo(
x2 / width,
]
return np.array(results, dtype=np.float32)
return results
def __post_process_nms_yolo(predictions: np.ndarray, width, height) -> np.ndarray: