RK Platform NPU SDK

From ESS-WIKI
Revision as of 06:59, 8 August 2023 by Yunjin.jiang (talk | contribs)
Jump to: navigation, search

Preface

NPU Introduce

RK3568

  • Neural network acceleration engine with processing performance up to 0.8 TOPS
  • Support integer 4, integer 8, integer 16, float 16, Bfloat 16 and tf32 operation
  • Support deep learning frameworks: TensorFlow, Caffe, Tflite, Pytorch, Onnx NN, Android NN, etc.
  • One isolated voltage domain to support DVFS

RK3588

  • Neural network acceleration engine with processing performance up to 6 TOPS
  • Include triple NPU core, and support triple core co-work, dual core co-work, and work independently
  • Support integer 4, integer 8, integer 16, float 16, Bfloat 16 and tf32 operation
  • Embedded 384KBx3 internal buffer
  • Multi-task, multi-scenario in parallel
  • Support deep learning frameworks: TensorFlow, Caffe, Tflite, Pytorch, Onnx NN, Android NN, etc.
  • One isolated voltage domain to support DVFS


RKNN SDK

RKNN SDK (Password: a887)include two parts:

  • rknn-toolkit2
  • rknpu2

├── rknn-toolkit2
│   ├── doc
│   ├── examples
│   ├── packages
│   └── rknn_toolkit_lite2
└── rknpu2
    ├── doc
    ├── examples
    └── runtime

rknpu2

'rknpu2' include documents (rknpu2/doc) and examples (rknpu2/examples) to help to fast develop AI applications using rknn model(*.rknn).

Other models (eg:Caffe、TensorFlow etc) can be translated to rknn model through 'rknn-toolkit2'.

RKNN API Library file librknnrt.so and header file rknn_api.h can be found in rknpu2/runtime.


Released BSP and images have already include NPU driver and runtime libraries.

Here is two examples built in released images:

1. rknn_ssd_demo

cd /tools/test/adv/npu2/rknn_ssd_demo
./rknn_ssd_demo model/ssd_inception_v2.rknn model/bus.jpg


resize 640 640 to 300 300
Loading model ...
rknn_init ...
model input num: 1, output num: 2
input tensors:
  index=0, name=Preprocessor/sub:0, n_dims=4, dims=[1, 300, 300, 3], n_elems=270000, size=270000, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=0, scale=0.007843
output tensors:
  index=0, name=concat:0, n_dims=4, dims=[1, 1917, 1, 4], n_elems=7668, size=7668, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=50, scale=0.090787
  index=1, name=concat_1:0, n_dims=4, dims=[1, 1917, 91, 1], n_elems=174447, size=174447, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=58, scale=0.140090
rknn_run
loadLabelName
ssd - loadLabelName ./model/coco_labels_list.txt
loadBoxPriors
person @ (106 245 216 535) 0.994422
bus @ (87 132 568 432) 0.991533
person @ (213 231 288 511) 0.843047

2. rknn_mobilenet_demo

cd /tools/test/adv/npu2/rknn_mobilenet_demo
./rknn_mobilenet_demo model/mobilenet_v1.rknn model/cat_224x224.jpg



rknn-toolkit2

TBD