ROM-6881 NPU
Contents
Preface
NPU SDK Introduce
RK3568
- Neural network acceleration engine with processing performance up to 0.8 TOPS
- Support integer 4, integer 8, integer 16, float 16, Bfloat 16 and tf32 operation
- Support deep learning frameworks: TensorFlow, Caffe, Tflite, Pytorch, Onnx NN, Android NN, etc.
- One isolated voltage domain to support DVFS
RK3588
- Neural network acceleration engine with processing performance up to 6 TOPS
- Include triple NPU core, and support triple core co-work, dual core co-work, and work independently
- Support integer 4, integer 8, integer 16, float 16, Bfloat 16 and tf32 operation
- Embedded 384KBx3 internal buffer
- Multi-task, multi-scenario in parallel
- Support deep learning frameworks: TensorFlow, Caffe, Tflite, Pytorch, Onnx NN, Android NN, etc.
- One isolated voltage domain to support DVFS
AI Program Development
AI Program Development for Rockchip platform mainly has two stages:
- AI Model Transfer.
- Program Development.
We provide PC tools (rknn-toolkit2) to do "AI Model Transfer" and provide "AI API (rknpu2)" and "Qt/Gcc toolchain" to develop AI applications.
Gcc toolchain can be gotton from released image.
Qt toolchain can be gotton from Advantech WIKI.
To get more details about RKNN C API from "rknpu2/doc/Rockchip_RKNPU_UserGuide_RKNN_API_V*.ptf"
To get more details about RKNN Python API from "rknn-toolkit2/rknn_toolkit_lite2/docs/Rockchip_UserGuide_RKNN_Toolkit_Lite2_V*.ptf"
RKNN SDK
RKNN SDK (Baidu Password: a887)include two parts:
- rknn-toolkit2
- rknpu2
├── rknn-toolkit2 │ ├── doc │ ├── examples │ ├── packages │ └── rknn_toolkit_lite2 └── rknpu2 ├── doc ├── examples └── runtime
rknpu2
'rknpu2' include documents (rknpu2/doc) and examples (rknpu2/examples) to help to fast develop AI applications using rknn model(*.rknn).
Other models (eg:Caffe、TensorFlow etc) can be translated to rknn model through 'rknn-toolkit2'.
RKNN API Library file librknnrt.so and header file rknn_api.h can be found in rknpu2/runtime.
Released BSP and images have already included NPU driver and runtime libraries.
Here are two examples built in released images:
rknn_ssd_demo
cd /tools/test/adv/npu2/rknn_ssd_demo ./rknn_ssd_demo model/ssd_inception_v2.rknn model/bus.jpg resize 640 640 to 300 300 Loading model ... rknn_init ... model input num: 1, output num: 2 input tensors: index=0, name=Preprocessor/sub:0, n_dims=4, dims=[1, 300, 300, 3], n_elems=270000, size=270000, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=0, scale=0.007843 output tensors: index=0, name=concat:0, n_dims=4, dims=[1, 1917, 1, 4], n_elems=7668, size=7668, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=50, scale=0.090787 index=1, name=concat_1:0, n_dims=4, dims=[1, 1917, 91, 1], n_elems=174447, size=174447, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=58, scale=0.140090 rknn_run loadLabelName ssd - loadLabelName ./model/coco_labels_list.txt loadBoxPriors person @ (106 245 216 535) 0.994422 bus @ (87 132 568 432) 0.991533 person @ (213 231 288 511) 0.843047
rknn_mobilenet_demo
cd /tools/test/adv/npu2/rknn_mobilenet_demo ./rknn_mobilenet_demo model/mobilenet_v1.rknn model/cat_224x224.jpg model input num: 1, output num: 1 input tensors: index=0, name=input, n_dims=4, dims=[1, 224, 224, 3], n_elems=150528, size=150528, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=0, scale=0.007812 output tensors: index=0, name=MobilenetV1/Predictions/Reshape_1, n_dims=2, dims=[1, 1001, 0, 0], n_elems=1001, size=1001, fmt=UNDEFINED, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003906 rknn_run --- Top5 --- 283: 0.468750 282: 0.242188 286: 0.105469 464: 0.089844 264: 0.019531
rknn-toolkit2
Tool Introduction
RKNN-Toolkit2 is a development kit that provides users with model conversion, inference and performance evaluation on PC platforms. Users can easily complete the following functions through the Python interface provided by the tool:
- Model conversion: support to convert Caffe / TensorFlow / TensorFlow Lite / ONNX / Darknet / PyTorch model to RKNN model, support RKNN model import/export, which can be used on Rockchip NPU platform later.
- Quantization: support to convert float model to quantization model, currently support quantized methods including asymmetric quantization (asymmetric_quantized-8). and support hybrid quantization.
- Model inference: Able to simulate NPU to run RKNN model on PC and get the inference result. This tool can also distribute the RKNN model to the specified NPU device to run, and get the inference results.
- Performance & Memory evaluation: distribute the RKNN model to the specified NPU device to run, and evaluate the model performance and memory consumption in the actual device.
- Quantitative error analysis: This function will give the Euclidean or cosine distance of each layer of inference results before and after the model is quantized. This can be used to analyze how quantitative error occurs, and provide ideas for improving the accuracy of quantitative models.
- Model encryption: Use the specified encryption method to encrypt the RKNN model as a whole.
System Dependency
OS Version : Ubuntu18.04(x64) / Ubuntu20.04(x64)
Python Version : 3.6 / 3.8
Python library dependencies :
Python 3.6
cat rknn-toolkit2/doc/requirements_cp36*.txt # if install failed, please change the pip source to 'https://mirror.baidu.com/pypi/simple' # base deps numpy==1.19.5 protobuf==3.12.2 flatbuffers==1.12 # utils requests==2.27.1 psutil==5.9.0 ruamel.yaml==0.17.4 scipy==1.5.4 tqdm==4.64.0 bfloat16==1.1 opencv-python==4.5.5.64 # base onnx==1.9.0 onnxoptimizer==0.2.7 onnxruntime==1.10.0 torch==1.10.1 torchvision==0.11.2 tensorflow==2.6.2
Python3.8
cat rknn-toolkit2/doc/requirements_cp38*.txt # if install failed, please change the pip source to 'https://mirror.baidu.com/pypi/simple' # base deps numpy==1.19.5 protobuf==3.12.2 flatbuffers==1.12 # utils requests==2.27.1 psutil==5.9.0 ruamel.yaml==0.17.4 scipy==1.5.4 tqdm==4.64.0 bfloat16==1.1 opencv-python==4.5.5.64 # base onnx==1.9.0 onnxoptimizer==0.2.7 onnxruntime==1.10.0 torch==1.10.1 torchvision==0.11.2
Installation
Create virtualenv environment. If there are multiple versions of the Python environment in the system, it is recommended to use virtualenv to manage the Python environment. Take Python3.6 for example:
- Install virtualenv 、Python3.6 and pip3
sudo apt-get install virtualenv sudo apt-get install python3 python3-dev python3-pip
- Install dependencies
sudo apt-get install libxslt1-dev zlib1g zlib1g-dev libglib2.0-0 libsm6 libgl1-mesa-glx libprotobuf-dev gcc
- Install requirements_cp36-*.txt
virtualenv -p /usr/bin/python3 venv source venv/bin/activate sed -i 's|bfloat16==|#bfloat16==|g' rknn-toolkit2/doc/requirements_cp36-*.txt pip3 install -r rknn-toolkit2/doc/requirements_cp36-*.txt sed -i 's|#bfloat16==|bfloat16==|g' rknn-toolkit2/doc/requirements_cp36-*.txt pip3 install -r rknn-toolkit2/doc/requirements_cp36-*.txt
- Install RKNN-Toolkit2
pip3 install rknn-toolkit2/packages/rknn_toolkit2*cp36*.whl
- Check whether RKNN-Toolkit2 install successfully,press ctrl+d to exit
python3 >>> from rknn.api import RKNN >>>
If install successfully, there is no error information.
Here is one of the failed informations:
>>> from rknn.api import RKNN Traceback (most recent call last): File "<stdin>",line 1,in <module> ImportError: No module named 'rknn'
Model Conversion Demo
Here gives an example to show how to convert tflite model(mobilenet_v1_1.0_224.tflite) to RKNN model (mobilenet_v1.rknn) ON PC.
cd rknn-toolkit2/examples/tflite/mobilenet_v1 python3 test.py W __init__: rknn-toolkit2 version: 1.4.0-22dcfef4 --> Config model W config: 'target_platform' is None, use rk3566 as default, Please set according to the actual platform! done --> Loading model ... ... done --> Building model Analysing : 100%|█████████████████████████████████████████████████| 58/58 [00:00<00:00, 2739.15it/s] Quantizating : 100%|████████████████████████████████████████████████| 58/58 [00:00<00:00, 95.72it/s] ... ... I rknn buiding done done --> Export rknn model done --> Init runtime environment W init_runtime: Target is None, use simulator! done --> Running model Analysing : 100%|█████████████████████████████████████████████████| 60/60 [00:00<00:00, 2740.54it/s] Preparing : 100%|██████████████████████████████████████████████████| 60/60 [00:00<00:00, 136.33it/s] mobilenet_v1 -----TOP 5----- [156]: 0.9345703125 [155]: 0.0570068359375 [205]: 0.00429534912109375 [284]: 0.003116607666015625 [285]: 0.00017178058624267578 done