Skip to content

Commit 2cdf5fe

Browse files
committed
* review
1 parent f1b369f commit 2cdf5fe

File tree

5 files changed

+81
-73
lines changed

5 files changed

+81
-73
lines changed

docs/how_to/deploy/adreno.rst

Lines changed: 17 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,12 @@ Deploying the compiled model here require use some tools on host as well as on t
127127
TVM has simplified user friendly command line based tools as well as
128128
developer centric python API interface for various steps like auto tuning, building and deploying.
129129

130-
TVM compilation process for remote devices has multiple stages listed below.
130+
131+
|Android deployment pipeline|
132+
133+
*Fig.2 Build and Deployment pipeline on Adreno devices*
134+
135+
The figure above demonstrates a generalized pipeline for various stages listed below.
131136

132137
**Model import:**
133138
At this stage we import a model from well known frameworks like Tensorflow, PyTorch, ONNX ...etc.
@@ -150,7 +155,7 @@ At this stage we run the TVM compilation output on the target. Deployment is pos
150155
environment using RPC Setup and also using TVM's native tool which is native binary cross compiled for Android.
151156
At this stage we can run the compiled model on Android target and unit test output correctness and performance aspects.
152157

153-
**Aplication Integration:**
158+
**Application Integration:**
154159
This stage is all about integrating TVM compiled model in applications. Here we discuss about
155160
interfacing tvm runtime from Android (cpp native environment or from JNI) for setting input and getting output.
156161

@@ -234,7 +239,6 @@ Below command will configure the build the host compiler
234239
cd build
235240
cp ../cmake/config.cmake .
236241

237-
echo set\(USE_OPENCL ON\) >> config.cmake
238242
echo set\(USE_RPC ON\) >> config.cmake
239243
echo set\(USE_GRAPH_EXECUTOR ON\) >> config.cmake
240244
echo set\(USE_LIBBACKTRACE AUTO\) >> config.cmake
@@ -258,7 +262,7 @@ Finally we can export python path as
258262

259263
::
260264

261-
export PYTHONPATH=$PWD:/python
265+
export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}
262266
python3 -c "import tvm" # Verify tvm python package
263267

264268

@@ -274,7 +278,6 @@ Target build require Android NDK to be installed.
274278
mkdir -p build-adreno
275279
cd build-adreno
276280
cp ../cmake/config.cmake .
277-
echo set\(USE_MICRO OFF\) >> config.cmake
278281
echo set\(USE_OPENCL ON\) >> config.cmake
279282
echo set\(USE_RPC ON\) >> config.cmake
280283
echo set\(USE_CPP_RPC ON\) >> config.cmake
@@ -342,73 +345,29 @@ manually and also inside docker using automated tools.
342345
**Automated RPC Setup:**
343346
Here we will explain how to setup RPC in docker environment.
344347

345-
Below command launches tracker in docker environment, where docker listens on port 9120.
348+
Below command launches tracker in docker environment, where tracker listens on port 9190.
346349

347350
::
348351

349352
./tests/scripts/ci.py adreno -i # Launch a new shell on the anreno docker
350-
source tests/scripts/setup-adreno-env.sh -e tracker -p 9120
353+
source tests/scripts/setup-adreno-env.sh -e tracker -p 9190
351354

352355
Now, the below comand can run TVM RPC on remote android device with id "abcdefgh".
353356

354357

355358
::
356359

357360
./tests/scripts/ci.py adreno -i # Launch a new shell on adreno docker.
358-
source tests/scripts/setup-adreno-env.sh -e device -p 9120 -d abcdefgh
361+
source tests/scripts/setup-adreno-env.sh -e device -p 9190 -d abcdefgh
359362

360363

361364
**Manual RPC Setup:**
362365

363-
Below command in manual setup starts the tracker on port 9120
364-
365-
::
366-
367-
python3 -m tvm.exec.rpc_tracker --host "0.0.0.0" --port "9120"
368-
369-
TVM RPC launch on Android device require some environment setup due to Android device is connected via ADB interface and we need to re-route
370-
TCP/IP communication over ADB interface. Below commands will do necessary setup and run tvm_rpc on remote device.
371-
372-
::
373-
374-
# Set android device to use
375-
export ANDROID_SERIAL=abcdefgh
376-
# Create a temporary folder on remote device.
377-
adb shell "mkdir -p /data/local/tmp/tvm_ci"
378-
# Copy tvm_rpc and it's dependency to remote device
379-
adb push build-adreno-target/tvm_rpc /data/local/tmp/tvm_test/tvm_rpc
380-
adb push build-adreno-target/libtvm_runtime.so /data/local/tmp/tvm_test
381-
# Forward port 9120 from target to host
382-
adb reverse tcp:9210 tcp:9120
383-
# tvm_rpc by default listens on ports starting from 5000 for incoming connections.
384-
# Hence, reroute connections to these ports on host to remore device.
385-
adb forward tcp:5000 tcp:5000
386-
adb forward tcp:5001 tcp:5001
387-
adb forward tcp:5002 tcp:5002
388-
# Finally launch rpc_daemon on remote device with identity key as "android"
389-
adb shell "cd /data/local/tmp/tvm_test; killall -9 tvm_rpc; sleep 2; LD_LIBRARY_PATH=/data/local/tmp/tvm_test/ ./tvm_rpc server --host=0.0.0.0 --port=5000 --port-end=5010 --tracker=127.0.0.1:9120 --key=android"
390-
391-
Upon successfull running this remote device will be available on tracker which can be queried as below.
392-
393-
::
394-
395-
python3 -m tvm.exec.query_rpc_tracker --port 9120
396-
Tracker address 127.0.0.1:9120
397-
Server List
398-
------------------------------
399-
server-address key
400-
------------------------------
401-
127.0.0.1:5000 server:android
402-
------------------------------
403-
404-
Queue Status
405-
-------------------------------
406-
key total free pending
407-
-------------------------------
408-
android 1 1 0
409-
-------------------------------
366+
Please refer to the tutorial
367+
`How To Deploy model on Adreno using TVMC <https://tvm.apache.org/docs/how_to/deploy_models/deploy_model_on_adreno.html>`_
368+
for manual RPC environment setup.
410369

411-
This concludes RPC Setup and we have rpc-tracker available on host 127.0.0.1 (rpc-tracker) and port 9120 (rpc-port).
370+
This concludes RPC Setup and we have rpc-tracker available on host 127.0.0.1 (rpc-tracker) and port 9190 (rpc-port).
412371

413372

414373
.. _commandline_interface:
@@ -431,7 +390,7 @@ Here we use a model from Keras and it uses RPC setup for tuning and finally gene
431390
resnet50.h5 -o \
432391
keras-resnet50.log \
433392
--early-stopping 0 --repeat 30 --rpc-key android \
434-
--rpc-tracker 127.0.0.1:9120 --trials 1024 \
393+
--rpc-tracker 127.0.0.1:9190 --trials 1024 \
435394
--tuning-records keras-resnet50-records.log --tuner xgb
436395

437396
**Model Compilation:**
@@ -466,7 +425,7 @@ We can use below tvmc command to deploy on remore target via RPC based setup.
466425
::
467426

468427
python3 -m tvm.driver.tvmc run --device="cl" keras-resnet50.tar \
469-
--rpc-key android --rpc-tracker 127.0.0.1:9120 --print-time
428+
--rpc-key android --rpc-tracker 127.0.0.1:9190 --print-time
470429

471430
tvmc based run has more option to initialize the input in various modes line fill, random ..etc.
472431

gallery/how_to/deploy_models/deploy_model_on_adreno.py

Lines changed: 49 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -53,11 +53,17 @@
5353
#
5454
# adb devices
5555
#
56+
# Set the android device to use
57+
#
58+
# .. code-block:: bash
59+
#
60+
# export ANDROID_SERIAL=<device-hash>
61+
#
5662
# Then to upload these two files to the device you should use:
5763
#
5864
# .. code-block:: bash
5965
#
60-
# adb -s <device_hash> push {libtvm_runtime.so,tvm_rpc} /data/local/tmp
66+
# adb push {libtvm_runtime.so,tvm_rpc} /data/local/tmp
6167
#
6268
# At this moment you will have «libtvm_runtime.so» and «tvm_rpc» on path /data/local/tmp on your device.
6369
# Sometimes cmake can’t find «libc++_shared.so». Use:
@@ -70,7 +76,7 @@
7076
#
7177
# .. code-block:: bash
7278
#
73-
# adb -s <device_hash> push libc++_shared.so /data/local/tmp
79+
# adb push libc++_shared.so /data/local/tmp
7480
#
7581
# We are now ready to run the TVM RPC Server.
7682
# Launch rpc_tracker with following line in 1st console:
@@ -83,12 +89,12 @@
8389
#
8490
# .. code-block:: bash
8591
#
86-
# adb -s <device_hash> reverse tcp:9190 tcp:9190
87-
# adb -s <device_hash> forward tcp:9090 tcp:9090
88-
# adb -s <device_hash> forward tcp:9091 tcp:9091
89-
# adb -s <device_hash> forward tcp:9092 tcp:9092
90-
# adb -s <device_hash> forward tcp:9093 tcp:9093
91-
# adb -s <device_hash> shell LD_LIBRARY_PATH=/data/local/tmp /data/local/tmp/tvm_rpc server --host=0.0.0.0 --port=9090 --tracker=127.0.0.1:9190 --key=android --port-end=9190
92+
# adb reverse tcp:9190 tcp:9190
93+
# adb forward tcp:5000 tcp:5000
94+
# adb forward tcp:5002 tcp:5001
95+
# adb forward tcp:5003 tcp:5002
96+
# adb forward tcp:5004 tcp:5003
97+
# adb shell LD_LIBRARY_PATH=/data/local/tmp /data/local/tmp/tvm_rpc server --host=0.0.0.0 --port=5000 --tracker=127.0.0.1:9190 --key=android --port-end=5100
9298
#
9399
# Before proceeding to compile and infer model, specify TVM_TRACKER_HOST and TVM_TRACKER_PORT
94100
#
@@ -130,6 +136,10 @@
130136
from tvm.relay.op.contrib import clml
131137
from tvm import autotvm
132138

139+
# Below are set of configuration that controls the behaviour of this script like
140+
# local run or device run, target definitions, dtype setting and auto tuning enablement.
141+
# Change these settings as needed if required.
142+
133143
# Adreno devices are efficient with float16 compared to float32
134144
# Given the expected output doesn't effect by lowering precision
135145
# it's advisable to use lower precision.
@@ -156,7 +166,8 @@
156166
arch = "arm64"
157167
target = tvm.target.Target("llvm -mtriple=%s-linux-android" % arch)
158168

159-
# Auto tuning is compute and time taking task, hence disabling for default run. Please enable it if required.
169+
# Auto tuning is compute intensive and time taking task,
170+
# hence disabling for default run. Please enable it if required.
160171
is_tuning = False
161172
tune_log = "adreno-resnet18.log"
162173

@@ -220,6 +231,19 @@
220231
#################################################################
221232
# Precisions
222233
# ----------
234+
235+
# Adreno devices are efficient with float16 compared to float32
236+
# Given the expected output doesn't effect by lowering precision
237+
# it's advisable to use lower precision.
238+
239+
# TVM support Mixed Precision through ToMixedPrecision transformation pass.
240+
# We may need to register precision rules like precision type, accumultation
241+
# datatype ...etc. for the required operators to override the default settings.
242+
# The below helper api simplifies the precision conversions across the module.
243+
# Now it supports dtypes "float16" and "float16_acc32".
244+
245+
# dtype is set to "float16_acc32" in configuration section above.
246+
223247
from tvm.relay.op.contrib import adreno
224248

225249
adreno.convert_to_dtype(mod["main"], dtype)
@@ -236,6 +260,12 @@
236260
# Prepare TVM Target
237261
# ------------------
238262

263+
# This generated example running on our x86 server for demonstration.
264+
265+
# To deply and tun on real target over RPC please set :code:`local_demo` to False in above configuration sestion.
266+
# Also, :code:`test_target` is set to :code:`llvm` as this example to make compatible for x86 demonstration.
267+
# Please change it to :code:`opencl` or :code:`opencl -device=adreno` for RPC target in configuration above.
268+
239269
if local_demo:
240270
target = tvm.target.Target("llvm")
241271
elif test_target.find("opencl"):
@@ -254,6 +284,10 @@
254284
rpc_tracker_port = int(os.environ.get("TVM_TRACKER_PORT", 9190))
255285
key = "android"
256286

287+
# Auto tuning is compute intensive and time taking task.
288+
# It is set to False in above configuration as this script runs in x86 for demonstration.
289+
# Please to set :code:`is_tuning` to True to enable auto tuning.
290+
257291
if is_tuning:
258292
# Auto Tuning Stage 1: Extract tunable tasks
259293
tasks = autotvm.task.extract_from_program(
@@ -275,9 +309,9 @@
275309
),
276310
)
277311
n_trial = 1024 # Number of iteration of training before choosing the best kernel config
278-
early_stopping = False # Do we apply early stopping when the loss is not minimizing
312+
early_stopping = False # Can be enabled to stop tuning while the loss is not minimizing.
279313

280-
# Iterate through each task and call the tuner
314+
# Auto Tuning Stage 3: Iterate through the tasks and tune.
281315
from tvm.autotvm.tuner import XGBTuner
282316

283317
for i, tsk in enumerate(reversed(tasks[:3])):
@@ -295,14 +329,17 @@
295329
autotvm.callback.log_to_file(tmp_log_file),
296330
],
297331
)
298-
# Pick the best performing kerl configurations from the overall log.
332+
# Auto Tuning Stage 4: Pick the best performing configurations from the overall log.
299333
autotvm.record.pick_best(tmp_log_file, tune_log)
300334

301335
#################################################################
302336
# Enable OpenCLML Offloading
303337
# --------------------------
304338
# OpenCLML offloading will try to accelerate supported operators
305339
# by using OpenCLML proprietory operator library.
340+
341+
# By default :code:`enable_clml` is set to False in above configuration section.
342+
306343
if not local_demo and enable_clml:
307344
mod = clml.partition_for_clml(mod, params)
308345

gallery/how_to/deploy_models/deploy_model_on_adreno_tvmc.py

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,10 @@
6565

6666
# To enable OpenCLML accelerated operator library.
6767
enable_clml = False
68-
cross_compiler = "/opt/android-sdk-linux/ndk/21.3.6528147/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang"
68+
cross_compiler = (
69+
os.environ["ANDROID_NDK_HOME"]
70+
+ "/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang"
71+
)
6972

7073
#######################################################################
7174
# Make a Keras Resnet50 Model
@@ -104,6 +107,12 @@
104107
rpc_key = "android"
105108
rpc_tracker = rpc_tracker_host + ":" + str(rpc_tracker_port)
106109

110+
# Auto tuning is compute intensive and time taking task.
111+
# It is set to False in above configuration as this script runs in x86 for demonstration.
112+
# Please to set :code:`is_tuning` to True to enable auto tuning.
113+
114+
# Also, :code:`test_target` is set to :code:`llvm` as this example to make compatible for x86 demonstration.
115+
# Please change it to :code:`opencl` or :code:`opencl -device=adreno` for RPC target in configuration above.
107116

108117
if is_tuning:
109118
tvmc.tune(
@@ -125,6 +134,11 @@
125134
# -----------
126135
# Compilation to produce tvm artifacts
127136

137+
# This generated example running on our x86 server for demonstration.
138+
# To deply and tun on real target over RPC please set :code:`local_demo` to False in above configuration sestion.
139+
140+
# OpenCLML offloading will try to accelerate supported operators by using OpenCLML proprietory operator library.
141+
# By default :code:`enable_clml` is set to False in above configuration section.
128142

129143
if not enable_clml:
130144
if local_demo:

tests/scripts/task_build_adreno_bins.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,6 @@ cd ${output_directory}
2828

2929
cp ../cmake/config.cmake .
3030

31-
echo set\(USE_MICRO OFF\) >> config.cmake
3231
if [ -f "${ADRENO_OPENCL}/CL/cl_qcom_ml_ops.h" ] ; then
3332
echo set\(USE_CLML "${ADRENO_OPENCL}"\) >> config.cmake
3433
echo set\(USE_CLML_GRAPH_EXECUTOR "${ADRENO_OPENCL}"\) >> config.cmake

tests/scripts/task_config_build_adreno.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,6 @@ mkdir -p "$BUILD_DIR"
2323
cd "$BUILD_DIR"
2424
cp ../cmake/config.cmake .
2525

26-
echo set\(USE_OPENCL ON\) >> config.cmake
2726
if [ -f "${ADRENO_OPENCL}/CL/cl_qcom_ml_ops.h" ] ; then
2827
echo set\(USE_CLML ${ADRENO_OPENCL}\) >> config.cmake
2928
fi

0 commit comments

Comments
 (0)