Move docs from release notes to dedicated Jetson docs section (#3958) (#3975)

Hemant Jain · web-flow · commit 3a185157bf6e · 2022-02-23T11:22:29.000-08:00
* Move docs from release notes to dedicated jetson doc section

* Must install the PyTorch wheel as well. Needed for build + runtime

* Fix for 2.19.0 release

* Move version specific information to release notes
diff --git a/docs/jetson.md b/docs/jetson.md
@@ -1,5 +1,5 @@
 <!--
-# Copyright 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# Copyright 2021-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 #
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions
@@ -28,29 +28,142 @@
 
 # Triton Inference Server Support for Jetson and JetPack
 
-Triton Inference Server is officially supported on JetPack starting from JetPack 4.6. Triton Inference Server on Jetson supports trained AI models from multiple frameworks includings NVIDIA TensorRT, TensorFlow and ONNX Runtime.
-
-On JetPack, although HTTP/REST and GRPC inference protocols are supported, for edge use cases, direct [C API integration](inference_protocols.md#c-api) is recommended.
+A release of Triton for [JetPack 4.6.1](https://developer.nvidia.com/embedded/jetpack)
+is provided in the attached tar file in the [release notes](https://github.com/triton-inference-server/server/releases).
 
 ![Triton on Jetson Diagram](images/triton_on_jetson.png)
 
 Triton Inference Server support on JetPack includes:
 
 * Running models on GPU and NVDLA
-* Support for multiple frameworks: TensorRT, TensorFlow and ONNX Runtime.
 * [Concurrent model execution](architecture.md#concurrent-model-execution)
 * [Dynamic batching](architecture.md#models-and-schedulers)
 * [Model pipelines](architecture.md#ensemble-models)
 * [Extensible backends](https://github.com/triton-inference-server/backend)
 * [HTTP/REST and GRPC inference protocols](inference_protocols.md)
 * [C API](inference_protocols.md#c-api)
 
-You can download the `.tar` files for Jetson published on the Triton Infence Server [release page](https://github.com/triton-inference-server/server/releases) in _"Jetson JetPack Support"_ section. The `.tar` file contains the Triton executables and shared libraries, as well as the C++ and Python client libraries and examples.
+Limitations on Jetson/JetPack:
+
+* Onnx Runtime backend does not support the OpenVino execution provider.
+The TensorRT execution provider however is supported.
+* The Python backend does not support GPU Tensors and Async BLS.
+* CUDA IPC (shared memory) is not supported. System shared memory however is supported.
+* GPU metrics, GCS storage, S3 storage and Azure storage are not supported.
+
+On JetPack, although HTTP/REST and GRPC inference protocols are supported, for edge
+use cases, direct [C API integration](inference_protocols.md#c-api) is recommended.
+
+You can download the `.tar` files for Jetson from the Triton Inference Server
+[release page](https://github.com/triton-inference-server/server/releases) in the
+_"Jetson JetPack Support"_ section.
+
+The `.tar` file contains the Triton server executable and shared libraries,
+as well as the C++ and Python client libraries and examples.
+
+## Installation and Usage
+
+The following dependencies must be installed before building / running Triton server:
+
+```
+apt-get update && \
+        apt-get install -y --no-install-recommends \
+            software-properties-common \
+            autoconf \
+            automake \
+            build-essential \
+            git \
+            libb64-dev \
+            libre2-dev \
+            libssl-dev \
+            libtool \
+            libboost-dev \
+            rapidjson-dev \
+            patchelf \
+            pkg-config \
+            libopenblas-dev \
+            libarchive-dev \
+            zlib1g-dev \
+            python3 \
+            python3-pip \
+            python3-dev
+```
+
+Additional PyTorch dependencies:
+
+```
+apt-get -y install autoconf \
+            bc \
+            g++-8 \
+            gcc-8 \
+            clang-8 \
+            lld-8
+
+pip3 install --upgrade expecttest xmlrunner hypothesis aiohttp pyyaml scipy ninja typing_extensions protobuf
+```
+
+In addition to the above Pytorch dependencies, the PyTorch wheel corresponding to this release must also be installed:
+
+```
+pip3 install --upgrade https://developer.download.nvidia.com/compute/redist/jp/v461/pytorch/torch-1.11.0a0+17540c5-cp36-cp36m-linux_aarch64.whl
+```
+
+**Note**: The PyTorch backend depends on libomp.so, which is not loaded automatically.
+If using the PyTorch backend in Triton, you need to set the LD_LIBRARY_PATH to allow
+libomp.so to be loaded as needed before launching Triton.
+
+```
+LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib/llvm-8/lib"
+```
+
+**Note**: When building Triton on Jetson, you will require a recent version of cmake. 
+We recommend using cmake 3.21.0. Below is a script to upgrade your cmake version to 3.21.0.
+
+```
+apt remove cmake
+wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | \
+      gpg --dearmor - | \
+      tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null && \
+    apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main' && \
+    apt-get update && \
+    apt-get install -y --no-install-recommends \
+        cmake-data=3.21.0-0kitware1ubuntu18.04.1 cmake=3.21.0-0kitware1ubuntu18.04.1
+```
+
+**Note**: Seeing a core dump when using numpy 1.19.5 on Jetson is a [known issue](https://github.com/numpy/numpy/issues/18131).
+We recommend using numpy version 1.19.4 or earlier to work around this issue.
+
+To build / run the Triton client libraries and examples on Jetson, the following dependencies must also be installed.
+
+```
+apt-get install -y --no-install-recommends \
+            curl \
+            jq
+
+    pip3 install --upgrade wheel setuptools cython && \
+    pip3 install --upgrade grpcio-tools numpy==1.19.4 future attrdict
+    pip3 install --upgrade six requests flake8 flatbuffers pillow
+```
+
+**Note**: OpenCV 4.1.1 is installed as a part of JetPack. It is one of the dependencies for the client build.
+
+**Note**: On Jetson, the backend directory must be explicitly specified using the
+`--backend-directory` flag. Triton defaults to using TensorFlow 1.x and a version string
+is required to use TensorFlow 2.x.
+
+```
+tritonserver --model-repository=/path/to/model_repo --backend-directory=/path/to/tritonserver/backends \
+             --backend-config=tensorflow,version=2
+```
 
-Note that [perf_analyzer](perf_analyzer.md) is supported on Jetson, while the [model_analyzer](model_analyzer.md) is currently not available for Jetson. To execute `perf_analyzer` for C API, include the option `--service-kind=triton_c_api`: 
+**Note**: [perf_analyzer](perf_analyzer.md) is supported on Jetson, while the [model_analyzer](model_analyzer.md)
+is currently not available for Jetson. To execute `perf_analyzer` for C API, use
+the CLI flag `--service-kind=triton_c_api`: 
 
 ```shell
-perf_analyzer -m graphdef_int32_int32_int32 --service-kind=triton_c_api --triton-server-directory=/opt/tritonserver --model-repository=/workspace/qa/L0_perf_analyzer_capi/models
+perf_analyzer -m graphdef_int32_int32_int32 --service-kind=triton_c_api \
+    --triton-server-directory=/opt/tritonserver \
+    --model-repository=/workspace/qa/L0_perf_analyzer_capi/models
 ```
 
 Refer to these [examples](examples/jetson) that demonstrate how to use Triton Inference Server on Jetson.