I’m back. I think i’m close, but running into some last few troublesome things. A bunch of stuff changed and improved, so I’ve readjusted my build steps, and hopefully documented them well, below. Very long error logs are included as links to a pastebin.
@lissyx @elpimous_robot – anything jump out at you in what’s below?
Goal: compile deepspeech native_client
for ARMv8 (aarch64) with GPU support
Configuration
All work done was performed on an NVIDIA TX-1 running jetpack 3.1. The
kernel was recompiled to support swap files, and an 8GB swap file was
created.
Prep Work
FIrst, just set up the repos
Clone mozilla’s deepspeech and tensorflow libraries at the right versions
mkdir $HOME/deepspeech
cd $HOME/deepspeech
git clone https://github.com/mozilla/DeepSpeech
git clone https://github.com/mozilla/tensorflow@r1.4
cd $HOME
ln -s deepspeech/tensorflow ./
ln -s deepspeech/DeepSpeech ./
Update tc-vars.sh
I adjust the cuda paths to adapt to what’s true on the TX-1: diff below.
git diff 23d3d54b3cbc9099678e9f01e45ea2627c835fc1:tc-vars.sh HEAD:tc-vars.sh
diff --git a/23d3d54b3cbc9099678e9f01e45ea2627c835fc1:tc-vars.sh b/HEAD:tc-vars.sh
index dec1ad7..f372dc4 100755
--- a/23d3d54b3cbc9099678e9f01e45ea2627c835fc1:tc-vars.sh
+++ b/HEAD:tc-vars.sh
@@ -95,7 +95,9 @@ if [ "${OS}" = "Darwin" ]; then
fi;
### Define build parameters/env variables that we will re-ues in sourcing scripts.
-TF_CUDA_FLAGS="TF_CUDA_CLANG=0 TF_CUDA_VERSION=9.0 TF_CUDNN_VERSION=7 CUDA_TOOLKIT_PATH=${DS_ROOT_TASK}/DeepSpeech/CUDA CUDNN_INSTALL_PATH=${DS_ROOT_TASK}/DeepSpeech/CUDA TF_CUDA_COMPUTE_CAPABILITIES=\"3.0,3.5,3.7,5.2,6.0,6.1\""
+GV_CUDA_PATH='/usr/local/cuda'
+GV_CUDNN_PATH='/usr/lib/aarch64-linux-gnu/'
+TF_CUDA_FLAGS="TF_CUDA_CLANG=0 TF_CUDA_VERSION=8.0 TF_CUDNN_VERSION=6 CUDA_TOOLKIT_PATH=${GV_CUDA_PATH} CUDNN_INSTALL_PATH=${GV_CUDNN_PATH} TF_CUDA_COMPUTE_CAPABILITIES=\"3.0,3.5,3.7,5.2,5.3,6.0,6.1\""
BAZEL_ARM_FLAGS="--config=rpi3"
BAZEL_CUDA_FLAGS="--config=cuda"
BAZEL_EXTRA_FLAGS="--copt=-fvisibility=hidden"
Update tc-build.sh
Update tc-build to add a new option for building tensorflow natively on
ARMv8 with CUDA support (using the vars set in tc-vars.sh
).
git diff 23d3d54b3cbc9099678e9f01e45ea2627c835fc1:tc-build.sh tc-build.sh
diff --git a/tc-build.sh b/tc-build.sh
index 31c4d69..a7d432e 100755
--- a/tc-build.sh
+++ b/tc-build.sh
@@ -11,14 +11,18 @@ if [ "$1" = "--gpu" ]; then
build_gpu=yes
fi
-if [ "$1" = "--arm" ]; then
- build_gpu=no
+if [ "$2" = "--arm" ]; then
build_arm=yes
fi
-pushd ${DS_ROOT_TASK}/DeepSpeech/tf/
+pushd ${DS_ROOT_TASK}/tensorflow
BAZEL_BUILD="bazel ${BAZEL_OUTPUT_USER_ROOT} build -s --explain bazel_monolithic_tf.log --verbose_explanations --experimental_strict_action_env --config=monolithic"
+ # experimental aarch64 GPU build (NVIDIA Jetson-class devices)
+ if [ "${build_gpu}" = "yes" -a "${build_arm}" = "yes" ]; then
+ eval "export ${TF_CUDA_FLAGS}" && (echo "" | TF_NEED_CUDA=1 ./configure) && ${BAZEL_BUILD} -c opt ${BAZEL_CUDA_FLAGS} ${BAZEL_EXTRA_FLAGS} ${BUILD_TARGET_LIB_CPP_API} ${BUILD_TARGET_GRAPH_TRANSFORMS} ${BUILD_TARGET_GRAPH_SUMMARIZE} ${BUILD_TARGET_GRAPH_BENCHMARK} ${BUILD_TARGET_CONVERT_MMAP} ${BUILD_TARGET_FRAMEWORK} ${BUILD_TARGET_DEEPSPEECH} ${BUILD_TARGET_DEEPSPEECH_UTILS} ${BUILD_TARGET_KENLM} ${BUILD_TARGET_TRIE}
+ fi
+
# Pure amd64 CPU-only build
Build tensorflow 1.4
By running tc-build.sh --gpu --arm
, we obtain this tree in
bazel-bin
(very long paste), which contains the build targets we specified. In
particular, libdeepspeech.so
, libtensorflow_cc.so
, etc. are all
built and of reasonable sizes.
Attempt to build native-client
Next, I adapt the flow from taskcluster (as suggested on
discourse)
and created $HOME/deepspeech/DeepSpeech/taskcluster/cuda-arm-build.sh
#!/bin/bash
set -xe
source $(dirname "$0")/../tc-tests-utils.sh
source ${DS_ROOT_TASK}/tensorflow/tc-vars.sh
BAZEL_TARGETS="
//native_client:libdeepspeech.so
//native_client:deepspeech_utils
//native_client:generate_trie
"
BAZEL_ENV_FLAGS="TF_NEED_CUDA=1 ${TF_CUDA_FLAGS}"
BAZEL_BUILD_FLAGS="${BAZEL_CUDA_FLAGS} ${BAZEL_EXTRA_FLAGS} ${BAZEL_OPT_FLAGS}"
SYSTEM_TARGET=host
EXTRA_LOCAL_CFLAGS="-march=armv8-a"
EXTRA_LOCAL_LDFLAGS="-L/usr/local/cuda/targets/aarch64-linux/lib/ -L/usr/local/cuda/targets/aarch64-linux/lib/stubs -lcudart -lcuda"
#do_bazel_build
deepspeech_python_build()
{
rename_to_gpu=$1
# unset PYTHON_BIN_PATH
# unset PYTHONPATH
# export PYENV_ROOT="${DS_ROOT_TASK}/DeepSpeech/.pyenv"
# export PATH="${PYENV_ROOT}/bin:$PATH"
# install_pyenv "${PYENV_ROOT}"
# install_pyenv_virtualenv "$(pyenv root)/plugins/pyenv-virtualenv"
mkdir -p wheels
SETUP_FLAGS=""
if [ "${rename_to_gpu}" ]; then
SETUP_FLAGS="--project_name deepspeech-gpu"
fi
# for pyver in ${SUPPORTED_PYTHON_VERSIONS}; do
# pyenv install ${pyver}
# pyenv virtualenv ${pyver} deepspeech
# source ${PYENV_ROOT}/versions/${pyver}/envs/deepspeech/bin/activate
EXTRA_CFLAGS="${EXTRA_LOCAL_CFLAGS}" EXTRA_LDFLAGS="${EXTRA_LOCAL_LDFLAGS}" EXTRA_LIBS="${EXTRA_LOCAL_LIBS}" make -C native_client/ \
TARGET=${SYSTEM_TARGET} \
# RASPBIAN=/tmp/multistrap-raspbian-jessie \
TFDIR=${DS_TFDIR} \
SETUP_FLAGS="${SETUP_FLAGS}" \
bindings-clean bindings
cp native_client/dist/*.whl wheels
make -C native_client/ bindings-clean
# deactivate
# pyenv uninstall --force deepspeech
# done;
}
#do_deepspeech_binary_build
deepspeech_python_build rename_to_gpu
#do_deepspeech_nodejs_build rename_to_gpu
$(dirname "$0")/decoder-build.sh
Running this failed quickly with ld
failing to find Model::Model
in
libdeepspeech.so
.
With nm
, we can inspect libdeepspeech.so
and see that indeed the
symbols are missing. They are present, however, in libdeepspeech.a
:
ubuntu@nvidia:~/tensorflow/bazel-bin/native_client$ nm -gC libdeepspeech.so | grep Model::Model
ubuntu@nvidia:~/tensorflow/bazel-bin/native_client$ nm -gC libdeepspeech.a | grep Model::Model
0000000000000000 T DeepSpeech::Model::Model(char const*, int, int, char const*, int)
0000000000000000 T DeepSpeech::Model::Model(char const*, int, int, char const*, int)
We move past this point with some trepidation, but changing
DeepSpeech/native_client/definitions.mk
as follows yielded some
success:
ubuntu@nvidia:~/deepspeech/DeepSpeech/native_client$ git diff 52adc2b2ddfb70eebfea84ada44f74af29336f2b:native_client/definitions.mk definitions.mk
diff --git a/native_client/definitions.mk b/native_client/definitions.mk
index 32a4d80..622e88d 100644
--- a/native_client/definitions.mk
+++ b/native_client/definitions.mk
@@ -48,8 +48,8 @@ LDFLAGS_RPATH := -Wl,-rpath,@executable_path
endif
CFLAGS += $(EXTRA_CFLAGS)
-LIBS := -ldeepspeech -ldeepspeech_utils $(EXTRA_LIBS)
-LDFLAGS_DIRS := -L${TFDIR}/bazel-bin/native_client $(EXTRA_LDFLAGS)
+LIBS := -ltensorflow_so -l:libdeepspeech.a -l:libdeepspeech_utils.a $(EXTRA_LIBS)
+LDFLAGS_DIRS := -L${TFDIR}/bazel-bin/tensorflow -L${TFDIR}/bazel-bin/native_client $(EXTRA_LDFLAGS)
LDFLAGS += $(LDFLAGS_NEEDED) $(LDFLAGS_RPATH) $(LDFLAGS_DIRS) $(LIBS)
Now we see the symbols in libdeepspeech.a.
However, rerunning the native client build now yields this very long
error, which
seems to make it past missing Model::Model
, only to fail on finding
symbols in libtensorflow.so
in a very similar fashion.
At this point i started to get worried that my bazel build step was
totally wrong in some way that broke the linker.
Questions
- How can i make
*.so
files only, and skip making .a
entirely?
- What could cause symbols to be stripped from the
.so
in this
situation?
- How close am I to home base?