I have a library my_lib.so
which links to several CUDA 10.1 libraries, including libnppicc.so
.
Running ldd
on the library outputs the following – all dependencies are resolved correctly:
12:51:45 ~/ $ ldd my_lib.so linux-vdso.so.1 (0x00007fffc5183000) libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0 (0x00007f8bdbb00000) librt.so.1 => /usr/lib/x86_64-linux-gnu/librt.so.1 (0x00007f8bdbaf6000) libomp.so => /usr/lib/llvm-7/lib/libomp.so (0x00007f8bdba0d000) libpthread.so.0 => /usr/lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f8bdb9ec000) libcudnn.so.7 => /usr/lib/x86_64-linux-gnu/libcudnn.so.7 (0x00007f8bc5100000) libdl.so.2 => /usr/lib/x86_64-linux-gnu/libdl.so.2 (0x00007f8bc50f9000) libcudart.so.10.1 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudart.so.10.1 (0x00007f8bc4e33000) libcublas.so.10 => /usr/lib/x86_64-linux-gnu/libcublas.so.10 (0x00007f8bc1098000) libcufft.so.10 => /usr/local/cuda/lib64/libcufft.so.10 (0x00007f8bb2d34000) libcusolver.so.10 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcusolver.so.10 (0x00007f8ba8229000) libcurand.so.10 => /usr/local/cuda/lib64/libcurand.so.10 (0x00007f8ba32f9000) libnppicc.so.10 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libnppicc.so.10 (0x00007f8ba2cba000) libnppial.so.10 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libnppial.so.10 (0x00007f8ba1f67000) libnppist.so.10 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libnppist.so.10 (0x00007f8ba0b11000) libnppidei.so.10 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libnppidei.so.10 (0x00007f8ba0121000) libnppig.so.10 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libnppig.so.10 (0x00007f8b9e64f000) libnppitc.so.10 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libnppitc.so.10 (0x00007f8b9e165000) libnpps.so.10 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libnpps.so.10 (0x00007f8b9d6de000) libnvToolsExt.so.1 => /usr/local/cuda/lib64/libnvToolsExt.so.1 (0x00007f8b9d4d5000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f8b9d351000) libm.so.6 => /usr/lib/x86_64-linux-gnu/libm.so.6 (0x00007f8b9d1ce000) libmvec.so.1 => /usr/lib/x86_64-linux-gnu/libmvec.so.1 (0x00007f8b9d1a2000) libgcc_s.so.1 => /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f8b9d188000) libc.so.6 => /usr/lib/x86_64-linux-gnu/libc.so.6 (0x00007f8b9cfc5000) /lib64/ld-linux-x86-64.so.2 (0x00007f8c3990d000) libgfortran.so.5 => /usr/lib/x86_64-linux-gnu/libgfortran.so.5 (0x00007f8b9cd57000) libcublasLt.so.10 => /usr/lib/x86_64-linux-gnu/libcublasLt.so.10 (0x00007f8b9aeb3000) libnppc.so.10 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libnppc.so.10 (0x00007f8b9ac38000) libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007f8b9abf4000) libz.so.1 => /usr/lib/x86_64-linux-gnu/libz.so.1 (0x00007f8b9a9d6000)
Next, I have a python bindings library which correctly links against this shared library lib_tf.so
.
When I try to run a simple python program which imports the python module, I get the following error:
Traceback (most recent call last): File "test.py", line 8, in <module> import myLib ImportError: /home/Jim/my_python_bindings_lib.cpython-37m-x86_64-linux-gnu.so: undefined symbol: nppiGammaInv_8u_C3IR
So we are getting an undefined symbol error to nppiGammaInv_8u_C3IR
.
The strange thing is that this symbol is defined in libnppicc.so
which is being linked.
We can confirm this is the case by running nm
:
12:51:53 ~/$ nm -D /usr/local/cuda-10.1/targets/x86_64-linux/lib/libnppicc.so.10 | gr ep nppiGammaInv_8u_C3IR 0000000000090590 T nppiGammaInv_8u_C3IR 00000000000907b0 T nppiGammaInv_8u_C3IR_Ctx
Why am I getting this error when the symbol has a definition? What’s stranger is that when I run the same test script & libs on other machines with CUDA 10.1 installed, it works fine. So something is wrong with this specific machine, but I can’t figure out what. I also have cuda 11.1 installed on this machine, not sure if that’s relevant.
Edit
Someone suggested I also run ldd
on the python bindings library, so here it is:
09:49:10 ~/ $ ldd my_python_bindings_lib.cpython-37m-x86_64-linux-gnu.so linux-vdso.so.1 (0x00007ffd3f79c000) my_lib.so => /home/Jim/my_lib.so (0x00007f5a522f4000) libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f5a522c3000) libpthread.so.0 => /usr/lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f5a522a2000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f5a5211e000) libm.so.6 => /usr/lib/x86_64-linux-gnu/libm.so.6 (0x00007f5a51f9b000) libgcc_s.so.1 => /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f5a51f7f000) libc.so.6 => /usr/lib/x86_64-linux-gnu/libc.so.6 (0x00007f5a51dbe000) /lib64/ld-linux-x86-64.so.2 (0x00007f5ab0bd0000) libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0 (0x00007f5a4fbda000) librt.so.1 => /usr/lib/x86_64-linux-gnu/librt.so.1 (0x00007f5a4fbd0000) libomp.so => /usr/lib/llvm-7/lib/libomp.so (0x00007f5a4fae7000) libcudnn.so.7 => /usr/lib/x86_64-linux-gnu/libcudnn.so.7 (0x00007f5a391fb000) libdl.so.2 => /usr/lib/x86_64-linux-gnu/libdl.so.2 (0x00007f5a391f4000) libcudart.so.10.1 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudart.so.10.1 (0x00007f5a38f2e000) libcublas.so.10 => /usr/lib/x86_64-linux-gnu/libcublas.so.10 (0x00007f5a35193000) libcufft.so.10 => /usr/local/cuda/lib64/libcufft.so.10 (0x00007f5a26e2f000) libcusolver.so.10 => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcusolver.so.10 (0x00007f5a1c324000) libcurand.so.10 => /usr/local/cuda/lib64/libcurand.so.10 (0x00007f5a173f2000) libnvToolsExt.so.1 => /usr/local/cuda/lib64/libnvToolsExt.so.1 (0x00007f5a171e9000) libgfortran.so.5 => /usr/lib/x86_64-linux-gnu/libgfortran.so.5 (0x00007f5a16f7b000) libcublasLt.so.10 => /usr/lib/x86_64-linux-gnu/libcublasLt.so.10 (0x00007f5a150d7000) libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007f5a15093000) libz.so.1 => /usr/lib/x86_64-linux-gnu/libz.so.1 (0x00007f5a14e75000)
Advertisement
Answer
You are importing a Python module, which depends on my_python_bindings_lib.cpython-37m-x86_64-linux-gnu.so
.
That library:
- has unresolved symbol
nppiGammaInv_8u_C3IR
(defined inlibnppicc
), and - does not depend on
libnppicc.so.10
where the symbol is defined.
It is exceedingly likely that my_python_bindings_lib
should depend on libnppicc
(since it uses a symbol defined there), and that adding that dependency will fix your import
problem.