Skip to content
Advertisement

install python packages using init scripts in a databricks cluster

I have installed the databricks cli tool by running the following command

pip install databricks-cli using the appropriate version of pip for your Python installation. If you are using Python 3, run pip3.

Then by creating a PAT (personal-access token in Databricks) I run the following .sh bash script:

JavaScript

python_dependencies.sh script

JavaScript

I use the above script to install python libraries in the init-scripts of the cluster

enter image description here

My problem is that even though everything seems to be fine and the cluster is started successfully, the libraries are not installed properly. When I click on the libraries tab of the cluster I get this:

enter image description here Only 1 out of the 10 python libraries is installed.

Appreciate your help and comments.

Advertisement

Answer

I have found the solution based on the comment of @RedCricket,

JavaScript

The above .sh file will install all the python dependencies referenced when the cluster is starting. So, the libraries won’t have to be re-installed when the notebook is re-executed.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement