I am trying to update pandas within a lifecycle configuration, and following the example of AWS I have the next code:
JavaScript
x
16
16
1
#!/bin/bash
2
3
set -e
4
5
# OVERVIEW
6
# This script installs a single pip package in a single SageMaker conda environments.
7
8
sudo -u ec2-user -i <<EOF
9
# PARAMETERS
10
PACKAGE=pandas
11
ENVIRONMENT=python3
12
source /home/ec2-user/anaconda3/bin/activate "$ENVIRONMENT"
13
pip install --upgrade "$PACKAGE"==0.25.3
14
source /home/ec2-user/anaconda3/bin/deactivate
15
EOF
16
Then I attach it to a notebook and when I enter the notebook and open a notebook file, I see that pandas have not been updated. Using !pip show pandas
I get:
JavaScript
1
11
11
1
Name: pandas
2
Version: 0.24.2
3
Summary: Powerful data structures for data analysis, time series, and statistics
4
Home-page: http://pandas.pydata.org
5
Author: None
6
Author-email: None
7
License: BSD
8
Location: /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages
9
Requires: pytz, python-dateutil, numpy
10
Required-by: sparkmagic, seaborn, odo, hdijupyterutils, autovizwidget
11
So we can see that I am indeed in the python3 env although the version is 0.24.
However, the log in cloudwatch shows that it has been installed:
JavaScript
1
14
14
1
Collecting pandas==0.25.3 Downloading https://files.pythonhosted.org/packages/52/3f/f6a428599e0d4497e1595030965b5ba455fd8ade6e977e3c819973c4b41d/pandas-0.25.3-cp36-cp36m-manylinux1_x86_64.whl (10.4MB)
2
2020-02-03T12:33:09.065+01:00
3
Requirement already satisfied, skipping upgrade: pytz>=2017.2 in ./anaconda3/lib/python3.6/site-packages (from pandas==0.25.3) (2018.4)
4
2020-02-03T12:33:09.065+01:00
5
Requirement already satisfied, skipping upgrade: python-dateutil>=2.6.1 in ./anaconda3/lib/python3.6/site-packages (from pandas==0.25.3) (2.7.3)
6
2020-02-03T12:33:09.065+01:00
7
Requirement already satisfied, skipping upgrade: numpy>=1.13.3 in ./anaconda3/lib/python3.6/site-packages (from pandas==0.25.3) (1.16.4)
8
2020-02-03T12:33:09.065+01:00
9
Requirement already satisfied, skipping upgrade: six>=1.5 in ./anaconda3/lib/python3.6/site-packages (from python-dateutil>=2.6.1->pandas==0.25.3) (1.13.0)
10
2020-02-03T12:33:09.065+01:00
11
Installing collected packages: pandas Found existing installation: pandas 0.24.2 Uninstalling pandas-0.24.2: Successfully uninstalled pandas-0.24.2
12
2020-02-03T12:33:12.066+01:00
13
Successfully installed pandas-0.25.3
14
What could be the problem?
Advertisement
Answer
if you want to install the packages only in for the python3 environment, use the following script in your Create Sagemaker Lifecycle configurations.
JavaScript
1
12
12
1
#!/bin/bash
2
sudo -u ec2-user -i <<'EOF'
3
4
# This will affect only the Jupyter kernel called "conda_python3".
5
source activate python3
6
7
# Replace myPackage with the name of the package you want to install.
8
pip install pandas==0.25.3
9
# You can also perform "conda install" here as well.
10
source deactivate
11
EOF
12
Reference : “Lifecycle Configuration Best Practices”