For couple of reasons we decided to host all our private and public Python dependencies(and their dependencies) on Amazon S3. We intend to download/install the packages only from S3 and nowhere else.
I followed the steps mentioned at https://stackoverflow.com/a/57552988/3007402 (I wrote the answer) to setup pypi server on S3.
To upload public packages to S3, I would first download them using
pip download numpy==1.14.2
pip download statsmodels==0.6.1
To install any package I would use
pip install pandas --index-url=http://<s3_endpoint> --trusted-host=<s3_endpoint> --no-cache-dir
Everything is working fine with packages that are downloaded as
.whl files. Such packages(for e.g.
pandas) are able to install themselves and their dependencies(
numpy in case of
pandas) without any problems.
The issue is with non-whl packages such as
pip is used to install
statsmodels, to install the dependencies,
The pip arg
--index-url is not used by
easy_install and it would download the dependency –
numpy from pypi.org.
To fix this(download only from S3), I extracted
setup.cfg, repackaged it and uploaded to S3. Below is the content of
[egg_info] tag_build = tag_date = 0 tag_svn_revision = 0 # lines below are added by me [easy_install] index_url = http://<s3_link> find_links = http://<s3_link>
With that change
statsmodels fetches the dependency
numpy from S3 and installs it successfully.
For some odd reason, this only works in Ubuntu(local and EC2 running Ubuntu) but fails on an EC2 running Amazon Linux. Below is the log I saved using
--log <file> argument to pip. I removed the timestamp for brevity.
Created temporary directory: /tmp/pip-ephem-wheel-cache-7SD5Bu Created temporary directory: /tmp/pip-req-tracker-du4AEi Created requirements tracker '/tmp/pip-req-tracker-du4AEi' Created temporary directory: /tmp/pip-install-G2qw36 Looking in indexes: http://<s3_link> Collecting statsmodels 1 location(s) to search for versions of statsmodels: * http://<s3_link>/statsmodels/ Getting page http://<s3_link>/statsmodels/ Found index url http://<s3_link> Analyzing links from page http://<s3_link>/statsmodels/ Found link http://<s3_link>/statsmodels/statsmodels-0.6.1.tar.gz (from http://<s3_link>/statsmodels/), version: 0.6.1 Given no hashes to check 1 links for project 'statsmodels': discarding no candidates Using version 0.6.1 (newest of versions: 0.6.1) Created temporary directory: /tmp/pip-unpack-r8lKU4 Found index url http://<s3_link> Downloading http://<s3_link>/statsmodels/statsmodels-0.6.1.tar.gz (7.1MB) Downloading from URL http://<s3_link>/statsmodels/statsmodels-0.6.1.tar.gz (from http://<s3_link>/statsmodels/) Added statsmodels from http://<s3_link>/statsmodels/statsmodels-0.6.1.tar.gz to build tracker '/tmp/pip-req-tracker-du4AEi' Running setup.py (path:/tmp/pip-install-G2qw36/statsmodels/setup.py) egg_info for package statsmodels Running command python setup.py egg_info No local packages or download links found for numpy Traceback (most recent call last): File "<string>", line 1, in <module> File "/tmp/pip-install-G2qw36/statsmodels/setup.py", line 449, in <module> **setuptools_kwargs) File "/usr/lib64/python2.7/distutils/core.py", line 111, in setup _setup_distribution = dist = klass(attrs) File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/setuptools/dist.py", line 265, in __init__ self.fetch_build_eggs(attrs['setup_requires']) File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/setuptools/dist.py", line 311, in fetch_build_eggs replace_conflicting=True, File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 797, in resolve dist = best[req.key] = env.best_match(req, ws, installer) File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1047, in best_match return self.obtain(req, installer) File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1059, in obtain return installer(requirement) File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/setuptools/dist.py", line 378, in fetch_build_egg return cmd.easy_install(req) File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 617, in easy_install raise DistutilsError(msg) distutils.errors.DistutilsError: Could not find suitable distribution for Requirement.parse('numpy')
cat /etc/os-release(Amazon Linux details):
NAME="Amazon Linux AMI" VERSION="2017.03" ID="amzn" ID_LIKE="rhel fedora" VERSION_ID="2017.03" PRETTY_NAME="Amazon Linux AMI 2017.03"
Apparently, the EC2 running Amazon Linux was having an older version of
I upgraded to the latest version and my installation went fine. 😅