Unable to fetch packages from private pypi server in Amazon EC2

For couple of reasons we decided to host all our private and public Python dependencies(and their dependencies) on Amazon S3. We intend to download/install the packages only from S3 and nowhere else.

I followed the steps mentioned at https://stackoverflow.com/a/57552988/3007402 (I wrote the answer) to setup pypi server on S3.

To upload public packages to S3, I would first download them using
pip download numpy==1.14.2
pip download statsmodels==0.6.1

To install any package I would use

pip install pandas --index-url=http://<s3_endpoint> --trusted-host=<s3_endpoint> --no-cache-dir

Everything is working fine with packages that are downloaded as .whl files. Such packages(for e.g. pandas) are able to install themselves and their dependencies(numpy in case of pandas) without any problems.

The issue is with non-whl packages such as statsmodels-0.6.1.tar.gz. While pip is used to install statsmodels, to install the dependencies, statsmodels uses easy_install.
The pip arg --index-url is not used by easy_install and it would download the dependency – numpy from pypi.org.

To fix this(download only from S3), I extracted statsmodels-0.6.1.tar.gz, edited setup.cfg, repackaged it and uploaded to S3. Below is the content of setup.cfg:

[egg_info]
tag_build =
tag_date = 0
tag_svn_revision = 0

# lines below are added by me
[easy_install]
index_url = http://<s3_link>
find_links = http://<s3_link>

With that change statsmodels fetches the dependency numpy from S3 and installs it successfully.

For some odd reason, this only works in Ubuntu(local and EC2 running Ubuntu) but fails on an EC2 running Amazon Linux. Below is the log I saved using --log <file> argument to pip. I removed the timestamp for brevity.

Created temporary directory: /tmp/pip-ephem-wheel-cache-7SD5Bu
Created temporary directory: /tmp/pip-req-tracker-du4AEi
Created requirements tracker '/tmp/pip-req-tracker-du4AEi'
Created temporary directory: /tmp/pip-install-G2qw36
Looking in indexes: http://<s3_link>
Collecting statsmodels
  1 location(s) to search for versions of statsmodels:
  * http://<s3_link>/statsmodels/
  Getting page http://<s3_link>/statsmodels/
  Found index url http://<s3_link>
  Analyzing links from page http://<s3_link>/statsmodels/
    Found link http://<s3_link>/statsmodels/statsmodels-0.6.1.tar.gz (from http://<s3_link>/statsmodels/), version: 0.6.1
  Given no hashes to check 1 links for project 'statsmodels': discarding no candidates
  Using version 0.6.1 (newest of versions: 0.6.1)
  Created temporary directory: /tmp/pip-unpack-r8lKU4
  Found index url http://<s3_link>
  Downloading http://<s3_link>/statsmodels/statsmodels-0.6.1.tar.gz (7.1MB)
  Downloading from URL http://<s3_link>/statsmodels/statsmodels-0.6.1.tar.gz (from http://<s3_link>/statsmodels/)
  Added statsmodels from http://<s3_link>/statsmodels/statsmodels-0.6.1.tar.gz to build tracker '/tmp/pip-req-tracker-du4AEi'
    Running setup.py (path:/tmp/pip-install-G2qw36/statsmodels/setup.py) egg_info for package statsmodels
    Running command python setup.py egg_info
    No local packages or download links found for numpy
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-G2qw36/statsmodels/setup.py", line 449, in <module>
        **setuptools_kwargs)
      File "/usr/lib64/python2.7/distutils/core.py", line 111, in setup
        _setup_distribution = dist = klass(attrs)
      File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/setuptools/dist.py", line 265, in __init__
        self.fetch_build_eggs(attrs['setup_requires'])
      File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/setuptools/dist.py", line 311, in fetch_build_eggs
        replace_conflicting=True,
      File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 797, in resolve
        dist = best[req.key] = env.best_match(req, ws, installer)
      File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1047, in best_match
        return self.obtain(req, installer)
      File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1059, in obtain
        return installer(requirement)
      File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/setuptools/dist.py", line 378, in fetch_build_egg
        return cmd.easy_install(req)
      File "/home/ec2-user/tempenv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 617, in easy_install
        raise DistutilsError(msg)
    distutils.errors.DistutilsError: Could not find suitable distribution for Requirement.parse('numpy')

Output of cat /etc/os-release(Amazon Linux details):

NAME="Amazon Linux AMI"
VERSION="2017.03"
ID="amzn"
ID_LIKE="rhel fedora"
VERSION_ID="2017.03"
PRETTY_NAME="Amazon Linux AMI 2017.03"

Answer

Apparently, the EC2 running Amazon Linux was having an older version of setuptools.
I upgraded to the latest version and my installation went fine. 😅

Source: stackoverflow
The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .