在Docker镜像中安装pydrill

3

我有一个基于 alpine 的 Docker 文件,它使用conda安装了几个软件包。最后使用pip安装pydrill,因为没有conda的安装程序。

from jcrist/alpine-dask

RUN /opt/conda/bin/conda update -n base -c defaults conda -y
RUN /opt/conda/bin/conda update dask
RUN /opt/conda/bin/conda install -c conda-forge dask-ml
RUN /opt/conda/bin/conda install scikit-learn -y
RUN /opt/conda/bin/conda install flask -y
RUN /opt/conda/bin/conda install waitress -y
RUN /opt/conda/bin/conda install gunicorn -y
RUN /opt/conda/bin/conda install pytest -y
RUN /opt/conda/bin/conda install apscheduler -y
RUN /opt/conda/bin/conda install matplotlib -y
RUN /opt/conda/bin/conda install pyodbc -y

USER root
RUN apk update
RUN apk add py-pip
RUN pip install pydrill

当我构建Docker镜像时,一切正常。但是当我运行容器后,命令行会启动gunicorn,但是它会失败并显示以下消息:

  File "/code/app/service/cm/exec/run_drill.py", line 1, in <module>
    from pydrill.client import PyDrill
   
   ModuleNotFoundError: No module named 'pydrill'

这个 pip 安装是否正确?这是 Docker Compose 的配置文件:

version: "3.0"
services:

  web:
    image: img-dask
    volumes:
      - vol_py_code:/code
      - vol_dask_data:/data
      - vol_dask_model:/model
    ports:
      - "5000:5000"
    working_dir: /code
    environment:
      - app.config=/code/conf/py.app.json
      - common.config=/code/conf/py.common.json     
    entrypoint:
      - /opt/conda/bin/gunicorn
    command:
      - -b 0.0.0.0:5000
      - --reload
      - app.frontend.app:app


 scheduler:
    image: img-dask
    ports:
      - "8787:8787"
      - "8786:8786"
    entrypoint:
      - /opt/conda/bin/dask-scheduler

  worker:
    image: img-dask
    depends_on:
      - scheduler
    environment:
      - PYTHONPATH=/code
      - MODEL_PATH=/model/rfc_model.pkl
      - PREPROCESSING_PATH=/model/data_columns.pkl
      - SCHEDULER_ADDRESS=scheduler
      - SCHEDULER_PORT=8786
    volumes:
      - vol_py_code:/code
      - vol_dask_data:/data
      - vol_dask_model:/model
    entrypoint:
      - /opt/conda/bin/dask-worker
    command:
      - scheduler:8786
      
volumes:
  vol_py_code:
     name: vol_py_code
  vol_dask_data:
     name: vol_dask_data
  vol_dask_model:
     name: vol_dask_model
  

更新

如果我在容器内运行命令行,则可以看到已安装 pydrill,但我的代码无法识别该库。

/code/conf # pip3 list
Package    Version  
---------- ---------
certifi    2020.12.5
chardet    4.0.0    
idna       2.10     
pip        18.1     
pydrill    0.3.4    
requests   2.25.1   
setuptools 40.6.2   
urllib3    1.26.4   
You are using pip version 18.1, however version 21.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

你尝试过使用/opt/conda/bin/pip install pydrill而不是pip install pydrill吗? - Elbek
3个回答

3
问题在于pydrill和所有其他conda软件包位于不同的环境中。当服务器启动时,它看不到pydrill,只能看到conda软件包。
为解决此问题,请在conda环境中安装pip本身。
from jcrist/alpine-dask

USER root
RUN /opt/conda/bin/conda create -p /pyenv -y
RUN /opt/conda/bin/conda install -p /pyenv dask scikit-learn flask waitress gunicorn \
    pytest apscheduler matplotlib pyodbc -y
RUN /opt/conda/bin/conda install -p /pyenv -c conda-forge dask-ml -y
RUN /opt/conda/bin/conda install -p /pyenv pip -y
RUN /pyenv/bin/pip install pydrill

0
你可以尝试使用conda install pip而不是apk
类似这样的东西。
from jcrist/alpine-dask
WORKDIR /opt/conda/bin

RUN conda update -n base -c defaults conda -y
RUN conda update dask
RUN install -c conda-forge dask-ml 
RUN conda install stickit-learn flask waitress gunicorn \
    pytest apscheduler matplotlib pydobc pip -y
RUN pip install pydrill


当我构建镜像时,出现了以下情况:
[3/7] RUN conda update -n base -c defaults conda -y: #6 0.227 /bin/sh: conda: not found
- ps0604
现在我已经接近成功了,在最后一行我得到了 > [6/6] RUN pip install pydrill: #9 0.254 /bin/sh: pip: not found - ps0604
嗯,奇怪;真的很奇怪?前一行成功安装了pip,对吧? - Işık Kaplan
查看更新,如果我登录到容器中,我会看到已安装了PyDrill。 - ps0604
你如何在Dockerfile中确定Python环境? - ps0604
显示剩余8条评论

0

我已经为conda-forge打包了pydrill,所以您可以简单地运行conda install -c conda-forge pydrill来安装。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接