My Dockerfile (run by a much larger docker-compose):
# set base image (host OS) FROM python:3.7 ARG scorer # set the working directory in the container WORKDIR /code # Download and install kenlm and stt for generating scorer files RUN git clone https://github.com/kpu/kenlm.git --depth=1 RUN git clone https://github.com/coqui-ai/stt --depth=1 RUN apt-get update && apt-get install -y build-essential libboost-all-dev cmake libeigen3-dev RUN mkdir /code/kenlm/build RUN cd /code/kenlm/build && cmake .. && make -j 4 # copy the dependencies file to the working directory COPY requirements.txt . # copy libraries (asr-common) COPY lib ./lib # install dependencies RUN pip install -r requirements.txt # copy the content of the local src directory to the working directory COPY src_code_folder ./src_code_folder RUN mkdir -p /code/models RUN ls -lF models # empty directory as expected COPY models/scorers/$scorer /code/models/${scorer} RUN ls -lF models # output (as expected): # some.scorer* RUN curl -o /code/models/model.tflite -L https://coqui.gateway.scarf.sh/english/coqui/v1.0.0-large-vocab/model.tflite RUN ls -lF # output (as expected): # src_code_folder/ # kenlm/ # lib/ # models/ # requirements.txt* # stt/ RUN ls -F models # output (as expected): # some.scorer* # model.tflite* # command to run on container start CMD [ "python", "-m", "src_code_folder" ]
and the relevant code from the docker-compose.yml
:
coqui-asr: build: context: microservices/coqui-asr args: scorer: some.scorer container_name: coqui-asr restart: always environment: - MQTT_ENDPOINT depends_on: - broker volumes: - ./microservices/coqui-asr/models:/code/models
The Python code I’m using to check the directory structure:
pbmms = glob.glob(os.path.join(args.models_dir, "*.tflite")) scorers = glob.glob(os.path.join(args.models_dir, "*.scorer")) logger.debug(f"Input: {args.models_dir}") logger.debug(f"tflite file: {pbmms}") logger.debug(f"scorer file: {scorers}") logger.debug(f"this directory: {os.path.dirname(os.path.realpath(__file__))}") logger.debug(f"current working directory: {os.getcwd()}") for (dirpath, dirnames, filenames) in os.walk(os.getcwd()): if 'code/stt' in dirpath or 'code/kenlm' in dirpath: # these cloned repos have a LOT of folders we don't need to see continue logger.debug(f"Path: {dirpath}") logger.debug(f"tDirectory: {dirnames}") for (dirpath, dirnames, filenames) in os.walk("models/"): logger.debug(f"Path: {dirpath}") logger.debug(f"tDirectory: {dirnames}") logger.debug(f"tFile: {filenames}") assert len(pbmms) == 1 # passes assert len(scorers) == 1 # fails
and its output:
DEBUG Input: models DEBUG tflite file: ['models/model.tflite'] DEBUG scorer file: [] DEBUG this directory: /code/src_code_folder DEBUG current working directory: /code DEBUG Path: /code DEBUG Directory: ['src_code_folder', 'models', 'lib', 'kenlm', 'stt'] ... irrelevant output of all the other folders ... DEBUG Path: models/ DEBUG Directory: ['scorers'] <----- WHY IS THIS HERE DEBUG File: ['model.tflite'] DEBUG Path: models/scorers <----- WHY DOES THIS APPEAR DEBUG Directory: [] DEBUG File: ['some.scorer', 'other.scorer'] Traceback (most recent call last): File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/code/src_code_folder/__main__.py", line 71, in <module> main(args) File "/code/src_code_folder/__main__.py", line 33, in main assert len(scorers) == 1 AssertionError
I don’t understand why the directory structure seen/output by the Dockerfile would be completely different from the directory structure seen/output by my Python file. For some reason, even though the Python code is run by (and inside) the Docker container and only specific files and folders are copied, the file system seen/output by Python seems to match my host system’s file structure:
Clearly some stuff is getting copied but not at all what I would expect based on my Dockerfile commands and the output from said Dockerfile.
Please let me know if I need to add more information.
Advertisement
Answer
Your Compose file specifies:
volumes: - ./microservices/coqui-asr/models:/code/models
This indicates that the /code/models
directory in the image, and whatever setup you’ve done locally on it, should be hidden and replaced with the named host directory.
Your image already contains the models, though, and it’s done some additional pre-processing on them. You should delete this volumes:
block so that you see the original contents of the image.