Minimizing python docker images

During the transition to a micro service based approach at Qualislabs I saw new faces joining my team. There was definitely a learning curve involved and generally it reduces the effectiveness of the team. A newcomer should not feel stranded and burdened with a lot to work with.

Dependency and Environment Hell

Originally we had a monolithic application either in python or node and I must say we were able to debug and fix issues easily even on each others machine. However as our number increased, the number of differing environments became an issue which meant that every machine had to have the correct dependencies and environment.

12 Factor Applications

If you have worked on making applications cloud friendly then you may have encountered the 12 factors. We were trying to ensure that every service within our application adhered to all of these factors in order to easily scale and deploy our services with very little to no manual effort.

The Issue

Setting appropriate dependencies and environment variables for all members was becoming a bit of a burden and there were times when tests were run against production resources. This was due to developers forgetting to set environment variables back to development values.

Docker to our rescue

I later found out about docker. I must admit it felt like love at first sight. With containerization we were able to overcome most of the challenges we had but still missed our former days of monoliths. Developers now only had to build their code with a well documented README file then I would later do the system integration on Friday evening. This was hard since my friends had a better plan for me that evening but I have stayed faithful to docker. We have had our ups and downs but we still care for each other 😁. We later converted our apps to micro services written in different languages from the wrath C to the lovely python and Go.

Minimizing images

So this is what led me to write this. Most of the developers starting out follow what every Indian guy tells them on YouTube like a Church mass. But they encounter many problems which they don’t know how to solve. One of these problems is when you have large images you have a slower run time and also it takes a lot of time to build and test to production. Large images may hog your memory if you write simple apps with a base image of Ubuntu or Debian yet it would have been better to convert the system to a monolith and they would have shared the requirements.

101

Docker is a set of platforms as a service products that uses OS-level virtualization to deliver software in packages called containers.

from flask import Flaskapp = Flask(__name__)@app.route("/")
def hello_world():
return "Hello world from container"
if __name__ == "__main__":
app.run(host="0.0.0.0", port=5000)
virtualenv venv --python=python3
source venv/bin/activate
pip3 install flask
pip freeze > requirements.txt
FROM ubuntu
LABEL Maintainer="Rodney Osodo"
WORKDIR /app
RUN apt-get update
RUN apt-get install python3-pip
COPY . /app
RUN pip3 install -r requirements.txt
WORKDIR EXPOSE 5000
CMD ['python3','app.py']
docker build -t flask_test:1.0.0 .

Build process

Lets make another dockerfile from python as the base image:

FROM python
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
  1. Reduce build time
  1. Smaller image for production
FROM python:3.7-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
FROM python:3.7-slim-stretch
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
FROM python:3.7-alpine
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
  1. To benefit from caching arrange statements in ascending order depending on system level dependencies.
  2. Do not save dependencies
    - pip3 — no-cache-dir
    - apk — no-cache
FROM python:3.7-alpine
WORKDIR /app
COPY . .
RUN pip3 install --no-cache-dir -r requirements.txt
CMD ["python", "app.py"]
  • Copy the result to a fresh image and label is as the final image.
# Stage 1 - Install build dependencies
FROM python:3.7-alpine AS builder
WORKDIR /app
RUN python -m venv .venv && .venv/bin/pip install --no-cache-dir -U pip setuptools
COPY requirements.txt .
RUN .venv/bin/pip install --no-cache-dir -r requirements.txt && find /app/.venv ( -type d -a -name test -o -name tests \) -o \( -type f -a -name '*.pyc' -o -name '*.pyo' \) -exec rm -rf '{}' \+
# Stage 2 - Copy only necessary files to the runner stage
FROM python:3.7-alpine
WORKDIR /app
COPY --from=builder /app /app
COPY app.py .
ENV PATH="/app/.venv/bin:$PATH"
CMD ["python", "app.py"]

Our benchmark

Our benchmark app was a go app with the same functionality.
The beauty of using a compiled code is that it runs faster than interpreted code. The compiled program was checked for errors during compilation.

package main
import (
"fmt"
"net/http"
)
func main(){
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request){
fmt.Fprintf(w, "Hello from conatiner")
})
http.ListenAndServe(":5000", nil)
}
FROM scratch
COPY app /
CMD ["/app"]

Enthusiastic Quantum computing engineer with a clear understanding of Quantum computing and Machine learning and training in Mechatronics engineering.

Enthusiastic Quantum computing engineer with a clear understanding of Quantum computing and Machine learning and training in Mechatronics engineering.