Docker – Everything you need to know as a developer

The basic idea behind Docker is to build small independent deployable units of software (Containers) that run in the same predictable way, no matter where or how many times you deploy them. You don’t have to install and configure all software components yourself on a bunch of different servers, hoping no mistake has been made when setting everything up. Docker Containers have the ability to scale exceptionally well. Because setting up the whole infrastructure with commands, this approach is often referred to ‘infrastructure as code’.

Advantages for Web Developers

Containers will (usually) help you to get your development environment set up much quicker, because they are pre-configured entities that can be created and moved faster than VMs.

They also help you to avoid App Conflicts because Containers can run separately and isolated from each other, having different versions of frameworks or tools.

With Containers you can get consistent environments. For example you can expect that your Development environment runs pretty much the same way as your staging or production environment.

All of that allows you to generally ship software faster in a more reliant, consistent and predictable way.

Basic terminology

Docker Images are the read-only blueprints/templates composed of layered filesystems that are used to share common files. From images you build isolated Containers that you deploy and run on servers that have Docker Engine installed. Again, images are not deployed and executed, it is Containers built from Images that are deployed and run.

There are hundreds of thousands public Images on Docker Hub for all kinds of pre-configured software, for example nginx, nodeJs and many more. Even trimmed down, security-optimized Images of a certain software product. Updating to a new nginx version means using an updated image version. Docker Images are portable between systems and allow snapshots and versioning.

You want to eventually create an image yourself (most probably based on another image) that contains your application files. In order to do this you create a Dockerfile that contains statements about which base Image to use, which files to copy over which commands to run once a Container starts.

Containers can be started, stopped, moved and deleted. A Docker Container is simply another process on your machine that has been isolated from all other processes on the host machine and uses its own isolated file system which is provided by the image. They usually only exist during runtime but can also be persisted back to an image. They start and stop much faster than Virtual machines.

A Docker repository is where you can store one or more versions of a specific Docker image, just like GitHub is a repository for your code.

A registry stores a collection of repositories. You could say a registry has many repositories and a repository has many different versions of the same image which are individually versioned with tags.

A Swarm is a group of machines that are running Docker and that are joined into a cluster. The machines in a swarm can be physical or virtual. After joining a swarm, a machine is referred to as a node. A Stack is a group of interrelated services that share dependencies, and can be orchestrated and scaled together.

A docker Service is one or more containers with the same configuration running under Docker’s swarm mode. It’s similar to docker run in that you spin up a container. The difference is that you now have orchestration.

Docker Containers vs. Virtual Machines

Virtual machines take more space and add much more redundant things than containers. For example Virtual machines require their own Guest OS.

Getting Docker running

Docker Toolbox is outdated and I won’t describe it in this article, but it used to be the only option for Windows 7/8 users in the past. It includes Docker Client, Docker Compose, Docker Kitematic which is a graphical UI to manage containers and images and Docker Machine that will spin up a boot2docker image inside of VirtualBox. These are Linux containers running with a Linux kernel inside the VM.

Docker Desktop provides image and container tools. You need either Windows 10 Pro+, Mac or you can install Docker Desktop on Windows Home using the WSL 2 backend. It uses Hyper-V/Hyperkit to run VMs. Works on Windows, Max, Linux (via Docker Engine). Not meant for production, only for development. You get Docker Client, Docker Compose and Docker Kitematic optionally.

Windows Server Containers run Windows binaries on the same host OS, similar to how Linux containers on a Linux OS do not need a VM.

Installing Docker on Ubuntu 20.04

First you have to add the repository once and verify the fingerprint:

$ sudo apt-get update

$ sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common

$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

$ sudo apt-key fingerprint 0EBFCD88

$ sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"

Next, you install the latest docker engine:

$ sudo apt-get update
$ sudo apt-get install docker-ce docker-ce-cli containerd.io

docker run hello-world

You might get the following error when running docker run hello-world:

docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/create: dial unix /var/run/docker.sock: connect: permission denied.
See 'docker run --help'.

Do the following to fix it, so that you can use the docker command without sudo:

sudo groupadd docker
sudo usermod -aG docker ${USER}

Now log out and back in and enter docker run hello-world. If the error still occurs do this:

sudo systemctl restart docker
sudo chmod 666 /var/run/docker.sock

Managing images

An image actually consists of many images, aka image layers. The main reasons being that those can be cached and thus don’t have to be rebuild every time you create an image.

Here are some image commands:

docker image COMMAND

Commands:
  ls          List images
  build       Build an image from a Dockerfile
  rm          Remove one or more images
  prune       Remove unused images
  tag         Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE

  pull        Pull an image or a repository from a registry
  push        Push an image or a repository to a registry

  history     Show the history of an image
  import      Import the contents from a tarball to create a filesystem image
  inspect     Display detailed information on one or more images
  load        Load an image from a tar archive or STDIN
  save        Save one or more images to a tar archive (streamed to STDOUT by default)

Remove images

# Remove specific image
docker rmi <image-id>

# Remove all images (You should first remove all containers)
docker rmi -f $(docker images -a -q)

# Remove dangling images (those intermediate images without ID)
docker rmi $(docker images -f "dangling=true" -q)

Building an image from a Dockerfile and tagging it

A Dockerfile lets you build a container image. Name it Dockerfile without a file extension and locate it in the same folder as package.json. An example:

 FROM node:12-alpine
 WORKDIR /app
 COPY . .
 RUN yarn install --production
 CMD ["node", "src/index.js"]

You eventually build an image from the Dockerfile like this:

docker build -t getting-started .
// or
docker image build -t getting-started .

Here we tag (-t) the image to specify a human readable name to refer to it more easily when running a container from it later on.

By the way, we would use the same command again if we wanted to create an updated version of the image.

The dot at the end specifies the location of the local Dockerfile (aka build context) but it can also be a git repository url.

Now, this happens in the background when building an image: Docker sends your files (the build context) to the docker daemon and ignores any files and folders specified in a .dockerignore file. Then Docker processes each line in the Dockerfile: For each line Docker

  1. creates an intermediate container
  2. executes the argument you specified
  3. creates an image (only if the container has data) and adds it as a layer
  4. finally deletes the intermediate container

Share an image on a registry

To share Docker images, you use a Docker registry and create a repo there. The default registry is Docker Hub (needs an account) which we use in the following command together with a repo name of getting-started:

docker push docker/getting-started

Tag an image for a registry

docker login -u YOUR-USER-NAME
docker tag getting-started YOUR-USER-NAME/getting-started

Then try to push using your user name:

docker push YOUR-USER-NAME/getting-started

Rename image

docker tag OldName:tag NewName:tag

Give an image (ID) a name

If you want to name a non-named image like

<none>   <none>   1d75add374d7   3 minutes ago   134MB

then you achieve this by tagging it:

docker tag 1d75 myrepo:my-image-name

Display layers of a Docker image

See the command that was used to create each layer within an image:

docker image history getting-started

You’ll notice that several of the lines are truncated. If you add the --no-trunc flag, you’ll get the full output.

Layers are cache but once a layer changes, all downstream layers have to be recreated as well. That means the order of the lines in Dockerfile matter.

The following is bad because when we make a change to the image, the yarn dependencies have to be reinstalled:

FROM node:12-alpine
WORKDIR /app
COPY . .
RUN yarn install --production
CMD ["node", "src/index.js"]

To fix this, we need to restructure our Dockerfile. For Node-based applications, those dependencies are defined in the package.json file. So, what if we copied only that file in first, install the dependencies, and then copy in everything else? Then, we only recreate the yarn dependencies if there was a change to the package.json.

 FROM node:12-alpine
 WORKDIR /app
 COPY package.json yarn.lock ./
 RUN yarn install --production
 COPY . .
 CMD ["node", "src/index.js"]

Ignore files for docker container

Add a .dockerignore file and add folders that should be ignored when copying over files to the container.

Managing containers

Usage:  docker container COMMAND

Commands:
  ls          List containers
  create      Create a new container
  rm          Remove one or more containers
  rename      Rename a container

  start       Start one or more stopped containers
  restart     Restart one or more containers
  stop        Stop one or more running containers
  kill        Kill one or more running containers
  prune       Remove all stopped containers

  run         Run a command in a new container
  exec        Run a command in a running container
  cp          Copy files/folders between a container and the local filesystem

  inspect     Display detailed information on one or more containers
  logs        Fetch the logs of a container
  port        List port mappings or a specific mapping for the container

  top         Display the running processes of a container
  pause       Pause all processes within one or more containers
  unpause     Unpause all processes within one or more containers

  attach      Attach local standard input, output, and error streams to a running container
  commit      Create a new image from a container's changes
  diff        Inspect changes to files or directories on a container's filesystem
  export      Export a container's filesystem as a tar archive
  stats       Display a live stream of container(s) resource usage statistics
  update      Update configuration of one or more containers
  wait        Block until one or more containers stop, then print their exit codes

Start/Run a container from an image

# Start/Run container from an image (container will get random name and default ports)
docker run docker/getting-started

# Start/Run container from an image and specify container name
docker run --name myapp docker/getting-started

# Start/Run container from an image on specified ports
docker run -p 80:80 docker/getting-started

# Start/Run container from an image in the background (detached)
docker run -d docker/getting-started

# Start/Run container from an image detached, with ports, name and command
docker run -dp 80:80 --name myapp docker/getting-started /bin/bash
# or
docker container run -dp 80:80 --name myapp docker/getting-started

List containers (also get their ID)

# list running containers
docker ps

# list all containers, including those not currently running
docker ps -a

Stopping container

# Stop specific
docker stop <the-container-id>

# Stop all containers
docker stop $(docker ps -a -q)

Remove containers

# Remove specific container (if stopped)
docker rm <the-container-id>

# Remove specific container (if running)
docker rm -f <the-container-id>

# Remove all containers including volumes
docker rm -vf $(docker ps -a -q)

Execute command in container

# Runs command in container
docker exec -it <mysql-container-name or id> mysql -p

# Connect to container via shell
docker exec -it <container-name or id> /bin/bash

Inspect a container

Run docker inspect <container> to check it. The result could be:

Watch container logs

docker logs -f <container-id>

Copying files from/to container

# from host to container (by the way, this works for kubectl too)
docker cp <src-path> <container>:<dest-path> 
kubectl cp <your-pod-name>:<src-path> <local-dest-path>

# from container to host
docker cp <container>:<src-path> <local-dest-path> 
kubectl cp <src-path> <your-pod-name>:<dest-path> 

Volumes

Containers have a “scratch space” that allows you to store files only temporarily, meaning as long as the container is running. But once the container stopped the data is gone.

Volumes are a way to persist information even after a container got destroyed. A volume is a special type of directory in a container (a mount or alias) which maps back to a folder on the host. That can be done in one of two ways: Using named volumes or bind mounts.

Volumes can be shared and reused among containers.

Create a volume that maps to a default folder location on the host

The following command will create a volume that mounts the container’s /var/www folder to a default folder location at the host system:

# mount host's default location to container's /var/www 
docker run -p 8080:3000 -v /var/www node

You can docker inspect <container> to look up the paths that are mounted from the host to the container (Listed under “Mounts”).

Another way to create a Volume is to give your volume a name like todo-db (named volume):

# create named volume
docker volume create todo-db

# run container using named volume (works even without creating named volume before)
docker run -dp 3000:3000 -v todo-db:/etc/todos getting-started

Create a volume that maps to a folder of your choice on the host system

# run container with volume that has host's current working directory pointing to container's /var/www
docker run -p 8080:3000 -v $(pwd):/var/www node

Remove container including volume

To remove a container including its volume run docker rm -v mycontainer.

Start container with volume and run command in container

To run a command in the docker container you first have to specify a working directory with working directory argument (-w), followed by the image name and the command:

docker run -dp 3000:3000 \
     -w /app -v "$(pwd):/app" \
     node:12-alpine \
     sh -c "yarn install && yarn run dev"

Inspect a volume

docker volume inspect todo-db

Container networking

Containers run in isolation and don’t know anything about other processes or containers on the same machine. You have to options to make them communicate: Via Container Linking (aka legacy linking) or via Container Networks. Linking containers is simple to setup, but also limited and should only be used during development.

Container Linking (Legacy Linking)

Let’s assume you have to Containers: The first one is running MongoDB, the second is running NodeJS and you want to connect them via Container Linking. First you give MongoDb container a name:

# Run container with name
docker run -d --name my-mongodb mongo

Then your second container (in this example node) can be linked to it:

# Link node container to my-mongo and choose alias 'mongodb'
docker run -d --link my-mongo:mongodb node

The alias mongodb that you chose is now available as a host within the node container:

{
  "databaseConfig" : {
    "host" : "mongodb",
    "database" : "myDatabase"
  }
}

You can repeat this process if you want to join more containers.

Creating a network

If two containers are on the same network, they can talk to each other. If they aren’t, they can’t. You can group containers into their own isolated network.

# list networks and their drivers
docker network ls

# list containers attached to the network, subnet, gateway and others
docker network inspect my-network

First you have to create a custom bridge network. After that there are two ways to put a container on a network: Assign the network when starting the Container or add an existing container to the network.

# create bridged network with name 'isolated-network'
docker network create --driver bridge isolated-network

Now, start the first container and attach it to the network:

# run detached container and assign network
docker run -d \
     --network isolated-network \
     --network-alias mongodb \
     mongo

# --network is same as --net
# --network-alias is same as --net-alias

Having network-alias mongodb allows us to connect to this container using mongodb as host name.

Run the second container and attach it to the same network:

# run detached container and assign network
docker run -d \
     --network isolated-network \
     --network-alias node-app \
     node

Linking multiple containers

If you have many containers it can be tedious to create networks and attach them to it via command line every time. Instead you should use Docker Compose.

Multi Stage Builds

Multi-stage builds help us reduce overall image size and increase final container security by separating build-time dependencies from runtime dependencies.

You can easily identify multi-stage builds by their multiple use of the FROM and COPY --from=mystage statements.

React example

FROM node:14 As development
WORKDIR /usr/src/app
COPY package*.json ./
RUN npm install
COPY . .
# enable/disable testing
#RUN CI=true npm test
RUN npm run build

FROM nginx as production
COPY --from=development /usr/src/app/build /usr/share/nginx/html
COPY --from=development /usr/src/app/nginx.conf /etc/nginx/conf.d/default.conf

# Fire up nginx
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

To only build a specific stage (in this example development) you have to specify the target stage:

docker build -t front-end:dev --target development .

Other useful docker commands

Scanning images for security vulnerabilities

docker scan getting-started

About Author

Mathias Bothe Contact me

I am Mathias, born 38 years ago in Heidelberg, Germany. Today I am living in Munich and Stockholm. I am a passionate IT freelancer with more than 14 years experience in programming, especially in developing web based applications for companies that range from small startups to the big players out there. I am founder of bosy.com, creator of the security service platform BosyProtect© and initiator of several other software projects.