The Complete Guide to Mastering Docker: Tools, Techniques, and Best Practices

Docker Overview: Docker is an open-source platform that automates the deployment of applications inside lightweight and portable containers. Containers allow developers to package up an application with all the parts it needs, such as libraries and other dependencies, and ship it out as one package.

Core Concepts:

  • Images: Read-only templates used to create containers. Images are created with the docker build command, usually from a Dockerfile that contains instructions on how to build them.
  • Containers: Runnable instances of images that encapsulate the application and its environment at the point of execution.
  • Volumes: Mechanisms for persisting data generated by and used by Docker containers. They are managed outside the lifecycle of a given container.
  • Dockerfile: A script with various commands and instructions to automatically build a given Docker image.
  • Docker Compose: A tool for defining and running multi-container Docker applications.
  • Key Docker Commands:
    • docker build: Builds Docker images from a Dockerfile and a context.
    • docker run: Runs a command in a new container.
    • docker ps: Lists running containers.
    • docker pull: Pulls an image or a repository from a registry.
    • docker push: Pushes an image or a repository to a registry.
    • docker stop: Stops one or more running containers.
    • docker rm: Removes one or more containers.
    • docker rmi: Removes one or more images.
    • docker exec: Runs a command in a running container.
    • docker logs: Fetches the logs of a container.
    • docker network: Manages networks – connect, disconnect, list, remove, etc.

Docker Networking:

  • Containers can communicate with each other through networking.
  • Docker provides network drivers to manage the scope and behavior of the network.

Docker Storage:

  • Data volumes can be used for persistent or shared data.
  • Volume drivers allow you to store volumes on remote hosts or cloud providers.

Docker Security:

  • Containers should run with the least privileges possible.
  • Image provenance (ensuring the images come from a trusted source) is critical.
  • Docker Content Trust provides the ability to use digital signatures for data sent to and received from remote Docker registries.

Best Practices:

  • Keep your images as small as possible.
  • Use multi-stage builds.
  • Minimize the number of layers.
  • Use .dockerignore files.
  • Leverage the build cache.

Dockerfile Instructions:

  • FROM: Set the base image for subsequent instructions.
  • RUN: Execute any commands in a new layer on top of the current image.
  • CMD: Provide defaults for an executing container.
  • LABEL: Add metadata to an image.
  • EXPOSE: Inform Docker that the container listens on the specified network ports at runtime.
  • ENV: Set environment variables.
  • ADD and COPY: Copy new files or directories into the Docker image.
  • ENTRYPOINT: Configure a container that will run as an executable.

Example:

# Use the official Tomcat base image with JDK 11
FROM tomcat:9-jdk11-openjdk-slim

# Set the working directory inside the container to the Tomcat webapps directory
WORKDIR /usr/local/tomcat/webapps/

# Download the WAR file from the GitHub repository and add it to the webapps directory of Tomcat
ADD https://github.com/AKSarav/SampleWebApp/raw/master/dist/SampleWebApp.war /usr/local/tomcat/webapps/SampleWebApp.war

# Expose port 8080
EXPOSE 8080

# Start Tomcat server
CMD [“catalina.sh”, “run”]

Docker Compose:

  • Purpose: Docker Compose is used to define and run multi-container Docker applications. You define services, networks, and volumes in a docker-compose.yml file, and then use docker-compose up to start the whole application stack.
  • docker-compose.yml: The configuration file where you define your application’s services, networks, and volumes.
  • Commands:
    • docker-compose up: Starts and runs the entire app.
    • docker-compose down: Stops and removes containers, networks, volumes, and images created by up.

Docker Swarm:

  • Description: Docker Swarm is a clustering and scheduling tool for Docker containers. With Swarm, IT administrators and developers can establish and manage a cluster of Docker nodes as a single virtual system.
  • Key Features: Easy to use, declarative service model, scaling, desired state reconciliation, multi-host networking, service discovery, and load balancing.
  • Commands:
    • docker swarm init: Initializes a swarm.
    • docker swarm join: Joins a machine to a swarm.
    • docker service create: Creates a new service.

Docker Security:

  • Namespaces: Docker uses namespaces to provide isolation between containers.
  • Control Groups (cgroups): Limit and prioritize the resources a container can use.
  • Secure Computing Mode (seccomp): Can be used to filter a container’s system calls to the kernel.
  • Capabilities: Grant specific privileges to a container’s root process without granting all the privileges of the host’s root.
  • Docker Bench for Security: A script that checks for dozens of common best practices around deploying Docker containers in production.

Docker Registries and Repositories:

  • Docker Hub: The default registry where Docker looks for images. It’s a service provided by Docker for finding and sharing container images.
  • Private Registry: You can host your own registry and push images to it.
  • Docker Trusted Registry (DTR): Offers a secure, private registry for enterprises.

Docker Volumes and Storage:

  • Bind Mounts: Allows you to map a host file or directory to a container file or directory.
  • tmpfs mounts: Store data in the host system’s memory only, which is not written to the host’s filesystem.
  • Volume Plugins: There are various volume plugins available that allow you to store data on remote hosts or cloud providers, such as Amazon EBS, Azure Blob Storage, or a network file system.

Docker Engine:

  • Components:
    • dockerd: The Docker daemon that runs on the host machine.
    • REST API: An API for interacting with the Docker daemon.
    • CLI: The command-line interface (CLI) that allows users to interact with Docker.

Container Orchestration:

  • Kubernetes: An open-source system for automating deployment, scaling, and management of containerized applications, commonly used with Docker.
  • Docker Swarm: Docker’s native clustering system, which turns a group of Docker engines into a single, virtual Docker engine.

Docker Networking:

  • Network Types:
    • bridge: The default network type. If you don’t specify a network, the container is connected to the default bridge network.
    • host: Removes network isolation between the container and the Docker host, and uses the host’s networking directly.
    • overlay: Connects multiple Docker daemons together and enables swarm services to communicate with each other.
  • Custom Networks: You can create custom networks to define how containers communicate with each other and with the external network.

This additional information provides an intermediate understanding of Docker’s capabilities, typical use cases, and the functionalities provided by the ecosystem around Docker. If you want to go deeper into any particular area, feel free to ask!

Single stage VS Multi satge Docker FIle

Single Stage:

 

# Use an image that includes both JDK and Maven
FROM maven:3.6.3-jdk-8

# Set the working directory in the container
WORKDIR /app

# Copy the source code and pom.xml file
COPY src /app/src
COPY pom.xml /app

# Build the application and package it into a JAR file
# and list the contents of the target directory
RUN mvn clean package -DskipTests && ls /app/target

# Expose the port the app runs on
EXPOSE 8080

# Run the JAR file (update this if the JAR name is different)
ENTRYPOINT ["java", "-jar", "/app/target/sample-0.0.1-SNAPSHOT.jar"]

Multi-stage Docker file

# Stage 1: Build the application
FROM maven:3.6.3-jdk-8 AS build
WORKDIR /app

# Copy the pom.xml and source code
COPY pom.xml .
COPY src ./src

# Build the application
RUN mvn clean package -DskipTests

# Stage 2: Create the runtime image
FROM openjdk:8-jdk-alpine
WORKDIR /app

# Copy the JAR from the build stage
COPY --from=build /app/target/sample-0.0.1-SNAPSHOT.jar /app

# Expose the port the app runs on
EXPOSE 8080

# Run the JAR file
ENTRYPOINT ["java","-jar","sample-0.0.1-SNAPSHOT.jar"]

Sample Java App Coede https://github.com/buildpacks/sample-java-app/tree/main

Most Popular Docker Interview Questions and Answers 

What is Docker?

  • Docker is a containerization platform which packages your application and all its dependencies together in the form of containers so as to ensure that your application works seamlessly in any environment be it development or test or production.

What is the difference between a Docker image and a container?

An instance of an image is called a container. You have an image, which is a set of layers. If you start this image, you have a running container of this image. You can have many running containers of the same image.

What is the difference between the COPY and ADD commands in a Dockerfile?

The COPY command is used to copy files and folders from the host file system to the Docker image. It’s simple and straightforward.

The ADD command can do everything COPY does, but it can also handle URLs and automatically unpack compressed files.

Best practice is to use COPY unless you need ADD for its additional features.

What is Docker hub?

Docker hub is a cloud-based registry service which allows you to link to code repositories, build your images and test them, stores manually pushed images, and links to Docker cloud so you can deploy images to your hosts. It provides a centralized resource for container image discovery, distribution and change management, user and team collaboration, and workflow automation throughout the development pipeline.

What are the various states that a Docker container can be in at any given point in time?

There are four states that a Docker container can be in, at any given point in time. Those states are as given as follows:

  • Running
  • Paused
  • Restarting
  • Exited

When would you use ‘docker kill’ or ‘docker rm -f’?

docker kill is used for forcefully stopping a container immediately, without waiting for it to shut down gracefully.

docker rm -f not only stops the container forcefully if it’s running but also removes it from the system.

Is there a way to identify the status of a Docker container?

We can identify the status of a Docker container by running the command

docker ps –a

What is the difference between ‘docker run’ and ‘docker create’?

docker run combines the actions of creating and starting a container. It creates the container with the specified configuration and starts it immediately.

docker create sets up the container but does not start it. It’s used when you want to configure a container that you will start later.

What is the difference between CMD and ENTRYPOINT in a Dockerfile?

In a Dockerfile, both `CMD` and `ENTRYPOINT` instructions define the command that will be executed when a Docker container starts. However, they are used in different ways:

CMD:
– Provides default arguments to the container at runtime.
– If Docker runs the container with a command, the default `CMD` is ignored.
– You can include multiple `CMD` instructions, but only the last one takes effect.

Example:
CMD [“echo”, “Hello world”]

If the container is run without a command specified, it will execute `echo Hello world`.

ENTRYPOINT:
– Sets the executable for the container; the main command that the container will run.
– Any arguments passed at runtime are appended to the `ENTRYPOINT`.
– Using `ENTRYPOINT` makes a container run like a binary; you can’t override the `ENTRYPOINT` easily without adding the `–entrypoint` flag.

Example:

ENTRYPOINT [“echo”]

If the container is run with `Hello world` as an argument, it will execute `echo Hello world`.

In summary, `CMD` is for setting default parameters that can be overridden easily, whereas `ENTRYPOINT` is for setting the container to run as a specific executable/service.

What’s the difference between a repository and a registry?

In the context of Docker and containerization:

A repository is a collection of related Docker images, usually providing different versions of the same application or service. These images are identified by their tags. For example, a repository can contain multiple versions of an Ubuntu image, tagged with different version numbers.

A registry is a service where Docker images are stored, shared, and managed. It’s a sort of ‘storage space’ for Docker images. Docker Hub is a popular public registry, but companies often use private registries to control access to their proprietary images.

Do I lose my data when the Docker container exits?

No, data isn’t lost when a Docker container exits. It remains until the container is explicitly removed. To keep data persistent even after the container is deleted, you should use Docker volumes or bind mounts.

Can you remove (‘docker rm’) a container that is paused?

No, you cannot directly remove a paused container with `docker rm`. You must first unpause it or use the force option `-f` with `docker rm` to remove it.

What is Build Cache in Docker?

Build cache in Docker is a mechanism that speeds up the image building process. When you build a Docker image, Docker looks for an existing image layer that can be reused. If the instructions in your Dockerfile haven’t changed and the cache from previous builds is available, Docker will use the cache rather than executing the instructions again, which saves time and resources.

Using Docker Networks (Recommended Method):

1.Create a network:

docker network create my-network

2.Start containers on that network:

docker run –name container1 –network my-network -d some-image
docker run –name container2 –network my-network -d another-image

Containers container1 and container2 are now on the same network and can communicate using the container names as hostnames.

Note: The –link option is deprecated and should be replaced with Docker networks.

What are the most common instructions in Dockerfile?

The most common instructions in a Dockerfile include:

FROM: Specifies the base image from which to start building your image.
RUN: Executes a command inside the container, creating a new layer.
CMD: Provides defaults for executing a container; only the last CMD takes effect.
LABEL: Adds metadata to an image, like version, description, maintainer info.
EXPOSE: Informs Docker that the container listens on the specified network ports at runtime.
ENV: Sets environment variables inside the container.
ADD: Copies files from a source on the host to the container’s filesystem, can also unpack local `.tar` files.
COPY: Copies new files or directories from the host to the filesystem of the container.
ENTRYPOINT: Configures a container to run as an executable; command line arguments are appended.
VOLUME: Creates a mount point for externally mounted volumes or other containers.
USER: Sets the username or UID to use when running the image.
WORKDIR: Sets the working directory for any `RUN`, CMD`, `ENTRYPOINT`, `COPY`, and `ADD` instructions that follow it.
ARG: Defines a variable that users can pass at build-time to the builder with the `docker build` command.

How do I transfer a Docker image from one machine to another one without using a repository, no matter private or public?

You will need to save the Docker image as a tar file:

docker save – o <path for generated tar file> <image name>Then copy your image to a new system with regular file transfer tools such as cp or scp. After that you will have to load the image into Docker:docker load -i <path to image tar file>

Can you explain what a multi-stage Dockerfile is, and provide a use-case for it?

A multi-stage Dockerfile is a Dockerfile that uses multiple FROM statements, allowing the creation of multiple separate build stages within a single Dockerfile. Each stage can use a different base image, and you can copy artifacts from one stage to another, discarding everything you don’t need in the final image. This is especially useful for creating lightweight production images.

# Build stage
FROM maven:3.6.0-jdk-11-slim AS build
COPY src /home/app/src
COPY pom.xml /home/app
RUN mvn -f /home/app/pom.xml clean package

# Package stage
FROM openjdk:11-jre-slim
COPY –from=build /home/app/target/my-app.jar /usr/local/lib/my-app.jar
EXPOSE 8080
ENTRYPOINT [“java”,”-jar”,”/usr/local/lib/my-app.jar”]

 

What Command Can You Run to Export a Docker Image As an Archive?
You can export a Docker image as an archive using the command docker save -o <path for generated tar file> <image name>. For example, docker save -o ubuntu.tar ubuntu:latest will save the Ubuntu image as a tar file named ubuntu.tar.

What Command Can Be Run to Import a Pre-Exported Docker Image Into Another Docker Host?
To import a Docker image from an archive, use the command docker load -i <path to image tar file>. For instance, docker load -i ubuntu.tar will import the Ubuntu image from the tar file into your Docker host.

Can a Paused Container Be Removed From Docker?
Yes, a paused container can be forcibly removed using the command docker rm -f <container ID>. For example, if a container with ID 1a2b3c is paused, you can remove it with docker rm -f 1a2b3c.

How Do You Get the Number Of Containers Running, Paused, and Stopped?
You can get the number of containers in different states by using the command docker info | grep 'Containers:'. This command will provide a summary of running, paused, and stopped containers.

What Are the Key Distinctions Between Daemon Level Logging and Container Level Logging in Docker?

  • Daemon Level Logging: Applies to all containers on the host and is configured at the Docker daemon level, affecting the logging of all containers.
  • Container Level Logging: Configured individually for each container using the --log-driver option when starting the container, allowing for specific logging settings per container.

What Does the Docker Info Command Do?
The docker info command provides detailed information about the Docker system, including the number of containers and images, configuration details like storage and network drivers, and overall system health metrics.

Where Are Docker Volumes Stored in Docker?
Docker volumes are typically stored within the Docker host’s filesystem at /var/lib/docker/volumes/. This location can be customized but serves as the default storage area for Docker volumes.

Can You Tell the Differences Between a Docker Image and a Layer?
A Docker image consists of multiple layers stacked on top of each other to form a complete image. Each layer represents instructions in the image’s Dockerfile, such as adding files, executing commands, or configuring settings. Layers are reused between images to optimize storage and speed.

Can a Container Restart By Itself?
Containers do not restart by themselves unless configured with restart policies. Docker supports several restart policies like no, on-failure, and always that determine under what circumstances a container should automatically restart.

Why is Docker System Prune Used? What Does It Do?
docker system prune is used to clean up unused Docker objects like stopped containers, unused networks, and dangling images. This command helps in reclaiming disk space by removing objects that are no longer in use.

How Do You Scale Docker Containers Horizontally?
Horizontal scaling of Docker containers can be achieved using orchestration tools like Docker Swarm or Kubernetes, which allow you to specify the number of container replicas you want to run based on the load.

What Is the Difference Between Docker Restart Policies “no”, “on-failure,” And “always”?

  • no: Do not automatically restart the container when it exits.
  • on-failure: Restart the container only if it exits with a non-zero status (indicative of an error).
  • always: Always restart the container regardless of the exit status.

How Do You Inspect the Metadata of a Docker Image?
You can inspect the metadata of a Docker image using the command docker inspect <image name>. For example, docker inspect ubuntu:latest will provide detailed metadata about the Ubuntu image, including its layers, environment variables, default command, and more.

How Does Docker Handle Container Isolation and Security?
Docker uses namespaces and cgroups to isolate containers from each other and the host system. Namespaces provide a layer of isolation in aspects like PID, network, and filesystem, while cgroups limit and monitor the resources a container can use, such as CPU and memory.

Is it a Good Practice to Run Stateful Applications on Docker?
Running stateful applications on Docker is feasible but requires careful management of data persistence and state across container restarts and redeployments. Using Docker volumes or external storage solutions can help manage state effectively.

What Is the Purpose of Docker Secrets?
Docker secrets provide a secure way to manage sensitive data like passwords and API keys within Docker Swarm environments. Secrets are encrypted during transit and at rest, making them safer than conventional methods like environment variables.

How Do You Update a Docker Container Without Losing Data?
To update a container without losing data, use Docker volumes to persist data independent of the container lifecycle. This way, you can stop, update, and restart a container without affecting the data stored in the volume.

How Do You Manage Network Connectivity Between Docker Containers And the Host Machine?
Docker provides several networking options like bridge, host, and overlay networks that facilitate communication between containers and the host. Bridge networks create a network bridge, allowing containers connected to the same bridge to communicate.

How Do You Debug Issues in a Docker Container?
Debugging a Docker container can involve checking the container logs using docker logs <container ID>, inspecting the running processes inside the container with docker top <container ID>, or entering the container to perform diagnostic commands via docker exec -it <container ID> /bin/bash.

What is depends_on in Docker Compose?
depends_on in Docker Compose specifies the dependency between services defined in the docker-compose.yml file. It ensures that services start in dependency order.

Can We Use JSON Instead of YAML for My Compose File in Docker?
Docker Compose supports using JSON instead of YAML for the compose file. You can convert your YAML file to JSON format and specify it during the Docker Compose command using the -f option.