Docker for Beginners
A collection of generic knowledge about Docker targeted at beginners. These concepts do not only apply to this project but this project is used as an example.
Compose File#
Docker compose files are the most common and arguably the most convenient way to deploy docker containers. The compose file is usually called docker-compose.yml
and is in yaml format.
Indentation Matters
YAML is very particular about indentation. Make sure you exactly mirror the indentation level as documented.
The top key is called services
then each key inside that represents a container. And each additional container has additional options like image
and container_name
. So a common compose file will look like this:
services:
container1:
container_name: name1
image: image1
container2:
container_name: name2
image: image2
Common Commands#
These commands are expected to be run from the same location in the filesystem where you have stored your docker-compose.yml
file.
Download#
Download (pull) the containers:
docker compose pull
Start#
Start (up) the containers. This will download the images first, if needed:
docker compose up -d
Stop#
Stop (down) all containers:
docker compose down
Update#
Update (pull and up) all containers to the latest image:
docker compose pull
docker compose up -d
Check Release Notes
For this project, and other projects under development, always check the release notes before updating. If there are any breaking changes or manual steps needed, you'll find instructions there.
Logs#
To get the logs:
docker compose logs
Get the logs of a single container:
docker compose logs tubearchivist
Optionally follow the logs, meaning always see the latest logs, add -f
:
docker compose logs -f tubearchivist
same for all containers:
docker compose logs -f
Volumes#
Data Loss
Understanding Volumes is crucial. Incorrect configurations can lead to data loss.
By default Docker containers don't persist any data, meaning all data will get lost and reset after a container rebuilds. To persist data, you need to define volumes.
Docker Managed#
In the example docker-compose.yml
file you can see the volume definition at the bottom like that:
volumes:
media:
cache:
redis:
es:
This defines volumes managed by docker. In a typical Linux based environment these get stored at /var/lib/docker/
. Then you can see the corresponding volume mount on the container service like so:
volumes:
- media:/youtube
This mounts the docker managed volume called media
inside the container at /youtube
. The colon symbol :
splits the paths:
- The left side points to the location of the host system. You can usually freely choose that.
- The right side points to the location inside the container. You usually can't modify that and you have to use the exact same path as documented.
- In some projects you see an additional
:
with additional options, e.g.:ro
meaning read only. That does not apply for this project.
User Managed, aka bind mount#
Permission Problems
Using bind mounts can lead to permission problems if the service inside the docker container is running on a specific user. In that case you'll have to make sure the folder on your host system has the same permissions as the folder inside the container.
You'll run into this problem when using a bind mount for the ElasticSearch volume. See here with instructions how to fix that.
If you prefere, you can define where each volume should be stored on the file system by modifying the path infront of the :
. Remember, you usually can't change the path inside the container (right side of the :
), only the path on the host system.
If you define a bind mount, you can remove the docker managed volume definition from the volumes
key at the bottom of the docker-compose file.
Relative Path#
You can specify a relative file path, starting with ./
, that will be relative from the location of the docker-compose.yml
file.
E.g.:
volumes:
- ./volume/youtube:/youtube
volume/youtube
where the content of the /youtube
folder will get stored on your host system.
Absolut Path#
Alternatively you can also specify an absolute path on your host system.
E.g.:
volumes:
- /media/docker/volume/youtube:/youtube
That will persist the data at /media/docker/volume/youtube
on your host system.
Networking#
For docker networking there are two basic concepts to understand: Publishing a service and networking between containers.
Publish#
Publishing a service running inside a container is required to access that service, e.g. through your web browser.
You will see ports
defined in the docker-compose.yml
file.
E.g.:
ports:
- "8080:8000"
Similar to volumes, the colon symbol :
splits the network definition:
- The left side of the
:
defines the port used on your host system, akaHOST_PORT
. You can usually freely choose that. - The right side of the
:
defines the port used inside the container, akaCONTAINER_PORT
. You usually can't change that and you need to use the port as defined in the documentation. If a service allows you to customize that, you should see a mention in the docs as well.
Info
You can't have more than one service using the same port on your machine. You'll see an error if you try to do that. The simplest way to rectify that is to change the HOST_PORT
to something not in use, but keep the CONTAINER_PORT
as is.
E.g.: "8008:8000"
to use the port 8008 on the host.
This means this container is publishing a service running in the container on port 8000
to port 8080
on the host. That usually means, when you access that service, you'll need to add the HOST_PORT
:8080
to the IP/URL in your browser address bar.
Networking between containers#
Containers regularly depend on other containers. How interacting between containers is usually done, is with networking. E.g. An application uses a traditional database like Postgres or, as here in this project, Tube Archivist needs to communicate with Redis and ElasticSearch.
The good news is, docker handles all of that for you automatically: Docker's internal DNS automatically resolves the service name. Usually, you don't need to publish ports for services only expected to be accessed by another container. You will see expose
keys defined on the service.
E.g.:
expose:
- "9200"
That does not actually do anything, that is mostly for you to document on what port a given service is expected to be accessed. You will usually set environment variables on the main container defining connection details like that. For this project thats: REDIS_CON
for Redis and ES_URL
for ElasticSearch.