chroot, Docker, LXC, containers, etc. ============================================================ (Updated March 2016) Containers fall into two broad types: - containers that contain one process - containers that contain multiple processes on a minimal OS install Advantages of containers (to various degrees, depending on container type): - enhanced security - isolation (library versions, etc.) - ease of packaging and deployment A lot of container systems seem to store their images in /var/lib/(docker|lxc|machines|whatever). Why /var/lib? No idea. chroot ------------------------------------------------------------ chroot is the simplest form of isolation. chroot temporarily sets the root directory ("/") as another directory, which keeps processes from moving up into the real system root. This provides some security, and is useful for testing. chroot is a "full system" type of container. Debian chroot: # apt-get install binutils debootstrap # mkdir -p /var/containers/mychroot # debootstrap --arch=amd64 jessie /var/containers/mychroot http://httpredir.debian.org/debian # chroot /var/chroot/mychroot Configure the chroot. Of course, systemd makes this more complicated. Create a service file /etc/systemd/system/mychroot.service, like: [Unit] Description=A chroot()ed Service [Service] RootDirectory=/var/containers/mychroot ExecStartPre=/usr/local/bin/setup-foo-chroot.sh ExecStart=/usr/bin/food RootDirectoryStartOnly=yes The ExecStartPre script can be used to do things like mount /proc and /sys inside the chroot, if needed. As alternatives to RootDirectory, note ReadOnlyDirecctories and InaccessibleDirectories. systemd-nspawn ------------------------------------------------------------ On systemd boxes, a chroot alternative is systemd-nspawn: # apt-get install binutils debootstrap systemd-container # mkdir -p /var/containers/mynspawn # debootstrap --arch=amd64 jessie /var/containers/mynspawn http://httpredir.debian.org/debian # systemd-nspawn -D /var/containers/mynspawn This dumps us into a shell prompt in the container. systemd-nspawn automatically mounts a number of filesystems read-only (e.g. /sys, /proc/sys). systemd-nspawn provides somewhat more isolation than pure chroot. (On the back-end, systemd-nspawn uses kernel LXC.) systemd-nspawn can be either a single process or full system container. The `-b` flag to `systemd-nspawn` runs `init` to "boot" the container, instead of just running a shell. Use `machinectl` to start,stop, and connect to systemd-nspawn containers. - https://wiki.archlinux.org/index.php/systemd-nspawn - https://www.linux-magazine.com/Issues/2016/184/systemd-nspawn Docker ------------------------------------------------------------ Docker is a container system, often used for packaging and distributing software, particularly microservice components. Docker is a single process container. Note that docker containers are stateless, so persistent data must be handled outside the container (mounted filesystem, database, etc.). (It's possible to have multiple processes or to attach persistent storage to a Docker container, but that's fighting against the tide.) Docker can use a number of back ends (libcontainer, libvirt, LXC, systemd-nspawn, etc.). Docker has a Docker Server with containers as clients. Optionally, a third component, the Docker Registry, stores Docker images and metadata. # apt-get install docker.io # docker version A note about how file systems normally work in Docker: a normal Docker container has a read-only file system based on the Docker image from which it was created. Above this, the container has a read-write layer that stores differences from the read-only original image layer. However, when the container is destroyed, the changes in the read-write layer are discarded; future spin-ups of the Docker image start fresh with the original read-only layer. Docker calls this the Union file system. Docker has two ways to implement persistent storage (although the default assumption is still that containers will be non-persistent): - Docker Volumes: - A volume outside the container (e.g. /mydata on the container's host) can be passed to the Docker invocation at runtime with the -v switch. - A data volume from one container can be accessed from another container with the --volumes-from flag. - Changes to the volume are made directly, not mediated through Union file system. - Data volumes persist after the container is destroyed. - Data-Only Containers: - Used to share data between containers - Mounted in other containers with --volumes-from - Multiple --volumes-from flags can be used to combine data volumes from multiple data containers - The data contain need not be running for other containers to consume its volume (and its usually a pointless waste of resources to leave it running). In fact, this is pretty much the only difference between a "Data Container" and a data volume from a running container mounted with --volumes-from. - Docker doesn't take any special precautions to prevent multiple containers from hosing each other data (race conditions, etc.) on mutually mounted volumes. Data integrity protection, if required, must be provided by our applications. - Because the volumes persist after a container is destroyed, it's possible to accidentally accumulate "dangling" unwanted data volumes, unless the container was deleted with the -v flag (`docker rm -v mycontainer`). Find dangling volumes with `docker volume ls -f dangling=true`; remove them with `docker volume rm myvolume`. NOTE: newer versions of Docker (1.9+) have `docker volume`: https://docs.docker.com/engine/reference/commandline/volume_create/ For most use cases, this supersedes data-only containers. LXC ------------------------------------------------------------ See https://paulgorman.org/technical/linux-lxc.txt LXC is the native linux kernel container. LXC is a full system type of container. lxc(7) # apt-get install lxc # lxc-checkconfig # lxc-create -n mycontainer -t debian # lxc-ls -f # lxc-info mycontainer # lxc-attach -n mycontainer lxc comes with a number of templates for new containers. On Debian, these are found in /usr/share/lxc/templates/. lxc containers can have various types of backing stores (btrfs, zfs, lvm, etc.); by default, they're created on the file system under /var/lib/lxc/. Unlike Docker, which is focused on containing a single process without persistent storage, LXC contains a more traditional fully system, like FreeBSD jails. Links ------------------------------------------------------------ - https://www.flockport.com/a-new-users-guide-to-container-technology/ - https://wiki.debian.org/chroot - http://0pointer.de/blog/projects/changing-roots.html - https://wiki.archlinux.org/index.php/Systemd-nspawn - https://lindenberg.io/blog/post/debian-containers-with-systemd-nspawn/ - https://www.flockport.com/lxc-vs-docker/ - https://www.flockport.com/lxc-guide/ - https://wiki.debian.org/LXC - http://michaeldehaan.net/post/111599240017/skipping-docker-for-lxc-for-local-development - http://container-solutions.com/understanding-volumes-docker/ - https://docs.docker.com/engine/userguide/containers/dockervolumes/ - https://www.digitalocean.com/community/tutorials/how-to-work-with-docker-data-volumes-on-ubuntu-14-04