paulgorman.org

SysVinit systemd
ls /etc/init.d/ systemctl
service foo start/stop/restart/reload systemctl start/stop/restart/reload foo
cat /etc/init.d/foo.sh systemctl cat foo.service

systemd

Like it or not, systemd will be the linux init system for at least the next few years.

Init is the first process on a *nix system. PID 1. All other processess descend from it. systemd fulfills that role, but does significantly more. In these notes, we'll focus on how systemd fulfills its core duties as a replacement for SysV init.

$ ps 1
  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     0:07 /lib/systemd/systemd --system --deserialize 16

systemd controls units. A unit is anything systemd controls. These include services, sockets, devices, etc.

systemd reads unit files for configuration.

Documentation

Documentation for systemd is thin, but growing.

Misc

systemctl --failed                Show failed units/services
systemd-cgtop                     Top-like display of cgroups
systemctl kill foo                Kill foo _and_ its children
systemctl kill -s SIGKILL foo     Kill foo _and_ its children
systemctl list-dependencies ssh.service                Show dependencies
systemctl show -p CPUShares ssh.service                Show CPU shares
systemctl set-property ssh.service CPUShares=999999    Set CPU shares

systtemctl and boot

Running systemctl without arguments shows the state of each service loaded on boot. A few more details about any of these services can be seen with systemctl status ssh.service, for example.

$ systemctl status ssh.service
● ssh.service - OpenBSD Secure Shell server
   Loaded: loaded (/lib/systemd/system/ssh.service; enabled)
   Active: active (running) since Thu 2015-01-29 20:01:09 EST; 3 weeks 2 days ago
 Main PID: 759 (sshd)
   CGroup: /system.slice/ssh.service
           └─759 /usr/sbin/sshd -D

Unit Files

Unit files configure units. They're scattered all over the place. From systemd.unit(5):

       /etc/systemd/system/*
       /run/systemd/system/*
       /lib/systemd/system/*
       ...

       $XDG_CONFIG_HOME/systemd/user/*
       $HOME/.config/systemd/user/*
       /etc/systemd/user/*
       $XDG_RUNTIME_DIR/systemd/user/*
       /run/systemd/user/*
       $XDG_DATA_HOME/systemd/user/*
       $HOME/.local/share/systemd/user/*
       /lib/systemd/user/*

(The ellipsis is from the man page. Nice.)

After adding or changing a unit file, systemctl daemon-reload will make systemd take notice of the change, although it will not automatically start a new service without systemctl start foo.service.

What's the difference, if any, between a service file and a unit file?

None. A *.service file is just a unit file for a service. *.socket, *.mount, etc. are also unit files.

The [Unit] section contains generic information about the service. systemd not only manages system services, but also devices, mount points, timer, and other components of the system. The generic term for all these objects in systemd is a unit, and the [Unit] section encodes information about it that might be applicable not only to services but also in to the other unit types systemd maintains.*
Here's /etc/systemd/system/myexample.service:
[Unit]
Description=MyApp
After=docker.service
Requires=docker.service

[Service]
TimeoutStartSec=0
ExecStartPre=/usr/bin/docker kill busybox1
ExecStartPre=/usr/bin/docker rm busybox1
ExecStartPre=/usr/bin/docker pull busybox
ExecStart=/usr/bin/docker run --name busybox1 busybox /bin/sh -c "while true; do echo Hello World; sleep 1; done"

[Install]
WantedBy=multi-user.target

...and then we do systemctl enable /etc/systemd/system/myexampel.service and systemctl start myexample.service for our new service.

A real unit file I wrote (/etc/systemd/system/asterisk.service):
[Unit]
Description=Asterisk PBX and telephony daemon
Documentation=man:asterisk(8)
Wants=network.target
After=network.target

[Service]
Type=simple
User=asterisk
Group=asterisk
PermissionsStartOnly=true
ExecStart=/usr/sbin/asterisk -g -f -C /etc/asterisk/asterisk.conf
ExecStop=/usr/sbin/asterisk -rx 'core stop now'
ExecReload=/usr/sbin/asterisk -rx 'core reload'
ExecStartPost=/home/admin/bin/asterisk_status.pl
ExecStartPost=/bin/sh -c 'echo "The Asterisk service on gab restarted. See https://gab.example.com/asterisk-status.txt" | mail -s "Asterisk service restarted" root'
ExecStopPost=/bin/sh -c 'echo "The Asterisk service on gab stopped." | mail -s "Asterisk service stopped" root'

Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
Things to note about the above unit file:

Targets

Targets are a way of grouping units, and are vaguely similar to SysV's run levels. Common targets include multi-user.target and graphical.target.

systemctl get-default shows the default target.

systemctl list-units --type target shows the current target(s).

systemctl list-units --type target --all lists all targets. (Jesus, nice command structure, guys.)

systemctl isolate foo.target changes the current target. systemctl isolate rescue.target is as close as we get to dropping to single user runlevel; systemctl rescue is shorthand for this. (In practice systemctl emergency might be preferable, since it sends a warning to all users.)

cgroups

cgroups are a feature of the linux kernel, not systemd. But systemd makes cgroups easier to use (how?).

Imagine we have a web app that forks a bunch of apache processes. It would be handy to be able to manage and measure those processes as a group, apart from any unrelated apache processes on the box. cgroups lets us do that. Furthermore, cgroups let us restrain that gaggle of apache processes from starving other processes on the box by dint of the scheduling advantage their great number gives them.

A cgroup does two things: it groups and labels/tags related processes as a single service, and it lets us control/measure that service. systemd relies on the first feature of cgroups to function; the second is just a useful feature of cgroups.

Processes on traditional *nix systems are a single hierarchy (i.e. all processes descend from init). cgroups ("control groups") bundle processes together such that each cgroup appears to be its own independent process hierarchy. The processes which are part of a cgroup are called "tasks". A cgroup can spawn a new child cgroup, which inherit the attributes of the parent cgroup. Tasks can be moved between cgroups. Tasks can belong to more than one cgroup at a time, so long as those cgroups are not part of the same hierarchy of descent. (If a task is added to a second cgroup that's part of the same hierarchy as its original cgroup, the task is automatically removed from the original cgroup). When a task forks off a child task, that child task is automatically part of the same cgroup (though it can be subsequently moved to a different cgroup). Forked tasks are independent; child and parent can be changed (e.g. to different cgroups) without affecting each other.

By grouping related tasks, we can think about managing resources for services rather than individual processes.

Each cgroup can attach to one or more resource "subsystems". Subsystems include cpu, blkio, net_cls (tagging packets with their originating cgroup), memory, namespaces, etc (enumerate them with ls /sys/fs/cgroup/). We could, for example, the attach tasks in a cgroup to a particular cpu core. Kernel literature sometimes calls these resource subsystems "controllers" or "resource controllers".

The point of cgroups is the ability to provide accounting (e.g. for billing purposes or provisioning planning), limits/prioritization (e.g. use only so much memory or disk I/O), and isolation (e.g. namespaces) for a group of processes.

(Namespace isolation is technically a separate feature from cgroups.)

In practice, systemd seems to use only a few of the features of cgroups, mainly to organize related processes in a way that makes it easier for administrators to keep track of them. Under sysvinit without using cgroups, orphaned processes are re-parented to PID 1, making it sometimes difficult to know where such a process originated; systemd keeps related processes together in a cgroup, even if the parent dies.

ps can show cgroups:

$ ps axw -o pid,user,cgroup,args
[...snip...]
  729 root     4:devices:/system.slice/rpc /sbin/rpcbind -w
  738 statd    4:devices:/system.slice/nfs /sbin/rpc.statd
  743 root     -                           [rpciod]
  745 root     -                           [nfsiod]
  752 root     4:devices:/system.slice/nfs /usr/sbin/rpc.idmapd
  756 root     4:devices:/system.slice/cro /usr/sbin/cron -f
  757 root     4:devices:/system.slice/sma /usr/sbin/smartd -n
  758 daemon   4:devices:/system.slice/atd /usr/sbin/atd -f
  759 root     4:devices:/system.slice/ssh /usr/sbin/sshd -D
[...snip...]
28267 paulgor+ 4:devices:/user.slice,1:nam rxvt
28268 paulgor+ 4:devices:/user.slice,1:nam rxvt
28269 paulgor+ 4:devices:/user.slice,1:nam bash
28734 paulgor+ 4:devices:/user.slice,1:nam /bin/bash
28736 paulgor+ 4:devices:/user.slice,1:nam iceweasel

The systemd-cgls command gives this information as a tree.

Signaling

Using systemd to signal a service like systemctl kill -s SIGTERM foo.service ensures that all processes that make up the service receive the signal.

Stopping services

systemctl stop foo.service terminates the running service. It will turn back on at the next boot or if something triggers activation for it (hardware plugging, socket activation, etc.).

systemctl disable foo.service unhooks a service from any activation triggers. It will not start on reboot. The service can still be started manually. (Note that disabling a service will not actually stop the currently running instance, if any, so you may also want to send systemctl stop foo.service.)

We can also mask a service, which both disables it and prevents it from being started manually: systemctl mask foo.service.

Finally, doing something like ln -s /dev/null /etc/systemd/system/foo.service; systemctl daemon-reload will block the service from being started, even manually, because entries in /etc/systemd/ override those in /lib/systemd/.

Services can be brought back up in the way we'd expect (enable, start).

/run/ and changes to /etc/

There's a new top-level directory called /run/. This contains things that once went in /var/run/ (or, worse, got stuck in /dev/.foo/ because /var/ wasn't available early enough in the boot process). See this mailing list post about /run/.

/run/ isn't strictly systemd-related, but part of a larger (some might say "overreaching") clean up, like the newly standardized config files (although the point of those new config files is that they can be run directly by systemd, without executing a shell, so systemd directly reads /etc/fstab and /etc/hostname).

Because systemd unit files are capable of doing the same job (i.e. — offering config options for init scripts that have become too complex for admins to safely edit), systemd has the ambition to phase out /etc/default/ (and /etc/sysconfig on RedHat-based distros).

Temp files

systemd-tmpfiles creates, deletes, and tidies up temp files based on configuration files in /etc/tmpfiles.d/ and /usr/lib/tmpfiles.d/. The systax of these files is concise (see tmpfiles.d(5)):

% cat /usr/lib/tmpfiles.d/sshd.conf
d /var/run/sshd 0755 root root

Timer

systemd can do cron-like stuff, configured with .timer unit files. See systemd.timer(5).

Mount

systemd handles mounting filesystems. See systemd.mount(5). It can do automounting, and includes various additions to the traditional /etc/fstab syntax.

References