Wednesday, November 11, 2009

Faster booting with Upstart

A good portion of the boot time on current Linux systems is spent on system initialisation and starting dozens of daemons sequentially. The Ubuntu 9.10 development team have started to parallelise and accelerate the boot process through the large scale use of Upstart.

This article originally appeared in c't magazine 9/09, p. 176

Loading the Linux kernel takes up just a fraction of the time spent waiting for the login prompt during booting. The system spends most of its time sitting waiting for init (which has its origins in Unix System V) to cycle through the various runlevels, during which it runs innumerable sequences of init scripts.

In Ubuntu and Fedora, Upstart has long replaced the traditional SysV init.

The current boot sequence, with services starting consecutively in a fixed sequence, remains unaltered simply because no-one has sat down and adapted the init scripts for these various services to the capabilities of Upstart.

Upstart simply emulates the SysV init runlevels (which actually no longer exist on systems running Upstart) and continues to call the old init scripts.

For Ubuntu 9.10, the development team have finally started to convert some services to Upstart.

Both Upstart and SysV init are the first processes to be launched by the kernel (with ID 1) as soon as the latter has booted and any boot scripts from the initial ramdisk (initrd) have been run.

For SysV init, the lynchpin of system initialisation is the /etc/inittab file. This is where SysV init finds the default runlevel, the name of the first initialisation script and the commands for initialising each runlevel.

During runlevel initialisation, the linked init scripts in the relevant runlevel directory (e.g. /etc/rc5.d) are run sequentially.

For this to work, all services must run in the background as daemons and decouple themselves from the console, since the init script would otherwise hang until the service had terminated.

This decoupling means that for init to determine whether a service is still running, or has self-terminated, is time-consuming.

It generally achieves this by the daemon saving a file containing its process ID (PID) in /var/run, leaving it to the init script to determine whether a particular PID belongs to the daemon in question.

One of the final init scripts launches the GUI. Once SysV init has run all of its scripts, services listed in the /etc/inittab file, such as login consoles, are then launched and monitored.

Upstart, by contrast, is event-oriented and works using 'jobs', with each job file in the /etc/init directory being responsible for launching a service or for a specific component of system initialisation.

There is no fixed sequence; instead each job specifies the events to which it will react. When an event occurs, Upstart starts all jobs that have been waiting for this event, in parallel.

Upstart generates the first event, startup, automatically when it is called. Each job also generates a started Jobname and stopped Jobname event on starting and terminating respectively.

Various jobs take an interest in the startup event in Ubuntu 9.10, including hostname, which sets up the computer name.

The associated job file is /etc/init/hostname.conf. The following example shows a very simplified version of this job:
start on startup
exec hostname -b -F /etc/hostname

The keyword start on specifies the event which will trigger this job. If the job is to be triggered by multiple events, these events must be logically linked.
start on (runlevel [016]
          and (stopped gdm
               or stopped kdm
               or stopped xdm))

Unlike in Ubuntu 9.04, there can no longer be multiple start on lines. If the job subsequently needs to be terminated, stop on is used to define additional stop events which in turn cause the job to be stopped.

Vergrößern If a job needs to be started or stopped manually, this can be achieved using

initctl start Jobname


initctl stop Jobname.

The name of the program which the job should run follows the keyword exec. One major difference between Upstart and SysV init is that services always run in the background in init scripts, since they would otherwise cause init to hang.

Upstart, by contrast, expects the process following the exec statement to run in the foreground, since Upstart only considers the job to be running for as long as this process is running.

If a process started using exec ends, Upstart considers the job to have ended and waits for another suitable event to occur (waiting).

Upstart notes the status of each job listed in /etc/init. This information can be viewed using the initctl list and initctl status Jobname commands.

Events as the key

Upstart's event-based design differs fundamentally from that used by SysV init, in which init scripts are stubbornly called in the lexical sequence in the relevant runlevel directory. This allows Upstart to be much more flexible.

If, for example, there is no network connection present when the mail daemon (MTA) is launched, SysV init has to wait for the time-out before continuing to boot the system.

In Upstart, the MTA is only started once the network connection has been established. The network up event is set as the start event for the MTA job.

This event is triggered by the service responsible for network set-up only after the network has been successfully configured – in the case of a laptop out on the road, for example, this means not at all.

Because jobs do not have a fixed start sequence, the system is able to boot faster with Upstart than with SysV init.

In addition, it is possible to process multiple jobs simultaneously, so that initialisation tasks can be executed in parallel. This also represents a potential time saving.

A major change in Ubuntu 9.10 is that Upstart has finally been brought creaking to life now many system settings and calls to daemons have been converted to it.

These include mounting drives and starting udev, cron and the GUI. However, Ubuntu is not yet quite at the point of being able to do away with the old init scripts altogether and there is still a wrapper job, rc, which emulates the SysV init runlevels and calls the init scripts linked in /etc/init.d and /etc/rc?.d.

Shortly after the kernel has been started and the first pseudo-file system mounted, the /etc/init/rc-sysinit.conf job is started.

The start event for this job is filesystem and its role is to generate the runlevel S event, thereby setting the equivalent of runlevel S. The /etc/init/rcS.conf job then starts; it processes the symbolic links in /etc/rcS.d, taking care of basic system configuration.

The signal for runlevel 2, to which further Upstart jobs are coupled, is then generated and the init scripts linked in /etc/rc2.d are executed.

The /etc/inittab file – which on systems using SysV init deals with tasks such as starting the login console to complete initialisation – is replaced by a number of Upstart jobs, a illustrated by the following extract:
start on stopped rc RUNLEVEL=[2345]
stop on runlevel [!2345]
exec /sbin/getty -8 38400 tty1

Since the rc job, which emulates the runlevels, terminates after calling the init script for the relevant runlevel and is not running during the emulated runlevel, the start signal for the tty jobs is the termination of the rc job, rather than for example, its start.

In addition, the login console needs to be deactivated when shutting down the system (runlevels 0 and 6) and in single user mode.

Getty itself runs in the foreground and, since it is called using the keyword exec, is monitored by Upstart. The additional keyword respawn tells Upstart to keep restarting the process if it terminates. This causes a new login prompt to be displayed after logging off.

To prevent the system from being overloaded by a process constantly restarting, the frequency with which Upstart tries to start a service over a particular period can be limited. Fedora 10 uses this feature in the prefdm job to launch the GUI.
start on stopped rc5
stop on runlevel [!5]
respawn limit 10 120
exec /etc/X11/prefdm --nodaemon

The limit for restarting is set to 10 attempts within 120 seconds, for which both the respawn and respawn limit commands are required. respawn limit on its own does not cause the service to restart.

No backchat
When it comes to terminating a job, Upstart concerns itself only with the process started in the foreground using exec.

It sends the terminate signal (SIGTERM) and expects the process to quit. Upstart does not tolerate dissent – if the process fails to terminate, it is summarily terminated a few seconds later using the kill signal (SIGKILL). 

Once a stop event has been triggered, a service can neither block it, delay it or roll it back
The keywords pre-stop and post-stop can be used to specify commands which Upstart should execute before and after terminating the service. This is useful for any clean-up work, for example:
post-stop exec rm -f /var/run/

Time course of a Upstart jobs. There are also pre-start and post-start keywords which can be used to specify commands to be executed immediately before and after starting a service, such as creating required directories or adjusting certain system settings, after starting the service.

Since exec expects the service to be started in the foreground, Upstart can't wait for the service to be terminated before executing post-start.

Upstart therefore executes post-start in parallel with starting the service. The diagram on the right illustrates the sequence of events when starting and ending an Upstart job.

In the above examples, all our commands have been called using exec. It is, however, also possible to use a block of commands book-ended by the keywords script and end script.
pre-start script
  if [ ! -e /var/run/tserv ]; then
    mkdir -p /var/run/tserv
end script

Where, instead of using exec, a service application is started using a block of commands. It is worth bearing in mind that the commands following the command for calling the service will only be executed if the service terminates immediately after being called.

If the service is subsequently terminated by means of a terminate signal from Upstart, these commands will not be executed.

Migrating init scripts

Its event-based design makes Upstart particularly useful for services which need to react to outside influences, such as VDR, which turns the computer into a digital video recorder.

On a desktop, it can reasonably be assumed that a DVB receiver card, once fitted into the machine, will always be present.

By contrast, on a laptop which is to be used as a DVB-T TV with a USB receiver only intermittently plugged in, this is not necessarily the case.

In this case, we only want VDR to run when the DVB-T receiver is connected – to achieve this we have to switch to using Upstart as our init script.

The VDR package's own init script needs to be deactivated to prevent it from inopportunely sticking its oar in.

In Ubuntu, this can be achieved temporarily until the VDR package is next updated by using update-rc.d:
update-rc.d -f vdr remove

For Upstart to be aware that a DVB receiver has been connected, a udev rules file (see link)  needs to be added under /etc/udev/rules.d: to trigger an Upstart event
SUBSYSTEM=="dvb", SUBSYSTEMS=="usb", ACTION=="add", \
  KERNEL=="dvb*.dvr0", RUN+="/sbin/initctl \
  --quiet emit --no-wait -e UDEV_KERNEL=$kernel \
  -e UDEV_DEVPATH=$devpath dvb-device-add"

This udev rule applies only to USB DVB devices which create a DVB output device with the kernel designation dvbX.dvr0.

If the kernel reports the presence of such a device, udev uses initctl emit to generate the dvb-device-add Upstart event. The --quiet parameter tells initctl to skip the usual status messages.

All the other parameters relate to the initctl command emit. Initctl normally waits until the triggered event has been processed in full.

For a service, this means that the initctl call does not return until the service has stopped. This is avoided by using the --no-wait parameter, which terminates initctl as soon as the Upstart event has been triggered.

The -e parameter permits environment variables to be passed to the Upstart job, in this case UDEV_KERNEL containing the device's kernel name and UDEV_DEVPATH containing the path to the device tree under Sysfs.

The dvb Upstart job (see listing at the end of this article) takes care of the dvb-device-add event. Its job is to create a file, for each DVB device in the /var/run/dvb directory, which stores the Sysfs path to the device tree.

This ensures that it is subsequently possible to trace which DVB device is associated with which USB device.

Finally, the job triggers the vdr-start Upstart signal.

The vdr Upstart job for calling VDR is trivial. It is triggered by the vdr-start and vdr-stop events. VDR also needs to run in runlevels 2 to 5  only.
stop on (vdr-stop
         or runlevel [!2345])
exec /usr/sbin/vdr-upstart

It is the vdr-upstart script which does the spadework when it comes to calling VDR. It checks whether VDR is activated in the /etc/defaults/vdr file, loads various configuration files and calls the runvdr start script in the foreground.

Vdr-upstart is based on the init script from the VDR package.

By adding the udev rule and the two Upstart jobs, we have ensured that VDR starts only if at least one DVB device is connected.

If more than one is connected, it doesn't matter. The dvb job always terminates after creating the file containing the sysfs path in /var/run/dvb and triggering the vdr-start Upstart event and is processed anew for each additional device.

VDR, by contrast, runs in the foreground. The job is therefore listed by initctl list with the status 'running'.

Upstart consequently ignores the vdr-start start signal, preventing multiple instances of VDR from being started by multiple DVB receivers.

Phantom devices
Stopping VDR turns out to be a lot more complicated that starting it. VDR needs to be terminated only when the final DVB receiver is removed.

As long as VDR is running, however, the program holds the DVB devices under /dev/dvb open, for which reason the kernel does not remove them even when the USB receiver has long been packed away – they are phantom devices.

As long as VDR is still running therefore, no udev event corresponding to a DVB device being removed occurs.

On top of this, since kernel 2.6.29, the device's Sysfs tree is only cleared out when the last device is closed.

With VDR still running, a little educated guesswork is therefore required in order to realise that the DVB receiver has already been put away.

The solution is to monitor all events affecting the removed USB devices via udev.
SUBSYSTEMS=="usb", ACTION=="remove", \
 RUN+="/sbin/initctl --quiet emit --no-wait \
  -e UDEV_DEVPATH=$devpath device-remove"

The dvb Upstart job also reacts to the device-remove event and checks the device path of the just-removed USB device against the device paths of the USB DVB receivers stored in the /var/run/dvb directory.

If it finds a match, the job assumes that the DVB receiver in question has been removed and deletes the associated file in /var/run/dvb.

Only once the final DVB receiver has been removed does the dvb job trigger the vdr-stop Upstart event, in response VDR is stopped and udev clears out the DVB device entries downstream from /dev/dvb.

Our example illustrates the flexibility which can be achieved using Upstart, but also how complicated adapting the old init scripts to the Upstart concept is.

It's likely to be a while yet before the last SysV init script is migrated to Upstart.

Udev-Job dvb.conf
env RUNDIR=/var/run/dvb
start on (dvb-device-add
          or device-remove)
emits vdr-start vdr-stop
  case "$UPSTART_EVENT" in
    mkdir -p $RUNDIR
    echo ${UDEV_DEVPATH%/dvb/${UDEV_KERNEL}} >\
    /sbin/initctl --quiet emit --no-wait vdr-start
    if [ -d $RUNDIR ]; then
      for d in $RUNDIR/*; do
        if [ -f $d ]; then
          read basedev < $d
          if [ -z "$basedev" -o "${UDEV_DEVPATH#${basedev}}" != \
            "${UDEV_DEVPATH}" ]; then
            rm -f $d
      rmdir --ignore-fail-on-non-empty $RUNDIR
      if [ ! -d $RUNDIR ]; then
        /sbin/initctl --quiet emit --no-wait vdr-stop
end script

No comments:

Post a Comment