Tuesday, November 22, 2011

How does Linux kernel detect and plug in your hardware? Kernel, sysfs, udev and dbus collaboration


I have been administrating Linux systems for a while now and were always strugling to „dig deeper“. Today I found myself wondering how does Linux detect, plug in my hardware and show that pop-up window asking me to choose what I want to do with my flash drive. So I launched my web browser and began to search for an answers in forums, tutorials and how-tos which almost ended in complete failure. I say „almost“ because I did find some of the answers but they all were scattered and incomplete or too old. So I had to use „heavy artillery“ and read through all those manuals… And I think I finally get it how it works :) This is what I will try to explain further. *I really hope I didn’t misunderstand something*
Everything starts with at the Kernel. Operating systems are using Privilege Rings:
In computer science, hierarchical protection domains, often called protection rings, are a mechanism to protect data and functionality from faults (fault tolerance) and malicious behaviour (computer security)
Privilege rings
CPU privilege rings
These rings are created by CPU and not by OS. Any OS kernel operates in Ring 0 which is most privileged level and can comunicate directly to the hardware and the CPU. Rings 1 and 2 are commonly used for device drivers. And ring 3 is used for user-space applications (media players, web servers and anything else user can communicate to directly). Device drivers are a „bridge“ between user-space applications and hardware. You should note that in Linux rings 1 and 2 are NOT used (at least this is what I found out…), because Linux drivers are compilled directly into kernel or as a dynamic kernel modules (in both cases drivers appear in at a Ring 0).
Now as we know where kernel and drivers are we can move on.
Linux kernel constantly scans all your computer bus’es  for any changes and new hardware. Once any change on any bus is detected magic begins :)

1. Export hardware information to userspace (sysfs)

All bus’es and hardware information in Kernel becomes objects (object oriented programming). These objects (hardware information) is exported to a virtual filesystem sysfs which in mounted at /sys directory. :
  • objects -> folders
  • attributes names -> file names
  • attribute values -> file contents
For an example lets take a file /sys/class/net/eth0/address which contains 00:11:22:33:44:55:66:
  • eth0/ – object
  • address – object attribute name
  • 00:11:22:33:44:55:66 – value of an attribute „address“.

 2. Notify userspace tools that hardware is available (uevent and udevd)

Now as we have device information available Kernel driver core can notify userspace tools about device initialization or removal. Because ring 0 (kernel) and ring 3 (userspace) can’t comunicate directly to each other Kernel sends out uevents through netlink to interested userspace tools (such as udevd) . Uevent (userspace event) is a message sent from Kernel (ring 0) to userspace (ring 3).
Udevd is just a daemon standing in between the Kernel and all the udev system and perform some important functions (I’ll mention them later). The udev daemon (udevd) is started at startup then reads and parses all the rules found in /etc/udev/rules.d/ and keep these rules in memory (udev database) for further usage by udev. Later udevd start to listen on the netlink for uevents comming from Kernel driver core.
recv(4, // socket id
„add@/class/input/input9/mouse2\0 // message
ACTION=add\0 // action type
DEVPATH=/class/input/input9/mouse2\0 // path in /sys
SUBSYSTEM=input\0 // subsystem (class)
SEQNUM=1064\0 // sequence number
PHYSDEVPATH=/devices/pci0000:00/0000:00:1d.1/usb2/2-2/2-2:1.0\0
// device path in /sys
PHYSDEVBUS=usb\0 // bus
PHYSDEVDRIVER=usbhid\0 // driver
MAJOR=13\0 // major number
MINOR=34\0″, // minor number
2048, // message buffer size
0) // flags
= 221 // actual message size

3. Process uevents, match them against rules in /etc/udev/rules.d/ and populate /dev directory (udevd and udev)

As uevents can be sent very fast throught netlink and arive in a wrong order or Kernel can send duplicate uevents udevd must:
  • ensure right event order
  • remove duplicate uevents
After removing duplicate uevents and sorting them in the right order udevd passes uevents to udev event process.
Udev performs folowing actions before and after creating device nodes:
  • parse /etc/udev/rules.d
  • match uevent against udev rules (identify device)
  • collect information from sysfs
  • populate (create device files) /dev directory
  • add symlinks to  devices in /dev (for an example /dev/cdrom may be a symlink to /dev/sr0)
  • set permissions to devices
  • rename network interfaces
  • store device information to udev database
  • load device drivers
  • notify userspace applications (e.g., dbus)
When udev finaly receives uevent from udevd it then parse udev rules and match uevent provided information to these rules (device identification). Udev also gather device’s minor and major node numbers (and other useful information) from sysfs, then based on udev rules and information gathered from sysfs populate /dev directory with device nodes representing currently available hardware, create symlinks to devices, set device permissions and store all information to udev database.

4. Load device drivers (udev, modproble)

Kernel driver core send uevent to udev and set MODALIAS environment variable. Udev event process then run commandmodproble $MODALIAS. Then modprobe load all modules whose aliases match this MODPROBE string. If kernel would not send MODALIAS environment variable udev could still get this information from sysfs. For an example /sys/class/net/eth0/device/modalias in my PC contain  pci:v000010ECd00008139sv00001734sd000010B8bc02sc00i00.
You could actually load ALL drivers for this device by simply executing a shell command:modprobe pci:v000010ECd00008139sv00001734sd000010B8bc02sc00i00

5. Notify userspace applications (through D-bus)

D-bus diagram
D-Bus diagram
Now as all hardware is set up and drivers loaded system can notify userspace and GUI applications of new available hardware so your media player could offer you to play MP3 files from your attached USB disk, or your file manager could ask you to open your files on that same disk.
In Linux systems the most common tool for this task is the D-Bus IPC (Inter-process communication protocol). Any application can subscribe events from dbus about interested hardware events.
A diagram will best explain how D-Bus work.
And finally a diagram to ilustrate everything that I was trying to explain all this article :)
Kernel device (hot)plugging diagram udev kernel sysfs bdus
Linux device (hot)plugging diagram
If you notice any errors or disagree on something  I wrote here, please, contact me through email, comments or any other means :) Good luck.

No comments:

Post a Comment