Friday, July 9, 2010

Basics of initrd

Many Linux distributions ship a single, generic kernel image that is intended to boot as wide a variety of hardware as possible. The device drivers for this generic kernel image are included as loadable module, as it is not possible to statically compile them all into the one kernel without making it too large to boot from computers with limited memory or from lower-capacity media like floppy disks.

This then raises the problem of detecting and loading the modules necessary to mount the root file system at boot time (or, for that matter, deducing where or what the root file system is).

To further complicate matters, the root file system may be on a software RAID volume, LVM, NFS (on diskless workstations), or on an encrypted partition. All of these require special preparations to mount.

Another complication is kernel support for hibernation, which suspends the computer to disk by dumping an image of the entire system to a swap partition or a regular file, then powering off. On next boot, this image has to be made accessible before it can be loaded back into memory.

To avoid having to hardcode handling for so many special cases into the kernel, an initial boot stage with a temporary root file system—now dubbed early user space—is used. This root file system would contain user-space helpers that would do the hardware detection, module loading and device discovery necessary to get the real root file system mounted.

An image of this initial root file system (along with the kernel image) must be stored somewhere accessible by the Linux bootloader or the boot firmware of the computer. This can be:
• The root file system itself
• A boot image on an optical disc
• A small ext2 or FAT-formatted partition on a local disk (a boot partition)
• A TFTP server (on systems that can boot from Ethernet)

The bootloader will load the kernel and initial root file system image into memory and then start the kernel, passing in the memory address of the image. At the end of its boot sequence, the kernel tries to determine the format of the image from its first few blocks of data:
• In the initrd scheme, the image may be an (optionally compressed) file system image, which is made available in a special block device (/dev/ram) that is then mounted as the initial root file system. The driver for that file system must be compiled statically into the kernel. Many distributions originally used compressed ext2 file system images. Others (including Debian 3.1) used cramfs in order to boot on memory-limited systems, since the cramfs image can be mounted in-place without requiring extra space for decompression.
Once the initial root file system is up, the kernel executes /linuxrc as its first process. When it exits, the kernel assumes that the real root file system has been mounted and executes "/sbin/init" to begin the normal user-space boot process.
• In the initramfs scheme (available in Linux 2.6.13 onwards), the image may be an (optionally compressed) cpio archive. The archive is unpacked by the kernel into a special instance of a tmpfs that becomes the initial root file system. This scheme has the advantage of not requiring an intermediate file system or block drivers to be compiled into the kernel.
On an initramfs, the kernel executes /init as its first process. /init is not expected to exit.

Depending on which algorithms were compiled statically into it, the kernel can currently unpack initrd/initramfs images compressed with gzip, bzip2 and LZMA.

The initial RAM disk (initrd) is an initial root file system that is mounted prior to when the real root file system is available. The initrd is bound to the kernel and loaded as part of the kernel boot procedure. The kernel then mounts this initrd as part of the two-stage boot process to load the modules to make the real file systems available and get at the real root file system.

The initrd contains a minimal set of directories and executables to achieve this, such as the insmod tool to install kernel modules into the kernel.

In the case of desktop or server Linux systems, the initrd is a transient file system. Its lifetime is short, only serving as a bridge to the real root file system. In embedded systems with no mutable storage, the initrd is the permanent root file system.

The initrd image contains the necessary executables and system files to support the second-stage boot of a Linux system.

Depending on which version of Linux you're running, the method for creating the initial RAM disk can vary. Prior to Fedora Core 3, the initrd is constructed using the loop device. The loop device is a device driver that allows you to mount a file as a block device and then interpret the file system it represents. The loop device may not be present in your kernel, but you can enable it through the kernel's configuration tool (make menuconfig) by selecting Device Drivers > Block Devices > Loopback Device Support. You can inspect the loop device as follows (your initrd file name will vary):

Inspecting the initrd (prior to FC3):
# mkdir temp ; cd temp
# cp /boot/initrd.img.gz .
# gunzip initrd.img.gz
# mount -t ext -o loop initrd.img /mnt/initrd
# ls -la /mnt/initrd

You can now inspect the /mnt/initrd subdirectory for the contents of the initrd. Note that even if your initrd image file does not end with the .gz suffix, it's a compressed file, and you can add the .gz suffix to gunzip it.

Beginning with Fedora Core 3, the default initrd image is a compressed cpio archive file. Instead of mounting the file as a compressed image using the loop device, you can use a cpio archive. To inspect the contents of a cpio archive, use the following commands:

Inspecting the initrd: (RHEL-4, FC-3 or Later)
# mkdir temp ; cd temp
# cp /boot/initrd-2.6.14.2.img initrd-2.6.14.2.img.gz
# gunzip initrd-2.6.14.2.img.gz
# cpio -i --make-directories < initrd-2.6.14.2.img

The result is a small root file system, as shown in Listing 3. The small, but necessary, set of applications are present in the ./bin directory, including nash (not a shell, a script interpreter), insmod for loading kernel modules, and lvm (logical volume manager tools).

Listing 3. Default Linux initrd directory structure

# ls -la
#
drwxr-xr-x 10 root root 4096 May 7 02:48 .
drwxr-x--- 15 root root 4096 May 7 00:54 ..
drwxr-xr-x 2 root root 4096 May 7 02:48 bin
drwxr-xr-x 2 root root 4096 May 7 02:48 dev
drwxr-xr-x 4 root root 4096 May 7 02:48 etc
-rwxr-xr-x 1 root root 812 May 7 02:48 init
-rw-r--r-- 1 root root 1723392 May 7 02:45 initrd-2.6.14.2.img
drwxr-xr-x 2 root root 4096 May 7 02:48 lib
drwxr-xr-x 2 root root 4096 May 7 02:48 loopfs
drwxr-xr-x 2 root root 4096 May 7 02:48 proc
lrwxrwxrwx 1 root root 3 May 7 02:48 sbin -> bin
drwxr-xr-x 2 root root 4096 May 7 02:48 sys
drwxr-xr-x 2 root root 4096 May 7 02:48 sysroot
#