Home of Rob
Menu
Improving Linux Boot Time
Improving the Linux boot-up time
By improving, I mean reducing. If you take a look at the links page on this site there's a link to the ubuntu forums wiki giving information about many of the init scripts that are run when the kernel first enters userspace. However, there's quite a few things that happen before this. If you're running a normal desktop system then all you really need to worry about is the init scripts. However, for embedded linux systems it's important to get the system up and running as quickly as possible and for this we need to get our hands dirty.
Hacking the kernel
To improve things in the stage before the init scripts are run we need to be modifying and then rebuilding the kernel. This isn't really what you want to be doing if you're just using linux on your desktop (well, maybe it is for some people...), but for the embedded scene it's required. And if you're writing you're own device drivers for an embedded system (as I've been) then you'll probably be rebuilding the kernel anyway.
In this web page I'll concentrate on altering the kernel, and ignore the things you can do to the init scripts. Take a look at the link to the ubuntu forums for help on the init scripts (it's fairly straight forward - you just need to delete the scripts you don't need). The focus is going to be on changes I've made to the kernel, and isn't the last word on these things, but should help beginners.
Boot up background
First off, we need to go over some background of how the linux kernel boots up in the first place.
The first thing that happens when the machine is turned on is that the bootloader runs. On our machine this is the Compulab ArmMon application and is out of our control. This application will:
- configure the CPU speed
- initialise the memory (setup registers, clear memory, determine onboard memory size etc.)
- turn on the chaches
- set up the serial port for the boot console
- do hardware tests, or POST (power on self test)
Once the above has been executed the bootloader will locate the kernel image, decompress it and load it into memory (notes on doing this differently later). You can pass arguments to the kernel via the ArmMon interface and if there are any present the bootloader will load these into memory and then jump to the kernel entry point.
The bootloader's job is now done and we enter kenerl startup:
The kernel configures the environment for the first C routine. The kernel entry point is an assembler routine (arch/arm/kernel/head.S) that turns on the MMU (memory management unit - responsible for mapping memory between physical and virtual addresses), initialises cache and setting up the stack so that the first C routine can be invoked.
Once the above has been completed we enter the start_kernel() function, located in init/main.c. This function does a lot of things before it terminates - once it terminates it becomes the idle task (process id 0).
The next function to run is setup_arch() - this is used to configure the specific architecture on which the kernel is running. It's obviously highly specific code, but the general features of this function include:
- recognise the processor (which flavour of ARM, for instance)
- recognise the board
- analysis of command line parameters passed to the kernel
- Identify the RAM disk
- call bootmem functions (the initial memory that the kernel can reserve for various purposes before paging steps in and grabs the rest of it).
- call the paging initialisation function (pages the rest of the memory for the system).
After this there are a number of initialisation routines, including:
- Exception initialisation (the trap_init() function) - before this the result of an exception is platform specific.
- Interrupt handling initialisation (the init_IRQ() function).
- Timer initialisation (time_init() function).
- Initialisation of the console ( console_init() ) - including printk.
- Calculate the delay for loops ( calibrate_delay() ) - this is used so that the udelay() function works throughout the kernel.
We've now configured everything that we need behind the scenes and the kernel now mounts the filesystem, frees up any memory no longer needed and enters userspace. The kernel jumps to userspace by overlaying iself (using execve) with the executable image of a special program known as init. This normally resides in /sbin (though the user can specify a custom application to use with a command line parameter). The init program runs through the /etc/rcS.d init scripts, enters the appropriate runlevel (2 for us) and etc. etc. Device driver loading and all that jazz is done here.
Where next?
So, getting back to the initial question, what can we do to improve boot times? Well, for my situation the need to reduce the boot time is in order to get some text displayed on an LCD. If the user thinks that something is happening when the device is turned on then it matters less that he can't take control yet. Having nothing happening at all is very frustrating! To get something onto the LCD then we can modify the kernel code so that it writes to the LCD during bootup. Obviously we need to wait until we've entered the C code area (messing around in assembly is likely to go horribly wrong, plus it's difficult and unlikely to make any noticeable difference in the time before the LCD is filled with text). The earliest 'safe' location, in my view, is the setup_arch routine. This code is for architecture specific code and we can be reasonably sure that toggling a few GPIO pins isn't going to cause any huge problems.
The setup_arch() routine is in arch/arm/kernel/setup.c and the last thing it does is to make a call into the machine initialisation routine. This is defined in arch/arm/mach-pxa/cm-x270.c and is cmx270_init() It is in here that we can put the LCD initialisation code.
A note on memory usage: The linker script for any architecture has an init section, using __init_begin and __init_end to signify the limits of this area. The idea of this section is that it contains text and data that can be thrown away once they've been used. Driver initialisation is one example of use-and-throw functions; once a driver that's statically linked to the kernel does its registration and initialisation then the function won't be invoked again and so it can be thrown away. Since all these functions are grouped together then the entire block of memory can be freed as one big chunk and will then be available for the memory manager as free pages. This is a particularly useful concept for the embedded engineer, where memory space is limited. A use-and-throw function (or variable) is declared using the __init directive.
So all I did was to add my LCD initialisation code into the cm-x270.c file, within the cmx270-init() function. The code is used, the GPIO pins are toggled and the LCD displays some text. The code is then thrown away when no longer in use.
While you're down there...
While we're messing around in the depths of the kernel code, we can make improvements to the startup time by removing bits of code that we'll never need. So grab those machetes and lets hack away!
In the cmx270_init function there's initialisation for a monitor, the ethernet, the PCI bus and some audio stuff. I don't need use any of this hardware so removing these initialisations will improve the bootup time. If you find the structure platform_device *platform_devices[] __initdata you'll see definitions for all these items. If you're doing something similar then you can make a judgement as to which you don't need and so can comment out the corresponding code.
Note that if you remove the PCI from the kernel (remove the PCI drivers completely) then the kernel can fail to build. Well, it fails with me! The problem is cmx270_init_irq() (cm-x270.c) line:
IT8152_INTC_PDCNIMR = 0xffff;
The #define on the left references it8152_base_address but this hasn't been defined if CONFIG_PCI is off, so you can get round this by adding the line:
unsigned long it8152_base_address = CMX270_IT8152_VIRT;
Ok, I know, a bit of a hack and is another reason for leaving the PCI stuff turned on (not really sure what other consequences there are for putting in this line...) but I haven't seen any issues just yet.
If anyone else has the same problem and knows a better way to resolve it then let me know.
Device Drivers
I haven't actually mentioned many ways to improve the startup time of your linux system yet, so here's some more practical techniques that you can use to get the kernel booting faster. They're not particularly difficult or fancy, but if you don't know about them then they could come in handy.
The first method is removing all drivers from the kernel that you don't need. And to load as modules as many of those that you do need as possible. For example, remove the IDE drivers if you have no IDE bus, remove all keyboard and mouse drivers if you don't have those attached. Obvious, but useful.
Another method is to use XIP (eXecute In Place). Normally the bootloader needs to decompress the kernel, copy it to RAM and then move to the kernel entry address. If we build a kernel that's already decompressed and then copy it to RAM when we download it to the embedded system then all the bootloader needs to do is to transfer code execution to the appropriate address. This can save a relatively large amount of time, but not all systems will support it.
Filesystems are another area to consider. If you're using an embedded system with NAND memory then the JFFS2 filesystem may provide benefits over the traditional ext3 system.
Clock syncronisation is something that can be worth investigating too. During boot up the kernel will syncronise the system clock with the hardware clock. It does this on an edge that occurs every second. Which means that the kernel will need to wait a time between roughly nothing to up to almost a second. The average delay, of course, will be half a second. If you can cope without an accurate system clock then removing this check will improve bootup considerably (it may be possible to do this check later, after the kernel has booted up, but I haven't investigated this yet).
Quiet mode is also worth pursuing. Writing text out to a serial port console can take quite a bit of time so passing the kenel command line argument 'quiet' can speed up the boot process. The argument will prevent text being sent to the console output, but the messages will still be present in the log (obtained with the command: dmesg