Wednesday 17 August 2022

Speeding up the kernel testing loop

When I create kernel contributions, I usually rely on a specific hardware, which makes using a system on which I need to deploy kernels too complicated or time-consuming to be worth it. Yes, I'm an idiot that hacks the kernel directly on their main machine, though in my defense, I usually just need to compile drivers rather than full kernels.

But sometimes I work on a part of the kernel that can't be easily swapped out, like the USB sub-system. In which case I need to test out full kernels.

I usually prefer compiling full kernels as RPMs, on my Fedora system as it makes it easier to remove old test versions and clearly tag more information in the changelog or version numbers if I need to.

Step one, build as non-root

First, if you haven't already done so, create an ~/.rpmmacros file (I know...), and add a few lines so you don't need to be root, or write stuff in /usr to create RPMs.

$ cat ~/.rpmmacros
%_topdir        /home/hadess/Projects/packages
%_tmppath        %{_topdir}/tmp

Easy enough. Now we can use fedpkg or rpmbuild to create RPMs. Don't forget to run those under “powerprofilesctl launch” to speed things up a bit.

Step two, build less

We're hacking the kernel, so let's try and build from upstream. Instead of the aforementioned fedpkg, we'll use “make binrpm-pkg” in the upstream kernel, which builds the kernel locally, as it normally would, and then packages just the binaries into an RPM. This means that you can't really redistribute the results of this command, but it's fine for our use.

 If you choose to build a source RPM using “make rpm-pkg”, know that this one will build the kernel inside rpmbuild, this will be important later.

 Now that we're building from the kernel sources, that's our time to activate the cheat code. Run “make localmodconfig”. It will generate a .config file containing just the currently loaded modules. Don't forget to modify it to include your new driver, or driver for a device you'll need for testing.

Step three, build faster

If running “make rpm-pkg” is the same as running “make ; make modules” and then packaging up the results, does that mean that the “%{?_smp_mflags}” RPM macro is ignored, I make you ask rhetorically. The answer is yes. “make -j16 rpm-pkg”. Boom. Faster.

Step four, build fasterer

As we're building in the kernel tree locally before creating a binary package, already compiled modules and binaries are kept, and shouldn't need to be recompiled. This last trick can however be used to speed up compilation significantly if you use multiple kernel trees, or need to clean the build tree for whatever reason. In my tests, it made things slightly slower for a single tree compilation.

$ sudo dnf install -y ccache
$ make CC="ccache gcc" -j16 binrpm-pkg

Easy.

And if you want to speed up the rpm-pkg build:

$ cat ~/.rpmmacros
[...]
%__cc            ccache gcc
%__cxx            ccache g++

More information is available in Speeding Up Linux Kernel Builds With Ccache.

Step five, package faster

Now, if you've implemented all this, you'll see that the compilation still stops for a significant amount of time just before writing “Wrote kernel...rpm”. A quick look at top will show a single CPU core pegged to 100% CPU. It's rpmbuild compressing the package that you will just install and forget about.

$ cat ~/.rpmmacros
[...]
%_binary_payload    w2T16.xzdio

More information is available in Accelerating Ceph RPM Packaging: Using Multithreaded Compression.

TL;DR and further work

All those changes sped up the kernel compilation part of my development from around 20 minutes to less than 2 minutes on my desktop machine.

$ cat ~/.rpmmacros
%_topdir        /home/hadess/Projects/packages
%_tmppath        %{_topdir}/tmp
%__cc            ccache gcc
%__cxx            ccache g++
%_binary_payload    w2T16.xzdio


$ powerprofilesctl launch make CC="ccache gcc" -j16 binrpm-pkg

I believe there's still significant speed ups that could be done, in the kernel, by parallelising some of the symbols manipulation, caching the BTF parsing for modules, switching the single-threaded vmlinux bzip2 compression, and not generating a headers RPM (note: tested this last one, saves 4 seconds :)

 

The results of my tests. YMMV, etc.

Command Time spent Notes
koji build --scratch --arch-override=x86_64 f36 kernel.src.rpm 129 minutes It's usually quicker, but that day must have been particularly busy
fedpkg local 70 minutes No rpmmacros changes except setting the workdir in $HOME
powerprofilesctl launch fedpkg local 25 minutes
localmodconfig / bin-rpmpkg 19 minutes Defaults to "-j2"
localmodconfig -j16 / bin-rpmpkg 1:48 minutes
powerprofilesctl launch localmodconfig ccache -j16 / bin-rpmpkg 7 minutes Cold cache
powerprofilesctl launch localmodconfig ccache -j16 / bin-rpmpkg 1:45 minutes Hot cache
powerprofilesctl launch localmodconfig xzdio -j16 / bin-rpmpkg 1:20 minutes