Security-hardening ARM64 Linux kernel

Intro

In the first year of my Ph.D, I ambitiously began a research on protecting the Linux kernel against data-only attacks. My research especially focused on protecting security-critical objects and pointers leveraging the ARM’s upcoming hardware security features (PAC and MTE).

Building such a kernel hardening system required me to tackle several key issues:

  1. Cross-compiling the kernel (X86 -> ARM64)
  2. Cross-module kernel analysis
  3. Linux kernel instrumentation
  4. Connecting the cross-module analysis with kernel build system.
  5. Setting up ARM64 environments for both emulation and a real device (Raspberry Pi 4B).

As someone who was not well familar with kernel and compiler, each of these issues presented quite a challenge. So I thought it would be nice to write about each of the issues to help others facing similar issues.

0. Environment

My environment was as follows:

  • Host Environment: Linux 5.15 Ubuntu 20.04.3
  • Clang/LLVM: 11.0.0 and 14.0.0
  • GNU Toolchain: aarch64-none-linux-gnu
  • Qemu 5.1.0
  • Raspberry Pi 4B (4GB)

1. Building Kernel

As the target system is AArch64 architecture and the host system is x86_64, the kernel needs to be cross-compiled from x86_64 to AArch64. At first, I expected Clang/LLVM would be able to do this job with -march option, as it is natively a cross-compiler. However, there was some problems related to the known cross compilation issues of Clang/LLVM. As described in ClangBuiltLinux and a blog, using ARM’s GNU toolchain (aarch64-none-linux-gnu) solves the problem.

The final script I used to cross-compile the kernel:

#!/bin/bash -ve
PROJ_DIR=${PWD}/..
LLVM_BUILD=${PROJ_DIR}/build/bin
BINUTIL=~/util/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu/bin
export PATH=${BINUTIL}:$PATH:${LLVM_BUILD}
export LLVM_COMPILER=clang

export KERNEL=kernel8
export KCFLAGS="-march=armv8.5-a "
pushd ../linux-5.5
    BINUTILS_TARGET_PREFIX=aarch64-none-linux-gnu make \
      ARCH=arm64 CROSS_COMPILE=aarch64-none-linux-gnu- \
      HOSTCC=clang CC=clang -j12
popd

2. Cross-module kernel analysis

To identify every instructions accessing the protection target, a cross-module data-flow analysis is performed. Here, module indicates a single C file. One approach is to use wllvm to generate a single LLVM bitcode file of the entire kernel (i.e., vmlinux.bc). Here, some kernel modules containing duplicate symbol names may occur linking errors. Such files can be manually excluded from vmlinux.bc as explained in this blog, while such modules will be excluded from the analysis.

Once vmlinux.bc is generated, a custom LLVM analysis pass can be applied to it, enabling the cross-module analysis. The analysis pass is implemented as a LLVM pass plugin, which is loaded by opt command. Given a list of protection target (e.g., struct types, named global variables), the analysis pass identifies the memory access instructions (i.e., load/store, copy, alloc/free) that access the protection target.

3. Linux Kernel Instrumentation

The instrumentation is applied during kernel build on the instructions identified from the analysis. The LLVM tutorial on writing a LLVM pass introduces writing a LLVM pass as a plugin tool for opt command. However, the kernel build system uses CC (clang) and LD (lld) to build the kernel, instead of opt.

One approach is -Xclang option to pass the instrumentation pass as a Clang pass plugin. However, for some technical reasons that I cannot recall precisely, I did not choose this approach. It may have been due to uncertainty regarding whether every module would be built with the instrumentation pass.

Another approach that I took is to implement the instrumentation pass as a sanitizer, which is well-supported by both Linux and clang with -fsanitize=<sanitizer-name> option. Sanitizers are implemented as LLVM passes and are automatically loaded by clang. Existing sanitizers can be found in llvm-project/llvm/lib/Transforms/Instrumentation.

To add a new sanitizer, several modifications were made to Clang/LLVM and this change can be found in my github repo.

The final script to build the kernel with the instrumentation:

#!/bin/bash -ve
PROJ_DIR=${PWD}/..
LLVM_BUILD=${PROJ_DIR}/build/bin
BINUTIL=~/util/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu/bin
export PATH=${LLVM_BUILD}:${BINUTIL}:$PATH
export LLVM_COMPILER=clang
export KERNEL=kernel8
export KCFLAGS="-march=armv8.5-a -fsanitize=kdfi_instrument "
pushd ../linux-5.5-kdfi
    BINUTILS_TARGET_PREFIX=aarch64-none-linux-gnu make \
    ARCH=arm64 CROSS_COMPILE=aarch64-none-linux-gnu- \
    HOSTCC=clang CC=clang -j12
popd

4. Connecting Analysis and Instrumentation

Now the kernel can be analyzed and instrumented, but there is one more issue to solve: how do we connect the analysis and the instrumentation? A single LLVM pass for both analysis and instrumentation would be ideal. However, the kernel, as most C/C++ projects do, builds each module separately and links them together, which makes it difficult to perform cross-module analysis during kernel build.

To address this, I came up with a two-pass approach: One pass is for cross-module analysis that extracts the list of instructions from a pre-compiled whole-kernel bitcode (vmlinux.bc). Another is for instrumentation that is applied to each module to perform instrumentation for the previous identified instructions.

This requires a way to share the list of instructions between the two LLVM passes. Here, I simply used a simple text file containing raw dump of instructions in LLVM IR format. Each instruction is accompanied by minimal metadata such as function name and the instruction’s order in the function to accurately specify the intended instruction. While this method works, I believe there are more elegant ways to share the instructions between two passes.

Additionally, to avoid generating an excessively large file by dumping every instruction that need to be instrumented, a simple filter in the analysis pass excludes dumping instructions if they can be identified within each module. Consequently, the instrumentation pass also performs a simple intra-module analysis to identify the instructions filtered out from the list.

5. Arm64 Environment Setup

I used two types of testing environments: QEMU and Raspberry Pi 4B (4GB). QEMU is used for debugging and security evaluation, while Raspberry Pi is used for performance evaluation.

For the QEMU environment, I used buildroot (v2022.02.03) to build a root file system. The entire command to run QEMU is as follows:

LINUX=$1

sudo ~/util/qemu-5.1.0/aarch64-softmmu/qemu-system-aarch64 \
    -machine virt,mte=on \
    -cpu max \
    -nographic -smp 1 \
    -hda /home/juhee/project/ppac/buildroot-2022.02.3/output/images/rootfs.ext4 \
    -kernel $LINUX/arch/arm64/boot/Image \
    -append "console=ttyAMA0 root=/dev/vda oops=panic panic_on_warn=1 panic=-1 ftrace_dump_on_oops=orig_cpu debug earlyprintk=serial slub_debug=UZ nokaslr " \
    -m 2G \
    -net user,hostfwd=tcp::10023-:22  \
    -net nic -s -S

The emulated kernel can then be debugged with aarch64-none-linux-gnu-gdb.

To build the kernel for Raspberry Pi, I used the official Raspberry Pi firmware for the boot files and modules. The following script builds and prepares the boot files and modules of the instrumented kernel:

#!/bin/bash -ve
PROJ_DIR=${PWD}
LLVM_BUILD=${PROJ_DIR}/build/bin
BINUTIL=~/util/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu/bin
export PATH=${LLVM_BUILD}:${BINUTIL}:$PATH
export LLVM_COMPILER=clang
export KERNEL=kernel8
export KCFLAGS="-fsanitize=kdfi_instrument "
pushd linux-rpi-6.0-$1
    BINUTILS_TARGET_PREFIX=aarch64-none-linux-gnu make \
        ARCH=arm64 CROSS_COMPILE=aarch64-none-linux-gnu- \
        CFLAGS="-march=armv8-a+crc -mtune=cortex-a72" \
        CXXFLAGS="-march=armv8-a+crc -mtune=cortex-a72" \
        HOSTCC=clang CC=clang -j10 \
        bindeb-pkg Image modules dtbs
    BINUTILS_TARGET_PREFIX=aarch64-none-linux-gnu make \
        ARCH=arm64 CROSS_COMPILE=aarch64-none-linux-gnu- \
        HOSTCC=clang CC=clang -j10 \
        CFLAGS="-march=armv8-a+crc -mtune=cortex-a72" \
        CXXFLAGS="-march=armv8-a+crc -mtune=cortex-a72" \
        INSTALL_MOD_PATH=../modules-$2 \
        modules_install

    cp arch/arm64/boot/Image ../boot-$2/kernel8.img
    cp arch/arm64/boot/dts/overlays/*.dtbo ../boot-$2/overlays
    cp arch/arm64/boot/dts/overlays/README ../boot-$2/overlays
    cp arch/arm64/boot/dts/broadcom/*.dtb ../boot-$2/
popd

tar -cvf modules-$2.tar.gz modules-$2/
tar -cvf boot-$2.tar.gz boot-$2/ *.deb

The boot files and modules can be sent to the Raspberry Pi with network, and then can be installed with the following script:

#!/bin/bash -ve
rm *.deb
tar -xvf boot-$1.tar.gz
sudo rm -r /boot/*
sudo dpkg -i linux-headers-*.deb linux-image-*.deb linux-libc-*.deb
sudo cp -r boot-$1/* /boot/
sudo cp *.txt /boot/

tar -xvf modules-$1.tar.gz
sudo cp -r modules-$1/lib/* /lib/modules/

Conclusion

This post described the issues I faced when building a security hardening system for the Linux kernel. I hope this post will be helpful for others facing similar issues.