KernGPT in Software Testing

An AI Advancement for Enhancing Software Security

Operating system kernels are the core of every computer system. They manage the hardware resources and provide the foundation for all other software to run. Bugs in operating system kernels can have serious consequences for security, reliability, and usability.


Security is a significant concern due to the potential for kernel bugs to be exploited by attackers to gain unauthorized access to the system. This could enable attackers to steal sensitive data, install malware, or even take complete control of the system.


Kernel bugs can cause system crashes, hangs, and other performance problems. This can make it difficult for users to work and can even lead to data loss.


Usability is compromised when kernel bugs cause unexpected behavior within applications, making it difficult for users to interact with their devices and potentially leading to data corruption

The Impact of Kernel Bugs

Kernel bugs can affect billions of devices and users all over the world. This is because operating system kernels are used in a wide variety of devices, from smartphones and laptops to servers and mainframes.

Examples of Kernel Bugs

There have been many high-profile examples of kernel bugs in recent years. Some of the most notable examples include:

  •  CVE-2023-3269: This is a vulnerability in versions 6.1 to 6.4 of the Linux operating system's core (kernel). It allows hackers to gain higher access rights on the computer system. This is the first time a specific type of error, known as 'use-after-free-by-RCU' (UAFBR), has been found to be usable by attackers. Severity: Hight 7.8/NIST Score
  • CVE-2021-27365: This issue involves a dangerous way of handling memory (heap buffer overflow) in the system. It happens because of the incorrect use of a function named sprintf(). This function is used for formatting data, but if not used correctly, it can allow attackers to exploit the system.
  • CVE-2021-27363: This problem is about the accidental exposure of sensitive addresses inside the computer's kernel (the core part of the operating system). It happens because a pointer, which should be a secret reference to a location in memory, is used as a unique ID, which can be seen by others.
  • CVE-2021-27364: This issue relates to reading beyond the intended limit of a data buffer, which can lead to either leaking sensitive information or causing the system to crash (denial of service). This type of problem is known as a buffer overread. It can be particularly serious because it may allow attackers to see data they shouldn't or disrupt the system's operation.

As software continues to power more and more of our digital infrastructure, ensuring its security and reliability is crucial. Recently, researchers at UIUC proposed an AI technique called “KernelGPT” that automatically improves software testing to catch more bugs. Their focus? The operating system kernel - one of the most fundamental software components underlying all computer systems globally.

Kernel Fuzzing

The kernel acts as the bridge between hardware and software, managing access and coordinating everything seamlessly. Any compromises to its integrity through bugs or security flaws could be tremendously disruptive, affecting billions of devices and users. Hence for decades, experts have dedicated extensive efforts towards rigorously testing kernels via “fuzzing,” which automatically generates diverse test case inputs to trigger potential issues.

In this context, Kernel fuzzing, a complex task due to its large code space and unique system call interfaces, involves generating test inputs to identify program crashes. The effectiveness of fuzzing is influenced by various tunable parameters.

Syzkaller: A popular coverage-guided kernel fuzzer that has uncovered hundreds of vulnerabilities in the Linux kernel. Automatically generates diverse system calls and tracks code coverage to guide further testing. (See figure below)
KAFL: Designed for fuzzing the entire kernel, including drivers. Leverages hardware-assisted virtualization to accelerate testing and isolate crashes.

A tool called “Syzkaller” has been at the forefront of coverage-guided kernel fuzzing, credited with detecting thousands of real-world kernel bugs. However, its effectiveness depends heavily on complex “specifications” that describe the interface and parameters to generate valid test cases. Manually authoring these specifications requires tremendous kernel internals expertise and is extremely challenging to scale across the vast and continually evolving codebase. Source

  • AFL: While a general-purpose fuzzer, it can be effectively adapted for kernel fuzzing with appropriate configuration. 

American Fuzzy Lop's afl-fuzz running on a test program


KernelGPT is a novel approach that aims to automate the generation of Syzkaller specifications (explained above) for enhanced kernel fuzzing. It leverages large language models (LLMs) like GPT-3, which have been pretrained on extensive data, including kernel codebases and documentation, to intelligently analyze kernel source code snippets and automatically synthesize accurate syscall specifications. By doing so, KernelGPT eliminates the need for immense manual effort and deep kernel internals expertise. (See picture below.) The tool's preliminary results have demonstrated its ability to help Syzkaller achieve higher coverage and find multiple previously unknown bugs. Additionally, the Syzkaller team has shown interest in upstreaming the specifications inferred by KernelGPT.

Figure 2 illustrates the KernelGPT Workflow, which consists of three steps: identifying drivers, generating specifications, and verifying and correcting them. Initially, KernelGPT identifies drivers along with their specific details. Next, it determines commands and arguments for ioctl handlers by analyzing the kernel code. If any information is missing, KernelGPT requests additional details, which are then utilized in the subsequent step. The final phase involves KernelGPT scrutinizing the specifications and rectifying any inaccuracies. While KernelGPT can locate device operation handlers in the code through basic patterns, its primary function is to convert code into descriptions, rather than merely identifying handlers.

And the results so far are promising, with KernelGPT specifications improving bug-finding coverage over Syzkaller baselines. More importantly, it has successfully detected multiple previously unknown flaws that could lead to serious kernel security vulnerabilities once exploited. The researchers have also open-sourced KernelGPT to benefit the broader community.

As software security becomes even more paramount today, platforms like KernelGPT exemplify AI’s immense promise to transform how we build, analyze and safeguard our systems. The researchers next aim to expand its applications, targeting more syscall interfaces beyond just device drivers. With computers teaching themselves to find bugs better than humans, the future looks brighter for engineering robust software that we can truly rely upon.