daemon

This ten-year-old vulnerability found by Chris Evans should remind us once more how, on modern linux systems, is important to take care of how we do security monitoring of software and user behaviour on modern linux systems.

Here’s the knot.
This simple assembly code spwans /bin/sh via execve and then exit.

BITS 64
global _start

section .text
_start:

jmp short jump
main:
    pop     rbx        ; stack needs x64 register [rbx]-
                       ; string address offset fits into 32 bit though
    xor     eax, eax
    mov     ecx, eax
    mov     edx, eax
    mov     al, 0xb
    int     0x80       ; execve_syscall
    xor     eax,eax
    inc     eax
    int     0x80       ; exit_syscall
jump:
    call main
    message db  "/bin/sh"

If we compile it as an x64 ELF binary we can start noticing a few shenanigans.

nasm -f elf64 -o test test.nasm
ld -o test test.o

# file test
test3: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped


The quirk here is that from the address space of an x64 binary, we are triggering the system call via the int 0x80 interrupt, which is a valid operation on both x86 and x64 worlds. However, given all the correct 32-bit register in place and the right system call number, the CPU will happily execute the 32 bit version of the syscall, which in this case will be execve and exit.

If we compare the two syscall numbers, we can see how they are mapped differently between x86 and x64:

==================================
|syscall|    name    |   name     |
|number |    x86     |   x64      |
|=================================
|   1   | sys_exit   | sys_write  |
|---------------------------------
|  11   | sys_execve | sys_munmap |
==================================

On the other hand, the tool ‘strace’, which runs in userland, gets fooled, believing that the x64 sys_write and sys_munmap have been called instead.

#strace ./test
[..]
munmap(0x7fd8f4870000, 101178)= 0
...
write(2, "# ", 2# )

The reason strace get it wrong, is because he tries decoding the registers’ content as if it was a 64-bit process syscall instead of int 0x80. However the kernel, clearly, knows exactly what’s really going on, and if we were to enable the auditd service we would retrieve the correct values.

type=SYSCALL msg=audit(1574200519.890:35): arch=40000003 syscall=11 success=yes exit=0 a0=401018 a1=0 a2=0 a3=0 items=3 ppid=1486 pid=2392 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts1 ses=4 comm="sh" exe="/usr/bin/dash" subj==unconfined key="root-commands"

type=SYSCALL msg=audit(1574216365.677:62): arch=40000003 syscall=1 a0=0 a1=7fca7fab9718 a2=7ffe5fe1def8 a3=7ffe5fe1dee8 items=0 ppid=10475 pid=11168 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts3 ses=305 comm="test" exe="/root/Desktop/test" subj==unconfined key="root-commands"

The main take-away here is the following: when evaluating linux EDR or monitoring tool, we should always lean to the kernel-aware one and disregard any solution that promise to track software behavior via tools running exclusively in user-land.