Following the wake of the previous post, here is Practical Binary Analysis level 7 CTF walkthrough. As we pay attention to the oracle’s hint, we begin to suspect that this level might be two-staged:

$ ./oracle 0fa355cbec64a05f7a5d050e836b1a1f -h
Find out what I expect, then trace me for a hint

If we try to open the level7 file

1.$ ./lvl7
-bash: ./lvl7: cannot execute binary file: Exec format error

$ file lvl7
lvl7: gzip compressed data, last modified: Sat Dec  1 17:30:15 2018, from Unix

We can see it’s a gzip format, so we need to decompress it first.

$ tar xvzf lvl7
stage1
stage2.zip

So our theory was right: we have indeed two stages that we have to face in order to grab the flag. Let’s start by having a quick look at both stage files.

$ file stage1
stage1: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=ac71081a0951af729a4064c1dafbc5713b1537e3, stripped

It looks like a regular stripped 64-bit ELF, let’s check the stage2.zip to make sure it’s really an archive file as it claims to be and try to unzip it.

$ file stage2.zip
stage2.zip: Zip archive data, at least v2.0 to extract

7$ unzip stage2.zip
Archive:  stage2.zip
[stage2.zip] tmp password:

Well, stage2 is actually a real zip file but, as expected, a password is required in order to reveal its content. This secret is most likely obtained by solving ‘stage1’ first, so let’s go back to it. As we have already seen, ‘stage1’ is a stripped binary, so it might have fewer or no symbols when debugged with GDB. Nevertheless, we can run a couple of tools against it, so we can gather a rough initial idea of how it behaves.

$ strace ./stage1
execve("./stage1", ["./stage1"], [/* 32 vars */]) = 0
brk(NULL)                               = 0x1659000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=98537, ...}) = 0
mmap(NULL, 98537, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f3a44afe000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1868984, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3a44afd000
mmap(NULL, 3971488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3a44528000
mprotect(0x7f3a446e8000, 2097152, PROT_NONE) = 0
mmap(0x7f3a448e8000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c0000) = 0x7f3a448e8000
mmap(0x7f3a448ee000, 14752, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3a448ee000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3a44afc000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3a44afb000
arch_prctl(ARCH_SET_FS, 0x7f3a44afc700) = 0
mprotect(0x7f3a448e8000, 16384, PROT_READ) = 0
mprotect(0x600000, 4096, PROT_READ)     = 0
mprotect(0x7f3a44b17000, 4096, PROT_READ) = 0
munmap(0x7f3a44afe000, 98537)           = 0
exit_group(0)                           = ?
+++ exited with 0 +++

$ ltrace -C ./stage1
__libc_start_main(0x4003e0, 1, 0x7ffc940ac188, 0x400520 <no return ...>
+++ exited (status 0) +++

$ strings  ./stage1 | grep -v "^\."
/lib64/ld-linux-x86-64.so.2
libc.so.6
__libc_start_main
__gmon_start__
GLIBC_2.2.5
dump ecx
UH-0
AWAVA
AUATL
[]A\A]A^A_
 S)TA
E2 KE
...snip...

Both ltrace and strace give us almost nothing, whereas strings shows something interesting in the last two lines. Let’s load it in GDB, and see if we can catch any useful string at runtime.

gdb -q ./stage1
Reading symbols from ./stage1...(no debugging symbols found)...done.

gdb-peda$ info file
...
	Entry point: 0x400420

We then break at the entry point and start stepping one instruction at a time, with the aid of PEDA.

gdb-peda$ b *0x4003e0
Breakpoint 1 at 0x4003e0

RAX: 0x54 ('T')
RBX: 0x0
RCX: 0x53 ('S')
RDX: 0x4005a7 --> 0x45ffdefa47114154
RSI: 0x7fffffffe428 --> 0x7fffffffe695 ("/home/binary/code/chapter5/level7/stage1")
RDI: 0x1
RBP: 0x400520 (push   r15)
RSP: 0x7fffffffe348 --> 0x7ffff7a2d830 (<__libc_start_main+240>:	mov    edi,eax)
RIP: 0x4003eb (cmp    eax,0x30)
R8 : 0x400590 (repz ret)
R9 : 0x7ffff7de7ab0 (<_dl_fini>:	push   rbp)
R10: 0x846
R11: 0x7ffff7a2d740 (<__libc_start_main>:	push   r14)
R12: 0x400420 (xor    ebp,ebp)
R13: 0x7fffffffe420 --> 0x1
R14: 0x0
R15: 0x0
EFLAGS: 0x283 (CARRY parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x4003e0:	mov    edx,0x4005a4
   0x4003e5:	nop    DWORD PTR [rax]
   0x4003e8:	movsx  eax,BYTE PTR [rdx]
=> 0x4003eb:	cmp    eax,0x30
   0x4003ee:	jl     0x400402
   0x4003f0:	cmp    eax,0x5a
   0x4003f3:	jg     0x400402
   0x4003f5:	jmp    0x400400
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffe348 --> 0x7ffff7a2d830 (<__libc_start_main+240>:	mov    edi,eax)
0008| 0x7fffffffe350 --> 0x0
0016| 0x7fffffffe358 --> 0x7fffffffe428 --> 0x7fffffffe695 ("/home/binary/code/chapter5/level7/stage1")
0024| 0x7fffffffe360 --> 0x1f7ffcca0
0032| 0x7fffffffe368 --> 0x4003e0 (mov    edx,0x4005a4)
0040| 0x7fffffffe370 --> 0x0
0048| 0x7fffffffe378 --> 0xab001f46e1993fa4
0056| 0x7fffffffe380 --> 0x400420 (xor    ebp,ebp)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
0x00000000004003eb in ?? ()

We can see some interesting letters appearing in the RAX register, eventually spelling the word ‘STAGE2KEY’. After being referenced in the RDX register, these letters are stored into the EAX register and into RCX later on.

movsx  eax,BYTE PTR [rdx]

Also, if we can check at which memory address RDX is pointing to, we can spot the key sneaking in between other characters.

x/s 0x4005a4
0x4005a4:	" S)TA\021G\372\336\377E2 KE\212Y"

Now that we feel bold and daring, we can try feeding stage2.zip with the recently discovered password ‘STAGE2KEY’.

$ unzip stage2.zip
Archive:  stage2.zip
[stage2.zip] tmp password:
  inflating: tmp
  inflating: stage2

Brilliant, it worked. The zip files contains two other files: stage2 and tmp. However, if we diff them we notice that they are two exact binary copies, so we can just focus our attention on stage2 only.

$ diff -s stage2 tmp
Files stage2 and tmp are identical

As we run the ELF, we notice a rather weird (at least to me) behaviour.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
$ ./stage2
#include <stdio.h>
#include <string.h>
#include <vector>
#include <algorithm>

int main() {
  std::vector < char > hex;
  char q[] = "#include <stdio.h>\n#include <string.h>\n#include <vector>\n#include <algorithm>\n\nint main()\n{\nstd::vector<char> hex;\nchar q[] = \"%s\";\nint i, _0F;\nchar c, qc[4096];\n\nfor(i = 0; i < 32; i++) for(c = '0'; c <= '9'; c++) hex.push_back(c);\nfor(i = 0; i < 32; i++) for(c = 'A'; c <= 'F'; c++) hex.push_back(c);\nstd::srand(55);\nstd::random_shuffle(hex.begin(), hex.end());\n\n_0F = 0;\nfor(i = 0; i < strlen(q); i++)\n{\nif(q[i] == 0xa)\n{\nqc[_0F++] = 0x5c;\nqc[_0F] = 'n';\n}\nelse if(q[i] == 0x22)\n{\nqc[_0F++] = 0x5c;\nqc[_0F] = 0x22;\n}\nelse if(!strncmp(&q[i], \"0F\", 2) && (q[i-1] == '_' || i == 545))\n{\nchar buf[3];\nbuf[0] = q[i];\nbuf[1] = q[i+1];\nbuf[2] = 0;\nunsigned j = strtoul(buf, NULL, 16);\nqc[_0F++] = q[i++] = hex[j];\nqc[_0F] = q[i] = hex[j+1];\n}\nelse qc[_0F] = q[i];\n_0F++;\n}\nqc[_0F] = 0;\n\nprintf(q, qc);\n\nreturn 0;\n}\n";
  int i, _0F;
  char c, qc[4096];

  for (i = 0; i < 32; i++)
    for (c = '0'; c <= '9'; c++) hex.push_back(c);
  for (i = 0; i < 32; i++)
    for (c = 'A'; c <= 'F'; c++) hex.push_back(c);
  std::srand(55);
  std::random_shuffle(hex.begin(), hex.end());

  _0F = 0;
  for (i = 0; i < strlen(q); i++) {
    if (q[i] == 0xa) {
      qc[_0F++] = 0x5c;
      qc[_0F] = 'n';
    } else if (q[i] == 0x22) {
      qc[_0F++] = 0x5c;
      qc[_0F] = 0x22;
    } else if (!strncmp( & q[i], "0F", 2) && (q[i - 1] == '_' || i == 545)) {
      char buf[3];
      buf[0] = q[i];
      buf[1] = q[i + 1];
      buf[2] = 0;
      unsigned j = strtoul(buf, NULL, 16);
      qc[_0F++] = q[i++] = hex[j];
      qc[_0F] = q[i] = hex[j + 1];
    } else qc[_0F] = q[i];
    _0F++;
  }
  qc[_0F] = 0;

  printf(q, qc);

  return 0;
}

The binary seems printing its source code, thus resembling a Quine. If we summon Wikipedia, we find out that:

A quine is a computer program which takes no input and produces a copy of its own source code as its only output.

This definition suggests that we should focus on the printed output. As we take a closer look at the c++ source code, we notice that the program is printing two buffer at the end (line 41s), named q and qc (probably for ‘quine’ and ‘quine code’), q contains a format string %s that is being used to reference qc in the printf statement. The program main duty is to ‘for loop’ through the q variable (where the source code is stored) and print it out.

Everything seems looking shiny and tid however, inside the last ‘else if’ (line 34 and 35), the loop-counter variable _0F gets mangled by a pseudo-random value stored in the 16 byte-long buffer j (line 33). We can prove this theory by compiling the latest generated output, the one with 0F as a variable, into a new c++ file and compile it.

$ g++ test2.cpp -o test2

Bingo. The newly created binary is printing a different variable name, this time *25.

./test2 | grep _
int i, _25;
for(i = 0; i < 32; i++) for(c = '0'; c <= '9'; c++) hex.push_back(c);
for(i = 0; i < 32; i++) for(c = 'A'; c <= 'F'; c++) hex.push_back(c);
std::random_shuffle(hex.begin(), hex.end());
_25 = 0;
qc[_25++] = 0x5c;
qc[_25] = 'n';
qc[_25++] = 0x5c;
qc[_25] = 0x22;
else if(!strncmp(&q[i], "25", 2) && (q[i-1] == '_' || i == 545))
qc[_25++] = q[i++] = hex[j];
qc[_25] = q[i] = hex[j+1];
else qc[_25] = q[i];
_25++;
qc[_25] = 0;

As we begin to feel a pattern, we can compile the resulting output in a recursive fashion and get a sequence of different bytes.

If we rename stage2 into test1, we can automate the whole process through:

#!/bin/bash

for i in {1..16}
do
    j=$(expr $i + 1)
    echo -n $(./test$i | egrep ", \"" | cut -d '"' -f 2) >> flag
    ./test$i > test$j.cpp
    g++ test$j.cpp -o test$j
    echo "done with binary n: $i"
done
echo "here is your flag..."
printf "\n"
cat flag
printf "\n"

Which results in:

done with binary nr: 1
done with binary nr: 2
done with binary nr: 3
[...snip...]
done with binary nr: 16
here is your flag...

0F25E512A7763EEFB7696B3AEDA1F964

We still feel brave and bold and we decide to feed the oracle with it.

$ ./oracle 0f25e512a7763eefb7696b3aeda1f964
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
| Level 7 completed, unlocked lvl8         |
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
Run oracle with -h to show a hint

yo! level completed! ~:)