2016-11-20

RC3 CTF 2016: logmein - Reversing 100

The flavor text of this challenge was:

Logmein
100

This has to be one of the safest and most secure login forms out there. Can you break it o 1337 h4x0r?

This being a binary reverse engineering challenge, it doesn't hurt to do a little digging before trying to execute the file.

If you'd like the binary to follow along, you can grab it here.

-> % file ./logmein
./logmein: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter 
/lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=c8f7fb137d9be24a19eb4f10efc29f7a421578a7, 
stripped

Here we can see that the file provided is a binary, and 64 bits. It is also stripped, meaning function and variable names are gone, among other things.

Before running a binary I don't know anything about, I also like to see what Symbols the program has:

-> % rabin2 -s logmein 
[Symbols]
vaddr=0x004004d0 paddr=0x000004d0 ord=001 fwd=NONE sz=16 bind=GLOBAL type=FUNC name=imp.strlen
vaddr=0x004004e0 paddr=0x000004e0 ord=002 fwd=NONE sz=16 bind=GLOBAL type=FUNC name=imp.printf
vaddr=0x004004f0 paddr=0x000004f0 ord=003 fwd=NONE sz=16 bind=GLOBAL type=FUNC name=imp.__libc_start_main
vaddr=0x00400000 paddr=0x00000000 ord=004 fwd=NONE sz=16 bind=UNKNOWN type=NOTYPE name=imp.__gmon_start__
vaddr=0x00400500 paddr=0x00000500 ord=005 fwd=NONE sz=16 bind=GLOBAL type=FUNC name=imp.__isoc99_scanf
vaddr=0x00400510 paddr=0x00000510 ord=006 fwd=NONE sz=16 bind=GLOBAL type=FUNC name=imp.exit

6 symbols

This gives me a pretty good idea that the program will be printf-ing some stuff to the terminal, reading in input using scanf, and at some point computing the length of a string using strlen

So let's run the program and see what it does:

-> % ./logmein 
Welcome to the RC3 secure password guesser.
To continue, you must enter the correct password.
Enter your guess: HERPDERP
Incorrect password!

Pretty simple eh? Let's see if any of these strings are embedded plainly in the binary:

-> % strings logmein 
<snip>
...
AWAVA
AUATL
[]A\A]A^A_
:"AL_RT^L*.?+6/46
harambe
Welcome to the RC3 secure password guesser.
To continue, you must enter the correct password.
Enter your guess: 
%32s
Incorrect password!
You entered the correct password!
Great job!
;*3$"
GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
.shstrtab
...
<snip>

Huh, looks like harambe is in there, and I didn't remember seeing it in the output.

-> % ./logmein
Welcome to the RC3 secure password guesser.
To continue, you must enter the correct password.
Enter your guess: harambe
Incorrect password!

Shucks. Oh well.

Next I dove in with GDB.

GDBZ as 1-2-3

Sensing that a stripped binary won't let us break main in gdb, I used rabin2 to find the program's entrypoint:

-> % rabin2 -e logmein 
[Entrypoints]
vaddr=0x00400530 paddr=0x00000530 baddr=0x00400000 laddr=0x00000000 type=program

1 entrypoints

Which is listed as to be 0x400530. I set a break point on this address and examine the next 20 instructions in GDB:

-> % gdb -q ./logmein
Reading symbols from ./logmein...(no debugging symbols found)...done.
(gdb) b *0x400530
Breakpoint 1 at 0x400530
(gdb) run
Starting program: /home/tobaljackson/SIT/fall_2016/rc3-ctf-2016/reversing/100_logmein/logmein 

Breakpoint 1, 0x0000000000400530 in ?? ()
(gdb) x/20i $rip
=> 0x400530:    xor    ebp,ebp
   0x400532:    mov    r9,rdx
   0x400535:    pop    rsi
   0x400536:    mov    rdx,rsp
   0x400539:    and    rsp,0xfffffffffffffff0
   0x40053d:    push   rax
   0x40053e:    push   rsp
   0x40053f:    mov    r8,0x400890
   0x400546:    mov    rcx,0x400820
   0x40054d:    mov    rdi,0x400630
   0x400554:    call   0x4004f0 <__libc_start_main@plt>
   0x400559:    hlt    
   0x40055a:    nop    WORD PTR [rax+rax*1+0x0]
   0x400560:    mov    eax,0x601057
   0x400565:    push   rbp
   0x400566:    sub    rax,0x601050
   0x40056c:    cmp    rax,0xe
   0x400570:    mov    rbp,rsp
   0x400573:    jbe    0x400590
   0x400575:    mov    eax,0x0

We can see that the program still seems to have a ways to go until it gets into main execution. So I examine many (300 instructions worth) more instructions to see if we can see a printf or scanf function:

(gdb) x/300i $rip
=> 0x400530:    xor    ebp,ebp
   0x400532:    mov    r9,rdx
   0x400535:    pop    rsi
   0x400536:    mov    rdx,rsp
   0x400539:    and    rsp,0xfffffffffffffff0
...
<snip>
...
   0x400685:    call   0x4004e0 <printf@plt>
   0x40068a:    movabs rdi,0x400905
   0x400694:    mov    DWORD PTR [rbp-0x5c],eax
   0x400697:    mov    al,0x0
   0x400699:    call   0x4004e0 <printf@plt>
   0x40069e:    movabs rdi,0x400938
   0x4006a8:    mov    DWORD PTR [rbp-0x60],eax
   0x4006ab:    mov    al,0x0
   0x4006ad:    call   0x4004e0 <printf@plt>
   0x4006b2:    movabs rdi,0x40094b
   0x4006bc:    lea    rsi,[rbp-0x50]
   0x4006c0:    mov    DWORD PTR [rbp-0x64],eax
   0x4006c3:    mov    al,0x0
   0x4006c5:    call   0x400500 <__isoc99_scanf@plt>
   0x4006ca:    lea    rdi,[rbp-0x20]
   0x4006ce:    lea    rsi,[rbp-0x50]
   0x4006d2:    mov    QWORD PTR [rbp-0x70],rdi
   0x4006d6:    mov    rdi,rsi
   0x4006d9:    mov    DWORD PTR [rbp-0x74],eax
   0x4006dc:    call   0x4004d0 <strlen@plt>
   0x4006e1:    mov    rdi,QWORD PTR [rbp-0x70]
   0x4006e5:    mov    QWORD PTR [rbp-0x80],rax
   0x4006e9:    call   0x4004d0 <strlen@plt>
   0x4006ee:    mov    rsi,QWORD PTR [rbp-0x80]
   0x4006f2:    cmp    rsi,rax
   0x4006f5:    jae    0x400700
   0x4006fb:    call   0x4007c0
   0x400700:    mov    DWORD PTR [rbp-0x54],0x0
   0x400707:    lea    rdi,[rbp-0x50]
   0x40070b:    movsxd rax,DWORD PTR [rbp-0x54]

Here I've highlighted some call functions, and we can see the first three are to printf, the fourth is to scanf, and then immediately following the scanf are two strlen calls, presumably to compare what I've entered with the password!

So let's break on the two strlen calls after the scanf to see if we can find what it's comparing:

(gdb) b *0x4006dc
Breakpoint 3 at 0x4006dc
(gdb) b * 0x4006e9
Breakpoint 4 at 0x4006e9

Next I'll continue execution so I can let the program prompt me for input (executing the scanf function):

(gdb) c
Continuing.
Welcome to the RC3 secure password guesser.
To continue, you must enter the correct password.
Enter your guess: HERPDERP

Breakpoint 3, 0x00000000004006dc in ?? ()

We can see the first strlen call is broken before being called.

(gdb) x/i $rip
=> 0x4006dc:    call   0x4004d0 <strlen@plt>

Now if we examine the rdi register, we'll see the memory address that will have its strlen computed:

(gdb) x/s $rdi
0x7fffffffded0: "HERPDERP"

Which is plainly the string I entered! If we use ni then gdb will execute the strlen call without stepping into it, and the result will be returned into the rax register, which will then be stored onto the stack:

(gdb) ni
0x00000000004006e1 in ?? ()
(gdb) i r rax
rax            0x8  8
(gdb) x/3i $rip
=> 0x4006e1:    mov    rdi,QWORD PTR [rbp-0x70]
   0x4006e5:    mov    QWORD PTR [rbp-0x80],rax
   0x4006e9:    call   0x4004d0 <strlen@plt>

But before rax is stored onto the stack using mov QWORD PTR [rbp-0x80],rax, we see that the argument for the next strlen is set up with mov rdi,QWORD PTR [rbp-0x70]. We can examine this memory address to see what is stored there as well:

(gdb) x/s *(int **)($rbp-0x70)
0x7fffffffdf00: ":\"AL_RT^L*.?+6/46"

We have to cast the value at the named memory address ($rbp-0x70) as a pointer to a pointer (int **) before dereferencing it (*), which allows gdb's string examine x/s to interpret it properly. Doing so shows us the curious string: :"AL_RT^L*.?+6/46.

You'll notice my reproduction of the string removes the \ before the " since GDB uses double quotes to reference strings, and so must escape a literal double quote character.

When we ni three times, we'll see the length of the string returned as 17 characters (0x11):

(gdb) ni
0x00000000004006e5 in ?? ()
(gdb) ni

Breakpoint 4, 0x00000000004006e9 in ?? ()
(gdb) ni
0x00000000004006ee in ?? ()
(gdb) i r rax
rax            0x11 17

Now that the string length of both our input string (HERPDERP) and the embedded strange string (:"AL_RT^L*.?+6/46) have been computed, we can see that the next three instructions will perform a comparison and branch based on these two numbers:

(gdb) x/4i $rip
   0x4006ee:    mov    rsi,QWORD PTR [rbp-0x80]
   0x4006f2:    cmp    rsi,rax
   0x4006f5:    jae    0x400700
   0x4006fb:    call   0x4007c0

We can see the highlighted line above is where the comparison takes place, and since we know rax holds 17, then rsi is loaded with 8 (the length of HERPDERP), and the jump will only occur if the value in rsi is greater than 17, which it is not.

We could assume that the program wants us to supply a password >= 17 characters in length, but we can verify what will happen if the jump is not followed by examining the call 0x4007c0 at 0x4006fb:

(gdb) x/6i 0x4007c0
   0x4007c0:    push   rbp
   0x4007c1:    mov    rbp,rsp
   0x4007c4:    sub    rsp,0x10
   0x4007c8:    movabs rdi,0x400950
   0x4007d2:    mov    al,0x0
   0x4007d4:    call   0x4004e0 <printf@plt>
(gdb) x/s 0x400950
0x400950:   "Incorrect password!\n"

In this alternate execution path there is a call to printf, and the argument supplied is stored in rdi with movabs rdi,0x400950. As you can see in the highlighted lines above, this memory address contains the failure string, so we know what this function call's purpose is.

Changing the future

Instead of re-running the program with a new argument, I'll just manually edit the rsi register after it's loaded to hold a value of 17 so that the jump if above or equal statement is followed:

(gdb) ni
0x00000000004006f2 in ?? ()
(gdb) set $rsi=17
(gdb) i r rsi rax
rsi            0x11 17
rax            0x11 17

And now if we ni twice, we'll see that the jae 0x400700 is followed:

(gdb) ni
0x00000000004006f5 in ?? ()
(gdb) ni
0x0000000000400700 in ?? ()
(gdb) x/50i $rip
=> 0x400700:    mov    DWORD PTR [rbp-0x54],0x0
   0x400707:    lea    rdi,[rbp-0x50]
   0x40070b:    movsxd rax,DWORD PTR [rbp-0x54]
   0x40070f:    mov    QWORD PTR [rbp-0x88],rax
   0x400716:    call   0x4004d0 <strlen@plt>
   0x40071b:    mov    rdi,QWORD PTR [rbp-0x88]
   0x400722:    cmp    rdi,rax
   0x400725:    jae    0x4007ac
...
<snip>

With a keen eye we can see that the next block of code starting at 0x400707 is rather similar to the block of code before. On this line, the string starting at $rbp-0x50 is loaded into rdi, its length computed, and compared with a value stored at $rbp-0x54.

Take note of the instruction at 0x400725: jae 0x4007ac. This address is the same as the one examined before which took us to the failure routine.

If we examine rbp-0x50 we see the familiar HERPDERP, and if we look at the disassembly above, we'll see that the value at $ebp-0x54 is initialized to 0. This should remind you of the for loop construct, since that's what this is.

What we can deduce from this segment of code is that it will process each character, incrementing the loop counter by 1 each time, but that if the correct password is not computed before the end of our input string is reached, the program will jump to the failure routine at 0x4007ac.

If we look at the next lines, we'll see what else happens:

   0x40072b:    lea    rdi,[rbp-0x20]
   0x40072f:    movsxd rax,DWORD PTR [rbp-0x54]
   0x400733:    mov    QWORD PTR [rbp-0x90],rax
   0x40073a:    call   0x4004d0 <strlen@plt>
   0x40073f:    mov    rdi,QWORD PTR [rbp-0x90]
   0x400746:    cmp    rdi,rax
   0x400749:    jb     0x400754
   0x40074f:    call   0x4007c0

Here we can see another 0x40073a: call 0x4004d0 <strlen@plt> which will be working with rbp-0x20, storing that address in rdi. Before we get too far ahead of ourselves, let's catch up eip to where we are looking. We'll set a breakpoint at the next opportunity for a branch (0x400725), and single step to show the branch is not taken (to the failure routine):

(gdb) b * 0x400725
Breakpoint 5 at 0x400725
(gdb) c
Continuing.
Breakpoint 5, 0x0000000000400725 in ?? ()
(gdb) x/2i $rip
=> 0x400725:    jae    0x4007ac
   0x40072b:    lea    rdi,[rbp-0x20]
(gdb) ni
0x000000000040072b in ?? ()
(gdb) i r rip
rip            0x40072b 0x40072b

Now that we are caught up to our analysis, we can look ahead again. At our current instruction, we can see that yet another strlen call is being set up, and that it'll be operating on a string stored at rbp-0x20, and when we x/s $rbp-0x20 we see that it is again our mysterious string: ":"AL_RT^L*.?+6/46".

We can also see that before the cmp @ 0x400746, that our loop counter is moved from rbp-0x54 to rbp-0x90. And then it is compared against the length returned from the mystery string, jumping if below (jb, which it is), to 0x400754, or calling 0x4007c0 if they are equal. If we step 5 times we can inspect rdi and rax before the comparison is done:

(gdb) ni
0x000000000040072f in ?? ()
(gdb) ni
0x0000000000400733 in ?? ()
(gdb) ni
0x000000000040073a in ?? ()
(gdb) ni
0x000000000040073f in ?? ()
(gdb) ni
0x0000000000400746 in ?? ()
(gdb) i r rdi rax
rdi            0x0      0
rax            0x11     17
(gdb) ni
0x0000000000400749 in ?? ()
(gdb) ni
0x0000000000400754 in ?? ()

When we examine the next block of code up until the next comparison-and-jump, we'll notice there is a single xor instruction which operates on edx and edi.

(gdb) x/20i $rip
=> 0x400754:    movsxd rax,DWORD PTR [rbp-0x54]
   0x400758:    mov    cl,BYTE PTR [rbp+rax*1-0x20]
   0x40075c:    mov    BYTE PTR [rbp-0x55],cl
   0x40075f:    mov    eax,DWORD PTR [rbp-0x54]
   0x400762:    cdq
   0x400763:    idiv   DWORD PTR [rbp-0x2c]
   0x400766:    movsxd rsi,edx
   0x400769:    mov    cl,BYTE PTR [rbp+rsi*1-0x28]
   0x40076d:    mov    BYTE PTR [rbp-0x56],cl
   0x400770:    movsx  edx,BYTE PTR [rbp-0x55]
   0x400774:    movsx  edi,BYTE PTR [rbp-0x56]
   0x400778:    xor    edx,edi
   0x40077a:    mov    cl,dl
   0x40077c:    mov    BYTE PTR [rbp-0x57],cl
   0x40077f:    movsxd rsi,DWORD PTR [rbp-0x54]
   0x400783:    movsx  edx,BYTE PTR [rbp+rsi*1-0x50]
   0x400788:    movsx  edi,BYTE PTR [rbp-0x57]
   0x40078c:    cmp    edx,edi
   0x40078e:    je     0x400799
   0x400794:    call   0x4007c0

Prior to this line, we can see that both of these registers are set up relative to our loop counter (rbp-0x54) using the instructions 0x400758: mov cl,BYTE PTR [rbp+rax*1-0x20] and mov cl,BYTE PTR [rbp+rsi*1-0x28], highlighted above.

Take the shortcut!

Let's break on 0x400778 to see what values will be xored together:

(gdb) b * 0x400778
Breakpoint 6 at 0x400778
(gdb) c
Continuing.

Breakpoint 6, 0x0000000000400778 in ?? ()
(gdb) i r edx edi
edx            0x3a     58
edi            0x68     104

Hmm, these values fall in the printable ASCII range. Let's see where they came from:

(gdb) x/c $rbp-0x56
0x7fffffffdeea: 104 'h'
(gdb) x/c $rbp-0x55
0x7fffffffdeeb: 58 ':'

Huh, a lowercase h? And the colon from our mystery string! If we go back even further to see how the values were loaded into rbp-0x55 and rbp-0x56:

(gdb) x/s $rbp+$rax-0x20
0x7fffffffdf20: ":\"AL_RT^L*.?+6/46"
(gdb) x/s $rbp+$rax-0x28
0x7fffffffdf18: "harambe"

It seems that we are xoring harambe together with our mystery string! Let's xor these values together ourselves outside this program and see what we get:

Branch to python3

-> % python3
Python 3.5.2 (default, Nov  7 2016, 11:31:36) 
[GCC 6.2.1 20160830] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> k = 'harambe'
>>> c = ":\"AL_RT^L*.?+6/46"
>>> from itertools import cycle
>>> ''.join([chr(ord(x[0]) ^ ord(x[1])) for x in zip(c, cycle(k))])
'RC3-2016-XORISGUD'

And we have our key!