Inspiration

The inspiration for this came from learning about DLL injection in a Data Structures lecture on libraries, which included some of their advantages, drawbacks, and security considerations. While learning how to implement such an attack using the Win32 API and running published examples through a debugger for reverse engineering, I noticed that function calls in the Win32 API are not actually processed in the API, but forwarded to another library. Because Windows functions are considered to be more abstracted with more processing (and authentication) done before manipulating hardware, calling functions directly from the NTAPI lends itself to more effective antivirus/end-point detection evasion. In essence, because kernel functions are more privileged than user functions, the closer we are to the kernel, the more likely our program won't get caught by a machine's onboard security.

What it does

The program and its associated header run as an executable and take a process ID as an argument. It starts by getting the address of the NTDLL and saves it as a handle to a module. We then instantiate our own renditions of NTAPI functions by searching for their addresses in the NTDLL. The logic for executing the attack is as follows: 1) The process ID is used to get a handle to the process. 2) allocate a buffer within the virtual address space of the hosting process set to the size of our payload, which is in this case shellcode. 3) Write the shellcode to the buffer we just allocated. 4) Start a thread that executes the shellcode in tandem with the hosting process. Technically the injection could be any malicious code, but because this attack intends for a reverse shell, we'll also set up a listener on a VM which will receive the request for a shell from our thread. At this point, the attack would be successful, and remote code execution would be possible. 5) Close the process and deallocate memory.

How we built it

In short, though PIDs and handles are similar in concept, a handle is an abstract reference to a process managed by the operating system. Because many lower-level functions take a process handle as a parameter, getting a handle is required. The program then gets the address to the NTAPI so we can call functions using their own unique addresses. Win32 and NTAPI functions are very similar and often have one-to-one equivalencies. But because the NTAPI is undocumented, we need to search for their addresses. I used structs and prototypes to hold the parameters so that when are instatiations are called, we can pass our own parameters and they'll be processed as they're listed in the prototype. Additional structs are added as part of library dependency.

Challenges we ran into

The Native API is undocumented, but thanks to the work of reverse engineers, there are repositories with the knowledge we know of, though it is subject to change and can be unstable depending on system architecture. Research and 'Google Hacking' are some of the skills honed while building out the header file in a way that's stable. Debugging went differently then expected. Normally, functions return values of type bool, int, void, etc. But NT functions return status values. So understanding where the program went wrong involved printing a hex of the status symbol and using Microsoft's database of error codes to see were errors lie.

Accomplishments that we're proud of

I'm proud of overcoming inexplicable error messages which facilitated a deepened understanding of memory addressing and obscure data types unique to Windows. An error of note was when the variable contained the shellcode would be inadvertendly overwritten while attemptig to write the data to the allocated buffer. I'm proud of the some of the quality of life improvements that other programmers before me have come up with. Namely, using macros with variadic arguments to allow for easy print statements made the debugging process more efficient and focus more on reading error codes and less on the semantics of every printf() statment. Unfortunately, even the long stretches of time spent working on a single error led to missing single character typos which would botch the entire program inexplicably. I can't explain the satisfaction I had when these were finally caught.

What we learned

The theory for this project relies on the understanding that most of our applications run in an outer 'ring' (user space) with oversight and built-in protection from accidental or intentional misuse by lower-level functions. But the closer an application can get to the inner 'ring' (kernel space), the easier it is to directly interact with computer hardware and execute instructions without prior authentication.

What's next for Native API Injection Attack

While the attack in its current state does allow for a reverse TCP shell, there is plenty of room for improvement. The intent was the better evade antivirus yet compliation and execution isn't even guaranteed with antivirus turned on. I noticed that AirRowdy's Wi-Fi settings don't even allow for the shell to be generated (or at least this is my intuition with how it has worked while connected to other networks). The exact code layout can be cleaned up and optimized, shellcode can be placed in a separate .data file instead of directly in the file involved in the injection. The shellcode is signatured which is what makes antivirus able to see it so easily, so obfuscation using RC4 or XOR encryption is also possible. And ultimately, the next step would be to invoke system calls in lieu of Native API function calls since these are included in the list of functions that often get 'hooked' by antivirus solutions. System calls are a subject I'm very new to, but this would be the next step in a logical progression of increasingly more evasive injection attacks.

Built With

  • c
  • metaspolit
  • msfvenom
  • ntapi
  • ntdoc
  • win32
Share this project:

Updates