Analysis of Virtual Machine Protection Technology
Virtual Machine (VM)
The virtual machines discussed in this article, such as VMware and other virtualization environments, are not the same thing; it is a kind ofvirtual machine-based code protection technology(Virtual Machine-Based Protection, VMP), to be precise, the virtual machine discussed here is aInterpretive execution system(For example, the P-Code compilation method in Visual Basic 6). Now some dynamic languages (such as Ruby, Python, Lua, and .NET, etc.) are also interpreted from a certain perspective.
Interpretive execution refers to the virtual machine interpreting and executing byte code instructions one by one, rather than compiling the entire program into machine code before execution. This approach allows for higher flexibility and security, as the code is only interpreted and executed at runtime.
The P-Code (Pseudo Code) compilation method in Visual Basic 6 is a method of intermediate code compilation, which is used to compile Visual Basic code into intermediate code instead of generating native machine code directly. This intermediate code can be executed by the VB6 interpreter at runtime.
Python bytecode is an intermediate representation form of Python programs, similar to P-Code or Java bytecode.
VMP
Virtual machine protection technology (VMP) is a technology that converts executable code based on the X86 assembly system into byte code instruction system code, in order to achieve the purpose of protecting the original instructions from being easily reversed or tampered with.
This instruction execution system is not at the same level as Intel's x86 instruction system. For example, 80x86 assembly instructions are executed in the CPU, while the byte code instruction system executes instructions through interpretation (here the byte code instruction system is built on the x86 instruction system).
Working principle:
Code conversion:
Initial compilation: Compile high-level languages (such as C++, C#, etc.) into x86 assembly code or intermediate representation (IR).
Bytecode generation: Convert x86 assembly code into virtual machine-specific byte code instructions. This conversion process will confuse and encrypt the instructions, generating byte code that is difficult to understand.
Virtual machine execution:
Interpreter: During program execution, a special virtual machine interpreter is responsible for interpreting and executing these byte code instructions.
Dynamic execution: The interpreter reads byte code instructions one by one and executes the corresponding operations on the underlying x86 instruction system.
Virtual machine execution status
In the figure above, there are several components:
VStartVMPartially initialize the virtual machine
VMDispatcherPartially schedule these Handlers. After the scheduling execution is completed, it will return to VMDispatcher, forming a loop.
BytecodeIt is a sequence of data consisting of a set of instructions and data defined by the instruction execution system.
HandlerIt is a small program or a process.
Overall process:
Virtual machine initialization (VStartVM)
Set the initial state of the virtual machine, including registers, stacks, memory, and so on.
Prepare the environment required for the virtual machine to execute.
Instruction scheduling (VMDispatcher)
Read the next instruction from the bytecode stream.
Search for the corresponding Handler according to the instruction's opcode (opcode).
Pass control to the found Handler for specific instruction processing.
Instruction processing (Handler)
Each Handler processes specific bytecode instructions and performs the corresponding operations.
After the Handler completes its execution, it returns control to VMDispatcher.
Loop execution
VMDispatcher continues to read the next bytecode instruction, repeating the scheduling and processing process until the program ends.
Throughout the process, Bytecode (bytecode) is equivalent to binary code running on a real CPU, containing all instructions and data of the program. VMDispatcher is similar to the instruction scheduler of the CPU, responsible for fetching instructions, interpreting instructions, and sending them to the corresponding execution unit (Handler). The Handler (instruction handler) is the execution unit of CPU instructions, each corresponding to a command in the virtual machine instruction set and performing specific operations.
VMContext
**“VMContext”** structure is used to store the status of various virtual registers in the virtual machine execution environment. This structure contains all the important registers that need to be saved and operated on during the execution of virtual machine instructions.
struct VMContext
{
DWORD v_eax;
DWORD v_ebx;
DWORD v_ecx;
DWORD v_edx;
DWORD v_esi;
DWORD v_edi;
DWORD v_ebp;
DWORD v_efl; // Symbolic register (virtual EFLAGS)
};
Calling convention
In virtual machine protection technology, specific register conventions are very important for the execution process of the virtual machine. They ensure that the virtual machine can correctly access and process critical data structures and instruction streams during execution.
As mentioned aboveVirtual machine execution process:
The virtual machine starts by calling VStartVM, initializes the virtual machine context and stack, and saves the current register state.
Load the bytecode address to ESI and set the virtual machine stack and context address.
VMDispatcher reads the opcode from ESI and finds the address of the handler from the JUMPADDR table according to the opcode.
According to the definition of the opcode, the program executes specific instruction operations, such as register operations, memory access, etc.
Return VMDispatcher to continue the scheduling and execution of the next bytecode instruction.
The code is as follows:
section .data
; Define the JUMPADDR table, each entry stores the address of a handler
JUMPADDR dd Handler0, Handler1, Handler2, ... ;
section .text
global VStartVM
VStartVM:
pusha ; Save the status of all general-purpose registers to the stack
push ebx
push ecx
push edx
push esi
push edi
push ebp
pushfd
; Set virtual machine registers and stack
mov esi, [esp + 0x20] ; Load the value on the stack (bytecode address) into ESI
mov ebp, esp ; Set EBP to the current ESP (virtual machine stack)
sub esp, 0x200 ; Allocate space for the virtual machine stack
mov edi, esp ; Set EDI to the new ESP (virtual machine context)
; Enter the virtual machine scheduling loop
jmp VMDispatcher
VMDispatcher:
mov al, byte ptr [esi] ; Read a byte (opcode) from the bytecode address pointed to by ESI
inc esi ; Increment ESI to point to the next byte
movzx eax, al ; Extend AL to EAX
mov eax, dword ptr [eax * 4 + JUMPADDR] ; Obtain the address of the handler from the JUMPADDR table according to the opcode
jmp eax ; Jump to the address of the handler to execute
It can be agreed as follows:
edi points to the starting value of VMContext
edi
The register always points to the memory address of the VMContext structure during the execution of the virtual machine. Through this address, the virtual machine can conveniently access and modify the status of the virtual registers.
esi points to the address of the bytecode
esi
The register points to the memory area containing bytecode instructions. The virtual machine interpreter usesesi
Instructions are read byte by byte, and the corresponding Handler is executed according to the instruction's opcode.
ebp points to the VM stack address
ebp
Registers are used during the execution of a virtual machine to point to the base address of the virtual machine stack. The virtual machine stack is used to store local variables, function parameters, and return addresses, etc.
Handler design
The Handlers in VMP are not handles in Windows, buta small piece of program or a process, which is scheduled by VMDispatcher in VMP. Each Handler corresponds to a bytecode instruction and is responsible for executing specific operations.
Handlers are divided into two major categories, one isAuxiliary Handler, another type isNormal Handler. The Auxiliary Handler is used to execute some important, basic instructions, usually related to the core functions and context management of the virtual machine, such as push and pop stack handling; the Normal Handler is used to execute conventional x86 instructions, such as arithmetic operations, logical operations, or comparisons and branches.
Auxiliary Handler
Push:
; Get the value to be pushed onto the stack, assuming the value is in eax
sub ebp, 4 ; Move stack pointer down to reserve 4 bytes of space
mov [ebp], eax ; Push the value in eax onto the top of the stack
jmp VMDispatcher ; Return to dispatcher
Pop:
; Pop the value from the top of the stack into eax
mov eax, [ebp] ; Get value from top of stack to eax
add ebp, 4 ; Move stack pointer up to release 4 bytes of space
jmp VMDispatcher ; Return to dispatcher
Normal Handler
Add:
mov eax, [edi + 0] ; Load value from virtual register v_eax to eax
mov ebx, [edi + 4] ; Load value from virtual register v_ebx to ebx
add eax, ebx ; Perform addition operation
mov [edi + 0], eax ; Store result back to virtual register v_eax
jmp VMDispatcher ; Return to dispatcher
Sub:
mov eax, [edi + 0]
mov ebx, [edi + 4]
sub eax, ebx
mov [edi + 0], eax
jmp VMDispatcher
And:
mov eax, [edi + 0]
mov ebx, [edi + 4]
and eax, ebx
mov [edi + 0], eax
jmp VMDispatcher
Mov:
mov eax, [esi] ; Read source register index
mov ebx, [esi + 1] ; Read target register index
mov ecx, [edi + eax * 4] ; Load value from source virtual register to ecx
mov [edi + ebx * 4], ecx ; Store the value into the target virtual register
add esi, 2 ; Skip the source and target register indices in the bytecode
jmp VMDispatcher
Flag Bit Handling
VStcHandler:
pushfd ; Save the current EFLAGS to the stack
pop eax ; Store EFLAGS into eax
mov [edi + 0x1C], eax ; Save EFLAGS to the v_efl in the virtual machine context
stc ; Set the carry flag (CF)
pushfd ; Save the modified EFLAGS to the stack
pop eax ; Store the modified EFLAGS into eax
mov [edi + 0x1C], eax ; Update the v_efl in the virtual machine context
jmp VMDispatcher ; Return to dispatcher
In x86, there are many instructions involving flag bit operations, some are setting flag bits, and some are judging flag bits, so flag bits should be saved before the related Handler and restored after the related Handler. As shown above, stc sets the CF flag to 1.
Branch Instructions
Branch instructions includeConditional Transfer,Unconditional Transfer,callandretn.
When implementing branch instructions, you canesi
points to the address of the current bytecode, esi instruction is like the actual CPUeip
register, it can be rewritten byesi
The value of the register can change the flow.
Unconditional Jump Instructionsjmp
The Handler is as follows:
vJmp:
mov esi, dword ptr [esp] ; [esp] points to the place to jump
add esp, 4 ; Pop the top address of the stack
jmp VMDispatcher ;
To implement some conditional branch instructions, such asja
,je
Because they need to judge the flow based on the flag bits, it is slightly麻烦, you can useConditional Transfer InstructionsSubstitution.
Conditional Branch Instructions | Conditional Transfer Instructions |
---|---|
ja | cmova |
jae | cmovae |

评论已关闭