Windows PE and PE File Header Programming
Basic Concepts
Introduction
PE (Portable Executeable File Format, Portable executable file format)is a format used forexecutable,target fileanddynamic link libraryfile format is mainly used forwindowsThe operating system uses this format to ensure that the EXE files generated by linking can work under different CPU instruction sets.
There are many executable programs in Windows, such as COM, PIF, SCR, EXE. But most of these file formats are inherited from PE, among which, EXE is the most common PE file, and dynamic link library (dll) files are also in PE format.

The most common file format in Linux is the ELF file format.
Address
In the PE file structure, generally, four types of addresses are involved, namely:
Virtual Memory Address (VA)
Relative Virtual Memory Address (RVA)
File Offset Address (FOA)
Base Address (Imagebase)
Virtual Memory Address:
After the PE file is loaded into memory by the operating system, the corresponding virtual space of the PE process, the address in this space is called the virtual address, which is an abstract address and does not exist in reality.
Base Address:
After the PE file is loaded into memory, its related dynamic link libraries are also loaded. At this time, the loaded file is called a module (Module), and the starting address of the mapping file is called a module handle (hModule), which can be used to access other data structures in memory. This initial memory address is also known as the base address (ImageBase)
The base address indicates where the operating system should start storing the module, and the base address of different modules is generally different.
Relative virtual memory address:
RVA is an offset relative to the base address, that is, RVA is the address used in virtual memory to locate a specific position, the value of which is the offset from the specific position to a certain module base address. Therefore, RVA is specific to a module.
Among them, VA = Imagebase + RVA .
Note: RVA is specific to a module, therefore RVA has a range, from the beginning of the module to the end of the module, RVA outside this range is invalid and is called out of bounds.
File offset address:
FOA is unrelated to memory, it is the offset from the file header to a certain position. When you open a PE file with a hexadecimal editor like WinHex, you see the file FOA.
Alignment
The concept of alignment is present in many file formats. In PE, three types of alignment are classified: alignment of data in memory, alignment of data in the file, and alignment of resource data in resource files.
Memory alignment
Due to the memory management mechanism of Windows (paging mechanism), memory is generally divided into pages, so the alignment unit of sections in PE files in memory must also be at least the size of a page. For 32-bit operating systems, this value is 4KB (1000h); for 64-bit operating systems, this value is 8KB (2000h)File alignment
Generally, the alignment unit defined for sections in the file is much smaller than the memory alignment unit. It is usually taken as 512 bytes (200h) as the alignment unitResource data alignment
In resource files, the resource bytecode section generally requires alignment in double words (4 bytes).
Paging mechanism is a commonly used memory management method in modern operating systems. By dividing physical memory into fixed-size pages (Page) and dividing the virtual address space of the process into pages of the same size, the conversion from virtual address to physical address can be mapped through the page table (Page Table).
The page table is used to store the mapping relationship of pages, that is, the correspondence between virtual addresses and physical addresses. Each entry in the page table is called a page table entry (PTE), which contains the address of the page in physical memory and some control information (such as whether the page is in memory, access permissions, etc.).
Overview of PE structure
PE structure diagram
PE structure as seen by programmers.
As shown in the figure above, a standard PE file generally consists of four major parts:
DOS header
PE header (IMAGE_NT_HEADERS)
Section table (multiple IMAGE_SECTION_HEADER structures)
Page content
Among them, the data structure of the PE header is the most complex. Simply put, the PE header includes:
4-byte size identification symbol (Signature)
20-byte basic header information (IMAGE_FILE_HEADER)
216-byte extended header information (IMAGE_OPTIONAL_HEADER32)
If it is according to the “Header + BodyFrom the perspective of the information organization method:
PE file header = DOS header + PE header + section table
PE file body = section content
The section content will appear various different data structures, such as import table, export table, resource table, relocation table, etc.
PE file header
DOS MZ header
The definition of the DOS MZ header in the Windows PE format is as follows:
Mainly for modern PE files to have good compatibility with early DOS files, and its structure is IMAGE_DOS_HEADER.
It is 64 bytes in size, with two important members being:
e_magic: DOS signature (4D5A, MZ)
e_lfanew: Indicates the offset of the NT header (the value is different for different files)
Below the DOS MZ header is the DOS Stub (DOS stub). The entire DOS Stub is a byte block, whose content varies depending on the linker used during linking, and there is no corresponding structure in PE.
Example:
In the analysis of malicious samples, it is often encountered that malicious software searches for PE executable files in the computer, as follows:
PE header (NT header)
NT headerContains important information for loading executable files in the Windows system. The NT header isIMAGE_NT_HEADERS
Definition.
From the name of the definition of the structure, it can be seen that the IMAGE_NT_HEADERS is composed of multiple structures, includingIMAGE_NT_SIGNATRUE
,IMAGE_FILE_HEADER
and IMAGE_OPTIONAL_HEADER
Three parts.
The position of the NT header in the PE file is not fixed, the position of the NT header is determined by the DOS headere_lfanew
Field.
When the executable body is executed on an operating system that supports the PE file structure, the PE loader will start fromIMAGE_DOS_HEADER
Structuree_lfanew
Find the starting offset of the NT header in the field, add it to the base address to get the pointer to the PE header.
Definition:
PE header identifier Signature
PE file identifier, defined as00004550hThe corresponding ASCII string is **"PE\0\0"**.
that is, the structure in the above structure. Signature
member, which follows the DOS Stub, and the position of the identifier is located IMAGE_DOS_HEADER.e_lfanew
The position pointed to.
If the file identifier is changed, the operating system will not be able to recognize the file as the correct PE file.
Standard PE header IMAGE_FILE_HEADER
File header IMAGE_FILE_HEADER
Following the PE identifier, the content of the next twenty bytes in this position is the content of the standard PE header data structure.
This structure is called in Microsoft's official documentation Standard through the object file format (Common Object File Format, COFF) header. It records the global properties of the PE file, such as the platform on which the PE file runs, the type of the PE file (EXE or DLL), the total number of sections in the file, etc. Most importantly, it points out the next structure IMAGE_OPTIONAL_HEADER32
size. Its detailed definition is as follows:
Taking this program as an example:
Pay special attention to the Machine member, NumberOfSections member, SizeOfOptionalHeader member, and Characteristics member.
Machine
+0004h, a single word.
Each CPU has a unique Machine code, used to specify the platform on which the PE file runs. Here it is 0x8664, corresponding to the AMD64 CPU.
#define IMAGE_FILE_MACHINE_UNKNOWN 0
#define IMAGE_FILE_MACHINE_TARGET_HOST 0x0001 // Useful for indicating we want to interact with the host and not a WoW guest.
#define IMAGE_FILE_MACHINE_I386 0x014c // Intel 386.
#define IMAGE_FILE_MACHINE_R3000 0x0162 // MIPS little-endian, 0x160 big-endian
#define IMAGE_FILE_MACHINE_R4000 0x0166 // MIPS little-endian
#define IMAGE_FILE_MACHINE_R10000 0x0168 // MIPS little-endian
#define IMAGE_FILE_MACHINE_WCEMIPSV2 0x0169 // MIPS little-endian WCE v2
#define IMAGE_FILE_MACHINE_ALPHA 0x0184 // Alpha_AXP
#define IMAGE_FILE_MACHINE_SH3 0x01a2 // SH3 little-endian
#define IMAGE_FILE_MACHINE_SH3DSP 0x01a3
#define IMAGE_FILE_MACHINE_SH3E 0x01a4 // SH3E little-endian
#define IMAGE_FILE_MACHINE_SH4 0x01a6 // SH4 little-endian
#define IMAGE_FILE_MACHINE_SH5 0x01a8 // SH5
#define IMAGE_FILE_MACHINE_ARM 0x01c0 // ARM Little-Endian
#define IMAGE_FILE_MACHINE_THUMB 0x01c2 // ARM Thumb/Thumb-2 Little-Endian
#define IMAGE_FILE_MACHINE_ARMNT 0x01c4 // ARM Thumb-2 Little-Endian
#define IMAGE_FILE_MACHINE_AM33 0x01d3
#define IMAGE_FILE_MACHINE_POWERPC 0x01F0 // IBM PowerPC Little-Endian
#define IMAGE_FILE_MACHINE_POWERPCFP 0x01f1
#define IMAGE_FILE_MACHINE_IA64 0x0200 // Intel 64
#define IMAGE_FILE_MACHINE_MIPS16 0x0266 // MIPS
#define IMAGE_FILE_MACHINE_ALPHA64 0x0284 // ALPHA64
#define IMAGE_FILE_MACHINE_MIPSFPU 0x0366 // MIPS
#define IMAGE_FILE_MACHINE_MIPSFPU16 0x0466 // MIPS
#define IMAGE_FILE_MACHINE_AXP64 IMAGE_FILE_MACHINE_ALPHA64
#define IMAGE_FILE_MACHINE_TRICORE 0x0520 // Infineon
#define IMAGE_FILE_MACHINE_CEF 0x0CEF
#define IMAGE_FILE_MACHINE_EBC 0x0EBC // EFI Byte Code
#define IMAGE_FILE_MACHINE_AMD64 0x8664 // AMD64 (K8)
#define IMAGE_FILE_MACHINE_M32R 0x9041 // M32R little-endian
#define IMAGE_FILE_MACHINE_ARM64 0xAA64 // ARM64 Little-Endian
#define IMAGE_FILE_MACHINE_CEE 0xC0EE
NumberOfSections
+0006h, a single byte.
This member indicates the number of sections existing in the file.
TimeDateStamp
+0008h, a double word.
The timestamp created by the compiler for this file.
PointToSymbolTable
+000Ch, a double word.
The file offset of the COFF symbol table.
NumberOfSymbol
+0010h, a double word.
The number of elements in the symbol table.
SizeOfOptionalHeader
+0014h, a single word.
Indicates the length of the structure IMAGE_OPTIONAL_HEADER32 (32-bit system).For 32-bit PE files, this field is usually 00E0h; for 64-bit PE32+ files, this field is 00F0h.
Note:The user can customize the size of this value
After modification, manually expand the size of IMAGE_OPTIONAL_HEADER32 to the specified value.
After expansion, maintain the alignment characteristics in the file.
Characteristics
+0016h, a single word.
Identifies file attributes, whether the file is an executable form, whether it is a DLL, etc., combined by bit OR.
extended PE header IMAGE_OPTIONAL_HEADER
IMAGE_OPTIONAL_HEADER structure has differences between 32-bit and 64-bit. With
typedef struct _IMAGE_OPTIONAL_HEADER
{
//
// Standard fields.
//
+18h WORD Magic; // Signature word, ROM image (0107h), common executable file (010Bh)
+1Ah BYTE MajorLinkerVersion; // The major version number of the linking program
+1Bh BYTE MinorLinkerVersion; // The minor version number of the linking program
+1Ch DWORD SizeOfCode; // The total size of all sections containing code
+20h DWORD SizeOfInitializedData; // The total size of all sections containing initialized data
+24h DWORD SizeOfUninitializedData; // The size of all sections containing uninitialized data
+28h DWORD AddressOfEntryPoint; // The RVA of the program execution entry
+2Ch DWORD BaseOfCode; // The starting RVA of the code block
+30h DWORD BaseOfData; // The starting RVA of the data block
//
// NT additional fields. The following are fields added to the NT structure.
//
+34h DWORD ImageBase; // The preferred loading address of the program
+38h DWORD SectionAlignment; // 内存中的区块的对齐大小
+3Ch DWORD FileAlignment; // 文件中的区块的对齐大小
+40h WORD MajorOperatingSystemVersion; // 要求操作系统最低版本号的主版本号
+42h WORD MinorOperatingSystemVersion; // 要求操作系统最低版本号的副版本号
+44h WORD MajorImageVersion; // 可运行于操作系统的主版本号
+46h WORD MinorImageVersion; // 可运行于操作系统的次版本号
+48h WORD MajorSubsystemVersion; // 要求最低子系统版本的主版本号
+4Ah WORD MinorSubsystemVersion; // 要求最低子系统版本的次版本号
+4Ch DWORD Win32VersionValue; // 无需字段,不被病毒利用的话一般为0
+50h DWORD SizeOfImage; // 映像装入内存后的总尺寸
+54h DWORD SizeOfHeaders; // 所有头 + 区块表的尺寸大小
+58h DWORD CheckSum; // 映像的校验和
+5Ch WORD Subsystem; // 可执行文件期望的子系统
+5Eh WORD DllCharacteristics; // DllMain()函数何时被调用,默认为 0
+60h DWORD SizeOfStackReserve; // 初始化时的栈大小
+64h DWORD SizeOfStackCommit; // 初始化时实际提交的栈大小
+68h DWORD SizeOfHeapReserve; // 初始化时保留的堆大小
+6Ch DWORD SizeOfHeapCommit; // 初始化时实际提交的堆大小
+68h DWORD SizeOfHeapReserve; // The reserved heap size at initialization
+6Ch DWORD SizeOfHeapCommit; // The actual committed heap size at initialization+70h DWORD LoaderFlags; // Related to debugging, default is 0
+74h DWORD NumberOfRvaAndSizes; // The number of items in the data directory below
+78h IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
// Data directory table
IMAGE_OPTIONAL_HEADER32, *PIMAGE_OPTIONAL_HEADER32;
}
A more important member:MagicMagic number, indicates
The type of the file
AddressOfEntryPointEntry Address. This address is a relative virtual address, abbreviated as EP (EntryPoint), which points to the first code to be executed by the program. If the program is obfuscated, this field's value will be modified. If the value of this field before obfuscation is found during the deobfuscation process, it means that the original entry point has been found, and the original entry point is calledOEP
The address pointed to by this field is not the address of the main() function, nor the address of the WinMain() function, but the address of the startup code of the runtime library.
If a piece of code is attached to an executable file and you want this code to be executed first, you just need to point the entry address to the attached code.
ImageBase
This field indicates thePreferred loading addressIf possible (the address is not occupied), the operating system will load the machine code into memory at this address, which will run much faster; if the address is occupied by other modules, the loaded file needs to beRelocation operation
For EXE files, the default loading address is0x00400000For DLL files, the default loading address is 0x00100000.
SectionAlignment,FileAlignment
The SectionAlignment field specifies the alignment unit after the section is transferred to memory. The FileAlignment field specifies the alignment unit of a word in the file.
SizeOfImage
The mapped size of the entire PE file in the file. Taking HelloWorld.exe loaded into memory as an example, the file header of HelloWorld.exe occupies 1000h bytes, and the three bytes each occupy 1000h bytes, so the total size of the space occupied by the file in memory is 4000h. This value can be larger than the actual value, but cannot be negative, and must be an integer multiple of the value of the SectionAlignment field.
SizeOfHeaders
Is the total size of the MS-DOS header, PE file header, and block table.
Subsystem
An enumeration value indicating the expected subsystem (user interface type) of the executable file.
NumberOfRvaAndSize
The number of items in the data directory. Generally 00000010h, that is, 16.
DataDirectory
Data directory structure. This is an array of structure types, consisting of 16 identical IMAGE_DATA_DIRECTORY structures, with a size of bytes, pointing to output tables, input tables, resource blocks, and other data. See below.
Data directory item IMAGE_DATA_DIRECTORY
The last field of the IMAGE_OPTIONAL_HEADER32 (extended PE header) structure isDataDirectory
. This field defines the directory information of all different types of data appearing in the PE file.
As mentioned earlier, data in the application is divided into many types according to their purpose, such as export table, import table, resources, relocation table, and so on. In memory, these data are organized by the operating system in pages and assigned different access attributes; in the file, these data pages are organized congruently and placed in designated positions in the file according to different categories.
This structure is used to describe the position and size of different categories of data in the file (and memory). Therefore, this field is quite important.
The data types defined in the data directory are a total of 16, and PE uses IMAGE_DATA_DIRECTORY
To define each type of data, the definition of this structure is as follows:
The two fields in order are VirtualAddress and isize, as shown in the figure, the total data directory is composed of 16 consecutive IMAGE_DATA_DIRECTORY.
Description of data directory table items:
As shown in Figure 3-11, to query specific types of data, you need to start from this structure. For example, to view which dynamic link library functions are called in the PE, you need to start from the second element of the data directory table (array index 1,)IMAGE_DATA_DIRECTORY
Obtain the starting position and size of the import table from the structure, and then find the related bytecode of the import table based on the VirtualAddress_1 address.
Section table entry IMAGE_SECTION_HEADER
From the previous PE structure diagram, it can be known that the section table is composed of multiple section table entries, each of which records information related to a specific section in the PE, such as the attributes of the section, including different characteristics, access permissions, etc. In the section table, the number of sections is determined by IMAGE_FILE_HEADER
in NumberOfSections
Determine.
There are four important members:
VirtualSize: The size of the section in memory
VirtualAddress: The starting address of the section in memory (RVA)
SizeOfRawData: The size of the section in the disk file
Charateristics: Section attributes (bit OR)
PE file header programming
PE memory image
PE memory image refers to the organization of the PE file after it is loaded into memory. As mentioned earlier, Windows has a memory management mechanism, and each running program will have its own independent running space. Each part is aligned according to the size of 1000h. Therefore, there will be a corresponding relationship between the PE file image and the PE memory image.
Loading process:
Reading the PE header and section table: The loader first reads the PE header and section table to understand the structure of the file and the positions of each section.
Memory allocation: Allocate appropriate memory space for the program based on the information in the PE header.
Loading each section: Load the sections in the PE file into the corresponding positions in memory. Different sections may have different loading methods, such as the code section (.text) is typically read-only, and the data section (.data) is readable and writable.
Relocation: If the base address (Preferred Base Address) of the PE file cannot be satisfied, the loader will adjust the addresses in memory based on the relocation table.
Parsing the import table (Import Table): The loader will load the required dynamic link libraries (DLLs) based on the import table and parse the addresses of functions and variables.
As shown in the figure above, from file to memory, the data in the 'header + section table' part has not been changed, and the additional part is just data filled with zeros. The alignment method for each section is defined by the fields IMAGEOPTIONAL HEADER32.FileAlignment and IMAGE OPTIONAL HEADER32.SectionAlignment in the data structure.
RVA和FOA的转换
Conversion between RVA and FOA.
RVA is relative virtual address, FOA is file offset. Before learning file header programming, it is necessary to know how to convert between the two.
Method:
Determine which section the specified RVA falls into.
Calculate the starting RVA of the section (sectionStartRVA).
Calculate the offset within the section (offsetWithinSection).
Calculate the offset (fileOffset) of the RVA relative to the disk file header.
#include <windows.h>
DWORD RVAToOffset(PBYTE lpFileHead, DWORD dwRVA) {
Code implementation:
PIMAGE_DOS_HEADER dosHeader;
PIMAGE_NT_HEADERS ntHeaders;
PIMAGE_SECTION_HEADER sectionHeader;
DWORD fileOffset = (DWORD)-1;
dosHeader = (PIMAGE_DOS_HEADER)lpFileHead;
ntHeaders = (PIMAGE_NT_HEADERS)(lpFileHead + dosHeader->e_lfanew);
sectionHeader = (PIMAGE_SECTION_HEADER)((BYTE*)ntHeaders + sizeof(IMAGE_NT_HEADERS));
for (int i = 0; i < ntHeaders->FileHeader.NumberOfSections; ++i) {
DWORD sectionStartRVA = sectionHeader[i].VirtualAddress;
if (dwRVA >= sectionStartRVA && dwRVA < sectionEndRVA) {
DWORD sectionEndRVA = sectionStartRVA + sectionHeader[i].SizeOfRawData;
DWORD offsetWithinSection = dwRVA - sectionStartRVA;
break;
{}
{}
return fileOffset;
{}
Data location
When processing PE files, the location steps are the prerequisite conditions for any data access and operation.
Location of the PE header: find the location of the PE header by the e_lfanew field in the DOS header.
The location of the data directory table entry: locate the specific data directory table entry in the DataDirectory array of the optional header.
Section table entry location: Traverse the section table to find the section containing the specified RVA.
These location steps are applicable to both handling PE files on the disk and handling PE file images in memory.
PE header location
PE header location refers to finding the core structure of the PE file—the PE header. The PE header contains basic information about the file and the runtime image.
#include <windows.h>
// Define constants for the values of dwFlag1 and dwFlag2
#define PE_IMAGE_HEADER 0
#define PE_MEMORY_MAPPED 1
#define RETURN_RVA_MODULE_BASE 0
#define RETURN_FOA_FILE_BASE 1
#define RETURN_RVA 2
#define RETURN_FOA 3
DWORD rPE(PBYTE lpHeader, DWORD dwFlag1, DWORD dwFlag2) {
DWORD ret = 0;
DWORD imageBase = 0;
// Header pointer conversion
PIMAGE_DOS_HEADER dosHeader = (PIMAGE_DOS_HEADER)lpHeader;
PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)(lpHeader + dosHeader->e_lfanew);
// Get the recommended load address of the program
imageBase = ntHeaders->OptionalHeader.ImageBase;
if (dwFlag1 == PE_IMAGE_HEADER) { // _lpHeader is a PE image header
if (dwFlag2 == RETURN_RVA_MODULE_BASE) { // Return RVA + module base address
ret = (DWORD)ntHeaders;
else if (dwFlag2 == RETURN_FOA)
ret = (DWORD)((BYTE*)ntHeaders - lpHeader);
} else if (dwFlag2 == RETURN_RVA) { // Return RVA
ret = (DWORD)((BYTE*)ntHeaders - lpHeader);
} { // Return FOA
ret = (DWORD)((BYTE*)ntHeaders - lpHeader);
{}
} else if (dwFlag1 == PE_MEMORY_MAPPED) { // _lpHeader is the memory-mapped file header
if (dwFlag2 == RETURN_RVA_MODULE_BASE) { // Return RVA + module base address
ret = (DWORD)((BYTE*)ntHeaders - lpHeader + imageBase);
} else if (dwFlag2 == RETURN_FOA_FILE_BASE) { // Return FOA + file base address
ret = (DWORD)ntHeaders;
} else if (dwFlag2 == RETURN_RVA) { // Return RVA
ret = (DWORD)((BYTE*)ntHeaders - lpHeader);
} else if (dwFlag2 == RETURN_FOA) { // Return FOA
ret = (DWORD)((BYTE*)ntHeaders - lpHeader);
{}
{}
return ret;
{}
Judging according to the value of dwFlag1, if dwFlag1 is PE_IMAGE_HEADER, it means that lpHeader is the PE image header; otherwise, it is the memory image header.
After that, according to dwFlag2:
If dwFlag2 is RETURN_RVA_MODULE_BASE, return RVA + module base address.
If dwFlag2 is RETURN_FOA_FILE_BASE, return FOA + file base address.
If dwFlag2 is RETURN_RVA, return RVA.
If dwFlag2 is RETURN_FOA, return FOA.
There is a point, if it is a PE image header, dwFlag2 is RETURN_FOA_FILE_BASE, returning FOA, so the file base address is meaningless at this time.
Data directory table entry location
The data directory table entry is part of the optional header in the PE file, pointing to various data structures, such as import table, export table, resource table, etc.
Code:
#include <windows.h>
// Define constants for the values of dwFlag1 and dwFlag2
#define PE_IMAGE_HEADER 0
#define PE_MEMORY_MAPPED 1
#define RETURN_RVA_MODULE_BASE 0
#define RETURN_FOA_FILE_BASE 1
#define RETURN_RVA 2
#define RETURN_FOA 3
// Auxiliary function: Convert RVA to file offset (FOA)
DWORD RVAToOffset(PBYTE lpFileHead, DWORD dwRVA) {
PIMAGE_DOS_HEADER dosHeader = (PIMAGE_DOS_HEADER)lpFileHead;
PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)(lpFileHead + dosHeader->e_lfanew);
PIMAGE_SECTION_HEADER sectionHeader = IMAGE_FIRST_SECTION(ntHeaders);
for (int i = 0; i < ntHeaders->FileHeader.NumberOfSections; ++i, ++sectionHeader) {
DWORD sectionStartRVA = sectionHeader->VirtualAddress;
DWORD sectionEndRVA = sectionStartRVA + sectionHeader->SizeOfRawData;
if (dwRVA >= sectionStartRVA && dwRVA < sectionEndRVA) {
return sectionHeader->PointerToRawData + (dwRVA - sectionStartRVA);
{}
{}
return -1; // Return invalid offset
{}
DWORD rDDEntry(PBYTE lpHeader, DWORD index, DWORD dwFlag1, DWORD dwFlag2) {
DWORD ret = 0, ret1 = 0, ret2 = 0;
DWORD imageBase = 0;
// Header pointer conversion
PIMAGE_DOS_HEADER dosHeader = (PIMAGE_DOS_HEADER)lpHeader;
PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)(lpHeader + dosHeader->e_lfanew);
// Get the recommended load address of the program
imageBase = ntHeaders->OptionalHeader.ImageBase;
// Point to DataDirectory
PIMAGE_DATA_DIRECTORY dataDirectory = &ntHeaders->OptionalHeader.DataDirectory[index];
// Extract the position of the specified index data directory item, which is RVA
ret1 = dataDirectory->VirtualAddress;
if (dwFlag1 == PE_IMAGE_HEADER) { // _lpHeader is a PE image header
if (dwFlag2 == RETURN_RVA_MODULE_BASE) { // Return RVA + module base address
ret = ret1 + (DWORD)lpHeader;
} else if (dwFlag2 == RETURN_FOA_FILE_BASE) { // Meaningless, return FOA
ret = RVAToOffset(lpHeader, ret1);
} else if (dwFlag2 == RETURN_RVA) { // Return RVA
ret = ret1;
} else if (dwFlag2 == RETURN_FOA) { // Return FOA
ret = RVAToOffset(lpHeader, ret1);
{}
} else if (dwFlag1 == PE_MEMORY_MAPPED) { // _lpHeader is the memory-mapped file header
if (dwFlag2 == RETURN_RVA_MODULE_BASE) { // Return RVA + module base address
ret = ret1 + imageBase;
} else if (dwFlag2 == RETURN_FOA_FILE_BASE) { // FOA+file base address
ret2 = RVAToOffset(lpHeader, ret1);
ret = ret2 + (DWORD)lpHeader;
} else if (dwFlag2 == RETURN_RVA) { // Return RVA
ret = ret1;
} else if (dwFlag2 == RETURN_FOA) { // Return FOA
ret = RVAToOffset(lpHeader, ret1);
{}
{}
return ret;
{}
Section table entry location
The section table entry contains information about the sections of the PE file, each section describing a segment of code or data.
#include <windows.h>
// Define constants for the values of dwFlag1 and dwFlag2
#define PE_IMAGE_HEADER 0
#define PE_MEMORY_MAPPED 1
#define RETURN_RVA_MODULE_BASE 0
#define RETURN_FOA_FILE_BASE 1
#define RETURN_RVA 2
#define RETURN_FOA 3
DWORD rSection(PBYTE lpHeader, DWORD index, DWORD dwFlag1, DWORD dwFlag2) {
DWORD ret = 0;
DWORD imageBase = 0;
// Header pointer conversion
PIMAGE_DOS_HEADER dosHeader = (PIMAGE_DOS_HEADER)lpHeader;
PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)(lpHeader + dosHeader->e_lfanew);
// Get the recommended load address of the program
imageBase = ntHeaders->OptionalHeader.ImageBase;
// Get the starting address of the section table
PIMAGE_SECTION_HEADER sectionHeader = IMAGE_FIRST_SECTION(ntHeaders);
// Points to the section table entry at the specified index
sectionHeader += index;
if (dwFlag1 == PE_IMAGE_HEADER) { // _lpHeader is a PE image header
if (dwFlag2 == RETURN_RVA_MODULE_BASE) { // Return RVA + module base address
ret = (DWORD)sectionHeader;
} else { // Return relative offset (RVA or FOA)
ret = (DWORD)sectionHeader - (DWORD)lpHeader;
{}
} else if (dwFlag1 == PE_MEMORY_MAPPED) { // _lpHeader is a memory-mapped file header
if (dwFlag2 == RETURN_RVA_MODULE_BASE) { // Return RVA + module base address
ret = (DWORD)sectionHeader - (DWORD)lpHeader + imageBase;
} else if (dwFlag2 == RETURN_FOA_FILE_BASE) { // Return FOA + file base address
ret = (DWORD)sectionHeader;
} else { // Return relative offset (RVA or FOA)
ret = (DWORD)sectionHeader - (DWORD)lpHeader;
{}
{}
return ret;
{}
By traversing the section table, compare the given RVA with the address range of each section in the section table. If the RVA falls within the address range of the section table, return the address string of the section name.
#include <windows.h>
#include <stdio.h>
const char* getRVASectionName(PBYTE lpFileHead, DWORD dwRVA) {
// Define a return value to save the found section name
const char* szNotFound = "Not Found";
const char* sectionName = szNotFound;
// Get the DOS header and NT header
PIMAGE_DOS_HEADER dosHeader = (PIMAGE_DOS_HEADER)lpFileHead;
PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)(lpFileHead + dosHeader->e_lfanew);
// Get the starting address of the section table
PIMAGE_SECTION_HEADER sectionHeader = IMAGE_FIRST_SECTION(ntHeaders);
WORD numberOfSections = ntHeaders->FileHeader.NumberOfSections;
// Traverse the section table
for (int i = 0; i < numberOfSections; i++, sectionHeader++) {
DWORD sectionStartRVA = sectionHeader->VirtualAddress;
DWORD sectionEndRVA = sectionStartRVA + sectionHeader->SizeOfRawData;
// Determine if the RVA is within the current section range
if (dwRVA >= sectionStartRVA && dwRVA < sectionEndRVA) {
sectionName = (const char*)sectionHeader->Name;
break;
{}
{}
return sectionName;
{}
int main() {
// Example call, assuming lpFileHead points to the memory image of the PE file, and dwRVA is the RVA to be queried
PBYTE lpFileHead = ...; // Replace with the actual address of the PE file header
DWORD dwRVA = ...; // Replace with actual RVA
const char* sectionName = getRVASectionName(lpFileHead, dwRVA);
printf("Section Name: %s\n", sectionName);
return 0;
{}
PE checksum
The checksum is a WORD value. It is generated by calculating a certain algorithm on a segment of data and is usually used as a basis for judging whether this segment of data has been illegally modified.
The algorithm for the checksum in the PE file header is very simple, and it is divided into three steps:
Clear the field IMAGE OPTIONAL HEADER32.CheckSum at the beginning of the file.
Add the data block in WORD units with carry, and the part greater than WORD automatically overflows.
Add the file length to the cumulative sum.
#include <windows.h>
#include <stdio.h>
DWORD CalculatePEChecksumAPI(const char* lpExeFile) {
DWORD cSum = 0, hSum = 0;
// Call MapFileAndCheckSum API to calculate checksum
DWORD result = MapFileAndCheckSumA(lpExeFile, &hSum, &cSum);
if (result == CHECKSUM_SUCCESS) {
return cSum;
} else {
// Handle error conditions, such as file does not exist or cannot be accessed
return 0;
{}
{}
DWORD CalculatePEChecksumManual(const char* lpExeFile) {
HANDLE hFile = INVALID_HANDLE_VALUE;
DWORD dwSize = 0, bytesRead = 0;
DWORD ret = 0;
PBYTE hBase = NULL;
// Open file
hFile = CreateFileA(lpExeFile, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (hFile == INVALID_HANDLE_VALUE) {
return 0;
{}
// Get file size
dwSize = GetFileSize(hFile, NULL);
if (dwSize == INVALID_FILE_SIZE) {
CloseHandle(hFile);
return 0;
{}
// Allocate memory and read the file
hBase = (PBYTE)VirtualAlloc(NULL, dwSize, MEM_COMMIT, PAGE_READWRITE);
if (hBase == NULL) {
CloseHandle(hFile);
return 0;
{}
if (!ReadFile(hFile, hBase, dwSize, &bytesRead, NULL) || bytesRead != dwSize) {
VirtualFree(hBase, 0, MEM_RELEASE);
CloseHandle(hFile);
return 0;
{}
// Close file handle
CloseHandle(hFile);
// Clear CheckSum
PIMAGE_DOS_HEADER dosHeader = (PIMAGE_DOS_HEADER)hBase;
PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)(hBase + dosHeader->e_lfanew);
ntHeaders->OptionalHeader.CheckSum = 0;
// Add carry by byte, ignore overflow
DWORD cSum = 0;
DWORD wordCount = (dwSize + 1) / 2; // Calculate word count by WORD
WORD* pWord = (WORD*)hBase;
for (DWORD i = 0; i < wordCount; i++) {
cSum += *pWord++;
cSum = (cSum >> 16) + (cSum & 0xFFFF); // Handle carry
{}
// Release memory
VirtualFree(hBase, 0, MEM_RELEASE);
// Add file length
cSum += dwSize;
ret = cSum;
return ret;
{}
int main() {
const char* filePath = "your_pe_file.exe"; // Replace with the actual PE file path
DWORD apiChecksum = CalculatePEChecksumAPI(filePath);
printf("API Checksum: %08X\n", apiChecksum);
DWORD manualChecksum = CalculatePEChecksumManual(filePath);
printf("Manual Checksum: %08X\n", manualChecksum);
return 0;
{}

评论已关闭