Introduction to dynamic linking
As mentioned earlier, there are many optimizations in the static linking (but this does not mean that static linking is not used by anyone; in some cases, static linking still needs to be used), for example:
Wastes too much disk and memory space
Cannot be updated dynamically
To solve the aforementioned problems, dynamic linking is used.
The problems that dynamic linking needs to solve
All dynamic linking isDelays the linking process until runtime
The executable program obtained from static linking can be executed after being loaded by the operating system, because as mentioned in the previous article, during the linking process, all the code, data, and other Sections in the target files have been merged into the executable file. During this process, all the external symbols (variables, functions, etc.) used in the target files are also relocated to their positions in the virtual address space, so the executable program can run without depending on external modules (the required 'external modules' have been merged into the executable file), in summary:For static linking, during the linking stage, the relocation of symbols is directly modified in the executable file that needs to be relocated to complete the relocation.
However, this is not an easy task in dynamic linking, as dynamic linking delays the linking process until runtime, meaning during runtime:Before the entry function of the executable program is called, after the executable file and the dynamic link libraries it depends on are loaded into memory, the relocation of external symbols referenced in the executable file (that is, filling the corresponding virtual addresses in the dynamic link libraries where these symbols are called) is performed.
此时就会产生一个很关键的问题:
At this time, a very critical problem will arise:
The code segment is not writable!!
The reason does not need to be explained, if the code segment is writable, there will be no security, and it will disrupt the execution logic of the code (a writable code segment mechanism has appeared in the early days)So imagine static relocation,
It is absolutely impossible to directly modify the address at the corresponding call position of external functions/variables in the code!So how to solve this problem is
So how to solve this problem is
The core of dynamic linking
Butler Lampson once said this: adding another level of indirection
All problems in computer science can be indirectly solved by adding another layer
Although the code segment is not writable after loading, where is it writable?
Data segment!
Although we cannot directly modify the actual address of external variables/functions in the code segment, we canPlace the actual address of external variables/functions in a certain position of the data segment, let the code segment first refer to the content in the data segment, and directly modify the address saved in this data segment to the virtual address of external variables/functions during relocation to complete relocation, and thus complete dynamic linking
And this approach is the optimal approach to implement dynamic linking, and it has another name:
Address-independent code(PIC, Position-independent Code
) which is the option that GCC will default to adding-fPIC
At this time, the shared library used is only one copy in the virtual memory, if the data in the shared library is used, it will be copied to each process
But this approach will generate security issues: since the data segment stores the actual virtual addresses after relocation of the executable file functions to be executed, and the data segment has write permission, this means that through some means (such as overflow), we can also overwrite its address to achieve the purpose of locating other functions, for example, if the executable file itself calls alibc
Functions, by modifying this address, it is possible to locate and uselibc
Another function system in it to achieve pop-upshell
The purpose of
Another way to implement dynamic linking
Address-independent code is undoubtedly the optimal dynamic linking solution, but if only-shared
Option without enabling-fPIC
Then it uses another dynamic linking method:Relocation at load time
Shared objects, also known as dynamic link libraries, are always only one copy after being loaded into physical memory, regardless of how many processes use them. However, for each process, the shared object is mapped once to the virtual address space, which means that each process space has a copy of the shared object mapping, but for different processes, the mapped addresses (base addresses) are different (most of the time).Therefore, when mapping the dynamic link library to each process during loading, it is necessary to modify the address in the instructions of the dynamic library in the physical memory according to the address of the process space (such as jump instructions or variable access, etc.)
So this generates a disadvantage:It is impossible to share the same instructions among multiple processes
Experimental observation
Some functions that may be used
dlopen()
dlsym()
dlerror()
dlclose()
dlopen()
Used to open a shared object (dynamic library) and load it into the process space, complete initialization
void *dlopen(const char *filename, int flags);
const char *filename
The path to open the dynamic libraryIf filename is NULL, then the returned handle is for the main program.
If the filename is empty, return the global symbol table of the process
If
filename
is0
, thendlopen
It will return the handle of the global symbol table, and we can find any symbol in the global symbol table at runtime and execute it. The global symbol table includes the executable file of the program itself, all shared modules loaded into the process by the dynamic linker aredlopen
Open and useRTLD_GLOBAL
The module with this method.const char *filename
The way of parsing function symbolsRTLD_LAZY
Deferred binding (the specific content will be explained later, for now let's just say)The binder (symbol lookup, relocation, etc.) is performed by the dynamic linker only when the function is used for the first time (not used is not bound). This improves the startup speed of the program
RTLD_NOW
All function binding tasks are completed when the module is loadedIf there are any undefined symbol references that cannot be bound, thendlopen()
returns an error
Choose one of the above options
RETURN VALUE
The final return value returns a pointer to a dynamic library
On success,dlopen()
Returns a non-NULL handle for the loaded object. On error (file could not be found, was not readable, had the wrong format, or caused errors during loading), these functions return NULL
dlsym()
Use this function to obtain fromdlopen()
The handle obtained by opening the dynamic library (handle
find the specified symbol in
void *dlsym(void *handle, const char *symbol);
void *handle
Dynamic library handleconst char *symbol
The symbol to be searched for
RETURN VALUEOn success, these functions return the address associated with the symbol. On failure, they return NULL; the cause of the error can be diagnosed using dlerror(3).
Successfully returns the address associated with the symbol, fails to returnNULL
dlerror()
to determine if the last call was successful
char *dlerror(void);
Each calldlopen()
,dlsym()
ordlclose()
From now on, you can calldlerror()
function to determine if the last call was successful.dlerror()
The return value type ischar*
If returnedNULL
Indicates that the last call was successful; if not, it returns the corresponding error message
dlclose()
Close a dynamic library, unload a module that has been loaded
int dlclose(void *handle);
Experimental code
Current code relationship:
b.c → libb.so
a.c → liba.so
main.c → main
liba.so
Depends onlibb.so
main.c
Andliba.so libb.so
Dynamically link together asmain
gcc -m32 -fPIC -shared b.c -o libb.so
gcc -m32 -fPIC -shared a.c -o liba.so https://www.freebuf.com/articles/system/libb.so
sudo ldconfig + current dynamic library path
gcc -m32 -fPIC main.c -o main https://www.freebuf.com/articles/system/liba.so libb.so
# Or gcc -m32 -fPIC main.c -L. liba.so libb.so -o main
b.c
#include <stdio.h>
int g_nVarB = 30;
void fnB(void)
{
printf("Processing in function B --- nVarB = %d \\n",g_nVarB);
}
a.c
#include <stdio.h>
// Internal static variable
static int s_nVarA1 = 10;
// Declare global variable
int g_nVarA2 = 20;
// Declare external variable g_nVarB
extern int g_nVarB;
// Declare external function fnB
extern void fnB(void);
static void s_fnA2(void)
{
printf("Processing in function A2 \\n");
}
void fnA3(void)
{
printf("Processing in function A3 \\n");
}
void fnA1(void)
{
printf("Processing in function A1 \\n");
// Modify internal static variable
s_nVarA1 = 11;
// Non-static global variable
g_nVarA2 = 20;
// Modify external global variable
g_nVarB = 31;
// Call internal static function
s_fnA2();
// Call internal function
fnA3();
// Call external function
fnB();
}
main.c
This program is very large, it may be a bit cumbersome to look at, and its functions are as follows:
- Through
dlopen
Returns the handle of the global symbol table of this process and passes throughdlsym
locate the symbol in it - locate
liba.so
imported intomain
of the symbol (if not imported, it will be outputget address failed
) - locate
libb.so
imported intomain
of the symbol (if not imported, it will be outputget address failed
)
If it exists, it will print the starting address
#include <stdio.h>
#include <unistd.h>
#include <dlfcn.h>
#include <stdlib.h>
extern int g_nVarA2;
extern void fnA1();
typedef void (*pFunc)(void);
int main(int argc,char** argv)
{
printf("Processing in main function \\n");
// Print the global symbol table of the process
void *handle = dlopen(0,RTLD_NOW);
if ( handle == NULL)
{
dlerror("dlopen()");
// exit(1);
}
printf("---------------------------- main function ------------------------------ \\n");
// Print the address of variable symbols in main
pFunc addr_main = dlsym(handle,"main");
if (addr_main == NULL)
{
// dlrror("dlsym():main");
fprintf(stderr,"get address of main failed!\\n");
// exit(1);
}
else
{
printf("main function address is : 0x%x \\n",(unsigned int)addr_main);
}
// Virtual address of related symbols in liba.so
printf("}}")"------------------------------ liba.a ------------------------------------ \\n"
unsigned int *addr_s_nVarA1 = dlsym(handle,"s_nVarA1");
if (addr_s_nVarA1 == NULL)
{
// dlrror("dlsym():liba.so:s_nVarA1");
fprintf(stderr,"get address of s_nVarA1 failed!\\n");
// exit(1);
}
else
{
printf("liba.so:s_nVarA1 address is : 0x%x \\n",addr_s_nVarA1);
}
unsigned int *addr_g_nVarA2 = dlsym(handle,"g_nVarA2");
if (addr_g_nVarA2 == NULL)
{
// dlerror("dlsym():liba.so:g_nVarA2");
fprintf(stderr,"get address of g_nVarA2 failed!\\n");
// exit(1);
}
else
{
printf("liba.so:g_nVarA2 address is : 0x%x \\n",addr_g_nVarA2);
}
pFunc addr_fnA1 = dlsym(handle,"fnA1");
if (addr_fnA1 == NULL)
{
// dlerror("dlsym():liba.so:fnA1");
fprintf(stderr,"get address of fnA1 failed!\\n");
// exit(1);
}
else
{
printf("liba.so:fnA1 function address is : 0x%x \\n",(unsigned int)addr_fnA1);
}
pFunc addr_s_fnA2 = dlsym(handle,"s_fnA2");
if (addr_s_fnA2 == NULL)
{
// dlerror("dlsym():liba.so:s_fnA2");
fprintf(stderr,"get address of s_fnA2 failed!\\n");
// exit(1);
}
else
{
printf("liba.so:s_fnA2 function address is : 0x%x \\n",(unsigned int)addr_s_fnA2)}
}
pFunc addr_fnA3 = dlsym(handle,"fnA3");
if (addr_fnA3 == NULL)
{
// dlerror("dlsym():liba.so:fnA3");
fprintf(stderr,"get address of fnA3 failed!\\n");
// exit(1);
}
else
{
printf("liba.so:fnA3 function address is : 0x%x \\n",(unsigned int)addr_fnA3);
}
// Virtual addresses of related symbols in libb.so
printf("---------------------------------- libb.so --------------------------------- \\n");
unsigned int *addr_g_nVarB = dlsym(handle,"g_nVarB");
if (addr_g_nVarB == NULL)
{
// dlerror("dlsym():libb.so:g_nVarB");
fprintf(stderr,"get address of g_nVarB failed!\\n");
}
else
{
printf("libb.so:g_nVarB address is : 0x%x \\n",addr_g_nVarB);
}
pFunc addr_fnB = dlsym(handle,"fnB");
if (addr_fnB == NULL)
{
// dlerror("dlsym():libb.so:fnB");
fprintf(stderr,"get address of fnB failed!\\n");
// exit(1);
}
else
{
printf("libb.so:fnB function address is : 0x%x \\n",(unsigned int)addr_fnB);
}
dlclose(handle);
// Assign values to global variables in the liba dynamic library
g_nVarA2 = 100;
// Call parameters in liba
fnA1();
// Suspend the program to prevent the program from exiting,For easy observation
while(1)
{
sleep(5);
}
return 0;
}
The loading process of dynamic libraries
Observe the dependency relationships between dynamic libraries
From the initial dependency relationship diagram, it can be seenmain
Program withliba.so
Andlibb.so
Dependency relationships:
main
Depends onliba.so
liba.so
Depends onlibb.so
You can useldd
patchelf
These two commands to observe their dependency relationships
ldd
patchelf
1: Bootstrap of dynamic library
The dynamic linker is responsible for loading the dynamic libraries it depends on when executing the program
In the executionmain
When, the operating system will first mapmain
Loaded into memory, and then according to.interp
The information in this segment to obtainDynamic linkerpath
.interp
The content of this segment is very simple, it saves a string,This string is the path of the dynamic linker required by the executable file
Throughobjdump
View the content of this segment
objdump -s main
The path to obtain the dynamic linker can be/lib/ld-linux.so.2
As can be seenThe dynamic linker itself is also a dynamic linked library (shared objectGlibc
Part), but the special thing is that the dynamic linker does not depend on any dynamic library,otherwise when it is loaded, there is no one to load it for him, andThe relocation work of the global and static variables that the dynamic linker itself needs must be completed by itselfTherefore, this code of self-relocation must be completed when the dynamic linker starts, and this process is often called Bootstrap(Bootstrap
),so the entry address of the dynamic library is the beginning of the bootstrap code
The dynamic linker itself is statically linked
Therefore, execute
main
When, it will first map the dynamic linker tomain
The virtual address space of the processThis passage will be easier to understand after reading the whole text
At this time, the dynamic linker will first execute the bootstrap code: first locate its own
GOT
Table,GOT
The first entry in the table is.dynamic
Short offset addresses, through the information in this segment, the bootstrap code can obtain the relocation table and symbol table of the dynamic linker itself, thereby obtaining the relocation entry of the dynamic linker itself, and then relocate them all first. From this step on, the dynamic linker code can start using its own global variables and static variablesAfter completing the basic bootstrap,The dynamic linker will
main
And the symbol table of the linker itself are merged into a single symbol table, which is the global symbol tableSubsequently, the linker begins to search for the shared objects that the executable file depends on in
.dynamic
There is an entry in the segmentDT_NEEDED
Indicates the shared objects that the current executable file depends onTherefore, the linker can list
main
All the required shared objects, and put these shared object names into a loading collectionliba.so
libc.so.6
then the linker starts to take a name of the required shared object from the set, find the corresponding file, open the file, and read the corresponding
ELF
file header and.dynamic
section, then map its corresponding code section and data section into the process spaceIf this
ELF
If the shared object depends on other shared objects, then put the names of the dependent shared objects into the loading set. Repeat this process until all dependent shared objects are loaded into the system.libb.so
(liba.so
dependencies )
executemain
There are two ways to observe its virtual address space after
cat /proc/PID/maps
pmmp PID
Global Symbol Table and .dynamic Table
During the static linking process, the linker will extract each symbol from each target file in the first scan to generate the global symbol table, and in the second scan, it will check each target file for symbols that need to be relocated, confirm their positions in the global symbol table, and fill this address into the reference location
due to the characteristics of dynamic linking, so each symbol can only know its specific address at runtime, so thisentrust the responsibility of generating the global symbol table to the dynamic linker
and wemain
The function of the process is todlopen
todlsym
print outputmain
process global symbol table address information of related symbols, let's first look at its execution result
you can see the defineds
whether it is a variable or a function at the beginning (static variables) inmain
global symbol table cannot find its address, while the others are normal, why is that?
Before answering this question, let's talk about two tables.dynamic
and.symtab
.symtab
is not unfamiliar, that is, the symbol table, which is all the symbols in the module
.dynamic
which is what is commonly referred to asDynamic Symbol Table,the dynamic symbol table records the export and import relationships of the module's symbols and is.symtab
a subset
so for a dynamic link library,Only symbols in the dynamic symbol table will be exported, and due tostatic
type characteristics, so the symbols modified by it will not be placed in the dynamic linking table,which also explains why inmain
the global symbol table cannot findliba.so
andlibb.so
instatic
Symbols
liba.so
symbol table and dynamic symbol table
libb.so
dynamic symbol table
main.so
dynamic symbol table
After completing the above process, the linker will proceed with the most important relocation and initialization process that we are concerned about
2:Relocation and Initialization
After completing the above process, the linker will perform the relocation and initialization againRevisit the relocation tables of the executable file and each shared object (.rel.
),and relocate theirGOT/PLT
correct the positions that need to be relocated in it
Due toliba.so
With more dependencies and relationships, the observation of the key tables will follow next:liba.so
for example
Relocation table .rel.
The dynamic linking will inform the linker which symbols need to be relocated through two relocation tables:
.rel.dyn
InDataReference correction (the position it corrects is locatedGOT
Table (.got
) and the data segment).rel.plt
InFunctionsReference correction (the position it corrects is locatedGOT
Table (.got.plt
))
By instructionreadef -r liba.so
ObserveRelocation table
Among whichRelocation entryThere are two types of types:
R_386_GLOB_DAT
R_386_JUMP_SLOT
These two types representThe position that needs to be corrected only needs to be filled in directlyGOT
The address in the table can be corrected
Global Offset Table GOT
As mentioned before, during the execution of the program, relocation is implemented by adding an intermediate layer to achieve the function of runtime relocation, then thisThe middle layer is the Global Offset TableGOT
To implement: for example, as shown aboveliba.so
Will refer tolibb.so
The global variables ing_nVarB
, so inliba.so
Code segment, it will take the referenceg_nVarB
Location in theg_nVarB
The address is changed toGOT
Table address (GOT
The address of the corresponding table item), so it only needs to put the address into theGOT
Table
GOT table in the dynamic link library
There are two in the dynamic link libraryGOT
Table:
- RelocationVariablesSymbols
.got
- RelocationFunctionsSymbols
.got.plt
As can be seen from the figure.got
And.got.plt
The size is24
Bytes, indicating that each one has6
Table items (each table item occupies 4 bytes)
And can be obtained fromOff
Looking at an item:
.got
The offset from the file header is0x002fe8
.got.plt
The offset from the file header0x003000
Let's look at it from the perspective of the loader.got
Table.got.plt
The location of the table is?
And note one point, you can see where the first section startsVirtAddr
(virtual address) is0x00000000
This is because the position-independent code parameter is enabled (-fPIC
)Whenliba.so
The code segment and data segment are loaded into memory, the dynamic linker will find a free space immediately, and use the starting address of this space asliba.so
The virtual starting base address, and thenVirtAddr
As the offset address
As can be seen, its4
numberLOAD
The starting address of the0x003ef8
that is.init_array
The virtual address (not the final virtual address, the final virtual address needs to be added the base address at runtime)
According toSection Headers
It can be calculated from.init_array
And.got
The offset between0x2fe8 - 0x2ef8 = 0xf0
Using0xf0 + 0x003ef8
=0x3fe8
That is to say.got
The starting address
0x3ef8 + 18 = 0x4000
That is to say.plt.got
The starting address
Of course, this calculation does not mean much at the beginning, starting directly fromSection headers
can obtain theAddr
the relationship between the relocation table and the GOT table
Careful observation may find that the offset address of the symbol to be relocated in the relocation table is exactly the same as.got
and.plt.got
the table items overlap, which also verifies what was said at the beginning, because the addresses of the symbols to be relocated in the dynamic linking all point toGOT
a table itemTherefore, the dynamic linker only needs to relocateGOT
relocate the table item corresponding to the offset in the table
Among the first three entries of the GOT table are three public table items, and the functions of these three table items are as follows:
got[0]
the dynamic section (.dynamic
(section) of this ELFgot[1]
the loading address oflink_map
data structure descriptor address (module ID of this module)got[2]
_dl_runtime_resolve
function address
liba.sodisassembly code
The role of <__x86.get_pc_thunk.bx>
used to obtain the address of the next instruction in the function body of the function calling this function
Since the function calling this function is incall 1080 <__x86.get_pc_thunk.bx>
will push the address of the next instruction onto the stack for return, so this instruction directly moves the stack pointeresp
the address pointed to byebx
in
⚠️ Note: Only32
in the 32-bit environment, the address of the next instruction is obtained in this way, because32
in the 32-bit environment does not support direct accesseip
register32
it is enough in the 32-bit environment
a strange section.plt.got
Attention, attention, this is not
.got.plt
!!!!!!!!!!!!!!! Not important, can be skipped
a program (written with C++ in Linux) calls __cxa_finalize
when exiting the main function
gcc already provides destructors, which will be called by __cxa_finalize.
__cxa_finalize is called on library unload (either when a program is exiting or by a dlopen)
atexit, __cxa_atexit, and __cxa_finalize interaction
<fnA1> Analysis
For convenience, here is the source code again
void fnA1(void)
{
printf("Processing in function A1 \\n");
// Modify internal static variable
s_nVarA1 = 11;
// Non-static global variable
g_nVarA2 = 20;
// Modify external global variable
g_nVarB = 31;
// Call internal static function
s_fnA2();
// Call internal function
fnA3();
// Call external function
fnB();
}
First, let's look at11da
The two lines of code at
1128: e8 53 ff ff ff call 1080 <__x86.get_pc_thunk.bx>
112d: 81 c3 d3 2e 00 00 add ebx,0x2ed3
call 1080 <__x86.get_pc_thunk.bx>
The112d
putebx
inadd ebx,0x2ed3
ebx = ebx + 0x2ed3 = 0x112d + 0x2ed3 = 0x4000
This0x4000
that is.plt.got
offset address
⚠️ When loading the dynamic library during the actual program execution, throughcall
The obtained is the virtual address, through+ 0x2ed3
This offset, can successfully calculate the actual.plt.got
virtual address
It can be seen that the operations on the variables in the relocation table after that are all based onebx
to perform (variables located in the relocation table also useebx
because.data
The section is located.got.plt
After)
But the function call borrows anotherSection
.plt
About.plt
It is necessary to mention another feature Delayed Binding (PLT
)
Delayed Binding (PLT)
Due to the complex access to global data and inter-module calls under dynamic linkingGOT
Location, and then indirect addressing or calling, resulting in a reduction in program execution speed of about 1%~5%. Moreover, because the linking work of dynamic linking is completed at runtime, it leads to a slower program startup speed.
During the program's execution, there will be many functions that are not used (error handling functions, unused functional modules, etc.), so there is no need to link all functions at the beginningELF
The basic idea of using the delayed binding method is that the dynamic linker performs binding (symbol lookup, relocation, etc.) only when the function is used for the first time, and does not bind the unused ones. This improves the program's startup speed
How to Implement Delayed Binding
ELF
ThroughPLT
(Procedure Linking Table
) 来实现延迟绑定
) to achieve lazy bindingIn Glibc, the function that the dynamic linker completes the binding work is called
_dl_runtime_resolve()It must know where the binding occurs in which module and function, so assume its function prototype is
_dl_runtime_resolve(module, function)GOT
When calling a function of an external module, it does not jump directly throughPLT
to jump, but through a callPLT
there is a corresponding item
<fnA3@plt>
for example
001050 <fnA3@plt>:
1050: ff a3 10 00 00 00 jmp DWORD PTR [ebx+0x10]
1056: 68 08 00 00 00 push 0x8
105b: e9 d0 ff ff ff jmp 1030 <_init+0x30>
To achieve lazy binding, the linker does not bind in the initialization phasefnA3()
the address is filled into[ebx+0x10]
the correspondingGOT
in the table, but thepush 0x8
the address is filled into[ebx+0x10]
the correspondingGOT
in the table, this step does not need to search for any symbols, so the cost is very low. It is obvious that the effect of the first instruction is to jump to the second instruction, which is equivalent to not performing any operation
the second instructionpush 0x8
push a0x8
is pushed onto the stack, and this number isfnA3
This symbol reference is in the relocation table.rel.plt
the index in
Thenjmp 1030 <_init+0x30>
jump to_dl_runtime_resolve
This is actually to implement_dl_runtime_resolve
function call, after performing a series of symbol resolution and relocation operations, it willfnA3()
the actual address is filled into[ebx+0x10]
the correspondingGOT
in the table
the next time we call<fnA3@plt>
Whenjmp
the instruction can jump to the actualfnA3()
In the function

评论已关闭