Background introduction
The volume of the application package affects many aspects such as user download volume, installation time, and user disk usage. According toGoogle Play statistics, for every 6MB increase in application volume, the conversion rate of installation will decrease by 1%.
The volume of the package is affected by many factors, and there are different optimization strategies for dex, resource files, and so files. This will not be expanded one by one here. This article mainly records the optimization scheme for file volume trimming of dynamic link libraries during research and development.

The link library I developed is written in Rust, and it implements the mutual calls between the Java layer and the native layer through the Android JNI interface. The main considerations for using Rust are mainly the following aspects:
1. Stable. The JNI interface call of Android is complex, and it also involves memory management at the native layer. As the amount of code increases, the safety and stability of the code will be greatly challenged. Using Rust for development, developers almost do not need to consider the GC issue. As long as they write code strictly according to the specifications and pass the compiler's checks during development, it is basically difficult to crash the program, and this has also been verified after the code goes online.
2. Secure. Traditional code developed with C, C++ is easy to be cracked with disassembly tools after compilation if not protected, and mature tools on the market such as IDA, ghidra, etc. can restore assembly code to high-level languages. The products compiled with Rust have a different calling convention between internal functions compared to the traditional ones, and there are currently no relatively complete decompilation tools on the market. The anti-cracking capability of the software is directly raised to a level.
However, a very obvious shortcoming of using Rust is that the compiled productsVolume is too large.Under the condition of not modifying the default Rust compilation options and only enabling strip, the volume of my dynamic library reached495k.
Optimization plan
Referring to the experience of predecessors on the Internet, the following optimization methods were carried out sequentially.
Adjust the optimization level
The default compilation optimization level is O3, the purpose of this optimization is to improve the running speed of the code, but at the same time, it will unroll some loops, causing the volume to expand. In this case, we aim to reduce the volume and change the optimization option to z, indicating the generation of the smallest binary size:
[profile.release]
opt-level = 'z'
Volume change before and after optimization
Compilation Options | Volume |
strip | 495k |
strip + opt-level = 'z' | 437k |
Enable LTO
LTO (Link Time Optimization) can eliminate redundant code during linking, reducing the binary size - at the cost of a longer linking time.
Cargo.toml
[profile.release]
opt-level = 'z'
lto = true
Volume change before and after optimization
Compilation Options | Volume |
strip | 495k |
strip + opt-level = 'z' | 437k |
strip + opt-level = 'z' + lto | 436k |
The optimization effect is very不明显, it's better than nothing.
Panic terminates immediately
Rust's default panic performs a stack backtrace when crashing, which is convenient for locating problems. However, it also brings additional volume increase, so this feature is replaced with abort.
[profile.release]
opt-level = 'z'
lto = true
panic = 'abort'
Volume change before and after optimization
Compilation Options | Volume |
strip | 495k |
strip + opt-level = 'z' | 437k |
strip + opt-level = 'z' + lto | 436k |
strip + opt-level = 'z' + lto + panic = 'abort' | 366K |
So far, the conventional optimization methods have been exhausted, and subsequent optimizations need to be combined with some additional code changes.
The analysis results of the product using the rust analysis tool bloat are as follows:
File .text Size Crate
4.1% 69.0% 192.7KiB std
1.0% 16.8% 46.9KiB jdmp
0.5% 8.1% 22.7KiB [Unknown]
0.2% 3.8% 10.5KiB jni
0.0% 0.5% 1.5KiB cesu8
0.0% 0.4% 1.1KiB adleR32
0.0% 0.3% 904B bytes
0.0% 0.2% 640B aho_corasick
0.0% 0.2% 588B regex_syntax
0.0% 0.2% 572B regex_automata
0.0% 0.2% 440B log
0.0% 0.1% 304B memchr
0.0% 0.0% 52B combine
0.0% 0.0% 8B jni_sys
To my surprise, my core code jdmp module only occupies 46.9k, and for this reason, I need to introduce several hundred k of additional overhead!
Remove some unnecessary strings
In the third-party dependencies introduced, developers have added many string information, most of which are used to improve the runtime error information provided. By modifying, simplifying these dependency libraries, and deleting unnecessary code, some space can be saved again.
At the same time, although the optimization uses abort instead of panic, the rust compiler still generates some formatted strings. This behavior can be disabled by using the panic_immediate_abort compilation option.
.cargo/config.toml
[unstable]
build-std-features = ["panic_immediate_abort"]
build-std = ["std","panic_abort"]
Volume change before and after optimization
Compilation Options | Volume |
strip | 495k |
strip + opt-level = 'z' | 437k |
strip + opt-level = 'z' + lto | 436k |
strip + opt-level = 'z' + lto + panic = 'abort' + code trimming + panic_immediate_abort | 135k |
After further analysis, the entire file size has dropped to 135k, with the self-developed core code accounting for 52% of the total code, which is basically in line with expectations.
File .text Size Crate
14.2% 52.0% 41.3KiB jdmp
3.2% 11.7% 9.3KiB core
3.1% 11.4% 9.1KiB jni
3.0% 11.0% 8.8KiB [Unknown]
1.9% 6.8% 5.4KiB std
0.9% 3.3% 2.6KiB alloc
0.3% 1.1% 936B cesu8
0.3% 1.0% 792B adleR32
0.1% 0.5% 372B aho_corasick
0.1% 0.4% 316B regex_automata
0.1% 0.3% 220B log
0.1% 0.3% 216B hashbrown
0.0% 0.1% 108B bytes
0.0% 0.1% 44B combine
0.0% 0.1% 44B rustc_demangle
0.0% 0.0% 8B compiler_builtins
0.0% 0.0% 8B jni_sys
Optimize linker script
Although the file size has been optimized quite a lot compared to the beginning, it has not yet reached the required level for integration. By further analyzing the sections of the ELF file with readelf, I found some additional optimization space.
$ aarch64-linux-gnu-readelf -S target/aarch64-linux-android/release/libjdmp.so
There are 24 section headers: starting at offset 0x21738:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .note.android.ide NOTE 0000000000000270 00000270
0000000000000098 0000000000000000 A 0 0 4
[ 2] .dynsym DYNSYM 0000000000000308 00000308
00000000000002e8 0000000000000018 A 7 1 8
[ 3] .gnu.version VERSYM 00000000000005f0 000005f0
000000000000003e 0000000000000002 A 2 0 2
[ 4] .gnu.version_r VERNEED 0000000000000630 00000630
0000000000000040 0000000000000000 A 7 2 4
[ 5] .gnu.hash GNU_HASH 0000000000000670 00000670
0000000000000024 0000000000000000 A 2 0 8
[ 6] .hash HASH 0000000000000694 00000694
0000000000000100 0000000000000004 A 2 0 4
[ 7] .dynstr STRTAB 0000000000000794 00000794
000000000000014d 0000000000000000 A 0 0 1
[ 8] .rela.dyn RELA 00000000000008e8 000008e8
00000000000007f8 0000000000000018 A 2 0 8
[ 9] .rela.plt RELA 00000000000010e0 000010e0
00000000000002a0 0000000000000018 AI 2 19 8
[10] .rodata PROGBITS 0000000000001380 00001380
0000000000001d83 0000000000000000 AM 0 0 8
[11] .eh_frame_hdr PROGBITS 0000000000003104 00003104
0000000000002494 0000000000000000 A 0 0 4
[12] .eh_frame PROGBITS 0000000000005598 00005598
00000000000078cc 0000000000000000 A 0 0 8
[13] .text PROGBITS 000000000000de64 0000ce64
0000000000013e0c 0000000000000000 AX 0 0 4
[14] .plt PROGBITS 0000000000021c70 00020c70
00000000000001e0 0000000000000000 AX 0 0 16
[15] .data.rel.ro PROGBITS 0000000000022e50 00020e50
0000000000000430 0000000000000000 WA 0 0 8
[16] .fini_array FINI_ARRAY 0000000000023280 00021280
0000000000000010 0000000000000008 WA 0 0 8
[17] .dynamic DYNAMIC 0000000000023290 00021290
0000000000000180 0000000000000010 WA 7 0 8
[18] .got PROGBITS 0000000000023410 00021410
0000000000000048 0000000000000000 WA 0 0 8
[19] .got.plt PROGBITS 0000000000023458 00021458
00000000000000f8 0000000000000000 WA 0 0 8
[20] .data PROGBITS 0000000000024550 00021550
0000000000000060 0000000000000000 WA 0 0 8
[21] .bss NOBITS 00000000000245b0 000215b0
0000000000000101 0000000000000000 WA 0 0 8
[22] .comment PROGBITS 0000000000000000 000215b0
00000000000000b2 0000000000000001 MS 0 0 1
[23] .shstrtab STRTAB 0000000000000000 00021662
00000000000000d3 0000000000000000 0 0 1
When optimizing these sections, it is necessary to understand the role of each section in the program running.
section | Function |
.text | Code segment |
.data .rodata .bss | Data segment |
.plt .got .dynamic .dynsym .rela.dyn .rela.plt .shstrtab | Parsed at runtime by the dynamic link library, used for dynamic linking. |
.eh_frame .eh_frame_hdr | Used to save the stack frame offset of the function, which is convenient for stack unwinding |
.gnu.hash .gnu.version .gnu.version_r .hash | Save the metadata of the compiled file |
During normal operation, the code segment and data segment are indispensable, and the section required for dynamic linking needs to be retained. The remaining sections can be removed, which can further optimize the file size. It is worth noting that after deleting .eh_frame .eh_frame_hdr, only a crash address can be obtained when the program crashes, and stack unwinding cannot be performed.
Create a linker script that only retains the section of the program running the minimum dependency.
PHDRS
{
headers PT_PHDR PHDRS ;
text PT_LOAD FILEHDR PHDRS ;
data PT_LOAD ;
dynamic PT_DYNAMIC ;
{}
ENTRY(Reset);
EXTERN(RESET_VECTOR);
SECTIONS
{
. = SIZEOF_HEADERS;
.text : { *(.text .text.*) } :text
.rodata : { *(.rodata .rodata.*) } :text
. = . + 0x1000;
.data : { *(.data .data.*) *(.fini_array .fini_array.*) *(.got .got.*) *(.got.plt .got.plt.*) } : data
.bss : {*(.bss .bss.*)} : data
.dynamic : { *(.dynamic .dynamic.*) } :data :dynamic
/DISCARD/:
{
*(.ARM.exidx .ARM.exidx.*);
*(.gnu.version .gnu.version.*);
*(.gnu.version_r .gnu.version_r.*);
*(.eh_frame_hdr .eh_frame .eh_frame_hdr.* .eh_frame.* );
*(.note.android.ident .note.android.ident.*);
*(.comment .comment.*);
{}
{}
Modify compilation parameters, replace the default linker script
.cargo/config.toml
[build]
target = ["aarch64-linux-android","armv7-linux-androideabi"]
[unstable]
build-std-features = ["panic_immediate_abort"]
build-std = ["std","panic_abort"]
[target.aarch64-linux-android]
rustflags = ["-C", "link-arg=-Tlinker.lds"]
[target.armv7-linux-androideabi]
rustflags = ["-C", "link-arg=-Tlinker.lds"]
After a series of operations, the program's volume was finally reduced to 95k! Perfectly meets the requirements.
Summary
Compilation Options | Volume |
strip | 495k |
strip + opt-level = 'z' | 437k |
strip + opt-level = 'z' + lto | 436k |
strip + opt-level = 'z' + lto + panic = 'abort' + code trimming + panic_immediate_abort | 135k |
strip + opt-level = 'z' + lto + panic = 'abort' + code trimming + panic_immediate_abort + remove section | 95k |
This article records various operations I have performed for compilation volume optimization, and some of the strategies still have certain universality in the development of C and C++ languages.
Author: Shang Hongze
Source: JD Cloud Developer Community. Please indicate the source when republishing.

评论已关闭