Exploration and practice of optimizing the file size of Android dynamic link libraries

0 22
Background introductionThe volume of the application package affects many aspect...

Background introduction

The volume of the application package affects many aspects such as user download volume, installation time, and user disk usage. According toGoogle Play statistics, for every 6MB increase in application volume, the conversion rate of installation will decrease by 1%.

The volume of the package is affected by many factors, and there are different optimization strategies for dex, resource files, and so files. This will not be expanded one by one here. This article mainly records the optimization scheme for file volume trimming of dynamic link libraries during research and development.

Exploration and practice of optimizing the file size of Android dynamic link libraries

The link library I developed is written in Rust, and it implements the mutual calls between the Java layer and the native layer through the Android JNI interface. The main considerations for using Rust are mainly the following aspects:

1. Stable. The JNI interface call of Android is complex, and it also involves memory management at the native layer. As the amount of code increases, the safety and stability of the code will be greatly challenged. Using Rust for development, developers almost do not need to consider the GC issue. As long as they write code strictly according to the specifications and pass the compiler's checks during development, it is basically difficult to crash the program, and this has also been verified after the code goes online.

2. Secure. Traditional code developed with C, C++ is easy to be cracked with disassembly tools after compilation if not protected, and mature tools on the market such as IDA, ghidra, etc. can restore assembly code to high-level languages. The products compiled with Rust have a different calling convention between internal functions compared to the traditional ones, and there are currently no relatively complete decompilation tools on the market. The anti-cracking capability of the software is directly raised to a level.

However, a very obvious shortcoming of using Rust is that the compiled productsVolume is too large.Under the condition of not modifying the default Rust compilation options and only enabling strip, the volume of my dynamic library reached495k.

Optimization plan

Referring to the experience of predecessors on the Internet, the following optimization methods were carried out sequentially.

Adjust the optimization level

The default compilation optimization level is O3, the purpose of this optimization is to improve the running speed of the code, but at the same time, it will unroll some loops, causing the volume to expand. In this case, we aim to reduce the volume and change the optimization option to z, indicating the generation of the smallest binary size:

[profile.release]
opt-level = 'z'

Volume change before and after optimization

Compilation OptionsVolume
strip495k
strip + opt-level = 'z'437k

Enable LTO

LTO (Link Time Optimization) can eliminate redundant code during linking, reducing the binary size - at the cost of a longer linking time.

Cargo.toml
[profile.release]
opt-level = 'z'
lto = true

Volume change before and after optimization

Compilation OptionsVolume
strip495k
strip + opt-level = 'z'437k
strip + opt-level = 'z' + lto436k

The optimization effect is very不明显, it's better than nothing.

Panic terminates immediately

Rust's default panic performs a stack backtrace when crashing, which is convenient for locating problems. However, it also brings additional volume increase, so this feature is replaced with abort.

[profile.release]
opt-level = 'z'
lto = true
panic = 'abort'

Volume change before and after optimization

Compilation OptionsVolume
strip495k
strip + opt-level = 'z'437k
strip + opt-level = 'z' + lto436k
strip + opt-level = 'z' + lto + panic = 'abort'366K

So far, the conventional optimization methods have been exhausted, and subsequent optimizations need to be combined with some additional code changes.

The analysis results of the product using the rust analysis tool bloat are as follows:

File  .text     Size Crate
4.1%  69.0% 192.7KiB std
1.0%  16.8%  46.9KiB jdmp
0.5%   8.1%  22.7KiB [Unknown]
0.2%   3.8%  10.5KiB jni
0.0%   0.5%   1.5KiB cesu8
0.0%   0.4%   1.1KiB adleR32
0.0%   0.3%     904B bytes
0.0%   0.2%     640B aho_corasick
0.0%   0.2%     588B regex_syntax
0.0%   0.2%     572B regex_automata
0.0%   0.2%     440B log
0.0%   0.1%     304B memchr
0.0%   0.0%      52B combine
0.0%   0.0%       8B jni_sys

To my surprise, my core code jdmp module only occupies 46.9k, and for this reason, I need to introduce several hundred k of additional overhead!

Remove some unnecessary strings

In the third-party dependencies introduced, developers have added many string information, most of which are used to improve the runtime error information provided. By modifying, simplifying these dependency libraries, and deleting unnecessary code, some space can be saved again.

At the same time, although the optimization uses abort instead of panic, the rust compiler still generates some formatted strings. This behavior can be disabled by using the panic_immediate_abort compilation option.

.cargo/config.toml
[unstable]
build-std-features = ["panic_immediate_abort"]
build-std = ["std","panic_abort"]

Volume change before and after optimization

Compilation OptionsVolume
strip495k
strip + opt-level = 'z'437k
strip + opt-level = 'z' + lto436k
strip + opt-level = 'z' + lto + panic = 'abort' + code trimming + panic_immediate_abort135k

After further analysis, the entire file size has dropped to 135k, with the self-developed core code accounting for 52% of the total code, which is basically in line with expectations.

File  .text    Size Crate
14.2%  52.0% 41.3KiB jdmp
 3.2%  11.7%  9.3KiB core
 3.1%  11.4%  9.1KiB jni
 3.0%  11.0%  8.8KiB [Unknown]
 1.9%   6.8%  5.4KiB std
 0.9%   3.3%  2.6KiB alloc
 0.3%   1.1%    936B cesu8
 0.3%   1.0%    792B adleR32
 0.1%   0.5%    372B aho_corasick
 0.1%   0.4%    316B regex_automata
 0.1%   0.3%    220B log
 0.1%   0.3%    216B hashbrown
 0.0%   0.1%    108B bytes
 0.0%   0.1%     44B combine
 0.0%   0.1%     44B rustc_demangle
 0.0%   0.0%      8B compiler_builtins
 0.0%   0.0%      8B jni_sys

Optimize linker script

Although the file size has been optimized quite a lot compared to the beginning, it has not yet reached the required level for integration. By further analyzing the sections of the ELF file with readelf, I found some additional optimization space.

$ aarch64-linux-gnu-readelf -S target/aarch64-linux-android/release/libjdmp.so
There are 24 section headers: starting at offset 0x21738:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .note.android.ide NOTE             0000000000000270  00000270
       0000000000000098  0000000000000000   A       0     0     4
  [ 2] .dynsym           DYNSYM           0000000000000308  00000308
       00000000000002e8  0000000000000018   A       7     1     8
  [ 3] .gnu.version      VERSYM           00000000000005f0  000005f0
       000000000000003e  0000000000000002   A       2     0     2
  [ 4] .gnu.version_r    VERNEED          0000000000000630  00000630
       0000000000000040  0000000000000000   A       7     2     4
  [ 5] .gnu.hash         GNU_HASH         0000000000000670  00000670
       0000000000000024  0000000000000000   A       2     0     8
  [ 6] .hash             HASH             0000000000000694  00000694
       0000000000000100  0000000000000004   A       2     0     4
  [ 7] .dynstr           STRTAB           0000000000000794  00000794
       000000000000014d  0000000000000000   A       0     0     1
  [ 8] .rela.dyn         RELA             00000000000008e8  000008e8
       00000000000007f8  0000000000000018   A       2     0     8
  [ 9] .rela.plt         RELA             00000000000010e0  000010e0
       00000000000002a0  0000000000000018  AI       2    19     8
  [10] .rodata           PROGBITS         0000000000001380  00001380
       0000000000001d83  0000000000000000  AM       0     0     8
  [11] .eh_frame_hdr     PROGBITS         0000000000003104  00003104
       0000000000002494  0000000000000000   A       0     0     4
  [12] .eh_frame         PROGBITS         0000000000005598  00005598
       00000000000078cc  0000000000000000   A       0     0     8
  [13] .text             PROGBITS         000000000000de64  0000ce64
       0000000000013e0c  0000000000000000  AX       0     0     4
  [14] .plt              PROGBITS         0000000000021c70  00020c70
       00000000000001e0  0000000000000000  AX       0     0     16
  [15] .data.rel.ro      PROGBITS         0000000000022e50  00020e50
       0000000000000430  0000000000000000  WA       0     0     8
  [16] .fini_array       FINI_ARRAY       0000000000023280  00021280
       0000000000000010  0000000000000008  WA       0     0     8
  [17] .dynamic          DYNAMIC          0000000000023290  00021290
       0000000000000180  0000000000000010  WA       7     0     8
  [18] .got              PROGBITS         0000000000023410  00021410
       0000000000000048  0000000000000000  WA       0     0     8
  [19] .got.plt          PROGBITS         0000000000023458  00021458
       00000000000000f8  0000000000000000  WA       0     0     8
  [20] .data             PROGBITS         0000000000024550  00021550
       0000000000000060  0000000000000000  WA       0     0     8
  [21] .bss              NOBITS           00000000000245b0  000215b0
       0000000000000101  0000000000000000  WA       0     0     8
  [22] .comment          PROGBITS         0000000000000000  000215b0
       00000000000000b2  0000000000000001  MS       0     0     1
  [23] .shstrtab         STRTAB           0000000000000000  00021662
       00000000000000d3  0000000000000000           0     0     1

When optimizing these sections, it is necessary to understand the role of each section in the program running.

sectionFunction
.textCode segment
.data .rodata .bssData segment
.plt .got .dynamic .dynsym .rela.dyn .rela.plt .shstrtabParsed at runtime by the dynamic link library, used for dynamic linking.
.eh_frame .eh_frame_hdrUsed to save the stack frame offset of the function, which is convenient for stack unwinding
.gnu.hash .gnu.version .gnu.version_r .hashSave the metadata of the compiled file

During normal operation, the code segment and data segment are indispensable, and the section required for dynamic linking needs to be retained. The remaining sections can be removed, which can further optimize the file size. It is worth noting that after deleting .eh_frame .eh_frame_hdr, only a crash address can be obtained when the program crashes, and stack unwinding cannot be performed.

Create a linker script that only retains the section of the program running the minimum dependency.

PHDRS
{
  headers PT_PHDR PHDRS ;
  text PT_LOAD FILEHDR PHDRS ;
  data PT_LOAD ;
  dynamic PT_DYNAMIC ;
{}
ENTRY(Reset);
EXTERN(RESET_VECTOR); 
SECTIONS
{
  . = SIZEOF_HEADERS;
  .text : { *(.text .text.*) } :text
  .rodata : { *(.rodata .rodata.*) } :text

  . = . + 0x1000;
  .data : { *(.data .data.*) *(.fini_array .fini_array.*) *(.got .got.*) *(.got.plt .got.plt.*) } : data
  .bss : {*(.bss .bss.*)} : data
  .dynamic : { *(.dynamic .dynamic.*) } :data :dynamic

  /DISCARD/:
  {
    *(.ARM.exidx .ARM.exidx.*);
    *(.gnu.version .gnu.version.*);
    *(.gnu.version_r .gnu.version_r.*);
    *(.eh_frame_hdr .eh_frame .eh_frame_hdr.* .eh_frame.* );
    *(.note.android.ident .note.android.ident.*);
    *(.comment .comment.*);
  {}
{}

Modify compilation parameters, replace the default linker script

.cargo/config.toml

[build]
target = ["aarch64-linux-android","armv7-linux-androideabi"]

[unstable]
build-std-features = ["panic_immediate_abort"]
build-std = ["std","panic_abort"]

[target.aarch64-linux-android]
rustflags = ["-C", "link-arg=-Tlinker.lds"]

[target.armv7-linux-androideabi]
rustflags = ["-C", "link-arg=-Tlinker.lds"]

After a series of operations, the program's volume was finally reduced to 95k! Perfectly meets the requirements.

Summary

Compilation OptionsVolume
strip495k
strip + opt-level = 'z'437k
strip + opt-level = 'z' + lto436k
strip + opt-level = 'z' + lto + panic = 'abort' + code trimming + panic_immediate_abort135k
strip + opt-level = 'z' + lto + panic = 'abort' + code trimming + panic_immediate_abort + remove section95k

This article records various operations I have performed for compilation volume optimization, and some of the strategies still have certain universality in the development of C and C++ languages.

Author: Shang Hongze

Source: JD Cloud Developer Community. Please indicate the source when republishing.

你可能想看:

(3) Is the national secret OTP simply replacing the SHA series hash algorithms with the SM3 algorithm, and becoming the national secret version of HOTP and TOTP according to the adopted dynamic factor

It is possible to perform credible verification on the system boot program, system program, important configuration parameters, and application programs of computing devices based on a credible root,

2. The International Criminal Police Organization arrests more than 1,000 network criminals from 20 countries, seize 27 million US dollars

Expanding the Android attack surface: Analysis of React Native Android applications

d) Adopt identification technologies such as passwords, password technologies, biometric technologies, and combinations of two or more to identify users, and at least one identification technology sho

4.5 Main person in charge reviews the simulation results, sorts out the separated simulation issues, and allows the red and blue teams to improve as soon as possible. The main issues are as follows

5. Collect exercise results The main person in charge reviews the exercise results, sorts out the separated exercise issues, and allows the red and blue sides to improve as soon as possible. The main

ExploitPack (cracked) is an offensive penetration tool that includes 0day and a large number of undetectable exploit programs.

Ensure that the ID can be accessed even if it is guessed or cannot be tampered with; the scenario is common in resource convenience and unauthorized vulnerability scenarios. I have found many vulnerab

Android penetration testing 12: IDA dynamic debugging so

最后修改时间:
admin
上一篇 2025年03月28日 21:36
下一篇 2025年03月28日 21:59

评论已关闭