How does Wine implement the cross-platform compatibility layer for Windows?

0 22
Wine is a compatibility layer that allows Windows applications to run on several...

Wine is a compatibility layer that allows Windows applications to run on several operating systems that comply with the POSIX standard, such as Linux, macOS, and BSD (https://www.winehq.org)

If you have been using Linux for a while, you may have used Wine at some point. Maybe to run that very important Windows program that doesn't have a Linux version, or to play World of Warcraft or other games. It's interesting that Valve's Steam Deck uses a solution based on Wine to run games (known as Proton)

How does Wine implement the cross-platform compatibility layer for Windows?

Over the past year, I have spent a considerable amount of time developing a debugger that can debug both the Wine layer and the Windows applications running alongside it. Understanding the internal structure of Wine is very interesting -- I have used Wine many times before, but never knew how it worked. If you have ever wondered why a Windows executable can run on Linux without any modification -- welcome to read this article

Disclaimer

This article greatly simplifies the reality, and I do not claim to know all the details. However, I hope that the explanation here gives you a general understanding of how Wine operates.

Not an emulator

Before describing how Wine works, let's first discuss how it doesn't work. Wine is an acronym for "Wine Is Not an Emulator." Why not? There are many excellent emulators that are suitable for both old architectures and modern game consoles. Can Wine be implemented as an emulator? Yes, but there are good reasons not to do so. Let's quickly look at how emulators generally work.

Imagine that we have some simple hardware with two instructions:

  • push - Push the given value onto the stack

  • setpxl - Pop three values from the stack and draw a pixel with color arg1 at (arg2, arg3).

(This should be enough to create some cool demo scenes, right?)

> dump-instructions game.rom
...
# draw red dot at (10,10)
push 10
push 10
push 0xFF0000
setpxl
# draw green dot at (15,15)
push 15
push 15
push 0x00FF00
setpxl

The binary game file (or ROM cartridge) is a sequence of these instructions that the hardware can load into memory and then execute. Real hardware can execute them natively, but what if we want to play games on a modern laptop? We will create a software emulator -- a program that loads the ROM into memory and then executes its instructions. If you prefer, an interpreter or a virtual machine. The implementation of our dual-instruction console emulator can be quite simple.

enum Opcode {
    Push(i32),
    SetPixel,
};

let program: Vec<Opcode> = read_program("game.rom");
let mut window = create_new_window(160, 144); // Virtual screen of 160x144 pixels
let mut stack = Vec::new(); // Stack for passing arguments

for opcode in program {
    match opcode {
        Opcode::Push(value) => {
            stack.push(value);
        }
        Opcode::SetPixel => {
            let color = stack.pop();
            let x = stack.pop();
            let y = stack.pop();
            window.set_pixel(x, y, color);
        }
    }
}

The real emulator is much more complex, but the basic idea is the same: maintain some context (memory, registers, etc.), handle input (such as keyboard/mouse) and output (such as drawing to some window), parse input data (ROM) and execute instructions one by one, applying their side effects.

This might be one way to implement Wine, but there are two reasons against it. First, the emulator is "slow"--the overhead of executing each instruction programmatically is great. This might be acceptable for old hardware, but not for the most advanced technology (while video games have always been among the most demanding application types). The second reason is that it is unnecessary! Linux/MacOS is fully capable of handling these issues. Linux/MacOS is fully capable of natively running Windows binary files, and they only need a little push......

Let's compile a simple program for Linux and Windows and compare the results.

int foo(int x) {
    return x * x;
}

int main(int argc) {
    int code = foo(argc);
    return code;
}

image.png(Left - Linux, Right - Windows)

The results are obviously different, but the instruction set is actually the same: push, pop, mov, add, sub, imul, ret. Therefore, if we have a 'simulator' that can execute these instructions, theoretically it should be able to execute these two instructions. And as it turns out, we do have this thing -- that's our CPU.

How does Linux run binary files

Before running Windows binary files on Linux, let's first understand how to run a normal Linux binary file.

❯ cat app.cc
#include <stdio.h>

int main() {
  printf("Hello!\n");
  return 0;
}

❯ clang app.cc -o app

❯ https://www.freebuf.com/articles/system/app
Hello!  # works!

It's simple enough, let's delve deeper. What happens when we do https://www.freebuf.com/articles/system/app?

❯ ldd app
        linux-vdso.so.1 (0x00007ffddc586000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f743fcdc000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f743fed3000)

❯ readelf -l app

Elf file type is DYN (Position-Independent Executable file)
Entry point 0x1050
There are 13 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000000040 0x0000000000000040
                 0x00000000000002d8 0x00000000000002d8  R      0x8
  INTERP         0x0000000000000318 0x0000000000000318 0x0000000000000318
                 0x000000000000001c 0x000000000000001c  R      0x1
      [Requesting program interpreter: /lib64/ld-linux-x86_64.so.2]
...

Firstly, we see that the application is a dynamic executable file. This means it depends on some dynamic libraries and needs them to be present at runtime to run. Another interesting thing is the 'Requesting program interpreter' part. What is the interpreter doing here? I thought C++ was a compiled language, unlike Python...

In this case, the interpreter is the 'dynamic loader'. It is a special program that leads the execution of the original program: it resolves and loads its dependencies, and then passes control to it.

❯ https://www.freebuf.com/articles/system/app
Hello!  # This works!

❯ /lib64/ld-linux-x86-64.so.2 https://www.freebuf.com/articles/system/app
Hello!  # This works too!

# Homework exercise, run this and try to make sense of the output.
❯ LD_DEBUG=all /lib64/ld-linux-x86_64.so.2 https://www.freebuf.com/articles/system/app

When running an executable file, the Linux kernel detects that it is dynamic and requires a loader. Then it executes the loader, and the loader completes all the work. For example, we can verify this by running the program under the debugger.

❯ lldb https://www.freebuf.com/articles/system/app
(lldb) target create "https://www.freebuf.com/articles/system/app"
Current executable set to '/home/werat/src/cpp/app' (x86_64).
(lldb) process launch --stop-at-entry
Process 351228 stopped
* thread #1, name = 'app', stop reason = signal SIGSTOP
    frame #0: 0x00007ffff7fcd050 ld-2.33.so`_start
ld-2.33.so`_start:
    0x7ffff7fcd050 <+0>: movq   %rsp, %rdi
    0x7ffff7fcd053 <+3>: callq  0x7ffff7fcdd70            ; _dl_start at rtld.c:503:1

ld-2.33.so`_dl_start_user:
    0x7ffff7fcd058 <+0>: movq   %rax, %r12
    0x7ffff7fcd05b <+3>: movl   0x2ec57(%rip), %eax       ; _dl_skip_args
Process 351228 launched: '/home/werat/src/cpp/app' (x86_64)

Here we can see that the first instruction executed is ld-2.33.so, not the binary file of the application.

To summarize, the process of running a dynamically linked executable file on Linux is roughly as follows:

  1. The kernel loads the image (≈ binary file) and sees it as a dynamic executable file

  2. The kernel loads the dynamic loader (ld.so) and gives it control

  3. The dynamic loader resolves dependencies and loads them

  4. The dynamic loader returns control to the original binary file

  5. The original binary file begins execution at _start() and eventually enters main().

At this point, we are very clear why simply running a Windows executable is not feasible -- it has a different format, and the kernel has no idea how to handle it.

❯ https://www.freebuf.com/articles/system/HalfLife4.exe
-bash: HalfLife4.exe: cannot execute binary file: Exec format error

However, if we could skip steps 1-4 and somehow reach step 5, theoretically it should be feasible, right? Since we are talking about 'execution', from the perspective of the operating system, what does it mean to 'run' a binary file?

Each executable file has a .text section, which contains serialized CPU instructions.

❯ objdump -drS app

app:     file format elf64-x86-64

...

Disassembly of section .text:

0000000000001050 <_start>:
    1050:       31 ed                   xor    %ebp,%ebp
    1052:       49 89 d1                mov    %rdx,%r9
    1055:       5e                      pop    %rsi
    1056:       48 89 e2                mov    %rsp,%rdx
    1059:       48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
    105d:       50                      push   %rax
    105e:       54                      push   %rsp
    105f:       4c 8d 05 6a 01 00 00    lea    0x16a(%rip),%r8        # 11d0 <__libc_csu_fini>
    1066:       48 8d 0d 03 01 00 00    lea    0x103(%rip),%rcx        # 1170 <__libc_csu_init>
    106d:       48 8d 3d cc 00 00 00    lea    0xcc(%rip),%rdi        # 1140 <main>
    1074:       ff 15 4e 2f 00 00       call   *0x2f4e(%rip)        # 3fc8 <__libc_start_main@GLIBC_2.2.5>
    107a:       f4                      hlt
    107b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
...

To 'run' an executable file, the operating system loads the binary file into memory (especially the .text section), sets the current instruction pointer to the address of the code, and thus, the executable file begins to run. Can we do the same thing to Windows executable files?

Yes! The code within the executable is "portable" between Windows and Linux (assuming the same CPU architecture). If we just take the code out of the Windows executable, load it into memory, and set %rip to the correct place--the processor will be happy to execute it.
image.png

Hello, Wine!

Essentially, Wine is the "dynamic loader" for Windows executable files. It is a native Linux binary file, so it can run normally, and it knows how to handle EXEs and DLLs. It is somewhat similar to ld-linux-x86-64.so.2.

# running an ELF binary
❯ /lib64/ld-linux-x86-64.so.2 https://www.freebuf.com/articles/system/app

# running a PE binary
❯ wine64 HalfLife4.exe

Wine loads the Windows executable file into memory, parses it, finds the dependencies, locates the executable code (i.e., the .text section), and then finally jumps to that code.

Well, in reality, it jumps into something like ntdll.dll!RtlUserThreadStart(), which is the "user space" entry point in the Windows world. It eventually enters mainCRTStartup() (equivalent to _start), and then finally into the actual main().

At this point, our Linux system is executing code originally compiled for Windows, and everything seems to be working. Except...

System calls

System calls, or commonly referred to as syscalls, are the reasons why Wine is so complex. Syscall is a call to a function that is implemented in the operating system (therefore it is a system call), rather than in the binary file of the application or any of its dynamic libraries. The set of syscalls provided by the operating system is essentially the API of the operating system.

Examples on Linux: read, write, open, brk, getpid

Examples on Windows: NtReadFile, NtCreateProcess, NtCreateMutant囧

System calls are not regular function calls in the code. For example, opening a file must be executed by the kernel itself because it is the one tracking file descriptors. Therefore, application code needs a method to "interrupt" itself, handing over control to the kernel (such operation is usually called context switching).

On each operating system, the set of functions exposed by the operating system and the way these functions are called are different. For example, on Linux, to call read(), the binary file places the file descriptor in the register %rdi, the buffer pointer in %rsi, and the number of bytes to read in %rdx. However, in the Windows system, there is no read() function in the kernel. These parameters have no meaning either. Therefore, the binary files compiled for Windows will use the Windows way of making system calls, which will not work on Linux. I will not delve into how syscalls actually work, here is a good article about Linux implementation -- https://blog.packagecloud.io/the-definitive-guide-to-linux-system-calls/.

Let's compile a small program again to compare the code generated on Linux and Windows.

#include <stdio.h>

int main() {
    printf("Hello!\n");
    return 0;
}

image.png(Left - Linux, Right - Windows)

This time we call a function from the standard library, which ultimately performs a system call. In the above screenshot, the Linux version calls puts, while the Windows version calls printf. These functions come from the standard library (Linux's libc.so, Windows's ucrtbase.dll), which applications use to simplify communication with the kernel. On Linux, it is now quite common to build static-linked binary files that do not depend on any dynamic libraries. In this case, the implementation of put is embedded in the binary file, and libc.so is not involved at runtime.

On Windows, at least until recently, "only malware would use direct system calls"[Note by the referee]. Normal applications always depend on kernel32.dll/kernelbase.dll/ntdll.dll, which hide the low-level magic of communication with the kernel. Applications just call a function, and the library handles the rest.

image.png

(Thank you https://alice.climent-pommeret.red/posts/a-syscall-journey-in-the-windows-kernel/)

At this point, you may have a feeling of what we are going to do next 2333.

Runtime translation of system calls

If we could "intercept" a system call, such as, every time an application calls NtWriteFile(), we would intervene, call write(), and return the result in the format expected by the binary file. This should be feasible. The simple and rough solution to the above example may look like this.

// HelloWorld.exe
lea     rcx, OFFSET FLAT:`string'
call    printf
  ↓↓
// "Fake" ucrtbase.dll
mov edi, rcx   // Convert the arguments to Linux ABI
call puts@PLT  // Call the real Linux implementation
  ↓↓
// Real libc.so
mov rdi, <stdout>  // write to STDOUT
mov rsi, edi       // pointer to "Hello"
mov rdx, 5         // how many chars to write
syscall

We can provide a custom version of ucrtbase.dll, which will have a special printf implementation. It will not try to call the Windows kernel, but will follow the Linux ABI, calling the write function of libc.so. However, in practice, applications can link statically with ucrtbase.dll, and we cannot modify the code of the binary files for many reasons -- it is chaotic and complex, and it will mess up DRM, etc.

Therefore, we will modify the place between the binary file and the kernel - ntdll.dll. This is the 'gateway' to enter the kernel, and Wine indeed provides its custom implementation. In the latest version of Wine, it consists of two parts: ntdll.dll (which is a PE library) and ntdll.so (which is an ELF library). The first part is a thin layer that simply redirects calls to the corresponding ELF part. The corresponding ELF library contains a special function named __wine_syscall_dispatcher, which performs a magic operation to convert the current stack from Windows to Linux and return.

Therefore, when making system calls, the call stack of the process running with Wine looks like this.

image.png

The system call scheduler is the bridge between the Windows world and the Linux world. It is responsible for handling calling conventions--allocating some stack space, moving registers, etc. Once executed in the Linux library (ntdll.so), we can freely use any conventional Linux API (such as libc or syscall), and can actually read/write files, lock/unlock mutexes, etc.

Is that it?

This sounds almost too easy. And indeed it is. First, there are a large number of Windows APIs. And their documentation is poor, with known (and unknown, hahaha) errors, which must be retained exactly as they are. Most of Wine's source code is the implementation of various Windows DLLs.

Second, there are different ways to make system calls. Technically, there is nothing to prevent an application from making direct system calls through syscall instructions, and ideally, this should also be feasible (remember, Windows games will do all sorts of crazy things). The Linux kernel has a special mechanism to handle this issue, of course, which will only increase complexity.

Third, there is the entire compatibility issue between 32-bit and 64-bit. There are many old 32-bit games that will never be re-released as 64-bit. Wine supports both, which again increases the overall complexity of the system.

Fourth, I didn't even mention Wine-server--a separate process spawned by Wine that maintains the 'state' of the kernel (open file descriptors, mutexes, etc.).

Fifth, oh, do you want to run a game? Not just a hello world? Then you need to deal with DirectX, audio, input devices (game controllers, joysticks), and so on. This is a huge task!

Wine has been developed for many years and has made significant progress. Today, you can run the latest games like 'Cyberpunk 2077' or 'Elden Ring' without any problems. Damn, sometimes Wine's performance is even better than Windows! What a time we are living in...


I hope this article can give you a basic understanding of how Wine works. As I warned in the disclaimer, I have simplified a lot of things, and some details may be incorrect (I hope not too many). If you find that I am completely misleading people, please lend a hand to correct me!

Author: Andy Hippo, Original article address:https://werat.dev/blog/how-wine-works-101/

你可能想看:

It is possible to perform credible verification on the system boot program, system program, important configuration parameters, and application programs of computing devices based on a credible root,

4.5 Main person in charge reviews the simulation results, sorts out the separated simulation issues, and allows the red and blue teams to improve as soon as possible. The main issues are as follows

In today's rapidly developing digital economy, data has become an important engine driving social progress and enterprise development. From being initially regarded as part of intangible assets to now

Cross-compilation environment construction and communication data decryption of the CIA Hive platform

5. Collect exercise results The main person in charge reviews the exercise results, sorts out the separated exercise issues, and allows the red and blue sides to improve as soon as possible. The main

Data security can be said to be a hot topic in recent years, especially with the rapid development of information security technologies such as big data and artificial intelligence, the situation of d

How to deploy and implement (cloud) host EDR + situation awareness platform for large enterprises

Article 2 of the Cryptography Law clearly defines the term 'cryptography', which does not include commonly known terms such as 'bank card password', 'login password', as well as facial recognition, fi

Ensure that the ID can be accessed even if it is guessed or cannot be tampered with; the scenario is common in resource convenience and unauthorized vulnerability scenarios. I have found many vulnerab

About the related technologies and implementations associated with tracing the source of posts by PDD employees

最后修改时间:
admin
上一篇 2025年03月29日 15:24
下一篇 2025年03月29日 15:47

评论已关闭