3. Multi-party Security Computation - MPC (Secure Multi-Party Computation)

1. Background

Nowadays, organizations are increasingly concerned about data security when collecting, storing sensitive personal information, and processing, sharing personal information in external environments (such as the cloud). This is a strong demand for compliance with privacy regulations: for example, the California Consumer Privacy Act (CCPA) in the United States, the General Data Protection Regulation (GDPR) of the European Union, and other emerging regulations around the world, as well as China's 'Data Security Law' and 'Personal Information Protection Law', all require the secure handling of sensitive data.

Encrypting static data is not enough to prevent data leakage. Static data encryption creates an 'encrypted boundary', outside of which data can be accessed in plain text form. Because processing usually requires plain text data, this boundary usually exists potential data leakage risks. Static data encryption also does not support data sharing solutions with other organizations. In order to make data useful, they usually need to be accessed in plain text form within the application, which greatly reduces the protective power of encryption. Therefore, different industries need new privacy protection computing methods to meet legal compliance requirements and provide privacy guarantees for data sharing.

General privacy computing technology is a computational theory and method for the full lifecycle protection of privacy information, covering all computational operations of information owners, information forwarders, and information receivers in the full lifecycle process of information collection, storage, processing, release (including exchange), and destruction, and is a series of technologies for the secure circulation of data under the premise of privacy protection. Including but not limited to:

• Technologies for data restriction release: such as data desensitization, de-identification (masking, suppression, generalization, truncation, confusion, k-anonymity, l-diversity, t-closeness, etc.)

• Technologies based on data distortion: such as random perturbation, differential privacy, and synthetic data

• Privacy computing technology: based on the three major routes of multi-party secure computing, federated learning, and trusted execution environment

• Related auxiliary technologies: such as blockchain, verifiable computing, and traceability audit technology

2. Preface

With the rapid development of digitalization and big data analysis technology, the demand for multi-party data is also increasing. For example, to assess personal credit risk, it is necessary to conduct joint statistical analysis of multiple attribute features. The information needed in the process of personal credit reporting includes basic information such as personal identity, address, and occupation, personal credit transaction records formed by credit activities such as personal loans, credit cards, quasi-credit cards, and guarantees, as well as other information reflecting the personal credit status, covering all aspects of personal life. In order for credit reporting agencies to achieve accurate credit assessment through their own credit systems and withstand market scrutiny, their credit models need to use more sources, more timely, and more valuable massive data to continuously model.

However, legal supervision and collection costs make data sharing difficult. As credit data is a non-exclusively owned special resource with characteristics such as low replication costs, difficult to determine ownership, difficult to control circulation channels, and difficult to limit usage scope, it is easy to leak data if not handled carefully. How to achieve the integration of credit data while ensuring data security and privacy? How to ensure that the use and dissemination of credit data are controllable? Multi-party Computation (Secure Multi-Party Computation, abbreviated as MPC) exactly solves this problem, providing a technical possibility for the abundant flow of credit data.

MPC technology has the advantages of protecting privacy, ensuring correct results, and controllable operations, which can solve the technical dilemma of credit data integration. MPC can ensure that the data of the parties involved in the credit data integration is not leaked through cryptographic means; by using ciphertext functions for computation, it can ensure that the computational results are the same as those with plaintext computation; it can also specify the purpose and amount of credit data usage, making every operation controllable, restricted, and auditable.

3. Multi-party Security Computation - MPC (Secure Multi-Party Computation)

Multi-party Computation (MPC) is a computing paradigm where multiple parties collaborate to achieve a computational goal without a trusted third party, ensuring that each participant cannot obtain any input information from other participants except for the computational result. The problem of multi-party computation originated from the Millionaires' Problem proposed by Professor Yao in 1982, and after years of continuous development, it has become an important branch of cryptography. Multi-party computation is based on cryptographic algorithms and protocols to achieve privacy protection, with common multi-party computation protocols including Secret Sharing (SS), Garbled Circuit (GC), Homomorphic Encryption (HE), Oblivious Transfer (OT), and others. Multi-party computation technology can obtain the value of data usage without leaking the original data content, protecting privacy, and is applicable to various scenarios such as joint data analysis, private information retrieval (PIR), private set intersection (PSI), and trusted data exchange. A series of open-source libraries for MPC (such as ABY, EMP-toolkit, FRESCO, JIFF, MP-SPDZ, MPyC, SCALE-MAMBA, and TinyGable, etc.) have been developed, further promoting the application and deployment of MPC.

Table 1 shows a comparison of the characteristics of multi-party secure computation and federated learning with trusted execution environments. Multi-party secure computation is a provably secure computation based on cryptography, with high security, but high network requirements, and can be applied in high-security scenarios such as banking and government. Federated learning is efficient and suitable for joint machine learning scenarios with large amounts of data, although there is a gradient leakage problem, but it can be combined with differential privacy or multi-party secure computation to enhance security. The trusted execution environment belongs to centralized computation after data encryption, with high security and high accuracy characteristics, but requires data to be encrypted and centralized to a third-party environment, which limits its application scenarios.

In MPC, calculations can be performed on data contributed by multiple parties, and no party can see the content of the data they contribute. This enables secure calculations to be performed without the need for a trusted third party. The figure below illustrates how participants collaborate in calculations, knowing only the result of the calculation, but not the specific data contributed by others. Taking the calculation of the average salary as an example, a simple way is to have a trusted third party F responsible for collecting the salary income data of A, B, and C, and calculating the average. However, in real life, a trusted third party is not always available, and it is not necessarily secure, as there is a possibility that the collected private data may be leaked. Another way is to use multi-party secure computation to ensure that the salary data of all parties is not leaked and the average salary can be calculated without a trusted third party.

As shown in the figure below, Allie (A) earns 100 dollars. In additive secret sharing, 100 dollars is divided into three randomly generated parts (or secret shares): for example, 20 dollars, 30 dollars, and 50 dollars. Allie retains one secret share for herself (50 dollars) and allocates one secret share to Brian (B) (30 dollars) and Caroline (C) (20 dollars). Brian and Caroline also follow the same process to secretly share their salaries. Each participant sums up their secret shares locally to calculate a partial result; in this example, each partial result is one-third of the information needed to calculate the final answer. Then, the partial results {-30, 480, 150} are recombined, and the complete secret sharing set distributed earlier is added together to find the average, which results in the average salary of Allie, Brian, and Caroline being 200 dollars. It can be seen that throughout the process, no party can deduce the original salary information of others based on the partial secret shares they hold and the final result.

4. Several Important Concepts of MPC

Secure multi-party computation needs to ensure privacy and correctness. In the presence of adversarial behavior, it is necessary to ensure the security properties. The main consideration are two classic adversary models:

• Semi-Honest: A semi-honest adversary (also known as a passive adversary) follows the protocol specifications but may try to learn more information from the protocol records;

• Malicious: A malicious adversary (also known as an active adversary) can run any attack strategy to try to disrupt the protocol.

Communication Mode: The default communication method of MPC is an authenticated channel, which can be implemented in practice using known standard technologies. In a multi-party setting, parties are also connected through point-to-point channels, and sometimes a broadcast channel is also needed. The standard two-round echo broadcast protocol can effectively implement the broadcast channel.

There are two main methods to design MPC protocols:

• (1) Secret Sharing Method, which enables each party to interact for each nonlinear gate of the circuit, with low communication bandwidth but interaction rounds that are linearly related to the circuit depth;

• (2) Scrambled Circuit Method, allowing each party to construct an encrypted version of the circuit, which can only be calculated once, and the number of interaction rounds is constant, but the communication bandwidth is large.

4.1 Oblivious Transfer (OT - Oblivious Transfer)

Oblivious Transfer (OT) is a cryptographic protocol, which is currently widely used in secure multi-party computation (MPC, Secure Multi-Party Computation). It was proposed by Rabin [1] in 1981. Alice has a secret S1, and Bob has a secret S2. Alice and Bob want to exchange secrets, requiring that both parties have the possibility of obtaining the secret, and the secret holder does not know whether the other party has obtained the secret.

Even et al. proposed a new 1-out-of-2 OT protocol using the public key cryptosystem, which provides the definition and implementation of OT axiomatization. Compared to the one proposed by Rabin et al., where there is only a 1/2 probability for one party to obtain the secret, Even et al. improved it, namely: Alice has two secrets (M0, M1), while Bob wants to know one of them. After the execution of the OT protocol, Bob obtained one of the secrets, but did not know the other secret, and Alice also did not know whether Bob chose M0 or M1.

4.2 MPC Protocol Based on Secret Sharing (Secret Sharing)

The idea of secret sharing is to split the secret in an appropriate way, and each share after splitting is managed by different participants. A single participant cannot recover the secret information, and only a number of participants can collaborate to recover the secret message. More importantly, when any participant within the corresponding range has a problem, the secret can still be fully recovered. It is an important means in information security and data confidentiality.

Currently, specific and efficient MPC protocols mainly adopt three linear secret sharing schemes (LSSS): additive secret sharing, Shamir secret sharing, and replicated secret sharing (also known as CNF secret sharing), among which additive secret sharing is mainly used in MPC protocols with dishonest majority settings, while Shamir and replicated secret sharing are used in honest majority MPC protocols. To achieve malicious security, additive secret sharing requires information theory message authentication codes (IT-MACs).

The three LSSS used in MPC are all (n, t)-threshold secret sharing schemes, which allow n parties to share a secret x among the parties, so that any subset of t parties cannot obtain any information about the secret x, and any subset of t + 1 parties can reconstruct the secret x. Additive secret sharing can only be performed when t = n - 1, while Shamir/replicated secret sharing allows any t < n (we often use t < n/2 to represent honest majority MPC).

4.3 Constant-Round MPC Protocol Based on Obfuscated Circuits

Its core technology is to compile the secure computation functions of the two parties into the form of Boolean circuits, and to encrypt and shuffle the truth table, thus achieving normal circuit output without leaking the private information of the two parties involved in the computation. The known practical and effective constant-round MPC protocol is based on obfuscated circuits, which are encrypted versions of the circuits and can only be computed once.

The obfuscated circuit protocol is divided into four parts.

Step 1: Alice generates the obfuscated circuit

Step 2: Alice and Bob communicate

Step 3: Bob evaluates the generated obfuscated circuit

Step 4: Share the results

4.4 Semi-honest Protocol

Partially honest participants follow the execution of the protocol, but they retain the intermediate computation state of the protocol. In fact, partially honest participants only save the internal coin-tossing process (the process of generating random numbers) and all messages received from other participants. Particularly, a partially honest participant will choose a random number and operate according to a predetermined program, that is, to fairly generate random numbers and execute input and output according to a predetermined program.

The first constant-round secure two-party computation (2PC) protocol was proposed by Yao [6], which achieved semi-honest security. Yao's 2PC protocol uses garbled circuits (GC) and OT as building blocks. In the multi-party setting, constant-round MPC must handle the situation where the parties collude to deceive the honest party. Therefore, it cannot only allow one party to construct the garbled circuit, but instead, the parties should jointly construct the garbled circuit in a distributed manner.

4.5 Malicious security protocols

To achieve malicious security, some check programs need to be added. The underlying technology to ensure resistance to malicious adversary attacks is different between dishonest majority MPC and honest majority MPC. For example, MPC in the dishonest majority setting requires information-theoretic message authentication codes IT-MAC to verify the shared values among the parties.

Secure two-party computation:

For constant-round 2PC protocols, before 2017, a popular method for designing maliciously secure protocols was to use the 'Cut-and-Choose' (C&C) technique. There are two different styles of this technique.

• The first is the circuit-level C&C method introduced by Lindell and Pinkas

• The second is the gate-level C&C method introduced by Nielsen and Orlandi

Currently, the most advanced method for malicious security 2PC is to adopt the distributed obfuscation method, which is obviously superior to the two C&C methods.

Secure multi-party computation:

Dishonest majority.For constant-round MPC protocols that tolerate non-malicious destruction, some studies adopt the cut-and-choose method or the combination of BMR and SPDZ to construct MPC protocols. However, their execution efficiency is very low. Goldreich, Micali, and Wigderson (GMW) proposed a universal compiler to convert semi-honest MPC protocols into maliciously secure MPC protocols to complete the same computational tasks. However, this compiler is non-black-box, using general zero-knowledge proofs to prove the correctness of each step of the computation, thus the efficiency is not high. Later, Ishai, Prabhakaran, and Sahai (IPS) proposed a black-box compiler, where the semi-honest secure internal MPC protocol computes circuits in the OT-mixed model, and the external MPC protocol with malicious security in the honest majority setting is used to ensure the existence of malicious adversaries in the entire MPC protocol. The SPDZ framework is the most advanced protocol in the dishonest majority malicious setting. The original SPDZ protocol uses a depth 1 homomorphic encryption (HE) scheme (i.e., the underlying HE scheme can support a single multiplication) to generate verified triples in the preprocessing phase and can quickly evaluate circuits in the online phase.

The honest majority.In the honest majority setting, constant-round MPC protocols can be constructed with less communication and computation based on replicated secret sharing. In the malicious setting, we only need to check the correctness of the multiplication gates, because the addition gates are locally computed and always correct. Specifically, it can be used to verify the correctness of multiplication gates with sub-linear communication distributed zero-knowledge proofs.

5. Application scenarios of MPC

With the deep penetration of government and enterprise digital transformation, the cumulative data assets of enterprises have increased, with an average number of orders reaching 3.2PB. Massive data assets mean that there is a large amount of value waiting to be explored on the one hand, and on the other hand, it means that higher requirements are put forward for data security protection capabilities. From the perspective of various industries,Telecommunications operatorsWith the most abundant and valuable data assets, with the help of privacy computing technology, the productivity of data elements can be fully released, realizing data monetization. At the same time, it provides core infrastructure for the safe development of the data element market and empowers all industries.Government dataOpening up has become the key to improving government service. Privacy computing can enhance the data collaboration of the whole society while ensuring data security, and promote the upgrading of industrial elements.Financial enterprise dataAs a production factor, more and more business scenarios require multi-party data circulation and sharing, breaking the 'data islands'. Health and medical big data is an important basic strategic resource of the country. Through integrationMedical institutionsInternal and cross-institutional multi-source heterogeneous data, constructing a unified big data model of health and medical care, can play an important role in supporting industry governance, medical research, and public services. Especially in medical research and pharmaceutical equipment development, through the sharing and application of health and medical big data among multiple institutions, it can enhance the connectivity and collaboration in clinical trials, access, and supervision, accelerate the research and transformation of key technologies such as new-generation gene sequencing, tumor immunotherapy, stem cell and regenerative medicine, and big data analysis in biomedicine, and promote the application of precision solutions and decision support systems for early screening of major diseases and personalized treatment. Privacy computing can take into account the security and efficiency of multi-party collaboration, and the following lists some typical application scenarios of MPC in various fields:

Financial field:

Government affairs field:

Medical and Internet fields:

6. Conclusion

Governments, enterprises, organizations, and individuals are increasingly concerned about data privacy, and the solutions usually implemented are no longer able to provide sufficient protection capabilities to prevent data theft and privacy leaks. Encrypting static data is not enough to prevent data leaks, and different industries have begun to use new privacy protection technologies. The development of new technologies has made it possible to securely share data and protect personal privacy. These technologies can allow for the sharing of data usage value between enterprises and organizations, search for encrypted data in data lakes and clouds without compromising data privacy, while still maintaining the analytical quality of the data.

However, while bringing security, it is inevitable to have some loss of data analysis performance. We still need new privacy-preserving computing methods to help find new opportunities and find an appropriate balance between privacy, security, and compliance.

Author: Yang Bo, Privacy Computing Product Department, JD Technology

7. References

1. Privacy Computing Alliance, 'Research Report on Privacy Computing Applications (2022)',https://mp.weixin.qq.com/s/B-z_z_LqvHsj9Xg9irknaQ

2. Privacy Computing Alliance, 'Research Report on Trusted Privacy Computing (2022)',https://mp.weixin.qq.com/s/ItTFfLP3h8mTYZoCytyvYg

3. Shamir A. (1979) 'How to Share a Secret'. Communications of the ACM, 22, 612-613.

4. Yao A C. 'How to Generate and Exchange Secrets'. FOCS 1986: 162-167

5. Gennaro R, Gentry C, Parno B. 'Non-interactive Verifiable Computing: Outsourcing Computation to Untrusted Workers'[M]// Advances in Cryptology – CRYPTO 2010. Springer Berlin Heidelberg, 2010: 465-482.

6. Rabin M O. 'How to Exchange Secrets by Oblivious Transfer'[J]. Technical Memo TR-81, 1981.

7. Feng D and Yang K. 'Concretely efficient secure multi-party computation protocols: survey and more'. Security and Safety 2022; 1: 2021001. https://doi.org/10.1051/sands/2021001

8. China Academy of Information and Communications Technology, 'White Paper on Privacy Computing (2021)',

你可能想看：

Analysis of a Separated Storage and Computing Lakehouse Architecture Supporting Multi-Model Data Analysis Exploration (Part 1)

snapchat hackers for hire(Snapchat Hack)

4.5 Main person in charge reviews the simulation results, sorts out the separated simulation issues, and allows the red and blue teams to improve as soon as possible. The main issues are as follows

As announced today, Glupteba is a multi-component botnet targeting Windows computers. Google has taken action to disrupt the operation of Glupteba, and we believe this action will have a significant i

Data security： Data does not land, presenting a variety of allure and resistance, half push and half pull

How to make ConcurrentMapCacheManager support automatic deletion of expired items

5. Collect exercise results The main person in charge reviews the exercise results, sorts out the separated exercise issues, and allows the red and blue sides to improve as soon as possible. The main

The rise of multi-vector attacks- An increasingly complex security landscape

Analysis of PyTorch library RPC framework deserialization RCE vulnerability (CVE

Fpicker： A suite of fuzz testing tools that support multiple modes