1. PPML: Creating an Efficient and Secure AI Experience

0 18
As of June 30th, the Confidential Computing Summit 2023 was successfully conclud...

As of June 30th, the Confidential Computing Summit 2023 was successfully concluded in San Francisco, USA. The summit, organized by the Confidential Computing Consortium, accelerated the application of confidential computing in various industries such as healthcare and finance by focusing on outstanding solutions and practice cases from various industries. On the day of the summit, global cloud service providers, confidential computing hardware and software suppliers, and academic experts from institutions such as MIT and ETH Zurich were attracted. More than ten enterprises and institutions shared excellent industry cases.1689156784_64ae7cb0ec013227e5464.jpg!small?1689156785460

The ByteDance Security Research Team and the Intel BigDL Team attended this summit and for the first time publicly showcased the latest capabilities of Jeddak Sandbox (Jeddak Data Security Sandbox) - PPML (Privacy-Preserving Machine Learning). By sharing with the attendees on-site how PPML helps users break through 'data silos', it presented the product power and customer value of Jeddak Sandbox -Provide privacy security guarantees for all parties' data throughout the machine learning process, achieving the secure compliance effect of 'available but invisible' data

1689156793_64ae7cb95e910527bfd54.jpeg!small?1689156793743

1. PPML: Creating an Efficient and Secure AI Experience

Jeddak Data Security Sandbox integrates commonly used machine learning engines and providesSupports multi-source data, customizable, debuggable, efficient and easy-to-use machine learning capabilitiesHelp users solve privacy compliance issues in various AI scenarios, fully tap into and give full play to the value of data. At present, the sandbox has served multiple internal and external business modeling and prediction scenarios, providing security guarantees for the full lifecycle privacy of all parties' data.
In terms of product construction, the sandbox team has carried out in-depth cooperation with the Intel BigDL team, integrating the security enhancement and performance optimization means provided by them to optimize the product experience.

  • The sandbox integrates the acceleration solutions (such as BigDL Nano) of the BigDL team, allowing users to complete various computing tasks faster and improve business execution efficiency.
  • The sandbox adopts BigDL's privacy enhancement solution, completing the integration of commonly used big data analysis, machine learning frameworks, and TEE at a low cost, enabling the sandbox to provide more rich product functions on this basis.

Second, powerful joint modeling tools

The sandbox provides a powerful joint modeling tool, allowing both data owners and experienced algorithmic parties to easily use the sandbox for privacy-protected modeling according to actual scenario needs and obtain high-quality models.

  • Simple and easy to useThe sandbox PPML is built-in with a variety of machine learning algorithms, including logistic regression, XGBoost, and general neural network models, to help users carry out standardized modeling. Users do not need to write complex code; after completing data and parameter configuration through a graphical interface, they can proceed with modeling. The sandbox can also provide real-time model training indicators and evaluation results according to user needs, providing an accurate basis for users to optimize models.
  • Flexible and customizableFor some complex scenarios, the sandbox supports customized modeling functions, allowing users to develop training scripts more flexibly. At the same time, the sandbox also provides debugging capabilities, and under the premise of ensuring data security, uses debugging data based on real data simulation for operation and debugging to help users quickly locate and solve problems in development.
  • Multiple optimizationsIn terms of usability, security, and efficiency, the sandbox has been optimized. For example, it supports joint modeling of multi-party data, and provides functions such as data alignment and I/O encryption to help users better handle data. In terms of performance, the sandbox combines the acceleration and distributed training features provided by BigDL Nano to improve training efficiency and performance. For distributed training, targeted security reinforcement has also been carried out, such as using RA-TLS to protect communication between each distributed node.

1689156691_64ae7c53a9656f849ca9b.png!small?1689156692185

Third, efficient online prediction service

To meet the users' needs for in-depth data analysis and prediction, and to respond quickly to users' data changes, the sandbox provides online prediction capabilities. Users can request trained machine learning models through the API interface provided by the sandbox to make predictions in real time.

  • Focus on performance enhancementTo improve the efficiency in the prediction process, the sandbox has undergone a series of optimizations. Firstly, the sandbox uses a performance-optimized online prediction framework. Secondly, the sandbox adopts a distributed architecture design to achieve fast and efficient processing of high concurrency requests. At the same time, the sandbox fully combines the model optimization strategies provided by BigDL Nano, such as IPEX, JIT, and model quantization based on half-precision (BF16) instructions, to improve prediction efficiency.
  • Fully ensure security. The online prediction of the sandbox not only has excellent prediction efficiency but also has been designed with targeted security measures. Firstly, the sandbox supports end-to-end communication encryption, ensuring that users' requests are only decrypted within the TEE. At the same time, the sandbox has added authentication and authorization mechanisms for model access, allowing only authorized users to access the service, effectively protecting the intellectual property rights of the model.

Therefore, the sandbox can quickly, securely, and accurately deploy trained models, providing users with high-security and high-efficiency prediction experiences.

1689156655_64ae7c2f10c6e4d4c0582.png!small?1689156655946

4. Performance Summary

The security sandbox team, together with Intel BigDL, conducted a series of end-to-end performance tests on the modeling and prediction capabilities of the sandbox. The test results are shown in the figure below.1689156632_64ae7c18864de2869bb69.jpg!small?1689156633086

It can be seen that due to the adoption of TEE technology, the modeling and prediction performance of the sandbox has been affected to a certain extent, but the performance of the solution based on TEE is not significantly different from that of the native solution,The basic performance loss is only below 10%(Refer to the Baseline section).


Through optimization, it can make up for the performance loss caused by the introduction of TEE. Actual test results show that in the Nano version after optimization, the modeling and prediction performance of the sandboxImproved by 3 times or even 4 times compared to the native solution(Refer to the performance gap between the Baseline and the performance after using Nano optimization).


5. Summary and Outlook

The Jeddak data security sandbox will continue to focus on industry development and technology trends, innovate and optimize continuously,Provide users with a more secure, efficient, and user-friendly PPML solutionAt the same time, the sandbox will expand algorithm support, improve customized modeling functions and debugging capabilities, and simplify operations, enabling users to develop, debug, and use models more conveniently and quickly. 
Meanwhile, the sandbox is studying the integration of TEE and GPU capabilities to achieve a full-chain trustworthy security solution from CPU to GPU, in order to improve the efficiency of modeling and prediction. In addition, the sandbox will continuously explore new application scenarios including large language models (LLM), promote the research and application of cutting-edge technologies, and help users better solve business needs and challenges.

你可能想看:
最后修改时间:
admin
上一篇 2025年03月25日 08:46
下一篇 2025年03月25日 09:09

评论已关闭