2. Use ShardingSphere-Proxy to implement sensitive data encryption

0 27
In terms of enterprise data security governance, in addition to being familiar w...

In terms of enterprise data security governance, in addition to being familiar with the provisions of laws and regulations, minimizing information collection, clearly defining privacy agreements at service entry, more importantly, it is necessary to build internal basic capabilities such as data recognition, classification and grading, data encryption, and permission control, which are the basic capabilities of data security.

This article focuses on the concept of data-centric, surrounding data recognition, classification and grading, basic protection, and combines open-source software for a sorting and functional demonstration, hoping to help those in need have a direct understanding of data security.

2. Use ShardingSphere-Proxy to implement sensitive data encryption

Based on data recognition, establish a data asset dashboard to achieve full lifecycle management of data asset risk identification, monitoring, and operation;

Based on the classification and grading of data, classify and grade different data assets, and invest superior resources in the security protection of key assets;

In terms of data security basic protection, in addition to having a secure and stable infrastructure and architecture, based on data recognition and classification, encrypt sensitive data for storage and transmission, control account permissions, desensitize data, and manage data distribution, combined with internal and external risk changes, ultimately moving towards the road of data security risk control.

1. Data recognition and classification

In today's era of big data, how to do a good job in data recognition and classification, and on this basis, establish a full lifecycle management of data assets, is a challenge faced by many enterprises. For example, how many databases within the enterprise store phone numbers explicitly, and how many interfaces expose phone number fields to the outside, and what risks do these databases and interfaces face, and how to achieve full lifecycle risk control. In terms of structured data, such as field recognition in database tables, semi-structured data, such as data recognition in logs, and unstructured data such as images, audio and video files, this still presents a considerable challenge to many enterprises in terms of the scope of recognition capabilities, recognition accuracy, and impact on performance.

1.1. Content recognition example

Data recognition can be realized through keywords, regular expressions, algorithms, etc., with many online articles and some large companies also having mature recognition technologies and solutions. In terms of implementation, it is mainly based on business scenarios, and in terms of data types, it is mainly divided into structured, semi-structured, and unstructured data recognition.

Structured: Relational databases

Semi-structured: Log data, JSON data, XML documents, etc.

Unstructured: HTML web pages, office documents, images, audio and video files, etc.

1.2. Classification and tiered management display

2. Use ShardingSphere-Proxy to implement sensitive data encryption

ShardingSphere is an Apache top-level open-source project aimed at building standards and ecosystems on top of heterogeneous databases. It focuses on how to fully and reasonably utilize the computational and storage capabilities of databases rather than implementing a new database. ShardingSphere stands at the upper level of the database, focusing more on their collaboration than on the database itself.

Connection, incremental, and pluggable are the core concepts of Apache ShardingSphere.

  • Connection: Connect applications with multi-model heterogeneous databases quickly through flexible adaptation to database protocols, SQL dialects, and database storage.
  • Incremental: Obtain database access traffic and provide transparent incremental functions such as traffic redirection (data sharding, read-write separation, shadow library), traffic transformation (data encryption, data desensitization), traffic authentication (security, audit, permissions), traffic governance (circuit breaking, throttling), and traffic analysis (service quality analysis, observability).
  • Pluggable: The project adopts a microkernel + three-layer pluggable model, making the kernel, functional components, and ecological connection fully capable of being flexibly expanded in a plug-and-play manner. Developers can customize their unique systems as if using building blocks.

ShardingSphere-Proxy is positioned as a transparent database proxy endpoint, providing a server version wrapped with database binary protocols to support heterogeneous languages. Currently, it provides MySQL and PostgreSQL.

Installation

Download the latest release of ShardingSphere-Proxy. After unpacking, modify the conf/server.yaml and files prefixed with config-, such as conf/config-encrypt.yaml, for field encryption configuration. Other configurations such as sharding rules and read-write separation rules are not discussed here. Run bin/start.sh on Linux operating systems, where you can specify the proxy port, bin/start.sh 3308.

Sensitive field configuration

(base) gengdeMacBook-Pro:conf js2thon$ mysql -h127.0.0.1 -uroot -P3308
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 13
Server version: 8.0.20-Sharding-Proxy 4.1.0
Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show tables;
+----------------------+------------+
| Tables_in_encrypt_db | Table_type |
+----------------------+------------+
| t_encrypt            | BASE TABLE |
+----------------------+------------+
1 row in set (0.03 sec)
mysql> select * from t_encrypt;
+----------+---------+-------------+
| order_id | user_id | phone       |
+----------+---------+-------------+
|       10 |       0 | 18516014911 |
|       11 |       1 | 18516014922 |
|       12 |       2 | 18516014933 |
|       13 |       3 | 18516014944 |
|       14 |       4 | 18516014955 |
+----------+---------+-------------+
5 rows in set (0.09 sec)
mysql> select * from t_encrypt;
+----------+---------+--------------------------+-------------+
| order_id | user_id | phone_cipher             | phone       |
+----------+---------+--------------------------+-------------+
|       10 |       0 | uFZ1RGQfxsUM+GUJqI5rlQ== | 18516014911 |
|       11 |       1 | SGxnMaUHY/HR50hJcYp6Vg== | 18516014922 |
|       12 |       2 | Z5NBefdS9WN3Bl6p45R1Dw== | 18516014933 |
|       13 |       3 | SKqYOUF4dxloUH5M9t/wEg== | NULL        |
|       14 |       4 | 4q+dOa+bxUTFSzX6AOjvUg== | NULL        |
+----------+---------+--------------------------+-------------+

3. Implementing database dynamic credentials and data encryption/decryption interface calls using Vault

Hashicorp Vault solves the problem of managing sensitive information, such as database credentials and API keys, which are sensitive data that need to be stored and provided to applications in a secure manner. Vault supports many secret engines, some of which are as follows:

Key-Value: Simple static key-value pairs

Dynamically generated credentials: Generated by Vault based on client requests

Encryption key: Used to perform encryption functions using client data

3.1. Database dynamic credentials

The installation and configuration of Vault is relatively simple, here we mainly list the relevant configuration information:

Database link configuration

Role configuration

Get database credentials

Use dynamic credentials for login verification

(base) js2thondeMacBook-Pro:Downloads js2thon$ mysql -u v-my-r-owFmZ3LFu -pM8DdaYZXYRU-rNIm2CbQ
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1379
Server version: 5.6.41-log MySQL Community Server (GPL)
Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>

The user was only configured with SELECT permissions in Vault, and the verification can be queried normally.

mysql> use mysql_test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+----------------------+
| Tables_in_mysql_test |
+----------------------+
| account              |
+----------------------+
mysql> select * from account;
+------+------+
| id   | name |
+------+------+
|  100 | abc  |
+------+------+
1 row in set (0.00 sec)

An error occurred during the data insertion operation because Vault reported an error due to lack of permissions.

mysql> insert into account values(101,'def');
ERROR 1142 (42000): INSERT command denied to user 'v-my-r-owFmZ3LFu'@'localhost' for table 'account'

3.2. Data encryption and decryption interface calls

Vault provides encryption and decryption interfaces as a KMS, and data encryption and decryption are realized by calling the interface. The creation of Vault data keys can be referred to in the official documentation, and this will be ignored here.

Key creation in Vault management backend

Encryption and decryption interface calls are implemented in Python

4. Issues and Thoughts

This article outlines the process from data recognition, classification and grading, configuration management, and storage field encryption, and it actually requires a lot of horizontal and vertical expansion. Unified key management, account permissions, asset dashboard, risk monitoring, and operational management, these constitute the basic capabilities of data security. The above list only includes some open-source tools' capabilities, and only a simple functional demonstration of single points has been done, without a detailed understanding of the implementation mechanism and advantages and disadvantages. Subsequent efforts will be made to conduct in-depth research, and it is also hoped that peers with practical experience in implementation can communicate, such as multidimensional practical exploration in aspects such as architectural scheme design, heterogeneous adaptation, and performance stability.

你可能想看:
最后修改时间:
admin
上一篇 2025年03月25日 15:35
下一篇 2025年03月25日 15:58

评论已关闭