Discussion on Zero Trust Network Construction and Some Details (Part Two)

This article is about 7000 words, continuing the content of the article 'Discussion on Zero Trust Network Construction and Some Details' to introduce some difficulties and solution design ideas encountered in the construction of our second phase of zero trust network. The focus of the first phase of construction was on the reasonable isolation of the network domain, the unified access of the seven layers, and the unified 4A capability of the seven-layer gateway; the focus of the second phase of construction was to explore the security access of terminals, the security access of the fourth layer network, the continuous risk control of terminals, and the capability of borderless office. Here, I would like to share with everyone and discuss the difficulties and construction methods encountered in the construction of zero trust networks together.

01 Background of Construction

In the first phase of the贝壳 zero-trust construction, we implemented 7-layer traffic zero-trust control and possessed fine-grained control capabilities for 7 protocol traffic. On this basis, for the 7-layer entry traffic, we can better implement unified 4A capabilities (account, authentication, authorization, audit). However, when 'people' use 'terminals' to access 'internal networks' and 'business resources', the following issues still exist:

Terminal securityBaseline is uncontrollable:

In the first-phase construction, the main focus was on the network closure of zero-trust gateways, without focusing on the control of devices. Among the devices connected, whether they meet the basic security baseline, such as whether necessary security patches have been installed, whether high-risk ports have been closed, and whether necessary antivirus software has been installed, these client security baselines are often necessary in complex internal networks.

Terminal security risks are not perceptible:

In addition to meeting the basic security baseline, it is often necessary to continuously monitor the security of devices according to the characteristics of the company's internal network and business. This includes checking whether there are horizontal penetration actions on computers, whether there are actions to connect to external mining pools, and whether there are large-scale data crawling behaviors, etc. Continuous risk assessment and control of terminals are also required.

Lack of terminal data security protection capabilities:

The importance and risks of data security are becoming increasingly prominent, and many companies also face the risk of terminal data leakage. The necessity of DLP (Data Loss Prevention) has become a consensus in terminal data security governance. How to effectively combine terminal security baselines, terminal continuous risk control capabilities, and DLP to make the risk of terminal data leakage perceptible and preventable is also one of the goals of this period.

Unbounded office capabilities need to be improved:

Our definition of unbounded office work is reflected in two aspects:

(1) Firstly, there is the unbounded nature outside the company. Due to factors such as the pandemic, remote work from home has become a necessity. Traditional solutions involve accessing the internal network through VPN, but VPN has the risk of flattened authorization management and easy spread of attack surfaces. The four-layer encrypted tunnel under the first-phase construction only realized the function of 'VPN', lacking fine-grained permission management and continuous terminal risk control capabilities. The lack of security capabilities makes it difficult to fully trustfully open the internal network 'globally'. Therefore, traditional practices are often to open up part of the network segments based on different user groups, but this is contradictory to the actual situation of not being able to work on-site due to the pandemic.

(2) Secondly, the security within the company is unbounded. For security considerations, companies generally distinguish between office networks, test networks, and production networks when dividing network segments, and isolate network segments through certain strategies. However, for production and research teams, there is often a need for cross-domain access under four-layer protocols such as databases and SSH. The conventional approach is often to open white lists on firewalls based on IP addresses, but this approach also leads to issues such as exposure of sensitive resources and authorization that cannot be fine-grained to individuals. How to meet the needs of such research-side access while ensuring security is also a major challenge we need to solve.

Based on the above zero trust construction, we have supplemented the panoramic view of zero trust construction and the existing problems and difficulties.

In the above figure, the modules introduced in the first article of this series will not be elaborated. This article mainly explains the new modules added to solve the above problems (marked by orange numbers in the figure). The overview of the functions of these modules and features are as follows:

Terminal Security Three-dimensional Prevention and Control:

As shown by the orange number 1 module, in order to solve the problems of terminal security being imperceptible and data security lacking protection capabilities, we introduce UEM, antivirus, vulnerability scanning, DLP, and other functional modules to the zero trust client. The zero trust client, which originally had only the function of a network tunnel, has been enriched with various terminal security and data security modules. And the unified module integration, information collection, policy distribution, and security control are carried out by the zero trust client program.

Improvement of terminal information collection and policy distribution:

As shown by the orange number 2 module, for the various terminals that are newly added, unified registration and management of devices are carried out, and binding management of accounts and devices is realized - such as limiting the number of devices logged in simultaneously, binding frequently used devices, etc.

7-Layer Access Service:

As shown by the orange number 3 module, the 7-layer realizes the unified 4A capability of business services through unified gateway capabilities and has the ability to联动 with zero trust clients, obtaining the security baseline, login information, and device status of zero trust clients. At the same time, for the needs of external network remote office and internal network high-sensitive resource access, 7-layer services are placed behind the 4-layer gateway to achieve a higher level of security control for accounts, devices, and resources.

Fine-grained authorization and control of 4-layer network traffic:

As shown by the orange number 4 module, in order to improve the unbounded office capability and achieve fine-grained authorization and control of 4-layer network traffic, we add a fine-grained authorization and control engine for 4-layer network traffic in the decision center. It can implement authorization management for any 4-layer resources (IP, port, protocol) for individuals, organizations, IP, and environments. Compared with traditional VPN and firewalls, it has a finer-grained authorization capability.

Risk Control Integration of 7-Layer and 4-Layer Zero Trust Networks:

As shown by the orange number 5 module, in order to enhance the dynamic authorization and control capability, dynamic risk control and authorization of 4-layer traffic are integrated on the basis of 7-layer traffic dynamic risk control and authorization. The risk control and audit capabilities of 4-layer and 7-layer are interconnected, and a unified continuous risk control model is possessed.

The newly added and related content are the focus of this article. The design ideas of other modules can be referred to in the previous article of this series, 'Zero Trust Network Construction and Some Details Discussion (Part 1)', and will not be elaborated here.

02 Terminal Security Awareness and Access Control

The risks that terminals may face are divided into the following aspects:

Risk	Scope of Ownership	Governance Components
Malicious Programs	Terminal Security	Antivirus Engine
System Vulnerabilities	Terminal Security	Vulnerability Scanning
Illegal Software	Software Compliance	Software Management
Data Leakage	Data Security	Terminal DLP

Considering the rapid implementation and the research and development cost of the customer, all modules are procured through external procurement methods, and API interface calling methods are negotiated with them to realize the issuance of strategies and the query of security results.

Information integration collection and control of various modules are realized through the self-developed zero-trust client. Terminal security is detected and reported by various security modules, providing a foundation for subsequent security baseline checks. The strategy for security baseline checks is as follows:

Strategy Name	Description
Integrity of Security Components	Network access devices must install all security components and ensure their integrity, maintaining login and running status. Otherwise, access to office resources will be prohibited.
Malware Check	Network access devices must ensure that they are not infected with malicious programs such as viruses, trojans, and mining programs. Malware detection and removal must be performed; otherwise, access to office resources will be prohibited. At the same time, the strategy center regularly issues malware detection and removal tasks to check and report the current terminal security status, ensuring terminal security.
System Vulnerability Check	High-risk and medium-risk vulnerabilities cannot exist on the network access devices. Vulnerability patch repairs must be performed; otherwise, access to office resources will be prohibited. At the same time, the strategy center regularly issues system vulnerability check tasks to check and report the current terminal security status, ensuring terminal security.
Illegal Software	Software compliance check; if non-compliant software is installed, login to the network will be prohibited.
Data Leakage	Based on the real-time check results of the DLP for network access devices, the data leakage risk of the device is evaluated. Access to internal network resources is prohibited or restricted for devices with data leakage.

The above modules conduct unified component integrity checks through the zero-trust client, including the integrity checks of various component programs, obtaining the current login and running status of each component. The results are reported to the device information management service on a regular basis.

The interlock method of the various modules of the admission control service when the user terminal accesses the 7-layer office resources is as follows:

When the terminal user wants to access office resources through a browser, due to the layer 7 zero-trust access control, the user must first log in. JS code is embedded in the HTML page of the login service to call the local service interface of the zero-trust client to obtain the device number.

After obtaining the device number, the admission service queries the security attributes of the device, such as the integrity of security components, vulnerability scanning results, malware infection status, and data leakage status, to conduct a comprehensive terminal security assessment. If the current terminal does not meet the security conditions for network access, the user's login will not be successful, and access to office OA resources will be denied. This achieves a checkpoint for terminal security admission.

Layer 4 Network Admission and Authorization Control

To ensure the security of internal network traffic access on the terminal, reduce the risk of traffic hijacking and eavesdropping, and simultaneously achieve access control and traffic admission at the fourth layer, the zero-trust client has the ability to establish a secure tunnel for internal network access. Identity credentials and original access data packets are encapsulated with TLS and sent to the zero-trust fourth-layer gateway. The fourth-layer gateway, based on device and identity credential verification, along with authorized query results, determines whether the traffic should be forwarded to the backend service. The technical details of this part have been described in the first article of this series, and will not be repeated here.

However, due to the complexity of the office network environment, the performance pressure of the zero-trust four-layer gateway, and other reasons, the traffic range of terminal proxy needs to be differentiated according to different scenarios. The overall strategy is as follows:

Terminal environment	Access target	Proxy controlled network segments
External network	Internal network resources	All internal network segments.
Internal network (office network)	Test, preview, and production environment resources	Test, preview, and production network segments.
	High data-sensitive resources	Highly sensitive service network segment.
	Office network	No proxy control is performed.

The zero-trust client has the ability to perceive the current network environment and can make corresponding proxy routing settings according to the different network environments.

When employees are working in the external network, the zero-trust client performs access proxy for all company internal network segments, thereby realizing boundaryless work in the external network environment.

When employees are working in the internal office network, the zero-trust client performs selective proxying of the internal network segments:

Perform network proxy when accessing across firewall security domains:

As described in the first article of this series, before implementing zero trust, we divided and isolated the company's internal network segments through firewalls. The company's internal network is divided into different levels of security domains such as office, test, preview, and production. Direct access between different security domains is not allowed.

However, in the process of implementation, R&D colleagues have cross-segment access needs, such as testing and debugging preview environment services in the office environment. Before the four-layer zero trust implementation, we temporarily opened up the firewall to meet these needs. But the firewall can only be authorized based on the five-tuple, and cannot perform fine-grained authorization based on people, organizations, access devices, etc. With the implementation of the zero-trust client and the four-layer zero trust gateway in the second phase of zero trust, fine-grained authorization control has been achieved through the configuration of the four-layer zero trust authorization strategy, replacing the original application for firewall whitelist.

Therefore, for access across isolated network segments, we need to perform proxying and fine-grained permission control through four-layer zero trust.

High data-sensitive services require network proxy:

Access to high data-sensitive services requires traffic access proxy through a zero-trust terminal for data security protection, and access control is performed according to the configured four-layer authorization strategy. Resource access can only be performed when both authentication and authorization are successful.

Access to office network resources does not require a proxy:

Considering the historical complexity of the office network environment, there is no need for a proxy to access office network resources. Complexity is reflected in: historical strategies and performance optimization carried out by the IT department on the office network, domain control configuration based on IP by the domain control service, etc. If a user is working within the office network, and a full internal network segment proxy is used, all traffic will be proxied to the zero-trust gateway, thereby bypassing the office network environment and covering the historical performance optimization and other work done by the office network. Therefore, there is no need to proxy this type of traffic.

04 Construction of convenient office capabilities

While improving security capabilities, we also need to consider the user's office experience. Therefore, for the zero-trust client, we have built the following convenient office capabilities.

Zero-trust client and office system login status connection:

This construction optimizes the following scenarios: after the user logs in to the zero-trust client, if the access requires SSO authentication for the Web office system, a second user login is required, which will reduce the user experience. Therefore, we carry out the work of connecting the login status of the zero-trust client and the office system, thus realizing that after a single login to the zero-trust client, there is no need to repeat logging in to the entire network office system, and the single sign-on node is moved to the zero-trust client, where the unified storage and maintenance of the login status are carried out on the client. The specific process is shown in the following figure:

Boundaryless office capability:

Through the construction of a boundaryless office capability at the terminal four-layer tunnel, it is realized that in the external network, it is more convenient to access internal network resources while ensuring security and more fine-grained authorization compared to VPN.

Zero-trust client messagesChannel:

Constructing terminal message tunnels, through message channels to implement the server's command issuance and error prompt message issuance to the zero-trust client, mainly to deal with the following scenarios: when a user accesses the seven-layer zero-trust through the browser, if the current service does not have permission to access, it can inform the user of the reason for the unavailability through the error page. However, after the implementation of the four-layer traffic control, when the user fails to access the target resource, it is impossible to distinguish whether it is due to network unavailability, service failure, or lack of access rights, and it is impossible to return error information to the user universally based on TCP and UDP protocols.

When a user does not have permission to access a certain SSH server, the traffic is intercepted by the front-end four-layer zero-trust client, and the traffic will not be forwarded, so the user cannot perceive the reason for the unavailability. We use the message tunnel method to push the error information through the message tunnel to the corresponding access client. The client informs the user of the reason for the unavailability through the bubble, achieving a better user experience. The detailed process is shown in the following figure:

05 Reliability

After the implementation of the four-layer zero-trust, the zero-trust terminal proxies the traffic of office network access testing, preview, and production networks. In order to enhance the overall reliability of the system, we have done the following work:

The internal and external network gateway clusters of the gateway are divided:

The four-layer gateway is divided into internal and external network clusters. When the zero-trust client is connected to the external network cluster in the external network environment, and to the internal network cluster in the internal network environment. This addresses potential scenarios such as: when the external network four-layer gateway is attacked by DDoS, it can ensure that the internal network gateway remains available, and it does not affect the internal network users accessing resources through the zero-trust client.

Three-dimensional monitoring, alerting, and degrading:

A comprehensive monitoring, alerting, and degrading design has been implemented for the four-layer gateway and its surrounding services. It achieves multi-dimensional monitoring, alerting, and degrading capabilities, as illustrated in the following figure:

Firstly, in order to analyze the system more detailed and comprehensively, we divide the system into levels. For the four-layer gateway service, we divide it into the application layer, service layer, data layer, and system layer. Then we enumerate each layer and core business functions, as well as peripheral dependencies. For example, in the system layer of the four-layer gateway program itself, we have core businesses such as access traffic authentication, authorization, and traffic forwarding. In the data layer gateway, there are dependent external Redis caching services.

Then we design the three-dimensional monitoring for these enumerated core businesses and core dependencies. The so-called three-dimensionality here refers to us starting from the user perspective and the system perspective separately.

The user perspective is to perceive the stability of the system from the perspective of user usage. For example, for the core business of authentication at the application layer, from the user perspective, we monitor the historical year-on-year login rate and authentication success rate. When the login rate and authentication success rate decline sharply year-on-year, we carry out monitoring alarms, which can detect problems before the users do.

The system perspective is the stability monitoring indicators commonly seen in most systems, such as database errors, slow, timeouts, and complete unavailability of services, CPU usage, memory, and disk usage of the server where the gateway is located, etc., for alarm monitoring.

For degradability, we first need to identify whether the system as a whole has a strong dependency on a certain function or dependent service. Strong dependency means that the dependency cannot be degraded and is the core business. For example, the authentication function of the four-layer gateway is a strong dependency due to security reasons. If the function fails or degrades, everyone can access the network, and the degradability of this function cannot be executed. For weak dependencies, corresponding degradability strategies can be formulated, such as authorization as a weak dependency. When there are problems with authorization queries, it can accept degradation to VPN-level security and execute the corresponding degradability strategy.

Analysis of Zero Trust client error log upload:

The message tunnel of the Zero Trust client mentioned in the previous text, in addition to the server's function of issuing instructions to the client and pushing error messages, also has the function of the client uploading errors to the server for summary. All clients on the network can upload generated error logs to the server through the message tunnel for unified network client stability analysis. For example, for issues such as incompatibility of client components after terminal system upgrade, early detection and early damage control can be achieved.

06 Security

As the core feature of a security product, we have made the following detailed design for security:

Authentication of four-layer traffic:

For the identity information of users in the four-layer Zero Trust, we carry it in the four-layer encrypted tunnel. At the same time, we implement a one-time, one-key dynamic identity credential through cryptographic methods. For each time a user accesses the network, we will verify the identity credential.

The four-layer tunnel encapsulation format is shown as follows:

The Zero Trust client encapsulates the user identity credentials and the original traffic into the TLS Data. The Zero Trust four-layer gateway performs private header parsing and original data packet parsing. It retrieves the user identity credentials from the private header, then goes to the decision center for authentication and judgment. If the authentication is successful, the encrypted tunnel is maintained; otherwise, the user connection is interrupted.

Authorization of four-layer traffic:

Compared to firewalls and VPNs, the four-layer zero-trust has more refined authorization for four-layer traffic, as shown in the following illustration:

In the four-layer zero-trust, we define a unique resource Resource with the IP, port, and protocol of the service, and authorization management at the level of person, organization, IP, and device can be performed on this resource. As shown in the most right side of the figure above, we have three OA resources, namely OA1 (IP_OA1, PORT_OA1, PROTO_OA1), OA2 (IP_OA2, PORT_OA2, PROTO_OA2), and OA3 (IP_OA3, PORT_OA3, PROTO_OA3). As shown in the figure, for these three OA resources, we have formulated three authorization strategies in the blue box in the figure, such as the first one: for OA1, OA2, and OA3, it is required that UserA access from DeviceA. The four-layer gateway parses the private header and the original data packet packet by packet to obtain the information of person, device, and access resource, and sends these information to the decision center for packet-by-packet authorization query. After authorization is granted, the current data packet is forwarded.

Enhanced audibility, transparent transmission of the original IP:

The traffic accessed by the client's internal network proxy must pass through the authentication, authorization, and forwarding of the four-layer zero-trust gateway before it can reach the backend system. However, when reaching the backend system, the NAT conversion that has passed through the four-layer zero-trust gateway is used, and the source IP received by the backend system is the IP of the zero-trust gateway, making it impossible to obtain the real IP of the client. This will bring great inconvenience to audit tracing, authorization between four-layer and seven-layer zero-trust, and risk control. To solve the problem in this scenario, we have tried the following scheme:

We store and record the NAT logs of the gateway, but there is a situation where the data volume is large and cannot be effectively associated.
We adopt the ProxyProto scheme, which requires both the client and the server to modify the protocol support. If there are backend services that have not been modified, it will lead to TCP handshake failure and prevent service access. Due to historical factors, the company's services are deployed in a mixed manner, such as on the cloud and in IDCs. Implementing this scheme has the problems of large modification volume and high impact on online services due to omissions, so as a security infrastructure service, this transparent transmission scheme has not been adopted.
We adopt the TOA (TCP OPTION ADDRESS) scheme. We modify the packet at the zero-trust gateway, encapsulating the client's original IP according to the protocol format into the TCP OPTION, and the TOA plugin loaded by the backend service can obtain the original IP encapsulated by the gateway. If the backend service does not load the kernel plugin, it will not affect the access connectivity, which is more suitable for complex scenarios of mixed deployment of services within the company. After pressure testing, it was found that the performance loss of loading the TOA plugin is not significant, and the traffic fluctuation when loading is extremely low. We finally adopted this scheme for the transparent transmission of the client's original IP.

Gradient Authentication:

For internal network services, we have implemented tiered governance and gradient authentication defense. High sensitivity services require users to dynamically raise the authentication level before access to enhance the security of access. We model the security level required to access the service based on the data sensitivity level and business level, as shown in the following figure:

According to the data sensitivity level and business level, the security level S(n) required to access the service is determined. For different security levels, the authentication methods required are as follows:

If the user first accesses a backend business with a security level of S1, access can be obtained through static username and password authentication. If the user accesses a backend business with a security level of S4 later, the Zero Trust system will jump to the login page, and the user needs to authenticate with the network shield dynamic code before continuing to access.

Continuous four-seven layer risk control and dynamic authorization:

To achieve comprehensive dynamic network access and authorization control, we establish the following risk control mechanism based on the business characteristics of Zero Trust:

After the risk control strategy is hit, corresponding execution strategies are issued to the Zero Trust gateway, Zero Trust decision center, and Zero Trust client to achieve the effects of prohibiting access to the network, adding gradient authentication, and revoking authorization.

Detailed strategy examples are as follows:

06 Performance Optimization

Due to the pressure of the Zero Trust four-layer gateway carrying the four-layer traffic transmission within the office network, and the need for packet-level authentication and authorization, it faces significant performance pressure challenges. To this end, we adopt tunnel-by-tunnel authentication and stream-by-stream authorization to alleviate the pressure on authorization queries:

The Zero Trust client maintains communication with the Zero Trust gateway through TLS long connections and transmits the original access data packets within this tunnel. During the lifecycle of the long connection, we only need to authenticate the long connection once. However, for individual packet user authorization queries, it will inevitably bring significant performance consumption. Therefore, we adopt a multi-level caching + individual stream in each tunnel approach for authorization optimization, as shown in the following figure:

If the target resource authorization result of the user exists in the multi-level cache, the strategy center will not be queried repeatedly, reducing the number of authorization queries, thereby reducing the authorization dimension of packets to the dimension of individual streams.

07 Conclusion

After the second-phase construction of Zero Trust security network, we have achieved controllable terminal security, baseline control of terminal security, dynamic authorization based on risk control, and more完善（perfect）unbounded office capabilities. In the gradual deepening of the implementation process, new scenarios and issues are bound to emerge, making it a challenging and long-term task.

This article serves as a summary and reflection on some new scenarios and issues encountered after the first-phase actual implementation, aiming to share insights and stimulate further discussion on the construction methods of Zero Trust, as well as the difficulties and challenges faced during the in-depth implementation process.

你可能想看：

d) Adopt identification technologies such as passwords, password technologies, biometric technologies, and combinations of two or more to identify users, and at least one identification technology sho

In today's rapidly developing digital economy, data has become an important engine driving social progress and enterprise development. From being initially regarded as part of intangible assets to now

In-depth Analysis and Practice： Analysis of Apache Commons SCXML Remote Code Execution Vulnerability and POC EXP Construction

Ensure that the ID can be accessed even if it is guessed or cannot be tampered with; the scenario is common in resource convenience and unauthorized vulnerability scenarios. I have found many vulnerab

4.5 Main person in charge reviews the simulation results, sorts out the separated simulation issues, and allows the red and blue teams to improve as soon as possible. The main issues are as follows

5. Collect exercise results The main person in charge reviews the exercise results, sorts out the separated exercise issues, and allows the red and blue sides to improve as soon as possible. The main

Analysis and reflection on some practical issues of network intrusion detection system based on traffic

A brief discussion on the methods of discovering vulnerabilities in business systems from the perspective of management

Completely separable two-dimensional vector graph encryption domain robust reversible watermark algorithm (Part 1)

2. The International Criminal Police Organization arrests more than 1,000 network criminals from 20 countries, seize 27 million US dollars

最后修改时间：2025-03-28 12:54:15