Apollo (Apollo) is an open-source configuration management center developed by the framework department of Ctrip, which can centrally manage configurations for different environments and clusters. After configuration changes, they can be pushed in real-time to the application end, and it also has features such as standardized authorization and process governance. This article aims to test the high availability and security of Apollo.
Chapter 1: Test Purpose
With the increasing complexity of program functions, the configuration of programs is also increasing: various function switches, parameter configurations, server addresses...

The expectations for program configuration are also getting higher: configuration changes take effect in real time, gray release, configuration management by environment and cluster, perfect authority and audit mechanism...
In such a large environment, traditional methods such as configuration files and databases are increasingly unable to meet the needs of developers for configuration management.
The Apollo configuration center has emerged!
Test the high availability and security of Apollo.
Chapter 2: Test Scope
This test includes the following aspects:
Check whether the modification of the configuration file takes effect
Simulate a disaster (such as a server crash or network fluctuation) to see if the backup Apollo can switch and work normally
Simulate the concurrent release of a large application to see if Apollo can withstand pressure and work normally
Chapter 3: Test Environment
3.1 Logical topology
3.2 Network topology
3.3 Software/hardware environment
Environment | Role | IP address | Point to |
---|---|---|---|
Portal | MySQLApolloEureka | 192.168.103.111 | Apollo-portalApollo-configApollo-adminMySQLportalDB->111MySQLconfigDB->111Eureka->111, 112 |
DEV | 192.168.103.111192.168.103.112 | Apollo-configApollo-adminMySQLportalDB->111MySQLconfigDB->111Eureka->111, 112 | |
PRO | 192.168.103.113192.168.103.114 | Apollo-configApollo-adminMySQLportalDB->111MySQLconfigDB->113Eureka->113, 114 |
Chapter 4: Comparison of Apollo Test Items
Check whether the modification of the configuration file takes effect
Effectively tested
Through
Simulate a disaster (such as a server crash or network fluctuation) to see if the backup Apollo can switch and work normally
Even after one server is taken down, it is still possible to release
Through
Simulate the concurrent release of a large application to see if Apollo can withstand pressure and work normally
After testing, the following stress test data was obtained for Apollo
At the same time, due to the overall adoption of the Apollo framework, the stress test of the overall project is equivalent to the stress test of the Apollo framework as a whole, and the conclusion of the stress test is as follows:
The system has performed excellently in terms of response and concurrency after a series of stress tests on business interfaces, remaining stable and reliable throughout the process. The server resources fluctuated normally, and the error rate for all business scenarios was 0% after ten million interactions.
The business hybrid mode performed well in the test scenario of 8 hours (1:2:1.5 recharge: query: transaction), with the pressure test aggregation report data showing excellent performance, a 0% error rate in 7.2 million business interactions, and 95% connection response times below 1000 milliseconds, with TPS stable at 250/sec.
Through
5. Emergency Measures
Once the framework has a problem, the following situations may occur:
1. The client cannot receive the configuration status
2. The server cannot update the configuration release
For the above two points, it is necessary to carefully analyze why the framework has this problem in order to solve it. In the past, the emergency plan was to manually modify the configuration and release. (Note: Regardless of the phenomenon, it only affects automation)
6. Test Conclusion
After Apollo performance testing, it was found that this framework is stable, can handle a large amount of information processing, and integrates with spring to the maximum extent, making it convenient for R&D colleagues to use and connect. It has realized automatic release and conforms to the distributed deployment and cluster concept of operations. It ensures that if a server fails, it does not affect the use of the application. It is excellent in terms of permission control, realizing role division and facilitating control.
The following is an overall availability analysis:
Scenario | Impact | Degradation | Reason |
---|---|---|---|
A specific Config Service goes offline | No impact | Config Service is stateless, and the client reconnects to other Config Services | |
All Config Services go offline | The client cannot read the latest configuration, but the Portal is unaffected | The client can read the local cache configuration file when restarting. If it is a newly expanded machine, it can obtain the cached configuration file from other machines | |
A specific Admin Service goes offline | No impact | Admin Service is stateless, and the Portal reconnects to other Admin Services | |
All Admin Services go offline | The client is unaffected, and the Portal cannot update the configuration | ||
A Portal goes offline | No impact | The Portal domain name is bound to multiple servers through SLB, and after retrying, it points to an available server | |
All Portals go offline | The client is unaffected, and the Portal cannot update the configuration | ||
A data center goes offline | No impact | Multi-data center deployment, data is fully synchronized, and the Meta Server/Portal domain name is automatically switched to other surviving data centers through SLB | |
Database crash | The client is unaffected, and the Portal cannot update the configuration | After the Config Service enables configuration caching, the reading of configurations is not affected by database crashes |
Summary of the above conclusions: This framework is suitable for large-scale R&D teams. As the team grows, it is imperative to use a mature framework. This framework meets the company's needs, all tests have passed, and it can be used.
*The original author of this article is:Yuyang, this article is part of the FreeBuf Original Reward Program and is prohibited from being reproduced without permission

评论已关闭