MogDB Database Disaster Recovery Solution

Database High Availability (HA) and Disaster Recovery (DR) Architecture is the core technical means to maximize business continuity and maximize data reliability and protection in the event of a database system failure, preventing enterprises from suffering potential financial and reputational loses. Two of the most important parameters of a disaster recovery or data protection plan are RTO (Recovery Time Objective) and RPO (Recovery Point Objective).<br> MogDB DR solution using Cluster Manager (CM) can autonomously detect faults to execute the failover, virtual IP automatic switch etc. To ensure business continuity, data security and reliability, in the event of systems failure, the failover time of the database can be shortened from the usual minutes to mere seconds (RTO <30s) and having "zero data loss" (RPO = 0) to maximize data protection. In addition, with the deployment of "one master with multiple replicas" on a dual-plane redundancy network further enhances the reliability of the disaster recovery system.

Pain points and challenges

Without a proper HA and DR solution, there will be:
1.Business Interruption
System downtime will lead to business unable to run normally, affecting the user experience and business processes.
2.Data loss
During the failure, unsaved data or work, may result in data inconsistency or data loss.
3.Business loss
Business may incur financial loss as a result from their inability to conduct normal business operations and transactions.
4.Social Impact
A prolong period of system downtime will have a negative impact on the reputation of the enterprise, reduce their customers’ trust in the enterprise, and may even cause other social issues.
Using the financial industry as an example, if their online banking system is down, customer will not be able to make payment online, transfer money to someone over the bank’s website or mobile app, or even make contactless payment in store using their mobile app. If their system was down for a prolong period, customers may be forced to switch to other banks or other means to make their purchases and payments.

Solution Details

CM (Cluster Manager) is a highly availability management software for MogDB, which consists of four major modules: cm_server, cm_agent, om_monitor, cm_ctl. Their core functions are as follows:
1.cm_server
Sends commands (such as start, stop, status query, switchover and failover of database instances) to cm_agent on each node and receives responses from these cm_agent.
Receive database instance status information reported by cm_agent on each node and initiate high availability arbitration of database instances on each node.
2.cm_agent
Receive and execute the commands issued by cm_server, such as start, stop, status query, switchover and failover of database instances. In addition, monitor the status of database instances running on this node and report it to cm_server.
3.cm_monitor
Monitor the cm_agent service running on this node to ensure its availability.
4.cm_ctl
CM cluster management software client tool for managing MogDB database clusters, providing cluster status monitoring, performance optimization, fault handling and other functions.
Automatic switching module, including nodes detection, fault judgment, automatic switching, notification and alarm, fault recovery, monitoring and logging etc. In a transparent failover setup:
-The end user end does not need to know the existence of the standby server(s).
-Data replication just need to happen between the master and the replica.
-The master or replica servers does not need to decide when to execute a failover and the series of complex operations to execute during a switching over.
Driver to connect to multiple physical IPs
Jdbc:postgresql://node1,node2/database
In the driver's connection string, configure the physical IP addresses of the database's primary and replica nodes at the same time. The driver will automatically detect the status of the database's primary and replica nodes.
After the database's primary and replica nodes switch, it can automatically send the connection request to the new primary node to ensure the business continuity of the application.

Driver to connect to multiple physical IPs
Jdbc:postgresql://node1,node2/database
In the driver's connection string, configure the physical IP addresses of the database's primary and replica nodes at the same time. The driver will automatically detect the status of the database's primary and replica nodes.
After the database's primary and replica nodes switch, it can automatically send the connection request to the new primary node to ensure the business continuity of the application.
Driver to connect to single virtual IP
Jdbc:postgresql://vip/database
Database cluster manager provides VIP capabilities and automatically binds VIP to the new master node during active/standby switchover. When the driver uses VIP to re-establish the connection, it will automatically connect to the new master node to ensure the business continuity of the application.

Key Benefits

MogDB Cluster Manager (CM) disaster recovery architecture has the following characteristics:
1.Comprehensive Fault Prevention
MogDB CM can ensure business continuity and data reliability in all following failure scenarios:
Instance failure (instance hung, instance down, software corruption)
Server failure (faulty hardware, disk failure, power failure, abnormal shutdown)
Database failure (file corruption, bad data blocks)
Storage failure (disk array failure, link failure, media failure)
Server room disaster (natural disaster, fire, power outage or other failures not caused by human)
Operation and maintenance failure (planned downtime maintenance, file deletion by mistake, disk formatting mistake)
2.Flexible Configuration
Master with replicas architecture, support 1 master and up to 8 replicas as backup. Supports individual datacenter, multiple datacenters across same or different cities
3.Balance between Performance and Disaster Recovery
Supports synchronous replication and asynchronous replication modes.
Synchronous replication mode, you can achieve RPO = 0; asynchronous replication mode, you can effectively reduce the delay in the main database to submit transactions, which can improve overall performance.
4.Seamless Failover
Network redundancy configuration and automatic reconnection mechanism helps ensure systems’ connectivity during seamless failover.
5.Dual Plane Network Redundancy
To prevent failure in network link causing data synchronization fault between the master and the replicas, MogDB CM supports the deployment of "dual plane network", thus further improving the reliability of data replication of the disaster recovery system.