What is a typical process for MySQL master failover?
Jun 19, 2025 am 01:06 AMMySQL master library failover mainly includes four steps. 1. Fault detection: Regularly check the main library process, connection status and simple query to determine whether it is downtime, set a retry mechanism to avoid misjudgment, and can use tools such as MHA, Orchestrator or Keepalived to assist in detection; 2. Select the new main library: Select the most suitable slave library to replace it according to the data synchronization progress (Seconds_Behind_Master), binlog data integrity, network delay and load conditions, and perform data compensation or manual intervention if necessary; 3. Switch topology: Point other slave libraries to the new master library, execute RESET MASTER or enable GTID, update VIP, DNS or proxy configurations to achieve transparent switching of applications; 4. Application layer cooperation: Clients must support automatic reconnection, exception handling, and DNS Mechanisms such as cache refresh, combined with connection pool middleware such as HikariCP to improve fault tolerance and ensure seamless service transition.
MySQL master library failover is a key link in a high-availability architecture. The purpose is to quickly transfer traffic to a slave library when an exception occurs in the main library to ensure that the service is continuously available. This process usually includes several steps such as detecting failures, selecting a new master library, and switching client connections.
1. Fault detection: How to determine whether there is a problem with the main library?
Automatically or manually confirming whether the main library is really downtime is the first step in the entire process. A common practice is to regularly detect the health status of the main library through a monitoring system, such as:
- Check if the MySQL process is running
- Try to establish a database connection
- Execute simple SQL queries (such as
SELECT 1
)
Once it fails several times in a row, the main library is considered unavailable. It should be noted that network jitter may also lead to misjudgment, so a retry mechanism and timeout time are usually set to avoid unnecessary switching.
Some tools such as MHA, Orchestrator or Keepalived can be used for fault detection.
2. Choose a new main library: Who will take over?
After confirming that the master library fails, the next step is to select a suitable node from the existing slave library as the new master library.
Selection criteria generally include:
- Data synchronization progress (see
Seconds_Behind_Master
) - Is there the latest binlog data
- Network latency and stability
- Instance load situation
Ideally, you should choose the slave library that is closest to the original master library, which can reduce the risk of data loss. If multiple slave libraries are lagging behind a lot, data compensation or manual intervention may be required.
MHA automatically tries to fill the logs, while Orchestrator can view the topology and make decisions through the API.
3. Switch to topology: Let other nodes know who the new master is
After determining the new master library, the next thing to do is to point the other slave library to the new master library and update the application layer configuration to ensure that the write requests arrive correctly.
Specific operations include:
- Execute
RESET MASTER
or enable GTID in the new main library - Other slave libraries execute
CHANGE MASTER TO
to point to the new master - Update VIP or DNS configuration to allow applications to switch transparently
- If you use a proxy (such as ProxySQL, MaxScale), you must also update its configuration
The key to this step is to try to be "seamless" and avoid business interruptions. Some solutions will also simplify the switching logic in combination with virtual IP or service discovery mechanism.
4. Application layer cooperation: Don't forget the client
Even if the switching is completed at the database level, if the application does not perceive changes in time, a connection error will still occur.
To deal with this problem, the following methods can be considered:
- Use connection pool middleware to support automatic reconnection and read-write separation
- Add retry logic and exception handling to the client code
- In conjunction with the DNS cache refresh strategy, ensure that the application gets the latest address
For example, using HikariCP or MyBatis from Spring Boot, you can set the connection timeout and maximum number of retries to improve fault tolerance.
Basically these are the steps. Although the overall process does not seem complicated, there are many details in the actual deployment that are easy to ignore, such as replication delay, GTID consistency, permission configuration and other issues. It is recommended to do more drills before it is officially launched to ensure that the failover can be executed smoothly.
The above is the detailed content of What is a typical process for MySQL master failover?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

To reset the root password of MySQL, please follow the following steps: 1. Stop the MySQL server, use sudosystemctlstopmysql or sudosystemctlstopmysqld; 2. Start MySQL in --skip-grant-tables mode, execute sudomysqld-skip-grant-tables&; 3. Log in to MySQL and execute the corresponding SQL command to modify the password according to the version, such as FLUSHPRIVILEGES;ALTERUSER'root'@'localhost'IDENTIFIEDBY'your_new

When handling NULL values ??in MySQL, please note: 1. When designing the table, the key fields are set to NOTNULL, and optional fields are allowed NULL; 2. ISNULL or ISNOTNULL must be used with = or !=; 3. IFNULL or COALESCE functions can be used to replace the display default values; 4. Be cautious when using NULL values ??directly when inserting or updating, and pay attention to the data source and ORM framework processing methods. NULL represents an unknown value and does not equal any value, including itself. Therefore, be careful when querying, counting, and connecting tables to avoid missing data or logical errors. Rational use of functions and constraints can effectively reduce interference caused by NULL.

mysqldump is a common tool for performing logical backups of MySQL databases. It generates SQL files containing CREATE and INSERT statements to rebuild the database. 1. It does not back up the original file, but converts the database structure and content into portable SQL commands; 2. It is suitable for small databases or selective recovery, and is not suitable for fast recovery of TB-level data; 3. Common options include --single-transaction, --databases, --all-databases, --routines, etc.; 4. Use mysql command to import during recovery, and can turn off foreign key checks to improve speed; 5. It is recommended to test backup regularly, use compression, and automatic adjustment.

Turn on MySQL slow query logs and analyze locationable performance issues. 1. Edit the configuration file or dynamically set slow_query_log and long_query_time; 2. The log contains key fields such as Query_time, Lock_time, Rows_examined to assist in judging efficiency bottlenecks; 3. Use mysqldumpslow or pt-query-digest tools to efficiently analyze logs; 4. Optimization suggestions include adding indexes, avoiding SELECT*, splitting complex queries, etc. For example, adding an index to user_id can significantly reduce the number of scanned rows and improve query efficiency.

TosecurelyconnecttoaremoteMySQLserver,useSSHtunneling,configureMySQLforremoteaccess,setfirewallrules,andconsiderSSLencryption.First,establishanSSHtunnelwithssh-L3307:localhost:3306user@remote-server-Nandconnectviamysql-h127.0.0.1-P3307.Second,editMyS

GROUPBY is used to group data by field and perform aggregation operations, and HAVING is used to filter the results after grouping. For example, using GROUPBYcustomer_id can calculate the total consumption amount of each customer; using HAVING can filter out customers with a total consumption of more than 1,000. The non-aggregated fields after SELECT must appear in GROUPBY, and HAVING can be conditionally filtered using an alias or original expressions. Common techniques include counting the number of each group, grouping multiple fields, and filtering with multiple conditions.

MySQL transactions and lock mechanisms are key to concurrent control and performance tuning. 1. When using transactions, be sure to explicitly turn on and keep the transactions short to avoid resource occupation and undolog bloating due to long transactions; 2. Locking operations include shared locks and exclusive locks, SELECT...FORUPDATE plus X locks, SELECT...LOCKINSHAREMODE plus S locks, write operations automatically locks, and indexes should be used to reduce the lock granularity; 3. The isolation level is repetitively readable by default, suitable for most scenarios, and modifications should be cautious; 4. Deadlock inspection can analyze the details of the latest deadlock through the SHOWENGINEINNODBSTATUS command, and the optimization methods include unified execution order, increase indexes, and introduce queue systems.

MySQL paging is commonly implemented using LIMIT and OFFSET, but its performance is poor under large data volume. 1. LIMIT controls the number of each page, OFFSET controls the starting position, and the syntax is LIMITNOFFSETM; 2. Performance problems are caused by excessive records and discarding OFFSET scans, resulting in low efficiency; 3. Optimization suggestions include using cursor paging, index acceleration, and lazy loading; 4. Cursor paging locates the starting point of the next page through the unique value of the last record of the previous page, avoiding OFFSET, which is suitable for "next page" operation, and is not suitable for random jumps.
