数据库运维
记录DBA学习成长历程

MySQL-MHA高可用技术

文章目录

MHA高可用工作原理
1.监控节点,通过配置文件获取所有节点信息
2.主库宕机
  1)如果判断从库GTID或者position号,数据有差距,最接近于master的从库会选为备选主
  2)如果两个从库数据一致,按照配置文件的顺序进行选择备选主
  3)如果设定有权重,按照权重强制指定备选主(默认如果一个slave落后master100M的话,即使有权重也会失效,如果配合check_repl_delay=0的话,即使落后100M,也强制选定为备主)
3.数据补偿
  1)当ssh能连接时,从库对比主库GTID或者position号,立即将二进制日志保存至各个节点,并且应用
  2)当ssh不能连接时,对比从库之间的relaylog的差异
4.failover
  将被选主身份切换,并且对外提供服务,其余从库和新的主库建立主从
5.VIP漂移
  配置vip,自动切换到新主库上,直接配置在配置文件中,需要一个脚本
6.binlog server
  二次数据补偿,一台额外的机器,实时同步主库的binlog,直接配置在配置文件中

配置

node节点:db01(10.0.0.51),db02(10.0.0.52),db03(10.0.0.53)
manager节点:m01(10.0.0.61)
搭建好一主两从,我选db01为主,并且都以GTID复制
关键程序进行软连接,我是通过tar包安装的MySQL5.7.20
ln -s /app/mysql/bin/mysqlbinlog    /usr/bin/mysqlbinlog
ln -s /app/mysql/bin/mysql          /usr/bin/mysql

配置互信
[root@m01 ~]# ssh-keygen
[root@m01 ~]# cd /root/.ssh/
[root@m01 .ssh]# mv id_rsa.pub authorized_keys
[root@m01 .ssh]# scp -r /root/.ssh 10.0.0.51:/root
[root@m01 .ssh]# scp -r /root/.ssh 10.0.0.52:/root
[root@m01 .ssh]# scp -r /root/.ssh 10.0.0.53:/root

所有节点安装node依赖包
[root@m01 .ssh]# yum install perl-DBD-MySQL -y
[root@m01 .ssh]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm

在主库中创建mha专用用户
db01 [(none)]> grant all privileges on *.* to mha@'10.0.0.%' identified by 'mha';
db01 [(none)]>select user,host from mysql.user;
+---------------+-----------+
| user          | host      |
+---------------+-----------+
| root          | %         |
| aaaa          | 10.0.0.%  |
| mha           | 10.0.0.%  |
| mysql.session | localhost |
| mysql.sys     | localhost |
| root          | localhost |
+---------------+-----------+
6 rows in set (0.00 sec)
node节点安装结束
下载manager依赖并安装
[root@m01 ~]# yum install -y perl-Config-Tiny epel-release perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
[root@m01 ~]# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm

准备配置文件目录
[root@m01 ~]# mkdir -p /etc/mha

创建日志目录
[root@m01 ~]# mkdir -p /var/log/mha/app1

编辑MHA配置文件
[root@m01 ~]# vim /etc/mha/app1.cnf

[binlog1]
hostname=10.0.0.61                binlog server要放在哪儿
master_binlog_dir=/data/binlog2   日志存放路径
no_master=1                       

[server default]
manager_log=/var/log/mha/app1/manager
manager_workdir=/var/log/mha/app1
master_binlog_dir=/data/mysql/binlog
master_ip_failover_script=/usr/local/bin/master_ip_failover     vip脚本的存放路径
password=mha
ping_interval=2
repl_password=123                 主从复制专用用户密码
repl_user=aaaa                    主从复制专用用户名
ssh_user=root
user=mha

[server1]
hostname=10.0.0.51
port=3306

[server2]
hostname=10.0.0.52
port=3306

[server3]
hostname=10.0.0.53
port=3306

创建binlog server路径
mkdir /data/binlog2
chown -R mysql.mysql /data/*

拉取日志
[root@m01 ~]# cd /data/binlog2/
[root@m01 ~]# mysqlbinlog  -R --host=10.0.0.51 --user=mha --password=mha --raw  --stop-never mysql-bin.000001 &
[root@m01 binlog2]# ll
total 696
-rw-r----- 1 mysql mysql 690968 Jul  4 09:01 mysql-bin.000001
-rw-r----- 1 mysql mysql    194 Jul  4 09:01 mysql-bin.000002
-rw-r----- 1 mysql mysql    954 Jul  4 09:01 mysql-bin.000003
-rw-r----- 1 mysql mysql    371 Jul  4 09:01 mysql-bin.000004
-rw-r----- 1 mysql mysql    257 Jul  4 09:01 mysql-bin.000005
-rw-r----- 1 mysql mysql    234 Jul  4 09:01 mysql-bin.000006

上传vip脚本到/usr/local/bin/
[root@m01 binlog2]# cd /usr/local/bin/
[root@m01 bin]# ll
-rwxr-xr-x 1 root root 2166 Jul  3 17:33 master_ip_failover
[root@m01 bin]# chmod +x master_ip_failover

第一次使用vip要手动在主机创建
[root@db01 ~]# ifconfig eth0:1 10.0.0.55/24在主库

manager配置完成,这里直接把MHA的binlog server 和VIP功能加进去了
启动前检测所有节点
[root@m01 ~]#  masterha_check_ssh --conf=/etc/mha/app1.cnf
Thu Jul  4 09:56:22 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Jul  4 09:56:22 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu Jul  4 09:56:22 2019 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu Jul  4 09:56:22 2019 - [info] Starting SSH connection tests..
Thu Jul  4 09:56:23 2019 - [debug] 
Thu Jul  4 09:56:22 2019 - [debug]  Connecting via SSH from root@10.0.0.51(10.0.0.51:22) to root@10.0.0.52(10.0.0.52:22)..
Thu Jul  4 09:56:23 2019 - [debug]   ok.
Thu Jul  4 09:56:23 2019 - [debug]  Connecting via SSH from root@10.0.0.51(10.0.0.51:22) to root@10.0.0.53(10.0.0.53:22)..
Thu Jul  4 09:56:23 2019 - [debug]   ok.
Thu Jul  4 09:56:24 2019 - [debug] 
Thu Jul  4 09:56:23 2019 - [debug]  Connecting via SSH from root@10.0.0.52(10.0.0.52:22) to root@10.0.0.51(10.0.0.51:22)..
Thu Jul  4 09:56:23 2019 - [debug]   ok.
Thu Jul  4 09:56:23 2019 - [debug]  Connecting via SSH from root@10.0.0.52(10.0.0.52:22) to root@10.0.0.53(10.0.0.53:22)..
Thu Jul  4 09:56:24 2019 - [debug]   ok.
Thu Jul  4 09:56:25 2019 - [debug] 
Thu Jul  4 09:56:23 2019 - [debug]  Connecting via SSH from root@10.0.0.53(10.0.0.53:22) to root@10.0.0.51(10.0.0.51:22)..
Thu Jul  4 09:56:24 2019 - [debug]   ok.
Thu Jul  4 09:56:24 2019 - [debug]  Connecting via SSH from root@10.0.0.53(10.0.0.53:22) to root@10.0.0.52(10.0.0.52:22)..
Thu Jul  4 09:56:24 2019 - [debug]   ok.
Thu Jul  4 09:56:25 2019 - [info] All SSH connection tests passed successfully.

检测主从状态
[root@m01 ~]# masterha_check_repl --conf=/etc/mha/app1.cnf
MySQL Replication Health is OK.

一切正常,启动MHA
[root@m01 ~]# nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover  < /dev/null> /var/log/mha/app1/manager.log 2>&1 &

检查运行状态
[root@m01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:1748) is running(0:PING_OK), master:10.0.0.51

模拟故障

把主库kill掉
[root@db01 ~]# pkill mysqld

在db03上查看状态,自动把db02当做主库并自动构建主从
db03 [(none)]>show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.0.52
                  Master_User: aaaa
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000007
          Read_Master_Log_Pos: 234
               Relay_Log_File: db03-relay-bin.000002
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
查看db02的vip
[root@db02 ~]# ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:90:b8:9d brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.52/24 brd 10.0.0.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever
    inet 10.0.0.55/24 brd 10.0.0.255 scope global secondary eth0:1

恢复

启动db01mysql,构建主从,db02为主
[root@db01 ~]# systemctl start mysqld.service 

我们可以在mha日志内找到db03的切换主库的命令,可以直接复制过来
[root@m01 ~]# cat /var/log/mha/app1/manager |grep -n 'CHANGE MASTER'
189:Wed Jul  3 21:13:47 2019 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.0.0.51', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='aaaa', MASTER_PASSWORD='xxx';
215:Wed Jul  3 21:13:47 2019 - [info]  Executed CHANGE MASTER.
486:Thu Jul  4 09:59:39 2019 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.0.0.52', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='aaaa', MASTER_PASSWORD='xxx';
512:Thu Jul  4 09:59:39 2019 - [info]  Executed CHANGE MASTER.

486行
db01 [(none)]>CHANGE MASTER TO MASTER_HOST='10.0.0.52', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='aaaa', MASTER_PASSWORD='123';
Query OK, 0 rows affected, 2 warnings (0.01 sec)
db01 [(none)]>start slave;
Query OK, 0 rows affected (0.00 sec)
db01 [(none)]>show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.0.52
                  Master_User: aaaa
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000007
          Read_Master_Log_Pos: 234
               Relay_Log_File: db01-relay-bin.000002
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

MHA挂掉,binlog server也要重新布置,把原来的日志删掉,重新拉取新主库的日志
[root@m01 binlog2]# rm -rf /data/binlog2/*
[root@m01 binlog2]# mysqlbinlog  -R --host=10.0.0.52 --user=mha --password=mha --raw  --stop-never mysql-bin.000001 &
[root@m01 binlog2]# ll
total 700
-rw-r----- 1 root root 690913 Jul  4 10:12 mysql-bin.000001
-rw-r----- 1 root root   1108 Jul  4 10:12 mysql-bin.000002
-rw-r----- 1 root root    281 Jul  4 10:12 mysql-bin.000003
-rw-r----- 1 root root    257 Jul  4 10:12 mysql-bin.000004
-rw-r----- 1 root root    257 Jul  4 10:12 mysql-bin.000005
-rw-r----- 1 root root    257 Jul  4 10:12 mysql-bin.000006
-rw-r----- 1 root root    234 Jul  4 10:12 mysql-bin.000007

进行切换后,配置文件内自动清除发生故障的sever,所以启动之前添加配置文件server

准备完毕,启动MHA
[root@m01 ~]# nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 &
[2] 4060
[root@m01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:4060) is running(0:PING_OK), master:10.0.0.52

检查binlog server
db02 [(none)]>db02 [(none)]>show master status;
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                                                                 |
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------+
| mysql-bin.000008 |      234 |              |                  | 8d397b30-8791-11e9-acbd-000c2902fc28:1-17,
c74ed2c4-998e-11e9-b9ee-000c2990b89d:1 |
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------+
1 row in set (0.00 sec)
db02 [(none)]>flush logs;
Query OK, 0 rows affected (0.00 sec)
db02 [(none)]>show master status;
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                                                                 |
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------+
| mysql-bin.000009 |      234 |              |                  | 8d397b30-8791-11e9-acbd-000c2902fc28:1-17,
c74ed2c4-998e-11e9-b9ee-000c2990b89d:1 |
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------+
1 row in set (0.00 sec)

已复制成功
[root@m01 ~]# ll /data/binlog2/
total 708
-rw-r----- 1 root root 690913 Jul  4 10:12 mysql-bin.000001
-rw-r----- 1 root root   1108 Jul  4 10:12 mysql-bin.000002
-rw-r----- 1 root root    281 Jul  4 10:12 mysql-bin.000003
-rw-r----- 1 root root    257 Jul  4 10:12 mysql-bin.000004
-rw-r----- 1 root root    257 Jul  4 10:12 mysql-bin.000005
-rw-r----- 1 root root    257 Jul  4 10:12 mysql-bin.000006
-rw-r----- 1 root root    281 Jul  4 10:16 mysql-bin.000007
-rw-r----- 1 root root    281 Jul  4 10:16 mysql-bin.000008
-rw-r----- 1 root root    234 Jul  4 10:16 mysql-bin.000009

至此,MHA的高可用功能,binlog server功能和vip漂移功能,部署并测试成功

 

赞(1)
MySQL学习笔记 » MySQL-MHA高可用技术