刚处理完“挖矿”事件,在做最后一个MySQL
NBU备份的时候,发现从库有问题,好奇的是怎么主从状态异常没有告警呢?先不管这么多了,处理了这个问题再完善告警内容。

 

站内信:网站运营人员可以向单个(单发)或多个(群发)用户推送消息,站内信其实就是向数据库中插入一条条记录。

一、错误信息

Preface

数据库设计:

message【站内信发件箱表】{每次运营人员发送一条站内信(无论是单发或是群发)就会在发件箱表中添加一条记录}

 图片 1

 表字段:ID、SendID、RecID、MessageID、Statue、DateTime

messagetext【站内信发件内容表】{每次运营人员发送站内信的内容就会保存到站内信发件内容表中}

图片 2

表字段:ID、Title、Message、PData

messageuserinfo【站内信接收箱表】{用户站内信收件箱}

图片 3

表字段:ID、SendID、RecID、MessageID、Statue、DateTime

 

从库show slave status \G看到的错误信息如下:

 

业务逻辑:在运营人员发送了一条站内信后,当用户再次登录时就会刷新历史站内信,而那些不活跃的用户就不会更新站内信。

场景:运营人员向某个(单发)或多个(群发)用户发送站内信

添加发件内容:向messagetext站内信发件内容表中添加一条内容记录
{这里先添加messagetext站内信发件内容表产生一条记录,后面message站内信发件箱表中才有MessageID可以关联}

添加发件人发件记录:向message站内信发件箱表中添加一条发送记录
{备注:如果是单发RecID则是指定用户的UserID,如果是群发RecID则是0。(0表示所有用户)}

用户登录时刷新与自己有关的站内信,并将数据添加到messageuserinfo站内信接收箱表

  1. 查询条件:【发件人不是自己】并且【收件人是自己或者收件人是所有人】并且【自己收件表中不存在的记录】

    select m.id from Message as m where (m.recid='所有人' or m.recid='自己的ID') and m.sendid!='自己的ID' and m.MessageID not in (select u.messageid from MessageUserInfo as u  where u.recid='自己的ID')
    

      

  2. 将相关的站内信消息添加到自己的收件表中,并标记未读。

    1.   将上面【查询条件】中查询出来的id信息用做查询message站内信发件箱表的条件

      select * from message  where id=上面的m.id
      

        

    2. 将获取到message站内信发件箱表数据添加到messageuserinfo站内信接收箱表

  3. 查询出自己的站内信记录

    select u.ID,u.RecID,m.Title,m.Message,u.Statue,m.PDate from messageuserinfo as u join messagetext as m on u.MessageID = m.ID where u.RecID='自己的ID'
    

      

     

Slave_IO_Running: No
Slave_SQL_Running: Yes
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'mysql-bin.000081' at 480141113, the last event read from './mysql-bin.000081' at 4, the last byte read from './mysql-bin.000081' at 4.'

    MasterHA is a tool which can be used
in MySQL HA architecture.I’m gonna implement it and have some tests
depend on it later.

二、错误原因

 

这里看到从库的io_thread已经终止,错误编号是1236,具体是由于读取主库的binlog日志位置(the first event ‘mysql-bin.000081’ at 480141113, the
last event read from ‘./mysql-bin.000081’ at
4)不对导致主从失败建立失败。

Framework

三、解决方案

 

1.检查从库状态以及读取、执行的binlog信息

Hostname IP Port Identity OS Version MySQL Version
zlm2 192.168.1.101 3306 master CentOS 7.0 5.7.21
zlm3 192.168.1.102 3306 slave/mha-manager CentOS 7.0 5.7.21
null 192.168.1.200 null vip null null
mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: xx.xx.xx.xx
                  Master_User: username
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000081
          Read_Master_Log_Pos: 480141113
               Relay_Log_File: mysql9017-relay-bin.000163
                Relay_Log_Pos: 480141259
        Relay_Master_Log_File: mysql-bin.000081
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 480141113
              Relay_Log_Space: 480141462
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'mysql-bin.000081' at 480141113, the last event read from './mysql-bin.000081' at 4, the last byte read from './mysql-bin.000081' at 4.'
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 17
1 row in set (0.00 sec)

 

2.查看主库的binlog内容

Procedure

[backup]# mysqlbinlog  mysql-bin.000081 >mysql-bin.log

 

图片 4

Downlaod the rpm package of MasterHA using
below web link first:

看到主库binlog日志mysql-bin.000081最大的pos为480140557,但从库要读取的是’mysql-bin.000081′ at
480141113,显然从库要读的pos值比主库本身存在的pos值大,导致读取不到,进而失败。

https://code.google.com/archive/p/mysql-master-ha/downloads

可通过下面语句查看binlog的pos信息和日志内容
mysql> show binlog events in  ‘mysql-bin.000081’ from 480140557 limit
10;       
Empty set (0.04 sec)
3.更改从库的同步位置,完成数据重新同步

MasterHA’s
wiki document.

发表评论

电子邮件地址不会被公开。 必填项已用*标注