系统变量group_replication_group_seeds为空导致MySQL节点无法启动组复制
MySQL InnoDB Cluster集群中一个节点,在服务器重启过后,启动MySQL实例后,发现status为MISSING,另外memberState为OFFLINE状态。如下所示:
MySQL mysqldbu02:7306 ssl JS > cluster.status()
{
"clusterName": "yssps",
"defaultReplicaSet": {
"name": "default",
"primary": "mysqldbu02:7306",
"ssl": "REQUIRED",
"status": "OK_NO_TOLERANCE_PARTIAL",
"statusText": "Cluster is NOT tolerant to any failures. 1 member is not active.",
"topology": {
"mysqldbu01:7306": {
"address": "mysqldbu01:7306",
"instanceErrors": [
"NOTE: group_replication is stopped."
],
"memberRole": "SECONDARY",
"memberState": "OFFLINE",
"mode": "n/a",
"readReplicas": {},
"role": "HA",
"status": "(MISSING)",
"version": "8.0.35"
},
"mysqldbu02:7306": {
"address": "mysqldbu02:7306",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
},
"mysqldbu03:7306": {
"address": "mysqldbu03:7306",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
}
},
"topologyMode": "Single-Primary"
},
"groupInformationSourceMember": "mysqldbu02:7306"
}
MySQL mysqldbu02:7306 ssl JS >
从错误提示信息来看是组复制没有启动(group_replication is stopped)。其实MySQL InnoDB Cluster的节点重启操作在生产环境和测试环境都做过多次,还是第一次遇到这种情况。
另外,在这个节点上查看节点的状态如下:
mysql> select * from performance_schema.replication_group_members order by member_host\G
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
MEMBER_ID: 5ad73846-3fce-11ee-a86c-5254003c1c6e
MEMBER_HOST: mysqldbu01
MEMBER_PORT: 7306
MEMBER_STATE: OFFLINE
MEMBER_ROLE:
MEMBER_VERSION:
MEMBER_COMMUNICATION_STACK: MySQL
1 row in set (0.00 sec)
mysql>
具体的错误日志如下所示
2024-11-06T10:03:19.369090+08:00 29 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv4
localhost address to the allowlist. It is mandatory that it is added.'
2024-11-06T10:03:19.369171+08:00 29 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Automatically adding IPv6
localhost address to the allowlist. It is mandatory that it is added.'
2024-11-06T10:03:19.371586+08:00 40 [System] [MY-010597] [Repl] 'CHANGE REPLICATION SOURCE TO FOR CHANNEL 'group_replication_applier
' executed'. Previous state source_host='<NULL>', source_port= 0, source_log_file='', source_log_pos= 4, source_bind=''. New state s
ource_host='<NULL>', source_port= 0, source_log_file='', source_log_pos= 4, source_bind=''.
2024-11-06T10:03:19.391449+08:00 29 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to join the group: peers not configured. '
2024-11-06T10:03:19.391512+08:00 29 [ERROR] [MY-011639] [Repl] Plugin group_replication reported: 'Error on group communication engine start'
2024-11-06T10:03:19.391581+08:00 29 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is leaving a group without being on one.'
关键信息为:[Repl] Plugin group_replication reported: '[GCS] Unable to join the group: peers not configured.
这个错误信息意味着没有配置系统变量group_replication_group_seeds。 另外,手工启动组复制,错误日志依然报同样的错误,执行启动组复制命令的错误信息如下所示:
mysql> start group_replication;
ERROR 3097 (HY000): The START GROUP_REPLICATION command failed as there was an error when joining the communication group.
mysql>
检查发现系统变量group_replication_group_seeds的值为空(my.cnf中没有设置这个系统变量,在mysqld-auto.cnf中系统变量为空值),因为是测试环境,不清楚什么时候动过这个系统变量,,毕竟测试环境经常被折腾来折腾去。
mysql> show variables like 'group_replication_local_address';
+---------------------------------+---------------------+
| Variable_name | Value |
+---------------------------------+---------------------+
| group_replication_local_address | mysqldbu01:7306 |
+---------------------------------+---------------------+
1 row in set (0.01 sec)
mysql>
mysql> show variables like 'group_replication_group_seeds';
+-------------------------------+-------+
| Variable_name | Value |
+-------------------------------+-------+
| group_replication_group_seeds | |
+-------------------------------+-------+
1 row in set (0.00 sec)
mysql> select * from performance_schema.persisted_variables where variable_name='group_replication_group_seeds';
+-------------------------------+----------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+-------------------------------+----------------+
| group_replication_group_seeds | |
+-------------------------------+----------------+
1 row in set (0.01 sec)
mysql>
设置系统变量group_replication_group_seeds的值后,重启组复制就正常了。
mysql> set persist group_replication_group_seeds='mysqldbu02:7306';
Query OK, 0 rows affected (0.01 sec)
mysql> start group_replication;
Query OK, 0 rows affected, 1 warning (11.28 sec)
mysql>
系统变量group_replication_group_seeds为空导致MySQL节点无法启动组复制系统变量group_replication_group_seeds为空导致MySQL节点无法启动组复制系统变量group_replication_group_seeds为空导致MySQL节点无法启动组复制系统变量group_replication_group_seeds为空导致MySQL节点无法启动组复制系统变量group_replication_group_seeds为空导致MySQL节点无法启动组复制系统变量group_replication_group_seeds为空导致MySQL节点无法启动组复制系统变量group_replication_group_seeds为空导致MySQL节点无法启动组复制系统变量group_replication_group_seeds为空导致MySQL节点无法启动组复制系统变量group_replication_group_seeds为空导致MySQL节点无法启动组复制