Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about pc.ignore_sb = true, I added this parameter to keep working in case of split brain #643

Open
WarnAndFine opened this issue Jul 13, 2023 · 0 comments

Comments

@WarnAndFine
Copy link

I saw the documentation on this parameter. I have three machines that form a cluster. All three machines are configured with this parameter. I turned off the network card of one of the machines and found that its status changed from primary to non-primary, and it cannot Read and write, will report an error

ERROR 1047 (08S01): WSREP has not yet prepared node for application use, use the following command to query
show variables like 'wsrep_provider_options';
pc.ignore_sb= true

environment:

mariadb:mariadb-10.11.2-linux-systemd-x86_64
galera:galera-4-26.4.14-1.el7.centos.x86_64
system: Linux 3.10.0-1127.el7.x86_64 centos7

shutdown interface log:

2023-07-13 15:34:01 0 [Note] WSREP: (c918c367-908e, 'tcp://0.0.0.0:4567') connection to peer b28c6e35-ab02 with addr tcp://192.168.1.179:4567 timed out, no messages seen in PT3S, socket stats: rtt: 8982 rttvar: 15039 rto: 209000 lost: 0 last_data_recv: 3323 cwnd: 10 last_queued_since: 322666886 last_delivered_since: 3322090455 send_queue_length: 0 send_queue_bytes: 0 segment: 0 messages: 0
2023-07-13 15:34:01 0 [Note] WSREP: (c918c367-908e, 'tcp://0.0.0.0:4567') connection to peer 4c100681-8496 with addr tcp://192.168.1.74:4567 timed out, no messages seen in PT3S, socket stats: rtt: 9888 rttvar: 16262 rto: 210000 lost: 0 last_data_recv: 3323 cwnd: 10 last_queued_since: 6209 last_delivered_since: 3322191023 send_queue_length: 1 send_queue_bytes: 212 segment: 0 messages: 1
2023-07-13 15:34:01 0 [Note] WSREP: Deferred close timer started for socket with remote endpoint: tcp://192.168.1.74:4567
2023-07-13 15:34:01 0 [Note] WSREP: (c918c367-908e, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://192.168.1.179:4567 tcp://192.168.1.74:4567
2023-07-13 15:34:01 0 [Note] WSREP: Deferred close timer handle_wait Operation aborted. for 0x7fa5200019e8
2023-07-13 15:34:01 0 [Note] WSREP: Deferred close timer destruct
2023-07-13 15:34:03 0 [Note] WSREP: (c918c367-908e, 'tcp://0.0.0.0:4567') reconnecting to b28c6e35-ab02 (tcp://192.168.1.179:4567), attempt 0
2023-07-13 15:34:03 0 [Note] WSREP: (c918c367-908e, 'tcp://0.0.0.0:4567') reconnecting to 4c100681-8496 (tcp://192.168.1.74:4567), attempt 0
2023-07-13 15:34:03 0 [Note] WSREP: evs::proto(c918c367-908e, OPERATIONAL, view_id(REG,4c100681-8496,20)) suspecting node: 4c100681-8496
2023-07-13 15:34:03 0 [Note] WSREP: evs::proto(c918c367-908e, OPERATIONAL, view_id(REG,4c100681-8496,20)) suspected node without join message, declaring inactive
2023-07-13 15:34:03 0 [Note] WSREP: evs::proto(c918c367-908e, OPERATIONAL, view_id(REG,4c100681-8496,20)) suspecting node: b28c6e35-ab02
2023-07-13 15:34:03 0 [Note] WSREP: evs::proto(c918c367-908e, OPERATIONAL, view_id(REG,4c100681-8496,20)) suspected node without join message, declaring inactive
2023-07-13 15:34:04 0 [Note] WSREP: view(view_id(NON_PRIM,4c100681-8496,20) memb {
c918c367-908e,0
} joined {
} left {
} partitioned {
4c100681-8496,0
b28c6e35-ab02,0
})
2023-07-13 15:34:04 0 [Note] WSREP: view(view_id(NON_PRIM,c918c367-908e,21) memb {
c918c367-908e,0
} joined {
} left {
} partitioned {
4c100681-8496,0
b28c6e35-ab02,0
})
2023-07-13 15:34:04 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2023-07-13 15:34:04 0 [Note] WSREP: Flow-control interval: [16, 16]
2023-07-13 15:34:04 0 [Note] WSREP: Received NON-PRIMARY.
2023-07-13 15:34:04 0 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 1089)
2023-07-13 15:34:04 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2023-07-13 15:34:04 0 [Note] WSREP: Flow-control interval: [16, 16]
2023-07-13 15:34:04 0 [Note] WSREP: Received NON-PRIMARY.
2023-07-13 15:34:04 2 [Note] WSREP: ================================================
View:
id: 16797e05-1ed5-11ee-bc52-9b6c8a07bc93:1089
status: non-primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: no
own_index: 0
members(1):
0: c918c367-2149-11ee-908e-b7c35a520d3d, test2

2023-07-13 15:34:04 2 [Note] WSREP: Non-primary view
2023-07-13 15:34:04 2 [Note] WSREP: Server status change synced -> connected
2023-07-13 15:34:04 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2023-07-13 15:34:04 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2023-07-13 15:34:04 2 [Note] WSREP: ================================================
View:
id: 16797e05-1ed5-11ee-bc52-9b6c8a07bc93:1089
status: non-primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: no
own_index: 0
members(1):
0: c918c367-2149-11ee-908e-b7c35a520d3d, test2

2023-07-13 15:34:04 2 [Note] WSREP: Non-primary view
2023-07-13 15:34:04 2 [Note] WSREP: Server status change connected -> connected
2023-07-13 15:34:04 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2023-07-13 15:34:04 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2023-07-13 15:34:24 12 [ERROR] Slave I/O: error connecting to master '[email protected]:3306' - retry-time: 60 maximum-retries: 100000 message: Can't connect to server on '192.168.1.179' (101 "Network is unreachable"), Internal MariaDB error code: 2003

My configuration file is as follows:

server.cnf
[mysqld]
datadir=/home/mariadb_data
socket=/usr/local/mariadb/socket/mysql.sock
bind-address=0.0.0.0
user=mysql
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
binlog_format=ROW
log-error=/usr/local/mariadb/log/mysqld.log
[galera]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_node_name='test2'
wsrep_node_address="192.168.1.193"
wsrep_cluster_name='galera-cluster'
wsrep_cluster_address="gcomm://192.168.1.179,191.168.1.193,192.168.1.74"
wsrep_provider_options="gcache.size=1G"
wsrep_slave_threads=4
wsrep_sst_method=rsync
wsrep_provider_options="pc.ignore_sb=TRUE"
#pc.ignore_sb=true
#wsrep_provider_options="pc.ignore_quorum=true"

My question is that when the brain is split, other nodes can also insert and modify data, but it is not possible according to the configuration. If it is convenient, please give some suggestions or references

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant