Description
We have simplified the environment we are running down to a single container running on the docker server, with mysql client accessing the mysql container.
All recommended configuration changes, as advised by Atlassian have been applied, as initial environment we were running were running JIRA, Confluence and MYSQL containers. mysql configured to 8 hour timeout for connections.
We are seeing connections being dropped in the mysql client, No connection. Trying to reconnect... There is no pattern to the duration when this occurs, 5 mins to 50 mins, we arbitrarily see this problem
[root@ost-clb-atl-dmc-c01 ~]# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.43 MySQL Community Server (GPL)
Copyright (c) 2000, 2024, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> select now();
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id: 3
Current database: *** NONE ***`
Docker Network
[root@ost-clb-atl-dmc-c01 mysql]# docker network ls
NETWORK ID NAME DRIVER SCOPE
3352cad49ba7 bridge bridge local
bf9ba81a8e28 docker_gwbridge bridge local
cc040dee87d0 host host local
v8zswgiyduo0 ingress overlay swarm
r04hth47im8z mysql-private overlay swarm
1a1e3f13b5e6 none null local
Docker Containers
[root@ost-clb-atl-dmc-c01 mysql]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
52c2c7d87f92 nexus.ostravam.corp.telstra.com:5000/mysql:5.7.43 "docker-entrypoint.s…" 4 seconds ago Up 3 seconds 3306/tcp, 33060/tcp mysql_mysql.1.5ltdc2y8k1ju4r7kv6l8mrc36
Network Logging
gwbridge network receives a reset packet for the initial connection that was established (AEST timezone)
6:30:18.913810 IP 172.31.1.1.51658 > 172.31.1.2.3306: Flags [.], ack 3827, win 1409, options [nop,nop,TS val 1710428538 ecr 3304546861], length 0
16:30:24.370359 ARP, Request who-has 172.31.1.2 tell 172.31.1.1, length 28
16:30:24.370373 ARP, Request who-has 172.31.1.1 tell 172.31.1.2, length 28
16:30:24.370379 ARP, Reply 172.31.1.1 is-at 02:42:d3:a7:b4:1e, length 28
16:30:24.370390 ARP, Reply 172.31.1.2 is-at 02:42:ac:1f:01:02, length 28
16:55:21.152848 IP 172.31.1.1.51658 > 172.31.1.2.3306: Flags [P.], seq 1251:1297, ack 3827, win 1409, options [nop,nop,TS val 1711930777 ecr 3304546861], length 46
16:55:21.152904 IP 172.31.1.2.3306 > 10.145.247.114.51658: Flags [R], seq 724319931, win 0, length 0
16:55:21.154149 IP 172.31.1.1.38042 > 172.31.1.2.3306: Flags [S], seq 1145254527, win 43690, options [mss 65495,sackOK,TS val 1711930778 ecr 0,nop,wscale 7], length 0
ingress network also receives a reset packet for the initial connection that was established (UTC timezone)
06:30:18.913817 eth1 In IP 172.31.1.1.51658 > 172.31.1.2.3306: Flags [.], ack 3827, win 1409, options [nop,nop,TS val 1710428538 ecr 3304546861], length 0
06:30:18.913824 eth0 Out IP 10.0.0.2.51658 > 10.0.0.4.3306: Flags [.], ack 3827, win 1409, options [nop,nop,TS val 1710428538 ecr 3304546861], length 0
06:30:24.370350 eth0 Out ARP, Request who-has 10.0.0.4 tell 10.0.0.2, length 28
06:30:24.370353 eth1 Out ARP, Request who-has 172.31.1.1 tell 172.31.1.2, length 28
06:30:24.370377 eth1 In ARP, Request who-has 172.31.1.2 tell 172.31.1.1, length 28
06:30:24.370382 eth1 Out ARP, Reply 172.31.1.2 is-at 02:42:ac:1f:01:02, length 28
06:30:24.370387 eth1 In ARP, Reply 172.31.1.1 is-at 02:42:d3:a7:b4:1e, length 28
06:30:24.370389 eth0 In ARP, Request who-has 10.0.0.2 tell 10.0.0.4, length 28
06:30:24.370391 eth0 Out ARP, Reply 10.0.0.2 is-at 02:42:0a:00:00:02, length 28
06:30:24.370394 eth0 In ARP, Reply 10.0.0.4 is-at 02:42:0a:00:00:04, length 28
06:55:21.152860 eth1 In IP 172.31.1.1.51658 > 172.31.1.2.3306: Flags [P.], seq 1252:1298, ack 3827, win 1409, options [nop,nop,TS val 1711930777 ecr 3304546861], length 46
06:55:21.152887 eth1 Out IP 172.31.1.2.3306 > 172.31.1.1.51658: Flags [R], seq 724319931, win 0, length 0
06:55:21.154154 eth1 In IP 172.31.1.1.38042 > 172.31.1.2.3306: Flags [S], seq 1145254527, win 43690, options [mss 65495,sackOK,TS val 1711930778 ecr 0,nop,wscale 7], length 0
06:55:21.154180 eth0 Out IP 10.0.0.2.38042 > 10.0.0.4.3306: Flags [S], seq 1145254527, win 43690, options [mss 65495,sackOK,TS val 1711930778 ecr 0,nop,wscale 7], length 0
mysql container network initial connection is not dropped, a new connection is established
06:30:18.913614 IP 10.0.0.2.51658 > 10.0.0.4.3306: Flags [P.], seq 1206:1252, ack 3727, win 1409, options [nop,nop,TS val 1710428538 ecr 3303806990], length 46
06:30:18.913760 IP 10.0.0.4.3306 > 10.0.0.2.51658: Flags [P.], seq 3727:3827, ack 1252, win 244, options [nop,nop,TS val 3304546861 ecr 1710428538], length 100
06:30:18.913827 IP 10.0.0.2.51658 > 10.0.0.4.3306: Flags [.], ack 3827, win 1409, options [nop,nop,TS val 1710428538 ecr 3304546861], length 0
06:30:24.370362 ARP, Request who-has 10.0.0.2 tell 10.0.0.4, length 28
06:30:24.370384 ARP, Request who-has 10.0.0.4 tell 10.0.0.2, length 28
06:30:24.370388 ARP, Reply 10.0.0.4 is-at 02:42:0a:00:00:04, length 28
06:30:24.370394 ARP, Reply 10.0.0.2 is-at 02:42:0a:00:00:02, length 28
06:55:21.154195 IP 10.0.0.2.38042 > 10.0.0.4.3306: Flags [S], seq 1145254527, win 43690, options [mss 65495,sackOK,TS val 1711930778 ecr 0,nop,wscale 7], length 0
06:55:21.154218 IP 10.0.0.4.3306 > 10.0.0.2.38042: Flags [S.], seq 2416441610, ack 1145254528, win 27960, options [mss 1410,sackOK,TS val 3306049101 ecr 1711930778,nop,wscale 7], length 0
06:55:21.154278 IP 10.0.0.2.38042 > 10.0.0.4.3306: Flags [.], ack 1, win 342, options [nop,nop,TS val 1711930778 ecr 3306049101], length 0
Any ideas on what the cause of these dropouts and how to remedy them is appreciated