Error ERROR 2013 (HY000): Lost connection to MySQL server during query setting up group replication on Lightsail

I am trying to configure sql group replication between 2 vps, one being a non AWS Lightsail instance and acting as primary.

However I cannot seem to be able to get the Lightsail instance working. It launched without problems on the primary, but when I try to start the group replication on the Lightsail server

START GROUP_REPLICATION; 
ERROR 2013 (HY000): Lost connection to MySQL server during query

(timemout is set to 20)

Ports 3306 and 33061 are open both servers in ufw and also open on the AWS Lightsail’s firewall.

my.cnf on Lightsail looks like this

# Shared replication group configuration
loose-group_replication_group_name = "<uuid>"
loose-group_replication_ip_whitelist = "<primary_static_public_ip>,<lightsail_static_public_ip>"
loose-group_replication_group_seeds = "<primary_static_public_ip>:33061,<lightsail_static_public_ip>:33061"

# Single or Multi-primary mode? Uncomment these two lines
# for multi-primary mode, where any host can accept writes
loose-group_replication_single_primary_mode = OFF
loose-group_replication_enforce_update_everywhere_checks = ON

# Host specific replication configuration
server_id = 2
bind-address = "0.0.0.0"
report_host = "<lightsail_static_public_ip>"
loose-group_replication_local_address = "<lightsail_static_public_ip>:33061"

On the primary I only use the primary’s address and whitelist Lightsail public address. I cannot figure out why it refuses to connect.

I could however connect from Lightsail to primary using mysql -h command and the replication username.

Answer

UPDATE

So, apparently there is no way of having this working remotely without servers being inside the same VPC or local network.

Therefore the only solution is to use the “deprecated” solution (according to mysql official website) of Master/Master replication.

To do so one has to first set-up a usual Master/Slave replication (I will not explain this here as there is plenty of tutorial on how to set it up, eg: https://www.digitalocean.com/community/tutorials/how-to-set-up-master-slave-replication-in-mysql).

Once the replication is in place and works:

on the SLAVE:

  • edit /etc/mysql/my.cnf to add the following lines

    #use public IP address of the server or 0.0.0.0 if you wish to bind it to  
    #more than a single IP address (ie using localhost)
    bind-address = "0.0.0.0" 
    binlog_do_db = db_to_replicate
    binlog_ignore_db = db_to_ignore
    
  • Then if you are not syncing the mysql database you will need to create a slave user on the slave

    CREATE USER 'slave_usr'@'%' IDENTIFIED BY 'password' REQUIRE SSL; GRANT REPLICATION SLAVE ON *.* TO 'slave_usr'@'%';

  • then run service mysql restart

On the MASTER:

  • Get the binlog file and position using: SHOW MASTER STATUS;

  • Run the CHANGE MASTER command: CHANGE MASTER TO MASTER_SSL=1, MASTER_HOST='slave_host_public_ip',MASTER_USER='<slave_usr>', MASTER_PASSWORD='password', MASTER_LOG_FILE='mysql-bin.00000X', MASTER_LOG_POS= binlog_pos;

    you might need run stop slave beforehand.

  • Then run START SLAVE;

You can check that all is working correctly using SHOW SLAVE STATUSG on both servers.

If you bind-address to 0.0.0.0 it is recommended to reject all IP’s that will not need to connect to the servers.

Example using ufw: sudo ufw allow from <ip> to any port 3306