[Problem Series] Solutions to the problem of disconnection between consumers and MQ (1)

1. Problem description

When using RabbitMQ as middleware and the consumer as a service, the following situation may occur: after a long period of no message delivery, the connection between the consumer and RabbitMQ is disconnected, resulting in the inability to process new messages. The solution to this problem is to restart the Python consumer service, after which the connection returns to normal.

2. Solution steps

In order to troubleshoot and resolve this issue, you can take the following steps:

  1. Connection settings review:
  2. Network status check:
  3. Consumer code review:
  4. RabbitMQ server checks:
  5. Monitoring and alarm settings:
  6. Version compatibility:

2.1 Connection settings review

  • Heartbeat timeout: RabbitMQ has a heartbeat mechanism by default. If no heartbeat from the consumer is received within a period of time, the connection will be closed. Make sure your connection settings center heartbeat time is reasonable to avoid being misjudged as inactive and closing the connection.
  • Connection timeout: Check the timeout in the connection parameters to make sure it is long enough to prevent the connection from being disconnected if there are no messages for a long time.

1. Heartbeat setting example:

import pika

# RabbitMQ 服务器地址
rabbitmq_host = 'localhost'

# RabbitMQ 服务器端口
rabbitmq_port = 5672

# RabbitMQ 虚拟主机
rabbitmq_virtual_host = '/'

# RabbitMQ 用户名和密码
rabbitmq_credentials = pika.PlainCredentials(username='guest', password='guest')

# 创建连接参数
connection_params = pika.ConnectionParameters(
    host=rabbitmq_host,
    port=rabbitmq_port,
    virtual_host=rabbitmq_virtual_host,
    credentials=rabbitmq_credentials,
    heartbeat=600,  # 设置心跳时间,以秒为单位
)

# 创建连接
connection = pika.BlockingConnection(connection_params)

# 创建通道
channel = connection.channel()

# 在这里添加你的消费者逻辑
# ...

# 关闭连接
connection.close()

 2. Connection timeout example

import pika

# RabbitMQ 服务器地址
rabbitmq_host = 'localhost'

# RabbitMQ 服务器端口
rabbitmq_port = 5672

# RabbitMQ 虚拟主机
rabbitmq_virtual_host = '/'

# RabbitMQ 用户名和密码
rabbitmq_credentials = pika.PlainCredentials(username='guest', password='guest')

# 设置连接超时时间,以秒为单位
connection_timeout = 10

# 创建连接参数
connection_params = pika.ConnectionParameters(
    host=rabbitmq_host,
    port=rabbitmq_port,
    virtual_host=rabbitmq_virtual_host,
    credentials=rabbitmq_credentials,
    connection_attempts=3,  # 设置尝试连接的次数
    retry_delay=5,  # 设置重试连接的延迟时间,以秒为单位
    socket_timeout=connection_timeout,
)

# 创建连接
connection = pika.BlockingConnection(connection_params)

# 创建通道
channel = connection.channel()

# 在这里添加你的消费者逻辑
# ...

# 关闭连接
connection.close()

In the above example, the socket_timeout parameter is set to connection_timeout, indicating the connection timeout. You can adjust this value to what you think is appropriate based on actual needs. In addition, the connection_attempts and retry_delay parameters are also set, indicating the number of connection attempts and the delay time for retrying the connection respectively.

Modify the connection parameters according to the specific situation and ensure that the connection timeout setting meets your expectations. The connection timeout should be long enough to ensure that the connection can still be successfully established when the network is unstable or the server is busy.

2.2  Network status check

  • Make sure the RabbitMQ service port is open in the firewall and does not block connections.
  • Check network stability and eliminate connection problems caused by network instability.

Check and set firewall rules, assuming RabbitMQ uses port 5672 by default:

1. View existing firewall rules

sudo iptables -L

This will list the current firewall rules. Make sure the rules for the RabbitMQ port (default is 5672) are not blocked.

2. Open RabbitMQ port

sudo iptables -A INPUT -p tcp --dport 5672 -j ACCEPT

2.3 Consumer Code Review

  • Make sure there is a robust exception handling mechanism in the consumer code to prevent exceptions from causing connection interruption.
  • Add an automatic reconnection mechanism to ensure that the connection can be reestablished after being disconnected.

Adding an automatic reconnection mechanism to the consumer code can improve the stability of the system.

Exception handling and automatic reconnection mechanism:
import pika
import time

def consume_callback(ch, method, properties, body):
    try:
        # 在这里添加你的消息处理逻辑
        print("Received message:", body.decode('utf-8'))
    except Exception as e:
        # 捕获并处理任何可能的异常
        print(f"Error processing message: {str(e)}")

def connect_rabbitmq():
    # 创建连接参数
    connection_params = pika.ConnectionParameters(
        host=rabbitmq_host,
        port=rabbitmq_port,
        virtual_host=rabbitmq_virtual_host,
        credentials=rabbitmq_credentials,
    )

    while True:
        try:
            # 创建连接
            connection = pika.BlockingConnection(connection_params)

            # 创建通道
            channel = connection.channel()

            # 声明队列
            channel.queue_declare(queue='your_queue_name', durable=True)

            # 设置消费者回调函数
            channel.basic_consume(queue='your_queue_name', on_message_callback=consume_callback, auto_ack=True)

            # 开始消费消息
            print('Waiting for messages. To exit press CTRL+C')
            channel.start_consuming()

        except Exception as e:
            # 捕获连接过程中的异常
            print(f"Error connecting to RabbitMQ: {str(e)}")
            print("Retrying in 5 seconds...")
            time.sleep(5)

        finally:
            # 在最终块中确保关闭连接
            if connection and connection.is_open:
                connection.close()

# RabbitMQ 服务器地址
rabbitmq_host = 'localhost'

# RabbitMQ 服务器端口
rabbitmq_port = 5672

# RabbitMQ 虚拟主机
rabbitmq_virtual_host = '/'

# RabbitMQ 用户名和密码
rabbitmq_credentials = pika.PlainCredentials(username='guest', password='guest')

if __name__ == "__main__":
    connect_rabbitmq()

Combining the above strategies can greatly improve the stability of the connection between consumers and message queues, ensuring that the system can process messages normally and respond accordingly.

Guess you like

Origin blog.csdn.net/weixin_36755535/article/details/134673267