Skip to content
Snippets Groups Projects
  • Michal Arbet's avatar
    79897566
    Improvement of ProxySQL Monitoring Configuration · 79897566
    Michal Arbet authored
    This update enhances the monitoring of the databasecluster
    in ProxySQL. The default monitoring intervals were insufficient
    for reliably detecting failures in the Galera cluster environment.
    
    A detailed configuration for monitoring intervals has been
    introduced, providing better control over how quickly and accurately
    ProxySQL can identify issues.
    
      - Variables such as `mariadb_monitor_connect_interval`,
        `mariadb_monitor_galera_healthcheck_interval, and
        `mariadb_monitor_ping_interval` significantly reduce
        the time between connection checks.
    
      - Timeouts like `mariadb_monitor_galera_healthcheck_timeout`
        and `mariadb_monitor_ping_timeout` allow faster failure
        detection, while `mariadb_monitor_galera_healthcheck_max_timeout_count`
        sets the maximum number of allowed timeouts before marking a node as down.
    
    Calculation:
    
     - Galera healthcheck:
    
       4 seconds (interval) + 1 second (timeout) + 4 seconds (interval)
       + 1 second (timeout) = 10 seconds.
    
     - Ping healthcheck:
    
       3 seconds (interval) + 2 seconds (timeout) + 3 seconds (interval)
       + 2 seconds (timeout) = 10 seconds.
    
    Both the health check and ping check mechanisms will detect a node failure
    within a maximum of 10 seconds. Both processes (health check and ping)
    operate independently, and failure in either mechanism will mark the node
    as failed.
    
    Health Check Failure Detection: Up to 10 seconds.
    Ping Failure Detection: Up to 10 seconds.
    Connect Attempts: ProxySQL also tries to connect every 2 seconds, which
    helps monitor connectivity.
    
    These changes ensure that ProxySQL can detect issues in 10 seconds
    as haproxy, significantly reducing downtime compared to default settings.
    This adjustment enables faster and more reliable monitoring, improving system
    stability and reducing potential downtime in production environments.
    
    Change-Id: Ic28801519cdb35ed2387a1468b9df661847a5476
    79897566
    History
    Improvement of ProxySQL Monitoring Configuration
    Michal Arbet authored
    This update enhances the monitoring of the databasecluster
    in ProxySQL. The default monitoring intervals were insufficient
    for reliably detecting failures in the Galera cluster environment.
    
    A detailed configuration for monitoring intervals has been
    introduced, providing better control over how quickly and accurately
    ProxySQL can identify issues.
    
      - Variables such as `mariadb_monitor_connect_interval`,
        `mariadb_monitor_galera_healthcheck_interval, and
        `mariadb_monitor_ping_interval` significantly reduce
        the time between connection checks.
    
      - Timeouts like `mariadb_monitor_galera_healthcheck_timeout`
        and `mariadb_monitor_ping_timeout` allow faster failure
        detection, while `mariadb_monitor_galera_healthcheck_max_timeout_count`
        sets the maximum number of allowed timeouts before marking a node as down.
    
    Calculation:
    
     - Galera healthcheck:
    
       4 seconds (interval) + 1 second (timeout) + 4 seconds (interval)
       + 1 second (timeout) = 10 seconds.
    
     - Ping healthcheck:
    
       3 seconds (interval) + 2 seconds (timeout) + 3 seconds (interval)
       + 2 seconds (timeout) = 10 seconds.
    
    Both the health check and ping check mechanisms will detect a node failure
    within a maximum of 10 seconds. Both processes (health check and ping)
    operate independently, and failure in either mechanism will mark the node
    as failed.
    
    Health Check Failure Detection: Up to 10 seconds.
    Ping Failure Detection: Up to 10 seconds.
    Connect Attempts: ProxySQL also tries to connect every 2 seconds, which
    helps monitor connectivity.
    
    These changes ensure that ProxySQL can detect issues in 10 seconds
    as haproxy, significantly reducing downtime compared to default settings.
    This adjustment enables faster and more reliable monitoring, improving system
    stability and reducing potential downtime in production environments.
    
    Change-Id: Ic28801519cdb35ed2387a1468b9df661847a5476