Skip to content
Snippets Groups Projects
  1. Apr 13, 2023
    • Matt Crees's avatar
      Remove RabbitMQ ha-all policy when not required · c85b64d1
      Matt Crees authored
      With the addition of the variable
      `om_enable_rabbitmq_high_availability`, this feature in the upgrade
      task should be brought back. It is also now used in the deploy task. The
      `ha-all` policy is cleared only when
      `om_enable_rabbitmq_high_availability` is set to `false`.
      
      Change-Id: Ia056aa40e996b1f0fed43c0f672466c7e4a2f547
      c85b64d1
  2. Apr 12, 2023
  3. Apr 03, 2023
  4. Mar 29, 2023
  5. Mar 23, 2023
  6. Mar 21, 2023
  7. Mar 15, 2023
  8. Mar 06, 2023
  9. Mar 02, 2023
  10. Mar 01, 2023
  11. Feb 20, 2023
  12. Feb 18, 2023
  13. Feb 14, 2023
    • Mark Goddard's avatar
      Fix deploy/genconfig in check mode · 572ff2f8
      Mark Goddard authored
      Previously, when running one of the following commands:
      
        kolla-ansible deploy --check
        kolla-ansible genconfig --check
      
      deployment or configuration generation fails for various reasons.
      
      MariaDB fails to lookup the existing cluster.
      
      Keystone fails to generate cron config.
      
      Nova-cell fails to get the cell settings.
      
      Closes-Bug: #2002661
      Change-Id: I5e765f498ae86d213d0a4379ca5d473db1499962
      572ff2f8
    • John Garbutt's avatar
      Improve RabbitMQ performance by reducing ha replicas · 6cf22b0c
      John Garbutt authored
      Currently we do not follow the RabbitMQ advice on replicas here:
      https://www.rabbitmq.com/ha.html#replication-factor
      
      Here we reduce the number of replicas to n // 2 + 1 as advised
      above. The hope it this helps speed up recovery from rabbit
      issues.
      
      Related-Bug: #1954925
      Change-Id: Ib6bcb26c499c9884faa4a0cd51abaec00cacb096
      6cf22b0c
    • Matt Crees's avatar
      Add flag to change RabbitMQ ha-mode definition · e13072a9
      Matt Crees authored
      Adds the flag `rabbitmq_ha_replica_count` to change how many different
      nodes a queue should be mirrored across. If the value is not set, then
      it defaults to "ha-mode":"all". This value is unset by default to avoid
      any unexpected changes to the RabbitMQ definitions.json file, as that
      would trigger an unexpected restart of RabbitMQ during the next deploy.
      
      Change-Id: Iee98cd937197a73a3b04aa8501fa325e8ecfff24
      e13072a9
    • Will Szumski's avatar
      Use loadbalancer to connect to etcd · e2c7dace
      Will Szumski authored
      Hardcoding the first etcd host creates a single point of failure.
      
      Change-Id: I0f83030fcd84ddcdc4bf2226e76605c7cab84cbb
      e2c7dace
  14. Feb 13, 2023
    • Will Szumski's avatar
      Put etcd behind HTTP loadbalancer · 6f536a4f
      Will Szumski authored
      
      etcd-compatible tooz drivers do not support multiple endpoints via
      backend_url. We can put a loadbalancer in front of etcd and configure
      backend_url to use the VIP instead. The issue with hard coding the first
      host is that we break coordination if we take this host offline. In the
      case of cinder, we would not be able to perform any volume related
      operations.
      
      Co-Authored-By: default avatarMark Goddard <mark@stackhpc.com>
      Change-Id: Ib684501ba03c386dc5ac71e5cbea05c99f191665
      6f536a4f
  15. Feb 09, 2023
    • John Garbutt's avatar
      RabbitMQ: Support setting ha-promote-on-shutdown · 94f3ce0c
      John Garbutt authored
      By default ha-promote-on-shutdown=when-synced. However we are seeing
      issues with RabbitMQ automatically recovering when nodes are restarted.
      https://www.rabbitmq.com/ha.html#cluster-shutdown
      
      Rather than waiting for operator interventions, it is better we allow
      recovery to happen, even if that means we may loose some messages.
      A few failed and timed out operations is better than a totaly broken
      cloud. This is achieved using ha-promote-on-shutdown=always.
      
      Note, when a node failure is detected, this is already the default
      behaviour from 3.7.5 onwards:
      https://www.rabbitmq.com/ha.html#promoting-unsynchronised-mirrors
      
      This patch adds the option to change the ha-promote-on-shutdown
      definition, using the flag `rabbitmq_ha_promote_on_shutdown`. This
      value is unset by default to avoid any unexpected changes to the
      RabbitMQ definitions.json file, as that would trigger an unexpected
      restart of RabbitMQ during the next deploy.
      
      Related-Bug: #1954925
      
      Change-Id: I2146bda2c72ddac2c9923c6941b0596395fd9ab5
      94f3ce0c
  16. Feb 07, 2023
  17. Feb 04, 2023
    • Michal Arbet's avatar
      Fix kolla_docker module · 63b9fa56
      Michal Arbet authored
      This patch fixes kolla_docker module
      as it did not take into account common_options
      parameter. From patchset it's visible that module's
      default values are used always - even if user overrided
      some param in common_options dict.
      
      Closes-Bug: #2003079
      
      Change-Id: I677fde708dd004decaff4bd39f2173d8d81052fb
      63b9fa56
  18. Feb 03, 2023
  19. Feb 02, 2023
  20. Jan 31, 2023
  21. Jan 30, 2023
  22. Jan 29, 2023
  23. Jan 26, 2023
  24. Jan 23, 2023
    • Alex-Welsh's avatar
      Adding optional delay between l3 agent restarts · 391aa467
      Alex-Welsh authored
      This change serialises the neutron l3 agent restart process and adds a
      user configurable delay between restarts. This can prevent connectivity
      loss due to all agents being restarted at the same time.
      
      Routers increase the recovery time, making this issue more prevalent.
      
      Change-Id: I3be0ebfa12965e6ae32d1b5f13f8fd23c3f52b8c
      391aa467
  25. Jan 20, 2023
    • Stanislav Dmitriev's avatar
      Set scheduler.max_attempts for nova conductor · 0b62db7c
      Stanislav Dmitriev authored
      In order to honour configured max number of attempts
      it has to be presented in nova.conf inside of
      nova_conductor container, otherwise the default value
      of 3 will be used
      
      Closes-Bug: #2003587
      Change-Id: I928af332b8658223444594f96417830233057284
      0b62db7c
  26. Jan 19, 2023
  27. Jan 17, 2023
  28. Jan 16, 2023
  29. Jan 13, 2023
    • Matt Crees's avatar
      Add a flag to handle RabbitMQ high availability · 09df6fc1
      Matt Crees authored
      A combination of durable queues and classic queue mirroring can be used
      to provide high availability of RabbitMQ. However, these options should
      only be used together, otherwise the system will become unstable. Using
      the flag ``om_enable_rabbitmq_high_availability`` will either enable
      both options at once, or neither of them.
      
      There are some queues that should not be mirrored:
      * ``reply`` queues (these have a single consumer and TTL policy)
      * ``fanout`` queues (these have a TTL policy)
      * ``amq`` queues (these are auto-delete queues, with a single consumer)
      An exclusionary pattern is used in the classic mirroring policy. This
      pattern is ``^(?!(amq\\.)|(.*_fanout_)|(reply_)).*``
      
      Change-Id: I51c8023b260eb40b2eaa91bd276b46890c215c25
      09df6fc1
  30. Jan 12, 2023
    • Mark Goddard's avatar
      Fix prechecks in check mode · 46aeb984
      Mark Goddard authored
      When running in check mode, some prechecks previously failed because
      they use the command module which is silently not run in check mode.
      Other prechecks were not running correctly in check mode due to e.g.
      looking for a string in empty command output or not querying which
      containers are running.
      
      This change fixes these issues.
      
      Closes-Bug: #2002657
      Change-Id: I5219cb42c48d5444943a2d48106dc338aa08fa7c
      46aeb984
  31. Jan 11, 2023
Loading