Skip to content
Snippets Groups Projects
  1. Feb 15, 2024
  2. Feb 07, 2024
    • Michal Arbet's avatar
      Fix horizon deployment · 4108aea8
      Michal Arbet authored
      New horizon release use [1] for cache backend
      instead of [2] as it was in previous versions.
      
      This patch:
      
      1. Removes override from config and
         configure only memcached endpoints, not backend
         specification itself. This will avoid bugs
         in future in case BACKEND will be switched again.
      
      2. Remove 'memcached' context from kolla_address filter
         and use 'url' as [1] don't support inet6:[{address}]
         for ipv6 but supports [{address}] which 'url' provides.
      
      [1] django.core.cache.backends.memcached.PyMemcacheCache
      [2] django.core.cache.backends.memcached.MemcachedCache
      
      Change-Id: Ie3a8f47e7b776b6aa2bb9b1522fdd4514ea1484b
      4108aea8
    • Michal Arbet's avatar
      Rework horizon role to support local_settings.d · b5aa63de
      Michal Arbet authored
      This patch implements horizon's preferred way how
      to configure itself described in docs [1],
      
      [1] https://docs.openstack.org/horizon/latest/configuration/settings.html
      
      Depends-On: https://review.opendev.org/c/openstack/kolla/+/906339
      Change-Id: I60ab4634bf4333c47d00b12fc4ec00570062bd18
      b5aa63de
    • Michal Nasiadka's avatar
      openvswitch: Set fail_mode to standalone for external bridges · 5016b3ef
      Michal Nasiadka authored
      That is the ovs-vsctl default but Ansible module is failing in
      reconfigure step - and secure breaks external connectivity in
      OVN.
      
      From OVS docs:
      fail_mode: optional string, either secure or standalone
      
      When  a controller is configured, it is, ordinarily, responsible
      for setting up all flows on the switch. Thus, if the  connection
      to  the  controller fails, no new network connections can be set
      up. If the connection to the controller stays down long  enough,
      no  packets can pass through the switch at all. This setting de‐
      termines the switch’s response to such a situation.  It  may  be
      set to one of the following:
      
      standalone
          If  no  message is received from the controller for three
          times  the  inactivity  probe  interval   (see   inactiv‐
          ity_probe), then Open vSwitch will take over responsibil‐
          ity for setting up flows.  In  this  mode,  Open  vSwitch
          causes  the  bridge  to act like an ordinary MAC-learning
          switch. Open vSwitch will continue to retry connecting to
          the controller in the background and, when the connection
          succeeds, it will discontinue its standalone behavior.
      
      secure 
          Open vSwitch will not set up flows on its  own  when  the
          controller  connection  fails  or when no controllers are
          defined. The bridge will continue to retry connecting  to
          any defined controllers forever.
      
      The default is standalone if the value is unset, but future ver‐
      sions of Open vSwitch may change the default.
      
      Change-Id: Ica4dda2914113e8f8349e7227161cb81a02b33ee
      5016b3ef
  3. Feb 05, 2024
  4. Jan 31, 2024
  5. Jan 30, 2024
  6. Jan 29, 2024
    • Alex-Welsh's avatar
      Update keystone service user passwords · ffd6e3bf
      Alex-Welsh authored
      Service user passwords will now be updated in keystone if services are
      reconfigured with new passwords set in config. This behaviour can be
      overridden.
      
      Closes-Bug: #2045990
      Change-Id: I91671dda2242255e789b521d19348b0cccec266f
      ffd6e3bf
  7. Jan 24, 2024
  8. Jan 17, 2024
    • Matt Crees's avatar
      Fix OpenSearch upgrade tasks idempotency · e502b65b
      Matt Crees authored
      Shard allocation is disabled at the start of the OpenSearch upgrade
      task. This is set as a transient setting, meaning it will be removed
      once the containers are restarted. However, if there is not change in
      the OpenSearch container it will not be restarted so the cluster is left
      in a broken state: unable to allocate shards.
      
      This patch moves the pre-upgrade tasks to within the handlers, so shard
      allocation and the flush are only performed when the OpenSearch
      container is going to be restarted.
      
      Closes-Bug: #2049512
      Change-Id: Ia03ba23bfbde7d50a88dc16e4f117dec3c98a448
      e502b65b
  9. Jan 11, 2024
  10. Jan 08, 2024
    • Pierre Riteau's avatar
      Fix Nova scp failures on Debian Bookworm · bfa9dd97
      Pierre Riteau authored
      The addition of an instance resize operation [1] to CI testing is
      triggering a failure in kolla-ansible-debian-ovn jobs, which are using a
      nodeset with multiple nodes:
      
          oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
          Command: scp -r /var/lib/nova/instances/8ca2c7e8-acae-404c-af7d-6cac38e354b8_resize/disk 192.0.2.2:/var/lib/nova/instances/8ca2c7e8-acae-404c-af7d-6cac38e354b8/disk
          Exit code: 255
          Stdout: ''
          Stderr: "Warning: Permanently added '[192.0.2.2]:8022' (ED25519) to the list of known hosts.\r\nsubsystem request failed on channel 0\r\nscp: Connection closed\r\n"
      
      This is not seen on Ubuntu Jammy, which uses OpenSSH 8.9, while Debian
      Bookworm uses OpenSSH 9.2. This is likely related to this change in
      OpenSSH 9.0 [2]:
      
          This release switches scp(1) from using the legacy scp/rcp protocol
          to using the SFTP protocol by default.
      
      Configure sftp subsystem like on RHEL9 derivatives. Even though it is
      not yet required for Ubuntu, we also configure it so we are ready for
      the Noble release.
      
      [1] https://review.opendev.org/c/openstack/kolla-ansible/+/904249
      [2] https://www.openssh.com/txt/release-9.0
      
      Closes-Bug: #2048700
      Change-Id: I9f1129136d7664d5cc3b57ae5f7e8d05c499a2a5
      bfa9dd97
    • Michal Arbet's avatar
      Enable glance proxying behaviour · 9ecfcf5a
      Michal Arbet authored
      This patch sets URL to glance worker.
      If this is set, other glance workers will know how to contact this one
      directly if needed. For image import, a single worker stages the image
      and other workers need to be able to proxy the import request to the
      right one.
      
      With current setup glance image import just not working.
      
      Closes-Bug: #2048525
      
      Change-Id: I4246dc8a80038358cd5b6e44e991b3e2ed72be0e
      9ecfcf5a
  11. Jan 05, 2024
    • Mark Goddard's avatar
      cadvisor: Set housekeeping interval to Prometheus scrape interval · 97e5c0e9
      Mark Goddard authored
      The prometheus_cadvisor container has high CPU usage. On various
      production systems I checked it sits around 13-16% on controllers,
      averaged over the prometheus 1m scrape interval. When viewed with top we
      can see it is a bit spikey and can jump over 100%.
      
      There are various bugs about this, but I found
      https://github.com/google/cadvisor/issues/2523 which suggests reducing
      the per-container housekeeping interval. This defaults to 1s, which
      provides far greater granularity than we need with the default
      prometheus scrape interval of 60s.
      
      Reducing the housekeeping interval to 60s on a production controller
      reduced the CPU usage from 13% to 3.5% average. This still seems high,
      but is more reasonable.
      
      Change-Id: I89c62a45b1f358aafadcc0317ce882f4609543e7
      Closes-Bug: #2048223
      97e5c0e9
    • Dawud's avatar
      Enable HAProxy Prometheus metrics endpoint · 140722f7
      Dawud authored
      
      HAProxy exposes a Prometheus metrics endpoint, it just needs to be
      enabled. Enable this and remove configuration for
      prometheus-haproxy-exporter. Remaining prometheus-haproxy-exporter
      containers will automatically be removed.
      
      Change-Id: If6e75691d2a996b06a9b95cb0aae772db54389fb
      Co-Authored-By: default avatarMatt Anson <matta@stackhpc.com>
      140722f7
    • Michal Arbet's avatar
      Fix long service restarts while using systemd · b1fd2b40
      Michal Arbet authored
      Some containers exiting with 143 instead of 0, but
      this is still OK. This patch just allows
      ExitCode 143 (SIGTERM) as fix. Details in
      bugreport.
      
      Services which exited with 143 (SIGTERM):
      
      kolla-cron-container.service
      kolla-designate_producer-container.service
      kolla-keystone_fernet-container.service
      kolla-letsencrypt_lego-container.service
      kolla-magnum_api-container.service
      kolla-mariadb_clustercheck-container.service
      kolla-neutron_l3_agent-container.service
      kolla-openvswitch_db-container.service
      kolla-openvswitch_vswitchd-container.service
      kolla-proxysql-container.service
      
      Partial-Bug: #2048130
      Change-Id: Ia8c85d03404cfb368e4013066c67acd2a2f68deb
      b1fd2b40
  12. Jan 04, 2024
  13. Jan 03, 2024
  14. Jan 02, 2024
  15. Dec 28, 2023
  16. Dec 21, 2023
    • Doug Szumski's avatar
      Set a log retention policy for OpenSearch · 5e5a2dca
      Doug Szumski authored
      We previously used ElasticSearch Curator for managing log
      retention. Now that we have moved to OpenSearch, we can use
      the Index State Management (ISM) plugin which is bundled with
      OpenSearch.
      
      This change adds support for automating the configuration of
      the ISM plugin via the OpenSearch API. By default, it has
      similar behaviour to the previous ElasticSearch Curator
      default policy.
      
      Closes-Bug: #2047037
      
      Change-Id: I5c6d938f2bc380f1575ee4f16fe17c6dca37dcba
      5e5a2dca
    • Alex-Welsh's avatar
      Remove nova cell sync comment · e9e7362f
      Alex-Welsh authored
      Removed a comment suggesting we use nova-manage db sync --local_cell
      when bootstrapping the nova service, since that suggestion has now been
      implemented in Kolla. See [1] for more details.
      
      [1]: https://review.opendev.org/c/openstack/kolla/+/902057
      
      Related-Bug: #2045558
      Depends-On: Ic64eb51325b3503a14ebab9b9ff2f4d9caec734a
      Change-Id: I591f83c4886f5718e36011982c77c0ece6c4cbd7
      e9e7362f
  17. Dec 20, 2023
  18. Dec 19, 2023
  19. Dec 18, 2023
  20. Dec 14, 2023
  21. Dec 13, 2023
  22. Dec 05, 2023
    • Andrey Kurilin's avatar
      Fix broken list concatenation in horizon role · 97cd1731
      Andrey Kurilin authored
      
      Starting with ansible-core 2.13, list concatenation format is changed
      and does not support concatenation operations outside of the jinja template.
      
      The format change:
      
        "[1] + {{ [2] }}" -> "{{ [1] + [2] }}"
      
      This affects the horizon role that iterates over existing policy files to
      override and concatenate them into a single variable.
      
      Co-Authored-By: default avatarDr. Jens Harbott <harbott@osism.tech>
      
      Closes-Bug: #2045660
      Change-Id: I91a2101ff26cb8568f4615b4cdca52dcf09e6978
      97cd1731
    • Mark Goddard's avatar
      Support Ansible max_fail_percentage · af6e1ca4
      Mark Goddard authored
      This allows us to continue execution until a certain proportion of hosts
      to fail. This can be useful at scale, where failures are common, and
      restarting a deployment is time-consuming.
      
      The default max failure percentage is 100, keeping the default
      behaviour. A global max failure percentage may be set via
      kolla_max_fail_percentage, and individual services may define a max
      failure percentage via <service>_max_fail_percentage.
      
      Note that all hosts in the inventory must be reachable for fact
      gathering, even those not included in a --limit.
      
      Closes-Bug: #1833737
      Change-Id: I808474a75c0f0e8b539dc0421374b06cea44be4f
      af6e1ca4
  23. Dec 02, 2023
Loading