Skip to content
Snippets Groups Projects
  1. Jun 26, 2024
    • Pedro Henrique's avatar
      Fix the docker container dimensions comparison for short notation · a086b780
      Pedro Henrique authored
      When using short notations like `1g` or `512m` to define the
      container dimensions, we are always getting the container to
      being restarted in each kolla-ansible run, even with no real
      changes in the container configs.
      
      Change-Id: Ic8e2dd42b95a8f5c2141a820c55642a3ed7beabd
      Closes-Bug: #2070494
      a086b780
  2. Jan 26, 2024
  3. Jan 24, 2024
  4. Jan 22, 2024
    • Michal Arbet's avatar
      Bump ansible-lint version · 47ddac41
      Michal Arbet authored
      
      The version that we were capping to is no longer compatible with latest
      upper-constraints.txt, so let us free float again.
      
      The resulting linting errors are included for now to unblock the gate,
      these will still need to be discussed or fixed later.
      
      NOTE(kevko): Temporarily disabling horizon deployment, as it's not
      possible to unblock gates without it
      
      Co-Authored-By: default avatarMichal Arbet <michal.arbet@ultimum.io>
      Change-Id: Ib7f72b2663199ef80844a412bc436c6ef09322cc
      47ddac41
  5. Jan 18, 2024
  6. Jan 17, 2024
    • Pierre Riteau's avatar
      Drop more remnants of install_type · 76f5d0cb
      Pierre Riteau authored
      Change-Id: I8e5e42db48c6235deb93dcb185e044fce983ba5a
      76f5d0cb
    • Bartosz Bezak's avatar
      use docker_custom_config override for Kolla CI upgrade jobs · 1d38ff5e
      Bartosz Bezak authored
      In Kolla CI K-A upgrade job needs docker_custom_config override
      as docker_registry var is being used both for docker daemon
      config - for kolla images build, and kolla-ansible container images
      sources - where we're using quay.io mirror.
      docker_custom_config gets precedence in docker daemon
      configuration.
      
      docker_custom_config was removed in [1].
      
      [1] https://review.opendev.org/c/openstack/kolla-ansible/+/904067
      
      Change-Id: I1e890223faf25b1169a49e22a9529f90806d2f3a
      1d38ff5e
    • Matt Crees's avatar
      Fix OpenSearch upgrade tasks idempotency · e502b65b
      Matt Crees authored
      Shard allocation is disabled at the start of the OpenSearch upgrade
      task. This is set as a transient setting, meaning it will be removed
      once the containers are restarted. However, if there is not change in
      the OpenSearch container it will not be restarted so the cluster is left
      in a broken state: unable to allocate shards.
      
      This patch moves the pre-upgrade tasks to within the handlers, so shard
      allocation and the flush are only performed when the OpenSearch
      container is going to be restarted.
      
      Closes-Bug: #2049512
      Change-Id: Ia03ba23bfbde7d50a88dc16e4f117dec3c98a448
      e502b65b
  7. Jan 15, 2024
  8. Jan 12, 2024
  9. Jan 11, 2024
  10. Jan 10, 2024
  11. Jan 09, 2024
  12. Jan 08, 2024
    • Pierre Riteau's avatar
      CI: Test Nova server resize functionality · f86ed027
      Pierre Riteau authored
      This adds an extra resize operation to core OpenStack tests. This should
      be fast since we are only increasing the number of cores of the VM and
      could help catch additional errors in CI tests.
      
      Change-Id: Ia61b995dbffcda4f1e6494548df457231cb67bd7
      f86ed027
    • Pierre Riteau's avatar
      Fix Nova scp failures on Debian Bookworm · bfa9dd97
      Pierre Riteau authored
      The addition of an instance resize operation [1] to CI testing is
      triggering a failure in kolla-ansible-debian-ovn jobs, which are using a
      nodeset with multiple nodes:
      
          oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
          Command: scp -r /var/lib/nova/instances/8ca2c7e8-acae-404c-af7d-6cac38e354b8_resize/disk 192.0.2.2:/var/lib/nova/instances/8ca2c7e8-acae-404c-af7d-6cac38e354b8/disk
          Exit code: 255
          Stdout: ''
          Stderr: "Warning: Permanently added '[192.0.2.2]:8022' (ED25519) to the list of known hosts.\r\nsubsystem request failed on channel 0\r\nscp: Connection closed\r\n"
      
      This is not seen on Ubuntu Jammy, which uses OpenSSH 8.9, while Debian
      Bookworm uses OpenSSH 9.2. This is likely related to this change in
      OpenSSH 9.0 [2]:
      
          This release switches scp(1) from using the legacy scp/rcp protocol
          to using the SFTP protocol by default.
      
      Configure sftp subsystem like on RHEL9 derivatives. Even though it is
      not yet required for Ubuntu, we also configure it so we are ready for
      the Noble release.
      
      [1] https://review.opendev.org/c/openstack/kolla-ansible/+/904249
      [2] https://www.openssh.com/txt/release-9.0
      
      Closes-Bug: #2048700
      Change-Id: I9f1129136d7664d5cc3b57ae5f7e8d05c499a2a5
      bfa9dd97
    • Michal Arbet's avatar
      Enable glance proxying behaviour · 9ecfcf5a
      Michal Arbet authored
      This patch sets URL to glance worker.
      If this is set, other glance workers will know how to contact this one
      directly if needed. For image import, a single worker stages the image
      and other workers need to be able to proxy the import request to the
      right one.
      
      With current setup glance image import just not working.
      
      Closes-Bug: #2048525
      
      Change-Id: I4246dc8a80038358cd5b6e44e991b3e2ed72be0e
      9ecfcf5a
    • Zuul's avatar
      Merge "CI: Use ControlPersist and ControlMaster" · 15380925
      Zuul authored
      15380925
  13. Jan 06, 2024
  14. Jan 05, 2024
    • Mark Goddard's avatar
      cadvisor: Set housekeeping interval to Prometheus scrape interval · 97e5c0e9
      Mark Goddard authored
      The prometheus_cadvisor container has high CPU usage. On various
      production systems I checked it sits around 13-16% on controllers,
      averaged over the prometheus 1m scrape interval. When viewed with top we
      can see it is a bit spikey and can jump over 100%.
      
      There are various bugs about this, but I found
      https://github.com/google/cadvisor/issues/2523 which suggests reducing
      the per-container housekeeping interval. This defaults to 1s, which
      provides far greater granularity than we need with the default
      prometheus scrape interval of 60s.
      
      Reducing the housekeeping interval to 60s on a production controller
      reduced the CPU usage from 13% to 3.5% average. This still seems high,
      but is more reasonable.
      
      Change-Id: I89c62a45b1f358aafadcc0317ce882f4609543e7
      Closes-Bug: #2048223
      97e5c0e9
    • Michal Arbet's avatar
      Fix long service restarts while using systemd · b1fd2b40
      Michal Arbet authored
      Some containers exiting with 143 instead of 0, but
      this is still OK. This patch just allows
      ExitCode 143 (SIGTERM) as fix. Details in
      bugreport.
      
      Services which exited with 143 (SIGTERM):
      
      kolla-cron-container.service
      kolla-designate_producer-container.service
      kolla-keystone_fernet-container.service
      kolla-letsencrypt_lego-container.service
      kolla-magnum_api-container.service
      kolla-mariadb_clustercheck-container.service
      kolla-neutron_l3_agent-container.service
      kolla-openvswitch_db-container.service
      kolla-openvswitch_vswitchd-container.service
      kolla-proxysql-container.service
      
      Partial-Bug: #2048130
      Change-Id: Ia8c85d03404cfb368e4013066c67acd2a2f68deb
      b1fd2b40
  15. Jan 04, 2024
  16. Jan 03, 2024
Loading