Skip to content
Snippets Groups Projects
  1. Apr 22, 2022
    • Mark Goddard's avatar
      nova: use any_errors_fatal for once-per-cell tasks · 832989d0
      Mark Goddard authored
      We run some nova tasks once per cell, using a condition to match a
      single host in the cell. In other similar tasks, we use run_once, which
      will fail all hosts if the task fails. Typically these tasks are
      critical, and that is desirable. However, with the approach used in
      nova-cell to support multiple cells, if a once-per-cell task fails, then
      other hosts will continue to execute, which could lead to unexpected
      results.
      
      This change adds any_errors_fatal to the plays or blocks that run these
      tasks.
      
      Closes-Bug: #1948694
      
      Change-Id: I2a5871ccd4e8198171ef3239ce95f475f3e4b051
      832989d0
  2. Apr 05, 2022
    • Mark Goddard's avatar
      libvirt: Fix nova-libvirt-cleanup command · 188b3285
      Mark Goddard authored
      This change addresses an issue in the nova-libvirt-cleanup command,
      added in I46854ed7eaf1d5b5e3ccd8531c963427848bdc99.
      
      Check for rc=1 pgrep command, since a lack of matches is a pass.
      
      Also, use bash for set -o pipefail.
      
      Change-Id: Iffda0dfffce8768324ffec55e629134c70e2e996
      188b3285
  3. Mar 29, 2022
    • Mark Goddard's avatar
      nova: improve compute service registration failure handling · f1d3ff11
      Mark Goddard authored
      If any nova compute service fails to register itself, Kolla Ansible will
      fail the host that queries the Nova API. This is the first compute host
      in the inventory, and fails in the task:
      
          Waiting for nova-compute services to register themselves
      
      Other hosts continue, often leading to further errors later on. Clearly
      this is not idea.
      
      This change modifies the behaviour to query the compute service list
      until all expected hosts are present, but does not fail the querying
      host if they are not. A new task is added that executes for all hosts,
      and fails only those hosts that have not registered successfully.
      
      Alternatively, to fail all hosts in a cell when any compute service
      fails to register, set nova_compute_registration_fatal to true.
      
      Change-Id: I12c1928cf1f1fb9e28f1741e7fe4968004ea1816
      Closes-Bug: #1940119
      f1d3ff11
  4. Mar 21, 2022
    • Mark Goddard's avatar
      libvirt: add nova-libvirt-cleanup command · 80b311be
      Mark Goddard authored
      Change Ia1239069ccee39416b20959cbabad962c56693cf added support for
      running a libvirt daemon on the host, rather than using the nova_libvirt
      container. It did not cover migration of existing hosts from using a
      container to using a host daemon.
      
      This change adds a kolla-ansible nova-libvirt-cleanup command which may
      be used to clean up the nova_libvirt container, volumes and related
      items on hosts, once it has been disabled.
      
      The playbook assumes that compute hosts have been emptied of VMs before
      it runs. A future extension could support migration of existing VMs, but
      this is currently out of scope.
      
      Change-Id: I46854ed7eaf1d5b5e3ccd8531c963427848bdc99
      80b311be
    • Mark Goddard's avatar
      libvirt: make it possible to run libvirt on the host · 4e41acd8
      Mark Goddard authored
      In some cases it may be desirable to run the libvirt daemon on the host.
      For example, when mixing host and container OS distributions or
      versions.
      
      This change makes it possible to disable the nova_libvirt container, by
      setting enable_nova_libvirt_container to false. The default values of
      some Docker mounts and other paths have been updated to point to default
      host directories rather than Docker volumes when using a host libvirt
      daemon.
      
      This change does not handle migration of existing systems from using
      a nova_libvirt container to libvirt on the host.
      
      Depends-On: https://review.opendev.org/c/openstack/ansible-collection-kolla/+/830504
      
      Change-Id: Ia1239069ccee39416b20959cbabad962c56693cf
      4e41acd8
  5. Mar 18, 2022
  6. Mar 10, 2022
    • Mark Goddard's avatar
      libvirt: support SASL authentication · d2d4b53d
      Mark Goddard authored
      In Kolla Ansible OpenStack deployments, by default, libvirt is
      configured to allow read-write access via an unauthenticated,
      unencrypted TCP connection, using the internal API network.  This is to
      facilitate migration between hosts.
      
      By default, Kolla Ansible does not use encryption for services on the
      internal network (and did not support it until Ussuri). However, most
      other services on the internal network are at least authenticated
      (usually via passwords), ensuring that they cannot be used by anyone
      with access to the network, unless they have credentials.
      
      The main issue here is the lack of authentication. Any client with
      access to the internal network is able to connect to the libvirt TCP
      port and make arbitrary changes to the hypervisor. This could include
      starting a VM, modifying an existing VM, etc. Given the flexibility of
      the domain options, it could be seen as equivalent to having root access
      to the hypervisor.
      
      Kolla Ansible supports libvirt TLS [1] since the Train release, using
      client and server certificates for mutual authentication and encryption.
      However, this feature is not enabled by default, and requires
      certificates to be generated for each compute host.
      
      This change adds support for libvirt SASL authentication, and enables it
      by default. This provides base level of security. Deployments requiring
      further security should use libvirt TLS.
      
      [1] https://docs.openstack.org/kolla-ansible/latest/reference/compute/libvirt-guide.html#libvirt-tls
      
      Depends-On: https://review.opendev.org/c/openstack/kolla/+/833021
      Closes-Bug: #1964013
      Change-Id: Ia91ceeb609e4cdb144433122b443028c0278b71e
      d2d4b53d
  7. Jan 10, 2022
  8. Oct 27, 2021
  9. Oct 01, 2021
  10. Sep 28, 2021
    • Niklas Hagman's avatar
      Transition Keystone admin user to system scope · 2e933dce
      Niklas Hagman authored
      A system-scoped token implies the user has authorization to act on the
      deployment system. These tokens are useful for interacting with
      resources that affect the deployment as a whole, or exposes resources
      that may otherwise violate project or domain isolation.
      
      Since Queens, the keystone-manage bootstrap command assigns the admin
      role to the admin user with system scope, as well as in the admin
      project. This patch transitions the Keystone admin user from
      authenticating using project scoped tokens to system scoped tokens.
      This is a necessary step towards being able to enable the updated oslo
      policies in services that allow finer grained access to system-level
      resources and APIs.
      
      An etherpad with discussion about the transition to the new oslo
      service policies is:
      
      https://etherpad.opendev.org/p/enabling-system-scope-in-kolla-ansible
      
      
      
      Change-Id: Ib631e2211682862296cce9ea179f2661c90fa585
      Signed-off-by: default avatarNiklas Hagman <ubuntu@post.blinkiz.com>
      2e933dce
  11. Aug 12, 2021
    • Michal Arbet's avatar
      Trivial fix nova's healthchecks · 85879afc
      Michal Arbet authored
      Kolla-ansible upgrade task is calling different
      handlers as deploy task and these handlers are
      missing healthcheck key. This patch is fixing
      this.
      
      Closes-Bug: #1939679
      Change-Id: Id83d20bfd89c27ccf70a3a79938f428cdb5d40fc
      85879afc
  12. Aug 10, 2021
    • Radosław Piliszek's avatar
      Refactor and optimise image pulling · 9ff2ecb0
      Radosław Piliszek authored
      We get a nice optimisation by using a filtered loop instead
      of task skipping per service with 'when'.
      
      Partially-Implements: blueprint performance-improvements
      Change-Id: I8f68100870ab90cb2d6b68a66a4c97df9ea4ff52
      9ff2ecb0
  13. Aug 02, 2021
    • Michal Arbet's avatar
      Fix freezed spice console in horizon · c281a018
      Michal Arbet authored
      This trivial patch is setting "timeout tunnel" in haproxy's
      configuration for spicehtml5proxy. This option extends time
      when spice's websocket connection is closed, so spice will
      not be freezed. Default value is set to 1h as it is in novnc.
      
      Closes-Bug: #1938549
      Change-Id: I3a5cd98ecf4916ebd0748e7c08111ad0e4dca0b2
      c281a018
  14. Jul 27, 2021
  15. Jun 23, 2021
    • Mark Goddard's avatar
      Use ansible_facts to reference facts · ade5bfa3
      Mark Goddard authored
      By default, Ansible injects a variable for every fact, prefixed with
      ansible_. This can result in a large number of variables for each host,
      which at scale can incur a performance penalty. Ansible provides a
      configuration option [0] that can be set to False to prevent this
      injection of facts. In this case, facts should be referenced via
      ansible_facts.<fact>.
      
      This change updates all references to Ansible facts within Kolla Ansible
      from using individual fact variables to using the items in the
      ansible_facts dictionary. This allows users to disable fact variable
      injection in their Ansible configuration, which may provide some
      performance improvement.
      
      This change disables fact variable injection in the ansible
      configuration used in CI, to catch any attempts to use the injected
      variables.
      
      [0] https://docs.ansible.com/ansible/latest/reference_appendices/config.html#inject-facts-as-vars
      
      Change-Id: I7e9d5c9b8b9164d4aee3abb4e37c8f28d98ff5d1
      Partially-Implements: blueprint performance-improvements
      ade5bfa3
  16. May 30, 2021
  17. Mar 02, 2021
  18. Dec 14, 2020
    • Mark Goddard's avatar
      Revert "Performance: Use import_tasks in the main plays" · db4fc85c
      Mark Goddard authored
      This reverts commit 9cae59be.
      
      Reason for revert: This patch was found to introduce issues with fluentd customisation. The underlying issue is not currently fully understood, but could be a sign of other obscure issues.
      
      Change-Id: Ia4859c23d85699621a3b734d6cedb70225576dfc
      Closes-Bug: #1906288
      db4fc85c
  19. Oct 27, 2020
    • Radosław Piliszek's avatar
      Do not set 'always' tag where unnecessary · 71e9c603
      Radosław Piliszek authored
      Makes 'import_tasks' not change behaviour compared to
      'include_tasks'.
      
      Change-Id: I600be7c3bd763b3b924bd4a45b4e7b4dca7a33e3
      71e9c603
    • Radosław Piliszek's avatar
      Performance: Use import_tasks in the main plays · 9cae59be
      Radosław Piliszek authored
      Main plays are action-redirect-stubs, ideal for import_tasks.
      
      This avoids 'include' penalty and makes logs/ara look nicer.
      
      Fixes haproxy and rabbitmq not to check the host group as well.
      
      Change-Id: I46136fc40b815e341befff80b54a91ef431eabc0
      Partially-Implements: blueprint performance-improvements
      9cae59be
  20. Oct 12, 2020
    • Radosław Piliszek's avatar
      Performance: optimize genconfig · 3411b9e4
      Radosław Piliszek authored
      Config plays do not need to check containers. This avoids skipping
      tasks during the genconfig action.
      
      Ironic and Glance rolling upgrades are handled specially.
      
      Swift and Bifrost do not use the handlers at all.
      
      Partially-Implements: blueprint performance-improvements
      Change-Id: I140bf71d62e8f0932c96270d1f08940a5ba4542a
      3411b9e4
  21. Oct 05, 2020
    • Michal Nasiadka's avatar
      Use Docker healthchecks for core services · c52a89ae
      Michal Nasiadka authored
      This change enables the use of Docker healthchecks for core OpenStack
      services.
      Also check-failures.sh has been updated to treat containers with
      unhealthy status as failed.
      
      Implements: blueprint container-health-check
      Change-Id: I79c6b11511ce8af70f77e2f6a490b59b477fefbb
      c52a89ae
  22. Sep 21, 2020
  23. Sep 18, 2020
  24. Aug 28, 2020
  25. Aug 22, 2020
  26. Jul 28, 2020
    • Mark Goddard's avatar
      Performance: use import_tasks for check-containers.yml · 9702d4c3
      Mark Goddard authored
      Including tasks has a performance penalty when compared with importing
      tasks. If the include has a condition associated with it, then the
      overhead of the include may be lower than the overhead of skipping all
      imported tasks. In the case of the check-containers.yml include, the
      included file only has a single task, so the overhead of skipping this
      task will not be greater than the overhead of the task import. It
      therefore makes sense to switch to use import_tasks there.
      
      Partially-Implements: blueprint performance-improvements
      
      Change-Id: I65d911670649960708b9f6a4c110d1a7df1ad8f7
      9702d4c3
  27. Jul 17, 2020
    • Radosław Piliszek's avatar
      Make /dev/kvm permissions handling more robust · 202365e7
      Radosław Piliszek authored
      This makes use of udev rules to make it smarter and override
      host-level packages settings.
      Additionally, this masks Ubuntu-only service that is another
      pain point in terms of /dev/kvm permissions.
      Fingers crossed for no further surprises.
      
      Change-Id: I61235b51e2e1325b8a9b4f85bf634f663c7ec3cc
      Closes-bug: #1681461
      202365e7
  28. Jul 08, 2020
    • Mark Goddard's avatar
      Load br_netfilter module in nova-cell role · 2f91be9f
      Mark Goddard authored
      The nova-cell role sets the following sysctls on compute hosts, which
      require the br_netfilter kernel module to be loaded:
      
          net.bridge.bridge-nf-call-iptables
          net.bridge.bridge-nf-call-ip6tables
      
      If it is not loaded, then we see the following errors:
      
          Failed to reload sysctl:
          sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory
          sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory
      
      Loading the br_netfilter module resolves this issue.
      
      Typically we do not see this since installing Docker and configuring it
      to manage iptables rules causes the br_netfilter module to be loaded.
      There are good reasons [1] to disable Docker's iptables management
      however, in which case we are likely to hit this issue.
      
      This change loads the br_netfilter module in the nova-cell role for
      compute hosts.
      
      [1] https://bugs.launchpad.net/kolla-ansible/+bug/1849275
      
      
      
      Co-Authored-By: default avatarDincer Celik <hello@dincercelik.com>
      
      Change-Id: Id52668ba8dab460ad4c33fad430fc8611e70825e
      2f91be9f
  29. Jun 16, 2020
  30. Jun 09, 2020
  31. Jun 07, 2020
  32. Apr 16, 2020
    • Michal Nasiadka's avatar
      Ansible lint: lines longer than 160 chars · d403690b
      Michal Nasiadka authored
      Change-Id: I500cc8800c412bc0e95edb15babad5c1189e6ee4
      d403690b
    • Mark Goddard's avatar
      Fix nova cell message queue URL with separate notification queue · e8ad5f37
      Mark Goddard authored
      If using a separate message queue for nova notifications, i.e.
      nova_cell_notify_transport_url is different from
      nova_cell_rpc_transport_url, then Kolla Ansible will unnecessarily
      update the cell. This should not cause any issues since the URL is taken
      from nova.conf.
      
      This change fixes the comparison to use the correct URL.
      
      Change-Id: I5f0e30957bfd70295f2c22c86349ebbb4c1fb155
      Closes-Bug: #1873255
      e8ad5f37
  33. Apr 14, 2020
    • Mark Goddard's avatar
      Fix nova compute addition with limit · 3af28d21
      Mark Goddard authored
      Deploy a small cloud. Add one host to the compute group in the
      inventory, and scale out:
      
      $ kolla-ansible deploy --limit <new compute host>
      
      The command succeeds, but creating an instance fails with the following:
      
          Host 'compute0' is not mapped to any cell
      
      This happens because we only discover computes on the first host in the
      cell's nova conductor group. If that host is not in the specified limit,
      the discovery will not happen.
      
      This change fixes the issue by running compute discovery when any ironic
      or virtualised compute hosts are in the play batch, and delegating it to
      a conductor.
      
      Change-Id: Ie984806240d147add825ffa8446ae6ff55ca4814
      Closes-Bug: #1869371
      3af28d21
    • James Kirsch's avatar
      Refactor copy certificates task · 4d155d69
      James Kirsch authored
      Refactor service configuration to use the copy certificates task. This
      reduces code duplication and simplifies implementing encrypting backend
      HAProxy traffic for individual services.
      
      Change-Id: I0474324b60a5f792ef5210ab336639edf7a8cd9e
      4d155d69
  34. Apr 08, 2020
    • Mark Goddard's avatar
      Perform host configuration during upgrade · 1d70f509
      Mark Goddard authored
      This is a follow up to I001defc75d1f1e6caa9b1e11246abc6ce17c775b. To
      maintain previous behaviour, and ensure we catch any host configuration
      changes, we should perform host configuration during upgrade.
      
      Change-Id: I79fcbf1efb02b7187406d3c3fccea6f200bcea69
      Related-Bug: #1860161
      1d70f509
Loading