Skip to content
Snippets Groups Projects
  1. Mar 29, 2022
    • Mark Goddard's avatar
      nova: improve compute service registration failure handling · f1d3ff11
      Mark Goddard authored
      If any nova compute service fails to register itself, Kolla Ansible will
      fail the host that queries the Nova API. This is the first compute host
      in the inventory, and fails in the task:
      
          Waiting for nova-compute services to register themselves
      
      Other hosts continue, often leading to further errors later on. Clearly
      this is not idea.
      
      This change modifies the behaviour to query the compute service list
      until all expected hosts are present, but does not fail the querying
      host if they are not. A new task is added that executes for all hosts,
      and fails only those hosts that have not registered successfully.
      
      Alternatively, to fail all hosts in a cell when any compute service
      fails to register, set nova_compute_registration_fatal to true.
      
      Change-Id: I12c1928cf1f1fb9e28f1741e7fe4968004ea1816
      Closes-Bug: #1940119
      f1d3ff11
  2. Mar 24, 2022
    • Michał Nasiadka's avatar
      designate: Allow to disable notifications · a19e1eb4
      Michał Nasiadka authored
      Designate sink is an optional service that consumes notifications,
      users should have an option to disable it when they don't use them.
      
      Change-Id: I1d5465d9845aea94cff39ff5158cd8b1dccc4834
      a19e1eb4
  3. Mar 21, 2022
    • Mark Goddard's avatar
      libvirt: add nova-libvirt-cleanup command · 80b311be
      Mark Goddard authored
      Change Ia1239069ccee39416b20959cbabad962c56693cf added support for
      running a libvirt daemon on the host, rather than using the nova_libvirt
      container. It did not cover migration of existing hosts from using a
      container to using a host daemon.
      
      This change adds a kolla-ansible nova-libvirt-cleanup command which may
      be used to clean up the nova_libvirt container, volumes and related
      items on hosts, once it has been disabled.
      
      The playbook assumes that compute hosts have been emptied of VMs before
      it runs. A future extension could support migration of existing VMs, but
      this is currently out of scope.
      
      Change-Id: I46854ed7eaf1d5b5e3ccd8531c963427848bdc99
      80b311be
    • Mark Goddard's avatar
      libvirt: make it possible to run libvirt on the host · 4e41acd8
      Mark Goddard authored
      In some cases it may be desirable to run the libvirt daemon on the host.
      For example, when mixing host and container OS distributions or
      versions.
      
      This change makes it possible to disable the nova_libvirt container, by
      setting enable_nova_libvirt_container to false. The default values of
      some Docker mounts and other paths have been updated to point to default
      host directories rather than Docker volumes when using a host libvirt
      daemon.
      
      This change does not handle migration of existing systems from using
      a nova_libvirt container to libvirt on the host.
      
      Depends-On: https://review.opendev.org/c/openstack/ansible-collection-kolla/+/830504
      
      Change-Id: Ia1239069ccee39416b20959cbabad962c56693cf
      4e41acd8
  4. Mar 10, 2022
    • Mark Goddard's avatar
      libvirt: support SASL authentication · d2d4b53d
      Mark Goddard authored
      In Kolla Ansible OpenStack deployments, by default, libvirt is
      configured to allow read-write access via an unauthenticated,
      unencrypted TCP connection, using the internal API network.  This is to
      facilitate migration between hosts.
      
      By default, Kolla Ansible does not use encryption for services on the
      internal network (and did not support it until Ussuri). However, most
      other services on the internal network are at least authenticated
      (usually via passwords), ensuring that they cannot be used by anyone
      with access to the network, unless they have credentials.
      
      The main issue here is the lack of authentication. Any client with
      access to the internal network is able to connect to the libvirt TCP
      port and make arbitrary changes to the hypervisor. This could include
      starting a VM, modifying an existing VM, etc. Given the flexibility of
      the domain options, it could be seen as equivalent to having root access
      to the hypervisor.
      
      Kolla Ansible supports libvirt TLS [1] since the Train release, using
      client and server certificates for mutual authentication and encryption.
      However, this feature is not enabled by default, and requires
      certificates to be generated for each compute host.
      
      This change adds support for libvirt SASL authentication, and enables it
      by default. This provides base level of security. Deployments requiring
      further security should use libvirt TLS.
      
      [1] https://docs.openstack.org/kolla-ansible/latest/reference/compute/libvirt-guide.html#libvirt-tls
      
      Depends-On: https://review.opendev.org/c/openstack/kolla/+/833021
      Closes-Bug: #1964013
      Change-Id: Ia91ceeb609e4cdb144433122b443028c0278b71e
      d2d4b53d
  5. Jan 12, 2022
  6. Dec 31, 2021
    • Pierre Riteau's avatar
      Move project_name and kolla_role_name to role vars · 56fc74f2
      Pierre Riteau authored
      Role vars have a higher precedence than role defaults. This allows to
      import default vars from another role via vars_files without overriding
      project_name (see related bug for details).
      
      Change-Id: I3d919736e53d6f3e1a70d1267cf42c8d2c0ad221
      Related-Bug: #1951785
      56fc74f2
  7. Dec 01, 2021
    • Pierre Riteau's avatar
      Update noVNC URL for noVNC >= 1.0.0 · 546122f1
      Pierre Riteau authored
      The documentation for novncproxy_base_url says:
      
          If using noVNC >= 1.0.0, you should use ``vnc_lite.html`` instead of
          ``vnc_auto.html``.
      
      While novnc packages in CentOS, Debian, and Ubuntu still provide
      vnc_auto.html for compatibility, this could be dropped in the future.
      
      Change-Id: I04883c877015c1835c8b6b2c8be1fb7156ceb340
      546122f1
  8. Sep 03, 2021
    • Radosław Piliszek's avatar
      Bump libvirtd memlock ulimit · 11d7233c
      Radosław Piliszek authored
      This is required for libvirtd with cgroupsv2 (Debian Bullseye and
      soon others).
      Otherwise, device attachments simply fail.
      The warning message suggests filtering will be disabled but it
      actually just fails the action entirely.
      
      Change-Id: Id1fbd49a31a6e6e51b667f646278b93897c05b21
      Closes-Bug: #1941940
      11d7233c
  9. Aug 30, 2021
    • Radosław Piliszek's avatar
      Restore libvirtd cgroupfs mount · 34c49b9d
      Radosław Piliszek authored
      It was removed in [1] as part of cgroupsv2 cleanup.
      However, the testing did not catch the fact that the legacy
      cgroups behaviour was actually still breaking despite latest
      Docker and setting to use host's cgroups namespace.
      
      [1] 286a03ba
      
      Closes-Bug: #1941706
      Change-Id: I629bb9e70a3fd6bd1e26b2ca22ffcff5e9e8c731
      34c49b9d
  10. Aug 16, 2021
  11. Aug 10, 2021
    • Radosław Piliszek's avatar
      Refactor and optimise image pulling · 9ff2ecb0
      Radosław Piliszek authored
      We get a nice optimisation by using a filtered loop instead
      of task skipping per service with 'when'.
      
      Partially-Implements: blueprint performance-improvements
      Change-Id: I8f68100870ab90cb2d6b68a66a4c97df9ea4ff52
      9ff2ecb0
  12. Aug 02, 2021
    • Michal Arbet's avatar
      Fix freezed spice console in horizon · c281a018
      Michal Arbet authored
      This trivial patch is setting "timeout tunnel" in haproxy's
      configuration for spicehtml5proxy. This option extends time
      when spice's websocket connection is closed, so spice will
      not be freezed. Default value is set to 1h as it is in novnc.
      
      Closes-Bug: #1938549
      Change-Id: I3a5cd98ecf4916ebd0748e7c08111ad0e4dca0b2
      c281a018
  13. Jun 23, 2021
    • Mark Goddard's avatar
      Use ansible_facts to reference facts · ade5bfa3
      Mark Goddard authored
      By default, Ansible injects a variable for every fact, prefixed with
      ansible_. This can result in a large number of variables for each host,
      which at scale can incur a performance penalty. Ansible provides a
      configuration option [0] that can be set to False to prevent this
      injection of facts. In this case, facts should be referenced via
      ansible_facts.<fact>.
      
      This change updates all references to Ansible facts within Kolla Ansible
      from using individual fact variables to using the items in the
      ansible_facts dictionary. This allows users to disable fact variable
      injection in their Ansible configuration, which may provide some
      performance improvement.
      
      This change disables fact variable injection in the ansible
      configuration used in CI, to catch any attempts to use the injected
      variables.
      
      [0] https://docs.ansible.com/ansible/latest/reference_appendices/config.html#inject-facts-as-vars
      
      Change-Id: I7e9d5c9b8b9164d4aee3abb4e37c8f28d98ff5d1
      Partially-Implements: blueprint performance-improvements
      ade5bfa3
  14. Jun 15, 2021
  15. May 30, 2021
  16. Apr 25, 2021
    • Radosław Piliszek's avatar
      Skip setting rp_filter by default · 7e81e20e
      Radosław Piliszek authored
      We don't do the best job with it and it's better to rely on users'
      and distros' default policies than try to water those down.
      
      Closes-Bug: #1837551
      Change-Id: I72b13adef60900fc31f1293c516030026f004216
      7e81e20e
  17. Mar 10, 2021
    • Michał Nasiadka's avatar
      Introduce nova_libvirt_logging_debug · eabdf1e9
      Michał Nasiadka authored
      In order to disable libvirt debug in CI (which takes vast amount of storage)
      this change introduces nova_libvirt_logging_debug and disables that in CI.
      
      Change-Id: I90bfd1b300ad3202ea4d139fda6d6beb44c5820f
      eabdf1e9
  18. Jan 26, 2021
    • Mark Goddard's avatar
      Persist nova libvirt secrets in a Docker volume · 1c63eb20
      Mark Goddard authored
      Libvirt may reasonably expect that its secrets directory
      (/etc/libvirt/secrets) is persistent. However, the nova_libvirt
      container does not map the secrets directory to a volume, so it will not
      survive a recreation of the container. Furthermore, if Cinder or Nova
      Ceph RBD integration is enabled, nova_libvirt's config.json includes an
      entry for /etc/libvirt/secrets which will wipe out the directory on a
      restart of the container.
      
      Previously, this appeared to cause an issue with encrypted volumes,
      which could fail to attach in certain situations as described in bug
      1821696. Nova has since made a related change, and the issue can no
      longer be reproduced. However, making the secret store persistent seems
      like a sensible thing to do, and may prevent hitting other corner cases.
      
      This change maps /etc/libvirt/secrets to a Docker volume in the
      nova_libvirt container.  We also modify config.json for the nova_libvirt
      container to merge the /etc/libvirt/secrets directory, to ensure that
      secrets added in the container during runtime are not overwritten when
      the container restarts.
      
      Change-Id: Ia7e923dddb77ff6db3c9160af931354a2b305e8d
      Related-Bug: #1821696
      1c63eb20
  19. Dec 16, 2020
  20. Oct 05, 2020
    • Michal Nasiadka's avatar
      Use Docker healthchecks for core services · c52a89ae
      Michal Nasiadka authored
      This change enables the use of Docker healthchecks for core OpenStack
      services.
      Also check-failures.sh has been updated to treat containers with
      unhealthy status as failed.
      
      Implements: blueprint container-health-check
      Change-Id: I79c6b11511ce8af70f77e2f6a490b59b477fefbb
      c52a89ae
  21. Sep 21, 2020
  22. Sep 17, 2020
  23. Aug 10, 2020
    • Mark Goddard's avatar
      Mount /etc/timezone based on host OS · 146b00ef
      Mark Goddard authored
      Previously we mounted /etc/timezone if the kolla_base_distro is debian
      or ubuntu. This would fail prechecks if debian or ubuntu images were
      deployed on CentOS. While this is not a supported combination, for
      correctness we should fix the condition to reference the host OS rather
      than the container OS, since that is where the /etc/timezone file is
      located.
      
      Change-Id: Ifc252ae793e6974356fcdca810b373f362d24ba5
      Closes-Bug: #1882553
      146b00ef
  24. Jul 17, 2020
    • Radosław Piliszek's avatar
      Make /dev/kvm permissions handling more robust · 202365e7
      Radosław Piliszek authored
      This makes use of udev rules to make it smarter and override
      host-level packages settings.
      Additionally, this masks Ubuntu-only service that is another
      pain point in terms of /dev/kvm permissions.
      Fingers crossed for no further surprises.
      
      Change-Id: I61235b51e2e1325b8a9b4f85bf634f663c7ec3cc
      Closes-bug: #1681461
      202365e7
  25. Jun 22, 2020
    • wu.chunyang's avatar
      nova-cell role clone failed · a9c94aee
      wu.chunyang authored
      when enable kolla_dev_mod, nova-cell role clones code failed,
      because we use nova-cell repository which is not exists.
      in fact, nova-cell role should use nova repository too
      
      Change-Id: I7fa62726d0d5b0aeb3bd5fa06dc0e59667f94fa0
      a9c94aee
  26. May 15, 2020
    • Jeffrey Zhang's avatar
      Configure RabbitMQ user tags in nova-cell role · 869e3f21
      Jeffrey Zhang authored
      The RabbitMQ 'openstack' user has the 'administrator' tag assigned via
      the RabbitMQ definitions.json file.
      
      Since the Train release, the nova-cell role also configures the RabbitMQ
      user, but omits the tag. This causes the tag to be removed from the
      user, which prevents it from accessing the management UI and API.
      
      This change adds support for configuring user tags to the
      service-rabbitmq role, and sets the administrator tag by default.
      
      Change-Id: I7a5d6fe324dd133e0929804d431583e5b5c1853d
      Closes-Bug: #1875786
      869e3f21
  27. Apr 09, 2020
    • Dincer Celik's avatar
      Introduce /etc/timezone to Debian/Ubuntu containers · 4b5df0d8
      Dincer Celik authored
      Some services look for /etc/timezone on Debian/Ubuntu, so we should
      introduce it to the containers.
      
      In addition, added prechecks for /etc/localtime and /etc/timezone.
      
      Closes-Bug: #1821592
      Change-Id: I9fef14643d1bcc7eee9547eb87fa1fb436d8a6b3
      4b5df0d8
  28. Feb 11, 2020
  29. Jan 30, 2020
    • Mark Goddard's avatar
      Python 3: Use distro_python_version for dev mode · 5a786436
      Mark Goddard authored
      In dev mode currently the python source is mounted under python2.7
      site-packages. This change fixes this to use the distro_python_version
      variable to ensure dev mode works with Python 3 images.
      
      Change-Id: Ieae3778a02f1b79023b4f1c20eff27b37f481077
      Partially-Implements: blueprint python-3
      5a786436
  30. Jan 10, 2020
    • Mark Goddard's avatar
      CentOS 8: Support variable image tag suffix · 9755c924
      Mark Goddard authored
      For the CentOS 7 to 8 transition, we will have a period where both
      CentOS 7 and 8 images are available. We differentiate these images via a
      tag - the CentOS 8 images will have a tag of train-centos8 (or
      master-centos8 temporarily).
      
      To achieve this, and maintain backwards compatibility for the
      openstack_release variable, we introduce a new 'openstack_tag' variable.
      This variable is based on openstack_release, but has a suffix of
      'openstack_tag_suffix', which is empty except on CentOS 8 where it has a
      value of '-centos8'.
      
      Change-Id: I12ce4661afb3c255136cdc1aabe7cbd25560d625
      Partially-Implements: blueprint centos-rhel-8
      9755c924
  31. Oct 16, 2019
    • Doug Szumski's avatar
      Support multiple nova cells · 78a828ef
      Doug Szumski authored
      
      This patch adds initial support for deploying multiple Nova cells.
      
      Splitting a nova-cell role out from the Nova role allows a more granular
      approach to deploying and configuring Nova services.
      
      A new enable_cells flag has been added that enables the support of
      multiple cells via the introduction of a super conductor in addition to
      cell-specific conductors. When this flag is not set (the default), nova
      is configured in the same manner as before - with a single conductor.
      
      The nova role now deploys the global services:
      
      * nova-api
      * nova-scheduler
      * nova-super-conductor (if enable_cells is true)
      
      The nova-cell role handles services specific to a cell:
      
      * nova-compute
      * nova-compute-ironic
      * nova-conductor
      * nova-libvirt
      * nova-novncproxy
      * nova-serialproxy
      * nova-spicehtml5proxy
      * nova-ssh
      
      This patch does not support using a single cell controller for managing
      more than one cell. Support for sharing a cell controller will be added
      in a future patch.
      
      This patch should be backwards compatible and is tested by existing CI
      jobs. A new CI job has been added that tests a multi-cell environment.
      
      ceph-mon has been removed from the play hosts list as it is not
      necessary - delegate_to does not require the host to be in the play.
      
      Documentation will be added in a separate patch.
      
      Partially Implements: blueprint support-nova-cells
      Co-Authored-By: default avatarMark Goddard <mark@stackhpc.com>
      Change-Id: I810aad7d49db3f5a7fd9a2f0f746fd912fe03917
      78a828ef
  32. Oct 01, 2019
    • Doug Szumski's avatar
      Copy Nova role as a basis for the Nova cell role · 952b5308
      Doug Szumski authored
      The idea is to factor out a role for deploying Nova related services
      to cells. Since all deployments use cells, this role can be used
      in both regular deployments which have just cell0 and cell1,
      and deployments with many cells.
      
      Partially Implements: blueprint support-nova-cells
      Change-Id: Ib1f36ec0a773c384f2c1eac1843782a3e766045a
      952b5308
  33. Sep 26, 2019
  34. Sep 19, 2019
    • Kris Lindgren's avatar
      Add support for libvirt+tls · f8cfccb9
      Kris Lindgren authored
      To securely support live migration between computenodes we should enable
      tls, with cert auth, instead of TCP with no auth support.
      
      Implements: blueprint libvirt-tls
      
      Change-Id: I22ea6233933c840b853fdcc8e03400b2bf577271
      f8cfccb9
  35. Sep 17, 2019
  36. Aug 22, 2019
    • Mark Goddard's avatar
      Remove stale nova-consoleauth variables · 67c59b1c
      Mark Goddard authored
      Nova-consoleauth support was removed in
      I099080979f5497537e390f531005a517ab12aa7a, but these variables were
      left.
      
      Change-Id: I1ce1631119bba991225835e8e409f11d53276550
      67c59b1c
  37. Jun 27, 2019
    • Mark Goddard's avatar
      Restart all nova services after upgrade · e6d2b922
      Mark Goddard authored
      During an upgrade, nova pins the version of RPC calls to the minimum
      seen across all services. This ensures that old services do not receive
      data they cannot handle. After the upgrade is complete, all nova
      services are supposed to be reloaded via SIGHUP to cause them to check
      again the RPC versions of services and use the new latest version which
      should now be supported by all running services.
      
      Due to a bug [1] in oslo.service, sending services SIGHUP is currently
      broken. We replaced the HUP with a restart for the nova_compute
      container for bug 1821362, but not other nova services. It seems we need
      to restart all nova services to allow the RPC version pin to be removed.
      
      Testing in a Queens to Rocky upgrade, we find the following in the logs:
      
      Automatically selected compute RPC version 5.0 from minimum service
      version 30
      
      However, the service version in Rocky is 35.
      
      There is a second issue in that it takes some time for the upgraded
      services to update the nova services database table with their new
      version. We need to wait until all nova-compute services have done this
      before the restart is performed, otherwise the RPC version cap will
      remain in place. There is currently no interface in nova available for
      checking these versions [2], so as a workaround we use a configurable
      delay with a default duration of 30 seconds. Testing showed it takes
      about 10 seconds for the version to be updated, so this gives us some
      headroom.
      
      This change restarts all nova services after an upgrade, after a 30
      second delay.
      
      [1] https://bugs.launchpad.net/oslo.service/+bug/1715374
      [2] https://bugs.launchpad.net/nova/+bug/1833542
      
      Change-Id: Ia6fc9011ee6f5461f40a1307b72709d769814a79
      Closes-Bug: #1833069
      Related-Bug: #1833542
      e6d2b922
  38. Jun 16, 2019
  39. May 31, 2019
Loading