- Apr 22, 2022
-
-
Mark Goddard authored
We run some nova tasks once per cell, using a condition to match a single host in the cell. In other similar tasks, we use run_once, which will fail all hosts if the task fails. Typically these tasks are critical, and that is desirable. However, with the approach used in nova-cell to support multiple cells, if a once-per-cell task fails, then other hosts will continue to execute, which could lead to unexpected results. This change adds any_errors_fatal to the plays or blocks that run these tasks. Closes-Bug: #1948694 Change-Id: I2a5871ccd4e8198171ef3239ce95f475f3e4b051
-
- Apr 05, 2022
-
-
Mark Goddard authored
This change addresses an issue in the nova-libvirt-cleanup command, added in I46854ed7eaf1d5b5e3ccd8531c963427848bdc99. Check for rc=1 pgrep command, since a lack of matches is a pass. Also, use bash for set -o pipefail. Change-Id: Iffda0dfffce8768324ffec55e629134c70e2e996
-
- Mar 29, 2022
-
-
Mark Goddard authored
If any nova compute service fails to register itself, Kolla Ansible will fail the host that queries the Nova API. This is the first compute host in the inventory, and fails in the task: Waiting for nova-compute services to register themselves Other hosts continue, often leading to further errors later on. Clearly this is not idea. This change modifies the behaviour to query the compute service list until all expected hosts are present, but does not fail the querying host if they are not. A new task is added that executes for all hosts, and fails only those hosts that have not registered successfully. Alternatively, to fail all hosts in a cell when any compute service fails to register, set nova_compute_registration_fatal to true. Change-Id: I12c1928cf1f1fb9e28f1741e7fe4968004ea1816 Closes-Bug: #1940119
-
- Mar 21, 2022
-
-
Mark Goddard authored
Change Ia1239069ccee39416b20959cbabad962c56693cf added support for running a libvirt daemon on the host, rather than using the nova_libvirt container. It did not cover migration of existing hosts from using a container to using a host daemon. This change adds a kolla-ansible nova-libvirt-cleanup command which may be used to clean up the nova_libvirt container, volumes and related items on hosts, once it has been disabled. The playbook assumes that compute hosts have been emptied of VMs before it runs. A future extension could support migration of existing VMs, but this is currently out of scope. Change-Id: I46854ed7eaf1d5b5e3ccd8531c963427848bdc99
-
Mark Goddard authored
In some cases it may be desirable to run the libvirt daemon on the host. For example, when mixing host and container OS distributions or versions. This change makes it possible to disable the nova_libvirt container, by setting enable_nova_libvirt_container to false. The default values of some Docker mounts and other paths have been updated to point to default host directories rather than Docker volumes when using a host libvirt daemon. This change does not handle migration of existing systems from using a nova_libvirt container to libvirt on the host. Depends-On: https://review.opendev.org/c/openstack/ansible-collection-kolla/+/830504 Change-Id: Ia1239069ccee39416b20959cbabad962c56693cf
-
- Mar 18, 2022
-
-
Imran Hussain authored
Consistently use template instead of copy. This has the added advantage of allowing variables inside ceph conf files and keyrings. Closes-Bug: 1959565 Signed-off-by:
Imran Hussain <ih@imranh.co.uk> Change-Id: Ibd0ff2641a54267ff06d3c89a26915a455dff1c1
-
- Mar 10, 2022
-
-
Mark Goddard authored
In Kolla Ansible OpenStack deployments, by default, libvirt is configured to allow read-write access via an unauthenticated, unencrypted TCP connection, using the internal API network. This is to facilitate migration between hosts. By default, Kolla Ansible does not use encryption for services on the internal network (and did not support it until Ussuri). However, most other services on the internal network are at least authenticated (usually via passwords), ensuring that they cannot be used by anyone with access to the network, unless they have credentials. The main issue here is the lack of authentication. Any client with access to the internal network is able to connect to the libvirt TCP port and make arbitrary changes to the hypervisor. This could include starting a VM, modifying an existing VM, etc. Given the flexibility of the domain options, it could be seen as equivalent to having root access to the hypervisor. Kolla Ansible supports libvirt TLS [1] since the Train release, using client and server certificates for mutual authentication and encryption. However, this feature is not enabled by default, and requires certificates to be generated for each compute host. This change adds support for libvirt SASL authentication, and enables it by default. This provides base level of security. Deployments requiring further security should use libvirt TLS. [1] https://docs.openstack.org/kolla-ansible/latest/reference/compute/libvirt-guide.html#libvirt-tls Depends-On: https://review.opendev.org/c/openstack/kolla/+/833021 Closes-Bug: #1964013 Change-Id: Ia91ceeb609e4cdb144433122b443028c0278b71e
-
- Jan 10, 2022
-
-
Radosław Piliszek authored
This is required as nova_compute tries to reach my_ip of the other node when resizing an instance and my_ip is set to api_interface_address. This potential issue was introduced with [1]. [1] https://review.opendev.org/c/openstack/kolla-ansible/+/569131 Closes-Bug: #1956976 Change-Id: Id57a672c69a2d5aa74e55f252d05bb756bbc945a
-
- Oct 27, 2021
-
-
Mark Goddard authored
This reverts commit 15259002. Reason for revert: The iptables_firewall produces warnings without it. Change-Id: Id046a3048436c4c18dd1fd9700ac9971d8c42c57
-
- Oct 01, 2021
-
-
Radosław Piliszek authored
Nor set related sysctls. More details in the reno. Change-Id: I898548ecc6df3caa094c3222159b7ba1e16dc211 Closes-Bug: #1945789
-
- Sep 28, 2021
-
-
Niklas Hagman authored
A system-scoped token implies the user has authorization to act on the deployment system. These tokens are useful for interacting with resources that affect the deployment as a whole, or exposes resources that may otherwise violate project or domain isolation. Since Queens, the keystone-manage bootstrap command assigns the admin role to the admin user with system scope, as well as in the admin project. This patch transitions the Keystone admin user from authenticating using project scoped tokens to system scoped tokens. This is a necessary step towards being able to enable the updated oslo policies in services that allow finer grained access to system-level resources and APIs. An etherpad with discussion about the transition to the new oslo service policies is: https://etherpad.opendev.org/p/enabling-system-scope-in-kolla-ansible Change-Id: Ib631e2211682862296cce9ea179f2661c90fa585 Signed-off-by:
Niklas Hagman <ubuntu@post.blinkiz.com>
-
- Aug 12, 2021
-
-
Michal Arbet authored
Kolla-ansible upgrade task is calling different handlers as deploy task and these handlers are missing healthcheck key. This patch is fixing this. Closes-Bug: #1939679 Change-Id: Id83d20bfd89c27ccf70a3a79938f428cdb5d40fc
-
- Aug 10, 2021
-
-
Radosław Piliszek authored
We get a nice optimisation by using a filtered loop instead of task skipping per service with 'when'. Partially-Implements: blueprint performance-improvements Change-Id: I8f68100870ab90cb2d6b68a66a4c97df9ea4ff52
-
- Aug 02, 2021
-
-
Michal Arbet authored
This trivial patch is setting "timeout tunnel" in haproxy's configuration for spicehtml5proxy. This option extends time when spice's websocket connection is closed, so spice will not be freezed. Default value is set to 1h as it is in novnc. Closes-Bug: #1938549 Change-Id: I3a5cd98ecf4916ebd0748e7c08111ad0e4dca0b2
-
- Jul 27, 2021
-
-
wu.chunyang authored
Nova always tries to create the rabbitmq user regardless of whether RabbitMQ is enabled or not. This ps also adds an external rabbitmq doc. Change-Id: Iec517226e4c82ea351889b55689a3efceaadcc76
-
- Jun 23, 2021
-
-
Mark Goddard authored
By default, Ansible injects a variable for every fact, prefixed with ansible_. This can result in a large number of variables for each host, which at scale can incur a performance penalty. Ansible provides a configuration option [0] that can be set to False to prevent this injection of facts. In this case, facts should be referenced via ansible_facts.<fact>. This change updates all references to Ansible facts within Kolla Ansible from using individual fact variables to using the items in the ansible_facts dictionary. This allows users to disable fact variable injection in their Ansible configuration, which may provide some performance improvement. This change disables fact variable injection in the ansible configuration used in CI, to catch any attempts to use the injected variables. [0] https://docs.ansible.com/ansible/latest/reference_appendices/config.html#inject-facts-as-vars Change-Id: I7e9d5c9b8b9164d4aee3abb4e37c8f28d98ff5d1 Partially-Implements: blueprint performance-improvements
-
- May 30, 2021
-
-
Radosław Piliszek authored
Makes nova-libvirt container always run in 'host' CgroupnsMode to ensure it works. Change-Id: I75105baf434977c68bc5c8ca1f5213e602c52c8c
-
- Mar 02, 2021
-
-
Michał Nasiadka authored
Change-Id: Ib6719a033b37be3e248b682795b7243c60b22b84
-
- Dec 14, 2020
-
-
Mark Goddard authored
This reverts commit 9cae59be. Reason for revert: This patch was found to introduce issues with fluentd customisation. The underlying issue is not currently fully understood, but could be a sign of other obscure issues. Change-Id: Ia4859c23d85699621a3b734d6cedb70225576dfc Closes-Bug: #1906288
-
- Oct 27, 2020
-
-
Radosław Piliszek authored
Makes 'import_tasks' not change behaviour compared to 'include_tasks'. Change-Id: I600be7c3bd763b3b924bd4a45b4e7b4dca7a33e3
-
Radosław Piliszek authored
Main plays are action-redirect-stubs, ideal for import_tasks. This avoids 'include' penalty and makes logs/ara look nicer. Fixes haproxy and rabbitmq not to check the host group as well. Change-Id: I46136fc40b815e341befff80b54a91ef431eabc0 Partially-Implements: blueprint performance-improvements
-
- Oct 12, 2020
-
-
Radosław Piliszek authored
Config plays do not need to check containers. This avoids skipping tasks during the genconfig action. Ironic and Glance rolling upgrades are handled specially. Swift and Bifrost do not use the handlers at all. Partially-Implements: blueprint performance-improvements Change-Id: I140bf71d62e8f0932c96270d1f08940a5ba4542a
-
- Oct 05, 2020
-
-
Michal Nasiadka authored
This change enables the use of Docker healthchecks for core OpenStack services. Also check-failures.sh has been updated to treat containers with unhealthy status as failed. Implements: blueprint container-health-check Change-Id: I79c6b11511ce8af70f77e2f6a490b59b477fefbb
-
- Sep 21, 2020
-
-
Radosław Piliszek authored
via KOLLA_SKIP and KOLLA_UNSET Change-Id: I7d9af21c2dd8c303066eb1ee4dff7a72bca24283 Related-Bug: #1837551
-
Radosław Piliszek authored
via kolla_sysctl_conf_path Change-Id: I09b20fa008a7fecedcb599b4792f24215179b853
-
- Sep 18, 2020
-
-
wu.chunyang authored
replace harcode 'internal' with {{ openstack_interface }} Change-Id: I885622967ffde2a7a1a08fedbde2eb0e4e330e22
-
- Aug 28, 2020
-
-
Mark Goddard authored
Including tasks has a performance penalty when compared with importing tasks. The nova-cell role uses include_tasks twice when generating certificates and keys for libvirt TLS. While a dynamic include makes sense here for a non-default feature, we can use one include rather than two with the same effect. Since this task runs against compute nodes the overhead is significant. See [1] for benchmarks of include_tasks and import_tasks. [1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/include-and-import.md Partially-Implements: blueprint performance-improvements Change-Id: Ic687d2f7d4625aede386e576ebb174da72142756
-
Mark Goddard authored
Including tasks has a performance penalty when compared with importing tasks. If the include has a condition associated with it, then the overhead of the include may be lower than the overhead of skipping all imported tasks. For unconditionally included tasks, switching to import_tasks provides a clear benefit. Benchmarking of include vs. import is available at [1]. This change switches from include_tasks to import_tasks where there is no condition applied to the include. [1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/include-and-import.md#task-include-and-import Partially-Implements: blueprint performance-improvements Change-Id: Ia45af4a198e422773d9f009c7f7b2e32ce9e3b97
-
- Aug 22, 2020
-
-
wu.chunyang authored
openstackclient doesn't supoort os-temant-name parameter use os-project-name instead of os-tenant-name https://docs.openstack.org/python-openstackclient/ussuri/cli/man/openstack.html Change-Id: Ibf17424c49118b4c3b7e621e04b43c8cdcf308a4
-
- Jul 28, 2020
-
-
Mark Goddard authored
Including tasks has a performance penalty when compared with importing tasks. If the include has a condition associated with it, then the overhead of the include may be lower than the overhead of skipping all imported tasks. In the case of the check-containers.yml include, the included file only has a single task, so the overhead of skipping this task will not be greater than the overhead of the task import. It therefore makes sense to switch to use import_tasks there. Partially-Implements: blueprint performance-improvements Change-Id: I65d911670649960708b9f6a4c110d1a7df1ad8f7
-
- Jul 17, 2020
-
-
Radosław Piliszek authored
This makes use of udev rules to make it smarter and override host-level packages settings. Additionally, this masks Ubuntu-only service that is another pain point in terms of /dev/kvm permissions. Fingers crossed for no further surprises. Change-Id: I61235b51e2e1325b8a9b4f85bf634f663c7ec3cc Closes-bug: #1681461
-
- Jul 08, 2020
-
-
Mark Goddard authored
The nova-cell role sets the following sysctls on compute hosts, which require the br_netfilter kernel module to be loaded: net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables If it is not loaded, then we see the following errors: Failed to reload sysctl: sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory Loading the br_netfilter module resolves this issue. Typically we do not see this since installing Docker and configuring it to manage iptables rules causes the br_netfilter module to be loaded. There are good reasons [1] to disable Docker's iptables management however, in which case we are likely to hit this issue. This change loads the br_netfilter module in the nova-cell role for compute hosts. [1] https://bugs.launchpad.net/kolla-ansible/+bug/1849275 Co-Authored-By:
Dincer Celik <hello@dincercelik.com> Change-Id: Id52668ba8dab460ad4c33fad430fc8611e70825e
-
- Jun 16, 2020
-
-
gugug authored
The double quotation is not necessary for include_tasks, this ps to cleanup it. Change-Id: I0701035d185fdf19286cced7fe51fc277511e4c1
-
- Jun 09, 2020
-
-
Christian Berendt authored
Change-Id: Iea3f4f3d2e5c6040c1e0bc7bfae8719cc7d8ac55
-
- Jun 07, 2020
-
-
wu.chunyang authored
non-root user has no permission to create directory under /opt directory. use "become: true" to resolve it. Change-Id: I155efc4b1e0691da0aaf6ef19ca709e9dc2d9168
-
- Apr 16, 2020
-
-
Michal Nasiadka authored
Change-Id: I500cc8800c412bc0e95edb15babad5c1189e6ee4
-
Mark Goddard authored
If using a separate message queue for nova notifications, i.e. nova_cell_notify_transport_url is different from nova_cell_rpc_transport_url, then Kolla Ansible will unnecessarily update the cell. This should not cause any issues since the URL is taken from nova.conf. This change fixes the comparison to use the correct URL. Change-Id: I5f0e30957bfd70295f2c22c86349ebbb4c1fb155 Closes-Bug: #1873255
-
- Apr 14, 2020
-
-
Mark Goddard authored
Deploy a small cloud. Add one host to the compute group in the inventory, and scale out: $ kolla-ansible deploy --limit <new compute host> The command succeeds, but creating an instance fails with the following: Host 'compute0' is not mapped to any cell This happens because we only discover computes on the first host in the cell's nova conductor group. If that host is not in the specified limit, the discovery will not happen. This change fixes the issue by running compute discovery when any ironic or virtualised compute hosts are in the play batch, and delegating it to a conductor. Change-Id: Ie984806240d147add825ffa8446ae6ff55ca4814 Closes-Bug: #1869371
-
James Kirsch authored
Refactor service configuration to use the copy certificates task. This reduces code duplication and simplifies implementing encrypting backend HAProxy traffic for individual services. Change-Id: I0474324b60a5f792ef5210ab336639edf7a8cd9e
-
- Apr 08, 2020
-
-
Mark Goddard authored
This is a follow up to I001defc75d1f1e6caa9b1e11246abc6ce17c775b. To maintain previous behaviour, and ensure we catch any host configuration changes, we should perform host configuration during upgrade. Change-Id: I79fcbf1efb02b7187406d3c3fccea6f200bcea69 Related-Bug: #1860161
-