Commits · f1d3ff11d0e43b7e70fe9c36709975d96dfa86e8 · Very Demiurge Very Mindful / Kolla Ansible

Mar 29, 2022

nova: improve compute service registration failure handling · f1d3ff11

Mark Goddard authored 3 years ago

If any nova compute service fails to register itself, Kolla Ansible will
fail the host that queries the Nova API. This is the first compute host
in the inventory, and fails in the task:

    Waiting for nova-compute services to register themselves

Other hosts continue, often leading to further errors later on. Clearly
this is not idea.

This change modifies the behaviour to query the compute service list
until all expected hosts are present, but does not fail the querying
host if they are not. A new task is added that executes for all hosts,
and fails only those hosts that have not registered successfully.

Alternatively, to fail all hosts in a cell when any compute service
fails to register, set nova_compute_registration_fatal to true.

Change-Id: I12c1928cf1f1fb9e28f1741e7fe4968004ea1816
Closes-Bug: #1940119

f1d3ff11

Mar 25, 2022

Change grafana provisioning.yaml indentation · 3d91e69a

Jan Horstmann authored 3 years ago

This commit changes the indentation scheme used in
`ansible/roles/grafana/templates/provisioning.yaml.j2` to the commonly
used pattern of two whitespaces.

Change-Id: I2f9d34930ed06aa2e63f7cc28bfdda7046fc3e67

3d91e69a

Mar 24, 2022

re-add rabbitmq config for clustering interface · 15992524

Sven Kieske authored 4 years ago

this adds back the ability to configure
the rabbitmq/erlang kernel network interface
which was removed in https://review.opendev.org/#/c/584427/
seemingly by accident.

Closes-Bug: 1900160

Change-Id: I6f00396495853e117429c17fadfafe809e322a31

Unverified

15992524

designate: Allow to disable notifications · a19e1eb4

Michał Nasiadka authored 3 years ago

Designate sink is an optional service that consumes notifications,
users should have an option to disable it when they don't use them.

Change-Id: I1d5465d9845aea94cff39ff5158cd8b1dccc4834

a19e1eb4

designate: allow designate_ns_record to be a list · f193d1af

Michał Nasiadka authored 3 years ago

In most real world deployments, there will be multiple backend DNS
servers, allow to specify all of them for the pool configuration.

Change-Id: Ic9737d0446a807891b429f080ae1bf048a3c8e4a

f193d1af

Mar 23, 2022

monasca: Remove monasca-grafana leftovers · 8fe98720

Michal Nasiadka authored 3 years ago

In Xena [1] we removed Monasca Grafana service, but some components were left
to support cleanup operations.

[1]: https://review.opendev.org/c/openstack/kolla-ansible/+/788228

Change-Id: Iccc7bc3628bb7cbab1ac28f41c7b7dc7695894c6

8fe98720

Mar 22, 2022

Enable memcached backend for mod_auth_openidc · 3ca80504
Will Szumski authored 3 years ago
```
Change-Id: Ie87a7488dad369464793b47c3d2db67d7dc1694e
```
3ca80504

designate: fix external backend deployment · bbd54f7c

Daniel Meyerholt authored 3 years ago


The backend external tasks which utilize an existing bind9
installation require appropriate permissions to be able to
copy rndc config and key.

Closes-Bug: #1912063
Change-Id: Ie50228a26d635e3db82e41ec266ab820bf58938e
Signed-off-by: Daniel Meyerholt <dxm523@gmail.com>

bbd54f7c

Mar 21, 2022

Ironic: rebootstrap ironic-pxe on upgrade · 1db06b32

Radosław Piliszek authored 3 years ago

Like other containers.

This ensures that upgrade already updates PXE components and no
additional deploy/reconfigure is needed.

Closes-Bug: #1963752
Change-Id: I368780143086bc5baab1556a5ec75c19950d5e3c

1db06b32

Support Prometheus as metrics database for Ceilometer · 6cf03122

Juan Pablo Suazo authored 3 years ago


This commit adds support for pushing Ceilometer metrics
to Prometheus instead of Gnocchi or alongside it.


Closes-Bug: #1964135
Signed-off-by: Juan Pablo Suazo <jsuazo@whitestack.com>
Change-Id: I9fd32f63913a534c59e2d17703702074eea5dd76

6cf03122

libvirt: add nova-libvirt-cleanup command · 80b311be

Mark Goddard authored 3 years ago

Change Ia1239069ccee39416b20959cbabad962c56693cf added support for
running a libvirt daemon on the host, rather than using the nova_libvirt
container. It did not cover migration of existing hosts from using a
container to using a host daemon.

This change adds a kolla-ansible nova-libvirt-cleanup command which may
be used to clean up the nova_libvirt container, volumes and related
items on hosts, once it has been disabled.

The playbook assumes that compute hosts have been emptied of VMs before
it runs. A future extension could support migration of existing VMs, but
this is currently out of scope.

Change-Id: I46854ed7eaf1d5b5e3ccd8531c963427848bdc99

80b311be

libvirt: make it possible to run libvirt on the host · 4e41acd8

Mark Goddard authored 3 years ago

In some cases it may be desirable to run the libvirt daemon on the host.
For example, when mixing host and container OS distributions or
versions.

This change makes it possible to disable the nova_libvirt container, by
setting enable_nova_libvirt_container to false. The default values of
some Docker mounts and other paths have been updated to point to default
host directories rather than Docker volumes when using a host libvirt
daemon.

This change does not handle migration of existing systems from using
a nova_libvirt container to libvirt on the host.

Depends-On: https://review.opendev.org/c/openstack/ansible-collection-kolla/+/830504

Change-Id: Ia1239069ccee39416b20959cbabad962c56693cf

4e41acd8

Mar 18, 2022

[external-ceph] Use template instead of copy · 4c221be8

Imran Hussain authored 3 years ago


Consistently use template instead of copy. This has the added
advantage of allowing variables inside ceph conf files and keyrings.

Closes-Bug: 1959565

Signed-off-by: Imran Hussain <ih@imranh.co.uk>
Change-Id: Ibd0ff2641a54267ff06d3c89a26915a455dff1c1

4c221be8

Mar 17, 2022

ADD venus for kolla-ansible · 3ccb176f

jinyuanliu authored 3 years ago

This project [1] can provide a one-stop solution to log collection,
cleaning, indexing, analysis, alarm, visualization, report generation
and other needs, which involves helping operator or maintainer to
quickly solve retrieve problems, grasp the operational health of the
platform, and improve the level of platform management.

[1] https://wiki.openstack.org/wiki/Venus

Change-Id: If3562bbed6181002b76831bab54f863041c5a885

3ccb176f

Mar 10, 2022

libvirt: support SASL authentication · d2d4b53d

Mark Goddard authored 3 years ago

In Kolla Ansible OpenStack deployments, by default, libvirt is
configured to allow read-write access via an unauthenticated,
unencrypted TCP connection, using the internal API network. This is to
facilitate migration between hosts.

By default, Kolla Ansible does not use encryption for services on the
internal network (and did not support it until Ussuri). However, most
other services on the internal network are at least authenticated
(usually via passwords), ensuring that they cannot be used by anyone
with access to the network, unless they have credentials.

The main issue here is the lack of authentication. Any client with
access to the internal network is able to connect to the libvirt TCP
port and make arbitrary changes to the hypervisor. This could include
starting a VM, modifying an existing VM, etc. Given the flexibility of
the domain options, it could be seen as equivalent to having root access
to the hypervisor.

Kolla Ansible supports libvirt TLS [1] since the Train release, using
client and server certificates for mutual authentication and encryption.
However, this feature is not enabled by default, and requires
certificates to be generated for each compute host.

This change adds support for libvirt SASL authentication, and enables it
by default. This provides base level of security. Deployments requiring
further security should use libvirt TLS.

[1] https://docs.openstack.org/kolla-ansible/latest/reference/compute/libvirt-guide.html#libvirt-tls

Depends-On: https://review.opendev.org/c/openstack/kolla/+/833021
Closes-Bug: #1964013
Change-Id: Ia91ceeb609e4cdb144433122b443028c0278b71e

d2d4b53d

Mar 08, 2022

Adds etcd endpoints as a Prometheus scrape target · 0f2794a0

Nathan Taylor authored 3 years ago

Add "enable_prometheus_etcd_integration" configuration parameter which
can be used to configure Prometheus to scrape etcd metrics endpoints.
The default value of "enable_prometheus_etcd_integration" is set to
the combined values of "enable_prometheus" and "enable_etcd".

Change-Id: I7a0b802c5687e2d508e06baf55e355d9761e806f

0f2794a0

Mar 07, 2022

Explicitly unset net.ipv4.ip_forward sysctl · caf33be5

Mark Goddard authored 3 years ago

While I8bb398e299aa68147004723a18d3a1ec459011e5 stopped setting
the net.ipv4.ip_forward sysctl, this change explicitly removes the
option from the Kolla sysctl config file. In the absence of another
source for this sysctl, it should revert to the default of 0 after the
next reboot.

A deployer looking to more aggressively change the value may set
neutron_l3_agent_host_ipv4_ip_forward to 0. Any deployments still
relying on the previous value may set
neutron_l3_agent_host_ipv4_ip_forward to 1.

Related-Bug: #1945453

Change-Id: I9b39307ad8d6c51e215fe3d3bc56aab998d218ec

caf33be5

Mar 04, 2022

[TrivialFix] Remove old comment · 833c45ea

Radosław Piliszek authored 3 years ago

Ironic is dropping default_boot_option and the new default has
been around for quite a while now so let's remove this old
scary comment.

Change-Id: I80d645cb97251ac63e04d7ec1c87d4600d17d4ee

833c45ea

Fix prechecks for "Ironic iPXE" container · 19c5f2f0

Radosław Piliszek authored 3 years ago

Since I30c2ad2bf2957ac544942aefae8898cdc8a61ec6 this container
is always enabled and thus the port should always be checked.

Change-Id: I94a70d89123611899872061bd69593280d0a68c4

19c5f2f0

Ironic: Avoid setting deprecated pxe_append_params · 87f75863
Radosław Piliszek authored 3 years ago
```
Set kernel_append_params instead.

Change-Id: I4fb42d376636dc363cd86950ed37de4a3d28df73
```
87f75863

Mar 03, 2022

Add Rocky Linux support as Host OS · 7080ccfc

Michal Nasiadka authored 3 years ago

Depends-On: https://review.opendev.org/c/openstack/ansible-collection-kolla/+/831642
Change-Id: I70dcd2d0cade52a23b3e219b7e0aaa31193ec938

7080ccfc

Mar 02, 2022
- rabbitmq: add node parameter in rabbitmq_user call · 38729dc3
  IDerr authored 3 years ago
  
  Change-Id: I4cf48620f03d67ea4a9ef327afbf3b1ebe28550b Closes-Bug: #1946506
  38729dc3
Feb 24, 2022
- Add openvswitch and prometheus to logrotate · 80ee3f2e
  Juan Pablo Suazo authored 3 years ago
  
  Closes-Bug: #1961795 Change-Id: I5547cce5c389846ed216bb898b78e45b8f231e1e
  80ee3f2e
Feb 23, 2022
- Fix hard coded OIDC response type · d3219727
  Piotr Parczewski authored 3 years ago
  
  Closes-bug: 1959781 Change-Id: If574d2242aa6a875dcf624d95495e6cec6fefddd
  d3219727
Feb 22, 2022

Fix location of release note for ironic-neutron-agent healthcheck · a6768dd3
Mark Goddard authored 3 years ago
```
TrivialFix

Change-Id: Id85a5d69e1222b616705e24885252425c92af527
```
a6768dd3

Remove grafana [session] configuration · f3756282

Pierre Riteau authored 3 years ago

These configuration settings were removed in Grafana 6.2. Instead we can
use [remote_cache], but it is not required since it will use database
settings by default.

Change-Id: I37966027aea9039b2ecba4214444507e9d87f513

f3756282

Feb 21, 2022

Remove classic queue mirroring for internal RabbitMQ · 6bfe1927

Doug Szumski authored 3 years ago

When OpenStack is deployed with Kolla-Ansible, by default there
are no durable queues or exchanges created by the OpenStack
services in RabbitMQ. In Rabbit terminology, not being durable
is referred to as `transient`, and this means that the queue
is generally held in memory.

Whether OpenStack services create durable or transient queues is
traditionally controlled by the Oslo Notification config option:
`amqp_durable_queues`. In Kolla-Ansible, this remains set to
the default of `False` in all services. The only `durable`
objects are the `amq*` exchanges which are internal to RabbitMQ.

More recently, Oslo Notification has introduced support for
Quorum queues [7]. These are a successor to durable classic
queues, however it isn't yet clear if they are a good fit for
OpenStack in general [8].

For clustered RabbitMQ deployments, Kolla-Ansible configures all
queues as `replicated` [1]. Replication occurs over all nodes
in the cluster. RabbitMQ refers to this as 'mirroring of classic
queues'.

In summary, this means that a multi-node Kolla-Ansible deployment
will end up with a large number of transient, mirrored queues
and exchanges. However, the RabbitMQ documentation warns against
this, stating that 'For replicated queues, the only reasonable
option is to use durable queues: [2]`. This is discussed
further in the following bug report: [3].

Whilst we could try enabling the `amqp_durable_queues` option
for each service (this is suggested in [4]), there are
a number of complexities with this approach, not limited to:

1) RabbitMQ is planning to remove classic queue mirroring in
   favor of 'Quorum queues' in a forthcoming release [5].
2) Durable queues will be written to disk, which may cause
   performance problems at scale. Note that this includes
   Quorum queues which are always durable.
3) Potential for race conditions and other complexity
   discussed recently on the mailing list under:
   `[ops] [kolla] RabbitMQ High Availability`

The remaining option, proposed here, is to use classic
non-mirrored queues everywhere, and rely on services to recover
if the node hosting a queue or exchange they are using fails.
There is some discussion of this approach in [6]. The downside
of potential message loss needs to be weighed against the real
upsides of increasing the performance of RabbitMQ, and moving
to a configuration which is officially supported and hopefully
more stable. In the future, we can then consider promoting
specific queues to quorum queues, in cases where message loss
can result in failure states which are hard to recover from.

[1] https://www.rabbitmq.com/ha.html
[2] https://www.rabbitmq.com/queues.html
[3] https://github.com/rabbitmq/rabbitmq-server/issues/2045
[4] https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit
[5] https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/
[6] https://fuel-ccp.readthedocs.io/en/latest/design/ref_arch_1000_nodes.html#replication
[7] https://bugs.launchpad.net/oslo.messaging/+bug/1942933
[8] https://www.rabbitmq.com/quorum-queues.html#use-cases

Partial-Bug: #1954925
Change-Id: I91d0e23b22319cf3fdb7603f5401d24e3b76a56e

6bfe1927

cloudkitty: fix URL used for Prometheus collector · b36c91b6

Pierre Riteau authored 3 years ago

The Prometheus HTTP API is reachable under /api/v1. Without this fix,
CloudKitty receives 404 errors from Prometheus.

Change-Id: Ie872da5ccddbcb8028b8b57022e2427372ed474e

b36c91b6

Install openstack.kolla collection · f63f1f30

Mark Goddard authored 3 years ago

This change adds an Ansible Galaxy requirements file including the
openstack.kolla collection. A new 'kolla-ansible install-deps' command
is provided to install the requirements.

With the new collection in place, this change also switches to using the
baremetal role from the openstack.kolla collection, and removes the
baremetal role from this repository.

Depends-On: https://review.opendev.org/c/openstack/ansible-collection-kolla/+/820168

Change-Id: I9708f57b4bb9d64eb4903c253684fe0d9147bd4a

f63f1f30

Feb 18, 2022

Configure node-exporter to report correct file system metrics · b210dcd6

Pierre Riteau authored 3 years ago

Without this configuration, all mount points are reporting the same
utilisation metrics [1]. With the rslave option, all root mounts from
the host are visible in the container, so we can remove the bind mounts
for /proc and /sys.

[1] https://github.com/prometheus/node_exporter#docker

Change-Id: I4087dc81f9d1fa5daa24b9df6daf1f9e1ccd702f
Closes-Bug: #1961438

b210dcd6

Add support for VMware First Class Disk (FCD) · 812e03f7

alecorps authored 3 years ago

An FCD, also known as an Improved Virtual Disk (IVD) or
Managed Virtual Disk, is a named virtual disk independent of
a virtual machine. Using FCDs for Cinder volumes eliminates
the need for shadow virtual machines.
This patch adds Kolla support.

Change-Id: Ic0b66269e6d32762e786c95cf6da78cb201d2765

812e03f7

Allow to define extra parameters for Prometheus exporters · dcba8297

Pierre Riteau authored 3 years ago

The following variables are added:

* prometheus_blackbox_exporter_cmdline_extras
* prometheus_elasticsearch_exporter_cmdline_extras
* prometheus_haproxy_exporter_cmdline_extras
* prometheus_memcached_exporter_cmdline_extras
* prometheus_mysqld_exporter_cmdline_extras
* prometheus_node_exporter_cmdline_extras
* prometheus_openstack_exporter_cmdline_extras

Change-Id: I5da2031b9367115384045775c515628e2acb1aa4

dcba8297

Feb 17, 2022

Add support for VMware NSXP · 458c8b13

Alban Lecorps authored 3 years ago

NSXP is the OpenStack support for the NSX Policy platform.
This is supported from neutron in the Stein version. This patch
adds Kolla support

This adds a new neutron_plugin_agent type 'vmware_nsxp'. The plugin
does not run any neutron agents.

Change-Id: I9e9d8f07e586bdc143d293e572031368af7f3fca

458c8b13

Feb 15, 2022
- CI: Fix new ansible-lint failures · fcdba9e8
  Michal Nasiadka authored 3 years ago
  
  Change-Id: I27b0e42fba93a35c6d878d108bf1e7fdebc9e3db
  fcdba9e8
Feb 11, 2022

Fix fluentd v1 buffer syntax issue · b3e2fcc7
Isaac Prior authored 4 years ago
```
Change-Id: I5b3ab3ab8153cda283dec772bf1393af0caf4137
Closes-Bug: 1919179
```
b3e2fcc7

Refactor fluentd syslog logging · b97832dd

Michal Nasiadka authored 4 years ago


Co-Authored-By: Mark Goddard <mark@stackhpc.com>

Change-Id: I75ca59d981bcd2dd51faa296ab0b4223a891f5cb

b97832dd

Feb 10, 2022

neutron: fix placement endpoint type configuration · 50edb94d
Pierre Riteau authored 3 years ago
```
Change-Id: I3362bd283eb7fb80f5da70f2a388f89f220617ea
Closes-Bug: #1960503
```
50edb94d

ironic: sync default inspection UEFI iPXE bootloader with Ironic · 556d9799

Mark Goddard authored 3 years ago

The bootloader used to boot Ironic nodes in UEFI boot mode during
inspection when iPXE is enabled has been changed from ipxe.efi to
snponly.efi. This is in line with the default UEFI iPXE bootloader used
in Ironic since the Xena release. The bootloader may be changed via
ironic_dnsmasq_uefi_ipxe_boot_file.

Note that snponly.efi was not available via in the ironic-pxe image
prior to I79e78dca550262fc86b092a036f9ea96b214ab48.

Related-Bug: #1959203

Change-Id: I879db340769cc1b076e77313dff15876e27fcac4

556d9799

Feb 09, 2022

[haproxy] optionally set socket to allow admin commands · f4bfab57

Imran Hussain authored 3 years ago


Allow operators to set haproxy socket to admin level.
This is done via the flag haproxy_socket_level_admin which
is set to "no" by default.

Closes-Bug: 1960215

Signed-off-by: Imran Hussain <ih@imranh.co.uk>
Change-Id: Ia0da89288d68f5803ace1934c013053f12343195

f4bfab57

Feb 08, 2022

octavia: drop warning about certificate changes · bede2a85

Mark Goddard authored 3 years ago

The change happened in Train, time to move on.

Change-Id: Ie58265284b2e6b4b30b24fc2f22dd4f5eec05d5b

bede2a85