Commits · 6cf22b0cb1f2dc4d8910409284fa5757a7dd67a1 · Very Demiurge Very Mindful / Kolla Ansible

Feb 14, 2023

Improve RabbitMQ performance by reducing ha replicas · 6cf22b0c

John Garbutt authored 3 years ago

Currently we do not follow the RabbitMQ advice on replicas here:
https://www.rabbitmq.com/ha.html#replication-factor

Here we reduce the number of replicas to n // 2 + 1 as advised
above. The hope it this helps speed up recovery from rabbit
issues.

Related-Bug: #1954925
Change-Id: Ib6bcb26c499c9884faa4a0cd51abaec00cacb096

6cf22b0c

Add flag to change RabbitMQ ha-mode definition · e13072a9

Matt Crees authored 2 years ago

Adds the flag `rabbitmq_ha_replica_count` to change how many different
nodes a queue should be mirrored across. If the value is not set, then
it defaults to "ha-mode":"all". This value is unset by default to avoid
any unexpected changes to the RabbitMQ definitions.json file, as that
would trigger an unexpected restart of RabbitMQ during the next deploy.

Change-Id: Iee98cd937197a73a3b04aa8501fa325e8ecfff24

e13072a9

Feb 09, 2023

RabbitMQ: Support setting ha-promote-on-shutdown · 94f3ce0c

John Garbutt authored 3 years ago

By default ha-promote-on-shutdown=when-synced. However we are seeing
issues with RabbitMQ automatically recovering when nodes are restarted.
https://www.rabbitmq.com/ha.html#cluster-shutdown

Rather than waiting for operator interventions, it is better we allow
recovery to happen, even if that means we may loose some messages.
A few failed and timed out operations is better than a totaly broken
cloud. This is achieved using ha-promote-on-shutdown=always.

Note, when a node failure is detected, this is already the default
behaviour from 3.7.5 onwards:
https://www.rabbitmq.com/ha.html#promoting-unsynchronised-mirrors

This patch adds the option to change the ha-promote-on-shutdown
definition, using the flag `rabbitmq_ha_promote_on_shutdown`. This
value is unset by default to avoid any unexpected changes to the
RabbitMQ definitions.json file, as that would trigger an unexpected
restart of RabbitMQ during the next deploy.

Related-Bug: #1954925

Change-Id: I2146bda2c72ddac2c9923c6941b0596395fd9ab5

94f3ce0c

Feb 03, 2023

Do not support dimensions:kernel_memory on Docker API 1.42 · f253f99c

Michal Nasiadka authored 2 years ago

It is deprecated in 20.10 and removed in 23.0 (and 23.0 is out) [1], [2].

[1]: https://docs.docker.com/engine/deprecated/#kernel-memory-limit
[2]: https://docs.docker.com/engine/api/version-history/#v142-api-changes

Change-Id: Ia6fa85172aad7bcd5f958922d3c224ef79882e6c

f253f99c

Jan 29, 2023

Remove support for Ubuntu Focal 20.04 hosts · 6db6bc0a

Bartosz Bezak authored 2 years ago

Users running on a Focal host will now fail in prechecks.

Change-Id: Icaef4b25458490e46f623b055658abc678d2f1c6

6db6bc0a

Jan 26, 2023

Remove system scope token to access services · 283fa242

Ghanshyam Mann authored 2 years ago

As per the RBAC new direction in Zed cycle, we have dropped the
system scope from API policies and all the policies are hardcoded
to project scoped so that any user accessing APIs using system scope
will get 403 error. It is dropped from all the OpenStack services
except for the Ironic service which will have system scope and to
support ironic only deployment, we are keeping system as well as project
scope in Keystone.

Complete discussion and direction can be found in the below gerrit
change and TC goal direction:

- https://review.opendev.org/c/openstack/governance/+/847418
- https://governance.openstack.org/tc/goals/selected/consistent-and-secure-rbac.html#the-issues-we-are-facing-with-scope-concept

As phase-2 of RBAC goal, services will start enabling the new
defaults and project scope by default. For example: Nova did in
- https://review.opendev.org/c/openstack/nova/+/866218

Kolla who start accessing the services using system scope token
- https://review.opendev.org/c/openstack/kolla-ansible/+/692179

This commit partially revert the above change except keeping
system scope usage for Keystone and Ironic. Rest all services are changed
to use the project scope token.

And enable the scope and new defaults for Nova which was disabled
by https://review.opendev.org/c/openstack/kolla-ansible/+/870804

Change-Id: I0adbe0a6c39e11d7c9542569085fc5d580f26c9d

283fa242

Jan 23, 2023

Adding optional delay between l3 agent restarts · 391aa467

Alex-Welsh authored 2 years ago

This change serialises the neutron l3 agent restart process and adds a
user configurable delay between restarts. This can prevent connectivity
loss due to all agents being restarted at the same time.

Routers increase the recovery time, making this issue more prevalent.

Change-Id: I3be0ebfa12965e6ae32d1b5f13f8fd23c3f52b8c

391aa467

Jan 20, 2023

Set scheduler.max_attempts for nova conductor · 0b62db7c

Stanislav Dmitriev authored 2 years ago

In order to honour configured max number of attempts
it has to be presented in nova.conf inside of
nova_conductor container, otherwise the default value
of 3 will be used

Closes-Bug: #2003587
Change-Id: I928af332b8658223444594f96417830233057284

0b62db7c

Jan 19, 2023

Add systemd container control · 4866017e

Martin Hiner authored 3 years ago


This commit adds SystemdWorker class to kolla_docker ansible module.
It is used to manage container state via systemd calls.

Change-Id: I20e65a6771ebeee462a3aaaabaa5f0596bdd0581
Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Signed-off-by: Martin Hiner <m.hiner@partner.samsung.com>

4866017e

Jan 17, 2023

Add ability to configure rabbitmq · 701dc20f

Michal Arbet authored 2 years ago

As rabbitmq's configuration file is not ini or yaml file,
there is no option to extend configuration by new config
options via merge_configs or merge_yaml.

This patch moves config options to dictionary
so it can be overriden in /etc/kolla/globals.yml.

Change-Id: I5cd772f4fb80a0e200fb24d67be735ca81e3fdeb

701dc20f

Disable new defaults and scope for Nova API policies (RBAC) · 941abf9e

Pierre Riteau authored 2 years ago

Nova changes to RBAC [1] are breaking Kolla Ansible and causing most CI
jobs to fail. Disable these changes until we can adapt.

[1] https://review.opendev.org/c/openstack/nova/+/866218

Change-Id: I506697d2b374e74a6b066c788bd2d61edc8d4876

941abf9e

Jan 16, 2023

Remove [trustee]/auth_uri option from heat.conf · 943fedee

Pierre Riteau authored 2 years ago

According to the code, docs and oslo-config-validator, this
configuration option is not supported.

Change-Id: I34410e5267d527ec629748f35771f227183810b6

943fedee

Remove use_forwarded_for configuration option · bbe8374f

Pierre Riteau authored 2 years ago

This option has never been supported by Glance.

Change-Id: I08113292ec862d6ef72b870dcf12577bf02d3771

bbe8374f

Fix issue with genconfig and octavia_auto_configure · 2bf4d4db

Will Szumski authored 2 years ago

Makes sure the facts required to generate octavia.conf are available
when using genconfig.

This change also ensures that the necessary tasks run when using Ansible
check mode.

Closes-Bug: #1987299
Change-Id: Ib8fbee2d3abdcfd2eae0f9b3e9b69eeb0e3086e0

2bf4d4db

Jan 13, 2023

Add a flag to handle RabbitMQ high availability · 09df6fc1

Matt Crees authored 2 years ago

A combination of durable queues and classic queue mirroring can be used
to provide high availability of RabbitMQ. However, these options should
only be used together, otherwise the system will become unstable. Using
the flag ``om_enable_rabbitmq_high_availability`` will either enable
both options at once, or neither of them.

There are some queues that should not be mirrored:
* ``reply`` queues (these have a single consumer and TTL policy)
* ``fanout`` queues (these have a TTL policy)
* ``amq`` queues (these are auto-delete queues, with a single consumer)
An exclusionary pattern is used in the classic mirroring policy. This
pattern is ``^(?!(amq\\.)|(.*_fanout_)|(reply_)).*``

Change-Id: I51c8023b260eb40b2eaa91bd276b46890c215c25

09df6fc1

Jan 12, 2023

Fix prechecks in check mode · 46aeb984

Mark Goddard authored 2 years ago

When running in check mode, some prechecks previously failed because
they use the command module which is silently not run in check mode.
Other prechecks were not running correctly in check mode due to e.g.
looking for a string in empty command output or not querying which
containers are running.

This change fixes these issues.

Closes-Bug: #2002657
Change-Id: I5219cb42c48d5444943a2d48106dc338aa08fa7c

46aeb984

Jan 11, 2023

Stop firewalld config during kolla genconfig · 86870bd7

Jack Hodgkiss authored 2 years ago

Prevent the haproxy-config role from attempting to modify firewalld when
running kolla-ansible genconfig.

Closes-Bug: #2002522
Change-Id: Ie8a524cc944aa8cb9cf0999b1b8da79f30b40092

Unverified

86870bd7

Jan 10, 2023
- Set previous_release to zed · 5f492f13
  Bartosz Bezak authored 2 years ago
  
  Change-Id: Ie9832bd9cae497e7dbd2a03661361c125d8ec15a
  5f492f13
Jan 05, 2023

Drop skydive · 673ca8c7
Michal Nasiadka authored 2 years ago
```
Change-Id: I8855bd60c2fd77f33fb55d4123131a94327bd166
```
673ca8c7

Explicitly set the value of heartbeat_in_pthread · 8b8b4a82

Matt Crees authored 2 years ago

The ``[oslo_messaging_rabbit] heartbeat_in_pthread`` config option
is set to ``true`` for wsgi applications to allow the RabbitMQ
heartbeats to function. For non-wsgi applications it is set to ``false``
as it may otherwise break the service [1].

[1] https://docs.openstack.org/releasenotes/oslo.messaging/zed.html#upgrade-notes

Change-Id: Id89bd6158aff42d59040674308a8672c358ccb3c

8b8b4a82

Dec 22, 2022

ovn: add ovn-monitor-all variable · 20355edb

labedz authored 2 years ago

Setting ovn-monitor-all to 'true' will configure
ovn-controller to monitor all OVS database records
unconditionally. That will release some CPU resource
from OVS Southbound DB but will increase number of events
coming to ovn-controller.

Default value is 'false'.

Change-Id: I291e166013d8c88f00e84ceaf308251c352c9a79

20355edb

ovn: Change order of deployment · 3a94996b

Michal Nasiadka authored 2 years ago

ovn-controller should be deployed first according to OVN upgrade guide.
Since we are getting newer OVN/OVS versions from RDO/Ubuntu in a cycle,
let's apply that to deployment.

Closes-Bug: #1979329

Change-Id: I017aec611a057db1634cfc2634164b21cb210193

3a94996b

Dec 21, 2022

Integrate oslo-config-validator · 6c2aace8

Matt Crees authored 2 years ago

Regularly, we experience issues in Kolla Ansible deployments because we
use wrong options in OpenStack configuration files. This is because
OpenStack services ignore unknown options. We also need to keep on top
of deprecated options that may be removed in the future. Integrating
oslo-config-validator into Kolla Ansible will greatly help.

Adds a shared role to run oslo-config-validator on each service. Takes
into account that services have multiple containers, and these may also
use multiple config files. Service roles are extended to use this shared
role. Executed with the new command ``kolla-ansible validate-config``.

Change-Id: Ic10b410fc115646d96d2ce39d9618e7c46cb3fbc

6c2aace8

Dec 20, 2022

Remove AvailabilityZoneFilter from enabled_filters · 517b0ec0

Pierre Riteau authored 2 years ago

The AvailabilityZoneFilter scheduler filter was deprecated in Xena [1].

[1] https://review.opendev.org/c/openstack/nova/+/745605

Change-Id: I86b6c772a15911d88834bd315e778b3919803422

517b0ec0

Dec 19, 2022

Add service-images-pull tag to tasks in the service-images-pull role · 73bc7ec7

Christian Berendt authored 2 years ago

We sometimes have the requirement that images should explicitly not
be pulled. Using the service-images-pull tag, it is now possible to
skip the actual pull task by using --skip-tags.

Change-Id: Ia00a5ecbcb944c252cd9d0366d8cf1e7ff6327f7

73bc7ec7

Dec 13, 2022

Remove shebang to properly support venv in ansible-core 2.13+ · a3f0511c
Michal Nasiadka authored 2 years ago
```
See https://github.com/ansible/ansible/pull/76677

Change-Id: If822dcfc4c1abf7a22be35ffd90fd05ee46cb0d9
```
a3f0511c

Fix generation of OpenSearch Dashboards config · a2739db0

Pierre Riteau authored 2 years ago

The opensearch config playbook was iterating over opensearch_services,
generating a file named opensearch-dashboards.yml containing an empty
JSON dictionary. The next task was generating opensearch_dashboards.yml
which is actually used by OpenSearch Dashboards.

Remove with_dict in the first task to only generate opensearch.yml.

Change-Id: I39cf74916630d27cd34ce0783ba8c3c0d20bbddc

a2739db0

Dec 08, 2022

Remove kafka, storm, zookeeper · f128d199

Michal Nasiadka authored 2 years ago

Their cleanup has been added to monasca cleanup command.

Change-Id: I19a846e2683ae70b33ca64d2aba7ac71eb724588

f128d199

Dec 01, 2022

Replace ElasticSearch and Kibana with OpenSearch · e1ec02ed

Michal Nasiadka authored 2 years ago

This change replaces ElasticSearch with OpenSearch, and Kibana
with OpenSearch Dashboards. It migrates the data from ElasticSearch
to OpenSearch upon upgrade.

No TLS support is in this patch (will be a followup).

A replacement for ElasticSearch Curator will be added as a followup.

Depends-On: https://review.opendev.org/c/openstack/kolla/+/830373



Co-authored-by: Doug Szumski <doug@stackhpc.com>
Co-authored-by: Kyle Dean <kyle@stackhpc.com>
Change-Id: Iab10ce7ea5d5f21a40b1f99b28e3290b7e9ce895

e1ec02ed

Nov 30, 2022

ovn: Change NB/SB connection setup to allow usage of inactivity probe · b32d456e

Michal Nasiadka authored 2 years ago

We have been using --db-nb-create-insecure-remote=yes - that results
a TCP method is set by ovn-ctl script to run ovsdb-server.

Downside is - we can't configure inactivity probe on that connection.

Closes-Bug: #1917484
Change-Id: I550aa4fe92aadea2a49ca5aff49c0183609b9470

b32d456e

Nov 28, 2022

ovn: Break out role into ovn-db and ovn-controller roles · 63a7968d

Michal Nasiadka authored 2 years ago

Instead of handling everything in one role - let's have small
fit-for-purpose roles, because in reality these are two hosts
roles and performance should be better with this approach.

[1]: https://docs.ovn.org/en/latest/intro/install/ovn-upgrades.html

Change-Id: I8f9dbe9d950323f16375ad5e1dbaedfb1be6585f

63a7968d

Removal of dockerSDK check for other container engines · 652b9cfe

Ivan Halomi authored 2 years ago


Typo fix and adding condition on not checking docker SDK version
when container engine is not docker

This is a followup to Ic30b67daa2e215524096ad1f4385c569e3d41b95
Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Change-Id: Iafa24db06ad46bcfe250451ed98bc3c48d8a5138

652b9cfe

Nov 23, 2022

Remove allow_overlapping_ips configuration option · 68376d6f

Pierre Riteau authored 2 years ago

This option was removed from Neutron in the Zed release [1]. This can be
backported to Yoga where the default value was changed to True [2].

[1] https://review.opendev.org/c/openstack/neutron/+/837286
[2] https://review.opendev.org/c/openstack/neutron/+/807848

Change-Id: Ibcd81a3a5f4b8de60459b3a4cfc30a50a06a436f

68376d6f

Fix some neutron configuration options · 92d6e27c

Pierre Riteau authored 2 years ago

Move metadata_workers from neutron.conf to metadata_agent.ini.

Remove unknown option placement/os_region_name: we already have
placement/region_name which is the correct one.

This can be backported to previous releases.

Change-Id: I710b5364244d976020656e1ee68e89f337cb3086

92d6e27c

Revert "Generate ovn-chassis-mac-mappings on ovn-controller group" · 826fd12a

Bartosz Bezak authored 2 years ago

This reverts commit 8bf8656d.

Reason for revert: Setting ovn-chassis-mac-mappings on network nodes 
is causing mac flooding [1] [2] for traffic between external ports, 
and very slow troughput in consequence.
OVN HA Chassis priorities between gateways should probably be managed
by Neutron [3]

[1] https://mail.openvswitch.org/pipermail/ovs-discuss/2020-September/050691.html
[2] https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051837.html
[3] https://mail.openvswitch.org/pipermail/ovs-discuss/2022-October/052068.html

Change-Id: Ia3b279d7e2c08464fda1a5dc41518296f559e93f

826fd12a

Nov 18, 2022

Monasca removal followup · 04d3bb36

Doug Szumski authored 2 years ago

A few minor fixes were noted in this review [1], and they
are addressed here.

TrivialFix

[1]: https://review.opendev.org/c/openstack/kolla-ansible/+/861392/

Change-Id: If30d9c2b48615dfb54edcb8d782c4c24b968ac4b

04d3bb36

Nov 17, 2022

cinder: Pure roce followup · 8f6298c8

Michal Nasiadka authored 2 years ago

Seems we missed this in Ic1eed7d19e9b583e22419625c92ac3507ea4614d

Change-Id: Ib8505b8cde4a018737d10da1576248e349215fb3

8f6298c8

Nov 15, 2022

Generate ovn-chassis-mac-mappings on ovn-controller group · 8bf8656d

Bartosz Bezak authored 2 years ago

Previously ovn-chassis-mac-mappings [1] has been added only to
ovn-controller-compute group. However external ports are being
scheduled on network nodes, therefore we need also do that there.

Closes-Bug: 1995078

[1] https://github.com/ovn-org/ovn/blob/v22.09.0/controller/ovn-controller.8.xml#L239

Change-Id: Ie62e9220bad56262cad602ca1480e6ca65827819

8bf8656d

Adds a wrapper script to run ovs-vsctl commands in the container · a8244348

Will Szumski authored 2 years ago

Libvirt needs to be able to plug ports into openvswitch bridges.
It does this using the ovs-vsctl command, which it searches for
in $PATH[1, 2]. This change will optionally install a wrapper
script that executes the ovs-vsctl commands in the context of the
openvswitchd container. This is useful when running libvirt on the
host whilst still running openvswitch in a container. The advantage
of this method over install the packages on the host is that it
ensures client compatability with the daemon. The default is set
to false as the wrapper could overwrite ovs-vsctl installed on the
host.

[1] https://github.com/libvirt/libvirt/blob/ee51ab86c2e52b6ff1b17a4c7ad11439fd610c9e/src/util/virnetdevopenvswitch.c#L59
[2] https://github.com/libvirt/libvirt/blob/a89b17c2a75cfbaeb9e430f88e0f8a7475eb4f54/docs/kbase/internals/command.rst#id3

Closes-Bug: #1995409
Change-Id: Iaa6bfb012ae847f5f6aa0a1fc1c27970ac265f93

a8244348

Nov 11, 2022

Remove support for deploying OpenStack Monasca · adb8f89a

Doug Szumski authored 2 years ago

Kolla Ansible is switching to OpenSearch and is dropping support for
deploying ElasticSearch. This is because the final OSS release of
ElasticSearch has exceeded its end of life.

Monasca is affected because it uses both Logstash and ElasticSearch.
Whilst it may continue to work with OpenSearch, Logstash remains an
issue.

In the absence of any renewed interest in the project, we remove
support for deploying it. This helps to reduce the complexity
of log processing configuration in Kolla Ansible, freeing up
development time.

Change-Id: I6fc7842bcda18e417a3fd21c11e28979a470f1cf

adb8f89a