Commits · c85b64d1589a515da2f3cf2dcc082d15df1d6edd · Very Demiurge Very Mindful / Kolla Ansible

Apr 13, 2023

Remove RabbitMQ ha-all policy when not required · c85b64d1

Matt Crees authored 2 years ago

With the addition of the variable
`om_enable_rabbitmq_high_availability`, this feature in the upgrade
task should be brought back. It is also now used in the deploy task. The
`ha-all` policy is cleared only when
`om_enable_rabbitmq_high_availability` is set to `false`.

Change-Id: Ia056aa40e996b1f0fed43c0f672466c7e4a2f547

c85b64d1

Apr 12, 2023

rabbitmq: Do not stop containers on upgrade · b30c7bc8

Michal Nasiadka authored 2 years ago

Since RMQ 3.8 we can use rolling upgrade [1].

Depends-On: https://review.opendev.org/c/openstack/kolla/+/872393

[1]: https://www.rabbitmq.com/upgrade.html#rolling-upgrades

Change-Id: If6a7c6c12d9226a2406728108b3c87b3485ac55f

b30c7bc8

Apr 03, 2023
- nova: Fix live migration on RHEL9 derivatives · 7c32e6f3
  Michal Nasiadka authored 1 year ago
  
  Closes-Bug: #2005119 Change-Id: I542f7ae19b4400355b04854f42a1d1802a6efeea
  7c32e6f3
Mar 29, 2023

Add LimitRequestBody configuration for Horizon · d907790f

Maksim Malchuk authored 2 years ago

Since CVE-2022-29404 is fixed [1,2] the default value for the
LimitRequestBody directive in the Apache HTTP Server has been changed
from 0 (unlimited) to 1 GiB. This limits the size of images (for
example) uploaded in Horizon. This change add the ability to
configure the limit.

1. https://access.redhat.com/articles/6975397
2. https://ubuntu.com/security/CVE-2022-29404



Closes-Bug: #2012588
Change-Id: I4cd9dd088cbcf38ff6f8d188ebcc56be7d9ea1c9
Signed-off-by: Maksim Malchuk <maksim.malchuk@gmail.com>

d907790f

Mar 23, 2023
- magnum: Fix trustee creation after ansible-collections-openstack bump · 8dd409ce
  Michal Nasiadka authored 2 years ago
  
  Change-Id: I54e68a3002d69f7b1be2704259c6a072f81aa586
  8dd409ce
- Fix restart_container when restart_policy is no · cdcf6220
  Michal Nasiadka authored 2 years ago
  
  Closes-Bug: #2012654 Change-Id: I9735b4409a48d80851cbc26a9edbf370af1d45bf
  cdcf6220
Mar 21, 2023

Set RabbitMQ message TTL and queue expiry · fd30dfb8

John Garbutt authored 3 years ago

Following ideas here:
https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit

Make sure old messages with no consumer are dropped after the message
TTL of 10 mins, longer than the 1 min RPC timeout.
Also ensure queues expire after an hour of inactivity, so queues from
removed nodes or renamed nodes don't grow over time.

Change-Id: Ifb28ac68b6328adb604a7474d01e5f7a47b2e788

fd30dfb8

Add flags for RabbitMQ message TTL & queue expiry · dae2cbca

Matt Crees authored 2 years ago

Adds two new flags to alter behaviour in RabbitMQ:
    * `rabbitmq_message_ttl_ms`, which lets you set a TTL on messages.
    * `rabbitmq_queue_expiry_ms`, which lets you set an expiry time on queues.
See https://www.rabbitmq.com/ttl.html for more information on both.

Change-Id: I51ca37ffbb1bb5c07f2d39873f0f33ca20263f2a

dae2cbca

Mar 15, 2023
- ironic: fix dev mode for inspector · f5b3f9d2
  Michal Nasiadka authored 2 years ago
  
  Change-Id: I1649a389bdc3977b936402c3ce3e55056d74ba08
  f5b3f9d2
Mar 06, 2023

Add neutron_ovn_availability_zones parameter · 6768b760

Christian Berendt authored 2 years ago

With the new ``neutron_ovn_availability_zones`` parameter it is possible
to define network availability zones for OVN. Further details can be found
in the Neutron OVN documentation:
https://docs.openstack.org/neutron/latest/admin/ovn/availability_zones.html#how-to-configure-it

Change-Id: I203e0d400a3218d0b4a41f2a948207032c4febec

6768b760

Mar 02, 2023

Set the etcd internal hostname and cacert for tls internal enabled · 5d3eed23

Matthew N Heler authored 2 years ago

deployments

This allows services to work with etcd when coordination is enabled
for TLS internal deployments. Without this fix, we fail to connect to
etcd with the coordination backend and the service itself crashes.

Change-Id: I0c1d6b87e663e48c15a846a2774b0a4531a3ca68

5d3eed23

Mar 01, 2023
- etcd: Set the proper peer and client protocol when tls is enabled · ee336ac4
  Matthew N Heler authored 2 years ago
  
  Partial-Bug: #1930109 Change-Id: I383b2b5a139d24a419145473b66a34c06e32060a
  ee336ac4
Feb 20, 2023

Refactor DockerWorker into ContainerWorker · 9a14a306

Ivan Halomi authored 2 years ago

Fourth part of patchset:
https://review.opendev.org/c/openstack/kolla-ansible/+/799229/


which was suggested to be split into smaller patches.

This commit refactors select methods from DockerWorker class
into ContainerWorker class. New class contains Docker independent
methods also used in Podman introduction and is inteded as a
parent class for specific worker classes.

Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Co-authored-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: I2dd5920410dda053f2dfedc4e2666c56b1a7095a

9a14a306

Feb 18, 2023

hacluster: Use nodename to align with nova service names · e1ae8223

Matthew N Heler authored 2 years ago

For Masakari and HACluster to work properly, the hostnames used
in HACluster need to match with the hostnames used in Nova.

Change-Id: Iac917ef4471905caab591cd64eab379e150a8524

e1ae8223

Feb 14, 2023

Fix deploy/genconfig in check mode · 572ff2f8

Mark Goddard authored 2 years ago

Previously, when running one of the following commands:

  kolla-ansible deploy --check
  kolla-ansible genconfig --check

deployment or configuration generation fails for various reasons.

MariaDB fails to lookup the existing cluster.

Keystone fails to generate cron config.

Nova-cell fails to get the cell settings.

Closes-Bug: #2002661
Change-Id: I5e765f498ae86d213d0a4379ca5d473db1499962

572ff2f8

Improve RabbitMQ performance by reducing ha replicas · 6cf22b0c

John Garbutt authored 3 years ago

Currently we do not follow the RabbitMQ advice on replicas here:
https://www.rabbitmq.com/ha.html#replication-factor

Here we reduce the number of replicas to n // 2 + 1 as advised
above. The hope it this helps speed up recovery from rabbit
issues.

Related-Bug: #1954925
Change-Id: Ib6bcb26c499c9884faa4a0cd51abaec00cacb096

6cf22b0c

Add flag to change RabbitMQ ha-mode definition · e13072a9

Matt Crees authored 2 years ago

Adds the flag `rabbitmq_ha_replica_count` to change how many different
nodes a queue should be mirrored across. If the value is not set, then
it defaults to "ha-mode":"all". This value is unset by default to avoid
any unexpected changes to the RabbitMQ definitions.json file, as that
would trigger an unexpected restart of RabbitMQ during the next deploy.

Change-Id: Iee98cd937197a73a3b04aa8501fa325e8ecfff24

e13072a9

Use loadbalancer to connect to etcd · e2c7dace

Will Szumski authored 2 years ago

Hardcoding the first etcd host creates a single point of failure.

Change-Id: I0f83030fcd84ddcdc4bf2226e76605c7cab84cbb

e2c7dace

Feb 13, 2023

Put etcd behind HTTP loadbalancer · 6f536a4f

Will Szumski authored 2 years ago


etcd-compatible tooz drivers do not support multiple endpoints via
backend_url. We can put a loadbalancer in front of etcd and configure
backend_url to use the VIP instead. The issue with hard coding the first
host is that we break coordination if we take this host offline. In the
case of cinder, we would not be able to perform any volume related
operations.

Co-Authored-By: Mark Goddard <mark@stackhpc.com>
Change-Id: Ib684501ba03c386dc5ac71e5cbea05c99f191665

6f536a4f

Feb 09, 2023

RabbitMQ: Support setting ha-promote-on-shutdown · 94f3ce0c

John Garbutt authored 3 years ago

By default ha-promote-on-shutdown=when-synced. However we are seeing
issues with RabbitMQ automatically recovering when nodes are restarted.
https://www.rabbitmq.com/ha.html#cluster-shutdown

Rather than waiting for operator interventions, it is better we allow
recovery to happen, even if that means we may loose some messages.
A few failed and timed out operations is better than a totaly broken
cloud. This is achieved using ha-promote-on-shutdown=always.

Note, when a node failure is detected, this is already the default
behaviour from 3.7.5 onwards:
https://www.rabbitmq.com/ha.html#promoting-unsynchronised-mirrors

This patch adds the option to change the ha-promote-on-shutdown
definition, using the flag `rabbitmq_ha_promote_on_shutdown`. This
value is unset by default to avoid any unexpected changes to the
RabbitMQ definitions.json file, as that would trigger an unexpected
restart of RabbitMQ during the next deploy.

Related-Bug: #1954925

Change-Id: I2146bda2c72ddac2c9923c6941b0596395fd9ab5

94f3ce0c

Feb 07, 2023
- remove elasticsearch remnants in antelope cycle · ee658f45
  Bartosz Bezak authored 2 years ago
  
  Change-Id: I115b491eca413437926f5bcaf53336151f9a7c0b
  ee658f45
Feb 04, 2023

Fix kolla_docker module · 63b9fa56

Michal Arbet authored 2 years ago

This patch fixes kolla_docker module
as it did not take into account common_options
parameter. From patchset it's visible that module's
default values are used always - even if user overrided
some param in common_options dict.

Closes-Bug: #2003079

Change-Id: I677fde708dd004decaff4bd39f2173d8d81052fb

63b9fa56

Feb 03, 2023

Do not support dimensions:kernel_memory on Docker API 1.42 · f253f99c

Michal Nasiadka authored 2 years ago

It is deprecated in 20.10 and removed in 23.0 (and 23.0 is out) [1], [2].

[1]: https://docs.docker.com/engine/deprecated/#kernel-memory-limit
[2]: https://docs.docker.com/engine/api/version-history/#v142-api-changes

Change-Id: Ia6fa85172aad7bcd5f958922d3c224ef79882e6c

f253f99c

Feb 02, 2023

Switch trove-api to wsgi running under apache. · 303998e2

wu.chunyang authored 2 years ago

This change also adds support for Trove backend TLS.

Depends-On: https://review.opendev.org/c/openstack/kolla/+/854744
Change-Id: I2acf7820b24b112b57b0c00a01f5c4b8cb85ce25

303998e2

Jan 31, 2023

Trivial: Add connection: local for keystone-fernet cron generate task · 78cf9585

Michal Arbet authored 2 years ago

This patch add connection local for above mentioned task as
kolla-ansible can be executed in docker container as in
my case.

When there is no connection: local, ansible is trying to connect
to localhost via ssh where specified python script is not available.

After connection: local everything is working as expected as file
is found inside container

Closes-Bug: #2004224

Change-Id: I219a958b4f101efb71a2935e6d910dae5c65f0be

78cf9585

Add skyline service · 113b77c8

yangshaoxue authored 3 years ago

Support to deploy skyline by kolla-ansible.

Implements: blueprint skyline
Depends-On: https://review.opendev.org/c/openstack/kolla/+/826948

Change-Id: Ice5621491a432ba32138abd6f62d1f815cc219e0

113b77c8

Jan 30, 2023

Default neutron_tls_proxy and glance_tls_proxy to haproxy_tag · 95895d5b

Bartosz Bezak authored 2 years ago

neutron_tls_proxy and glance_tls_proxy are using haproxy container
image. Pin them to haproxy_tag directly.

Change-Id: I73142db48ebe6641520d21b560f16de892e07c34

95895d5b

Jan 29, 2023

Remove support for Ubuntu Focal 20.04 hosts · 6db6bc0a

Bartosz Bezak authored 2 years ago

Users running on a Focal host will now fail in prechecks.

Change-Id: Icaef4b25458490e46f623b055658abc678d2f1c6

6db6bc0a

Jan 26, 2023

Remove system scope token to access services · 283fa242

Ghanshyam Mann authored 2 years ago

As per the RBAC new direction in Zed cycle, we have dropped the
system scope from API policies and all the policies are hardcoded
to project scoped so that any user accessing APIs using system scope
will get 403 error. It is dropped from all the OpenStack services
except for the Ironic service which will have system scope and to
support ironic only deployment, we are keeping system as well as project
scope in Keystone.

Complete discussion and direction can be found in the below gerrit
change and TC goal direction:

- https://review.opendev.org/c/openstack/governance/+/847418
- https://governance.openstack.org/tc/goals/selected/consistent-and-secure-rbac.html#the-issues-we-are-facing-with-scope-concept

As phase-2 of RBAC goal, services will start enabling the new
defaults and project scope by default. For example: Nova did in
- https://review.opendev.org/c/openstack/nova/+/866218

Kolla who start accessing the services using system scope token
- https://review.opendev.org/c/openstack/kolla-ansible/+/692179

This commit partially revert the above change except keeping
system scope usage for Keystone and Ironic. Rest all services are changed
to use the project scope token.

And enable the scope and new defaults for Nova which was disabled
by https://review.opendev.org/c/openstack/kolla-ansible/+/870804

Change-Id: I0adbe0a6c39e11d7c9542569085fc5d580f26c9d

283fa242

Jan 23, 2023

Adding optional delay between l3 agent restarts · 391aa467

Alex-Welsh authored 2 years ago

This change serialises the neutron l3 agent restart process and adds a
user configurable delay between restarts. This can prevent connectivity
loss due to all agents being restarted at the same time.

Routers increase the recovery time, making this issue more prevalent.

Change-Id: I3be0ebfa12965e6ae32d1b5f13f8fd23c3f52b8c

391aa467

Jan 20, 2023

Set scheduler.max_attempts for nova conductor · 0b62db7c

Stanislav Dmitriev authored 2 years ago

In order to honour configured max number of attempts
it has to be presented in nova.conf inside of
nova_conductor container, otherwise the default value
of 3 will be used

Closes-Bug: #2003587
Change-Id: I928af332b8658223444594f96417830233057284

0b62db7c

Jan 19, 2023

Add systemd container control · 4866017e

Martin Hiner authored 3 years ago


This commit adds SystemdWorker class to kolla_docker ansible module.
It is used to manage container state via systemd calls.

Change-Id: I20e65a6771ebeee462a3aaaabaa5f0596bdd0581
Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Signed-off-by: Martin Hiner <m.hiner@partner.samsung.com>

4866017e

Jan 17, 2023

Add ability to configure rabbitmq · 701dc20f

Michal Arbet authored 2 years ago

As rabbitmq's configuration file is not ini or yaml file,
there is no option to extend configuration by new config
options via merge_configs or merge_yaml.

This patch moves config options to dictionary
so it can be overriden in /etc/kolla/globals.yml.

Change-Id: I5cd772f4fb80a0e200fb24d67be735ca81e3fdeb

701dc20f

Disable new defaults and scope for Nova API policies (RBAC) · 941abf9e

Pierre Riteau authored 2 years ago

Nova changes to RBAC [1] are breaking Kolla Ansible and causing most CI
jobs to fail. Disable these changes until we can adapt.

[1] https://review.opendev.org/c/openstack/nova/+/866218

Change-Id: I506697d2b374e74a6b066c788bd2d61edc8d4876

941abf9e

Jan 16, 2023

Remove [trustee]/auth_uri option from heat.conf · 943fedee

Pierre Riteau authored 2 years ago

According to the code, docs and oslo-config-validator, this
configuration option is not supported.

Change-Id: I34410e5267d527ec629748f35771f227183810b6

943fedee

Remove use_forwarded_for configuration option · bbe8374f

Pierre Riteau authored 2 years ago

This option has never been supported by Glance.

Change-Id: I08113292ec862d6ef72b870dcf12577bf02d3771

bbe8374f

Fix issue with genconfig and octavia_auto_configure · 2bf4d4db

Will Szumski authored 2 years ago

Makes sure the facts required to generate octavia.conf are available
when using genconfig.

This change also ensures that the necessary tasks run when using Ansible
check mode.

Closes-Bug: #1987299
Change-Id: Ib8fbee2d3abdcfd2eae0f9b3e9b69eeb0e3086e0

2bf4d4db

Jan 13, 2023

Add a flag to handle RabbitMQ high availability · 09df6fc1

Matt Crees authored 2 years ago

A combination of durable queues and classic queue mirroring can be used
to provide high availability of RabbitMQ. However, these options should
only be used together, otherwise the system will become unstable. Using
the flag ``om_enable_rabbitmq_high_availability`` will either enable
both options at once, or neither of them.

There are some queues that should not be mirrored:
* ``reply`` queues (these have a single consumer and TTL policy)
* ``fanout`` queues (these have a TTL policy)
* ``amq`` queues (these are auto-delete queues, with a single consumer)
An exclusionary pattern is used in the classic mirroring policy. This
pattern is ``^(?!(amq\\.)|(.*_fanout_)|(reply_)).*``

Change-Id: I51c8023b260eb40b2eaa91bd276b46890c215c25

09df6fc1

Jan 12, 2023

Fix prechecks in check mode · 46aeb984

Mark Goddard authored 2 years ago

When running in check mode, some prechecks previously failed because
they use the command module which is silently not run in check mode.
Other prechecks were not running correctly in check mode due to e.g.
looking for a string in empty command output or not querying which
containers are running.

This change fixes these issues.

Closes-Bug: #2002657
Change-Id: I5219cb42c48d5444943a2d48106dc338aa08fa7c

46aeb984

Jan 11, 2023

Stop firewalld config during kolla genconfig · 86870bd7

Jack Hodgkiss authored 2 years ago

Prevent the haproxy-config role from attempting to modify firewalld when
running kolla-ansible genconfig.

Closes-Bug: #2002522
Change-Id: Ie8a524cc944aa8cb9cf0999b1b8da79f30b40092

86870bd7